Proteins are essential biomolecules for the functioning of cells, tissues, and organs. The proteome, defined as a complete set of proteins as expressed in cells, has been analyzed mainly by mass spectrometry (MS) to characterize a wide variety of samples based on the protein profiles. In recent years, progress has been made in MS and related technologies, and more and more proteomic datasets have been collected by research groups all over the world. The raw MS data formats, data analysis workflows and final results depend on several factors including the MS instruments used, analysis software employed and the parameters optimized by each research group. Thus, it is very challenging to to integrate datasets from different research groups.
Since 2011, the Database Integration Coordination Program, conducted by the Japanese government, has been promoting the sharing of life science data. We joined this program in 2015 and developed a proteomic database named jPOST (Japan ProteOme STandard Repository/Database) to integrate proteome datasets generated from multiple projects and institutions. Researchers can now upload their proteome datasets to the jPOST repository where the raw MS data is re-processed using the jPOST standard protocol to automatically generate high-quality databases for data comparison and integration. Finally, peer researchers can access preset databases or create customized queries.
We hope jPOST facilitates research of the proteome and in other aspects of life sciences.