Miriam Bergeret, MSc
Ten years after the start of the Human Proteome Project (HPP), the Human Proteome Organization announced last month that the first draft of the human proteome map—based on approximately 470 terabytes of data from 5,658 human datasets—is now complete.
According to the recent report published in Nature Communications, the map includes 19,773 high-quality protein entries that cover just over 90 percent of the human proteome, with data from tangible sources such as mass spectrometry and amino acid sequencing, as well as transcriptional data gathered using reverse-transcriptase PCR, northern blotting, and other techniques.
The project also provides information about the spatio-temporal localization, transport, and post-translational modification, and more, of expressed proteins, the researchers report. They hope that adding this proteomic dimension to existing genomic data could provide a more complete picture of diseases such as cancer and advance precision medicine, which relies on large-scale -omics data to guide treatment and identify new drug targets.
But though the human proteome blueprint is largely complete, the project is far from over. Ten percent of genome-encoded proteins remains uncategorized (a.k.a the dark proteome). The researchers anticipate that identifying these 1899 “missing proteins” will require collecting rare cell and tissue samples and enriching low abundance proteins for analysis using mass spectrometry.
Researchers will also continue to explore protein drivers of disease and focus on developing new analytical tools and assays that can help diagnose and treat disease.