The Protein Data Bank

in: Health and Well-Being , Prosperity


Human beings are composed of trillions of cells, each of which are powered by small biological machines called proteins. Almost all diseases, disorders, and infections that humanity faces can ultimately stem back to an issue with a specific type of protein, or an organism (bacteria, parasite, etc…) that relies on the function of specific proteins. For this reason, essentially all drugs and treatments consist of small molecules or physical interventions that act by influencing particular proteins. It is important to understand the nature and shapes of proteins before we can identify candidate proteins and molecules for targeting with new drug research.

For more than half a century, a massive database called the Protein Data Bank (PDB) has existed thanks to federal funding from the National Science Foundation and the National Institute of Health. Originally created at Brookhaven National Lab, this database currently contains about 230,000 entries that show the physical structure of various proteins and molecules, with around 10,000 new structures added each year. The database is open-access and we encourage readers to “explore the proteins” that make our bodies work (such as the nanoscale motor we use to generate energy, complete with motor, axle, stator, and generator)

Although, drug development often involves private sector research, incentivized by earning returns on their product, this research relies on these basic protein structures to get started. Between 2010 and 2016, 88% of new FDA approved molecules targeted proteins that had been deposited on PDB for free access, often a decade earlier, as a result of federally funded non-proprietary research. Indeed, every single new FDA approved drug in this timespan relied to some extent on previous NIH-funded research. Recent analyses suggest that the value of the time spent using PDB equates to about $5.5 billion dollars each year, which is 800 times more than its operating cost. A conservative estimate of its impact on society due to publicly funded research alone is $1.1 billion dollars a year.

This database has helped research on proteins and molecules involved in cancer and cancer immunotherapy, lipid storage diseases, diabetes, muscular dystrophy, epilepsy, cystic fibrosis, cardiovascular disease, antibiotics, living fuel cells, antitoxins, coronavirus, plastic decomposition, genetics, Alzheimer’s, depression, obesity, vision, oral health, and many more. Data from PDB was also vital in powering recent AI advances in predicting protein structures, accelerating future drug design. You can see a full list of all the diseases with associated structures on PDB here.



← Back to home page