Evolutionary-scale prediction of atomic-level protein structure with a language model
Recent advances in machine learning have leveraged evolutionary information in multiple sequence alignments to predict protein structure. We demonstrate direct inference of full atomic-level protein structure from primary sequence using a large language model. As language models of protein sequences are scaled up to 15 billion parameters, an atomic-resolution picture of protein structure emerges in the learned representations. This results in an order-of-magnitude acceleration of high-resolution structure prediction, which enables large-scale structural characterization of metagenomic proteins. We apply this capability to construct the ESM Metagenomic Atlas by predicting structures for >617 million metagenomic protein sequences, including >225 million that are predicted with high confidence, which gives a view into the vast breadth and diversity of natural proteins.
Top-30
Journals
|
20
40
60
80
100
120
|
|
|
Briefings in Bioinformatics
106 publications, 3.26%
|
|
|
Journal of Chemical Information and Modeling
100 publications, 3.08%
|
|
|
Bioinformatics
97 publications, 2.99%
|
|
|
Nature Communications
92 publications, 2.83%
|
|
|
bioRxiv
84 publications, 2.59%
|
|
|
Computational and Structural Biotechnology Journal
42 publications, 1.29%
|
|
|
Protein Science
39 publications, 1.2%
|
|
|
Current Opinion in Structural Biology
38 publications, 1.17%
|
|
|
PLoS Computational Biology
35 publications, 1.08%
|
|
|
Nucleic Acids Research
35 publications, 1.08%
|
|
|
Nature Machine Intelligence
34 publications, 1.05%
|
|
|
Proceedings of the National Academy of Sciences of the United States of America
34 publications, 1.05%
|
|
|
Proteins: Structure, Function and Genetics
31 publications, 0.95%
|
|
|
Scientific Reports
31 publications, 0.95%
|
|
|
Methods in Molecular Biology
31 publications, 0.95%
|
|
|
Nature Methods
30 publications, 0.92%
|
|
|
Lecture Notes in Computer Science
29 publications, 0.89%
|
|
|
International Journal of Molecular Sciences
28 publications, 0.86%
|
|
|
International Journal of Biological Macromolecules
26 publications, 0.8%
|
|
|
Cell Systems
24 publications, 0.74%
|
|
|
Journal of Molecular Biology
23 publications, 0.71%
|
|
|
Nature
22 publications, 0.68%
|
|
|
Computers in Biology and Medicine
19 publications, 0.58%
|
|
|
Science
18 publications, 0.55%
|
|
|
Advanced Science
18 publications, 0.55%
|
|
|
eLife
18 publications, 0.55%
|
|
|
Nature Biotechnology
16 publications, 0.49%
|
|
|
BMC Bioinformatics
15 publications, 0.46%
|
|
|
Communications Biology
15 publications, 0.46%
|
|
|
20
40
60
80
100
120
|
Publishers
|
100
200
300
400
500
600
700
800
900
|
|
|
Cold Spring Harbor Laboratory
889 publications, 27.37%
|
|
|
Springer Nature
516 publications, 15.89%
|
|
|
Elsevier
500 publications, 15.39%
|
|
|
Oxford University Press
291 publications, 8.96%
|
|
|
American Chemical Society (ACS)
187 publications, 5.76%
|
|
|
Wiley
172 publications, 5.3%
|
|
|
MDPI
105 publications, 3.23%
|
|
|
Institute of Electrical and Electronics Engineers (IEEE)
99 publications, 3.05%
|
|
|
Public Library of Science (PLoS)
51 publications, 1.57%
|
|
|
Frontiers Media S.A.
44 publications, 1.35%
|
|
|
Proceedings of the National Academy of Sciences (PNAS)
34 publications, 1.05%
|
|
|
American Association for the Advancement of Science (AAAS)
33 publications, 1.02%
|
|
|
Taylor & Francis
26 publications, 0.8%
|
|
|
Royal Society of Chemistry (RSC)
25 publications, 0.77%
|
|
|
eLife Sciences Publications
18 publications, 0.55%
|
|
|
Association for Computing Machinery (ACM)
17 publications, 0.52%
|
|
|
American Society for Microbiology
16 publications, 0.49%
|
|
|
International Union of Crystallography (IUCr)
14 publications, 0.43%
|
|
|
Annual Reviews
9 publications, 0.28%
|
|
|
American Physical Society (APS)
9 publications, 0.28%
|
|
|
AIP Publishing
7 publications, 0.22%
|
|
|
Research Square Platform LLC
7 publications, 0.22%
|
|
|
Science in China Press
7 publications, 0.22%
|
|
|
IOP Publishing
6 publications, 0.18%
|
|
|
Walter de Gruyter
5 publications, 0.15%
|
|
|
PeerJ
5 publications, 0.15%
|
|
|
Mary Ann Liebert
4 publications, 0.12%
|
|
|
SAGE
4 publications, 0.12%
|
|
|
The Royal Society
4 publications, 0.12%
|
|
|
100
200
300
400
500
600
700
800
900
|
- We do not take into account publications without a DOI.
- Statistics recalculated weekly.