volume 36 issue 7 pages 7628-7636

Frozen Pretrained Transformers as Universal Computation Engines

Kevin Lu 1
Aditya Grover 2
Pieter Abbeel 3
Igor Mordatch 4
Publication typeJournal Article
Publication date2022-06-28
General Medicine
Abstract

We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks. We consider such a model, which we call a Frozen Pretrained Transformer (FPT), and study finetuning it on a variety of sequence classification tasks spanning numerical computation, vision, and protein fold prediction. In contrast to prior works which investigate finetuning on the same modality as the pretraining dataset, we show that pretraining on natural language can improve performance and compute efficiency on non-language downstream tasks. Additionally, we perform an analysis of the architecture, comparing the performance of a random initialized transformer to a random LSTM. Combining the two insights, we find language-pretrained transformers can obtain strong performance on a variety of non-language tasks.

Found 

Top-30

Journals

1
2
3
4
5
6
Lecture Notes in Computer Science
6 publications, 8.96%
Briefings in Bioinformatics
2 publications, 2.99%
IEEE Transactions on Multimedia
2 publications, 2.99%
IEEE Transactions on Intelligent Transportation Systems
2 publications, 2.99%
Frontiers of Information Technology and Electronic Engineering
2 publications, 2.99%
Information (Switzerland)
1 publication, 1.49%
Computers
1 publication, 1.49%
Nature Machine Intelligence
1 publication, 1.49%
Lecture Notes in Networks and Systems
1 publication, 1.49%
Autonomous Robots
1 publication, 1.49%
IEEE Transactions on Pattern Analysis and Machine Intelligence
1 publication, 1.49%
IEEE/ACM Transactions on Computational Biology and Bioinformatics
1 publication, 1.49%
Computers and Chemical Engineering
1 publication, 1.49%
IEEE Signal Processing Letters
1 publication, 1.49%
Reliability Engineering and System Safety
1 publication, 1.49%
IEEE Transactions on Information Forensics and Security
1 publication, 1.49%
Applied Sciences (Switzerland)
1 publication, 1.49%
Neural Computing and Applications
1 publication, 1.49%
bioRxiv
1 publication, 1.49%
BioData Mining
1 publication, 1.49%
Technologies
1 publication, 1.49%
Neurocomputing
1 publication, 1.49%
Scientific Reports
1 publication, 1.49%
Energy Conversion and Management
1 publication, 1.49%
Geophysical Research Letters
1 publication, 1.49%
IEEE Transactions on Geoscience and Remote Sensing
1 publication, 1.49%
PLoS Computational Biology
1 publication, 1.49%
Information Fusion
1 publication, 1.49%
IEEE Transactions on Knowledge and Data Engineering
1 publication, 1.49%
1
2
3
4
5
6

Publishers

5
10
15
20
25
30
Institute of Electrical and Electronics Engineers (IEEE)
26 publications, 38.81%
Springer Nature
15 publications, 22.39%
Association for Computing Machinery (ACM)
7 publications, 10.45%
MDPI
5 publications, 7.46%
Elsevier
5 publications, 7.46%
Oxford University Press
2 publications, 2.99%
Cold Spring Harbor Laboratory
2 publications, 2.99%
Wiley
1 publication, 1.49%
Public Library of Science (PLoS)
1 publication, 1.49%
IntechOpen
1 publication, 1.49%
5
10
15
20
25
30
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
67
Share
Cite this
GOST |
Cite this
GOST Copy
Lu K. et al. Frozen Pretrained Transformers as Universal Computation Engines // Proceedings of the AAAI Conference on Artificial Intelligence. 2022. Vol. 36. No. 7. pp. 7628-7636.
GOST all authors (up to 50) Copy
Lu K., Grover A., Abbeel P., Mordatch I. Frozen Pretrained Transformers as Universal Computation Engines // Proceedings of the AAAI Conference on Artificial Intelligence. 2022. Vol. 36. No. 7. pp. 7628-7636.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1609/aaai.v36i7.20729
UR - https://doi.org/10.1609/aaai.v36i7.20729
TI - Frozen Pretrained Transformers as Universal Computation Engines
T2 - Proceedings of the AAAI Conference on Artificial Intelligence
AU - Lu, Kevin
AU - Grover, Aditya
AU - Abbeel, Pieter
AU - Mordatch, Igor
PY - 2022
DA - 2022/06/28
PB - Association for the Advancement of Artificial Intelligence (AAAI)
SP - 7628-7636
IS - 7
VL - 36
SN - 2159-5399
SN - 2374-3468
ER -
BibTex |
Cite this
BibTex (up to 50 authors) Copy
@article{2022_Lu,
author = {Kevin Lu and Aditya Grover and Pieter Abbeel and Igor Mordatch},
title = {Frozen Pretrained Transformers as Universal Computation Engines},
journal = {Proceedings of the AAAI Conference on Artificial Intelligence},
year = {2022},
volume = {36},
publisher = {Association for the Advancement of Artificial Intelligence (AAAI)},
month = {jun},
url = {https://doi.org/10.1609/aaai.v36i7.20729},
number = {7},
pages = {7628--7636},
doi = {10.1609/aaai.v36i7.20729}
}
MLA
Cite this
MLA Copy
Lu, Kevin, et al. “Frozen Pretrained Transformers as Universal Computation Engines.” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 7, Jun. 2022, pp. 7628-7636. https://doi.org/10.1609/aaai.v36i7.20729.