Association for the Advancement of Artificial Intelligence (AAAI)

Proceedings of the AAAI Conference on Artificial Intelligence

, volume 36 , issue 7 , pages 7628-7636

Frozen Pretrained Transformers as Universal Computation Engines

Kevin Lu ¹

Aditya Grover ²

Pieter Abbeel ³

Igor Mordatch ⁴

Hide authors affiliations Show authors affiliations: 4 affiliations

UC Berkeley Facebook AI Research |

UCLA Facebook AI Research |

UC Berkeley |

⁴

Google Brain

Publication type: Journal Article

Publication date: 2022-06-28

Association for the Advancement of Artificial Intelligence (AAAI)

Proceedings of the AAAI Conference on Artificial Intelligence

SJR: 0.133

CiteScore: 2.0

Impact factor: —

ISSN: 21595399, 23743468

DOI: 10.1609/aaai.v36i7.20729

Copy DOI

General Medicine

Abstract

We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks. We consider such a model, which we call a Frozen Pretrained Transformer (FPT), and study finetuning it on a variety of sequence classification tasks spanning numerical computation, vision, and protein fold prediction. In contrast to prior works which investigate finetuning on the same modality as the pretraining dataset, we show that pretraining on natural language can improve performance and compute efficiency on non-language downstream tasks. Additionally, we perform an analysis of the architecture, comparing the performance of a random initialized transformer to a random LSTM. Combining the two insights, we find language-pretrained transformers can obtain strong performance on a variety of non-language tasks.

Found

Top-30

Journals

	1 2 3 4 5 6
Lecture Notes in Computer Science	Lecture Notes in Computer Science, 6, 8.96% Lecture Notes in Computer Science 6 publications, 8.96%
Briefings in Bioinformatics	Briefings in Bioinformatics, 2, 2.99% Briefings in Bioinformatics 2 publications, 2.99%
IEEE Transactions on Multimedia	IEEE Transactions on Multimedia, 2, 2.99% IEEE Transactions on Multimedia 2 publications, 2.99%
IEEE Transactions on Intelligent Transportation Systems	IEEE Transactions on Intelligent Transportation Systems, 2, 2.99% IEEE Transactions on Intelligent Transportation Systems 2 publications, 2.99%
Frontiers of Information Technology and Electronic Engineering	Frontiers of Information Technology and Electronic Engineering, 2, 2.99% Frontiers of Information Technology and Electronic Engineering 2 publications, 2.99%
Information (Switzerland)	Information (Switzerland), 1, 1.49% Information (Switzerland) 1 publication, 1.49%
Computers	Computers, 1, 1.49% Computers 1 publication, 1.49%
Nature Machine Intelligence	Nature Machine Intelligence, 1, 1.49% Nature Machine Intelligence 1 publication, 1.49%
Lecture Notes in Networks and Systems	Lecture Notes in Networks and Systems, 1, 1.49% Lecture Notes in Networks and Systems 1 publication, 1.49%
Autonomous Robots	Autonomous Robots, 1, 1.49% Autonomous Robots 1 publication, 1.49%
IEEE Transactions on Pattern Analysis and Machine Intelligence	IEEE Transactions on Pattern Analysis and Machine Intelligence, 1, 1.49% IEEE Transactions on Pattern Analysis and Machine Intelligence 1 publication, 1.49%
IEEE/ACM Transactions on Computational Biology and Bioinformatics	IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1, 1.49% IEEE/ACM Transactions on Computational Biology and Bioinformatics 1 publication, 1.49%
Computers and Chemical Engineering	Computers and Chemical Engineering, 1, 1.49% Computers and Chemical Engineering 1 publication, 1.49%
IEEE Signal Processing Letters	IEEE Signal Processing Letters, 1, 1.49% IEEE Signal Processing Letters 1 publication, 1.49%
Reliability Engineering and System Safety	Reliability Engineering and System Safety, 1, 1.49% Reliability Engineering and System Safety 1 publication, 1.49%
IEEE Transactions on Information Forensics and Security	IEEE Transactions on Information Forensics and Security, 1, 1.49% IEEE Transactions on Information Forensics and Security 1 publication, 1.49%
Applied Sciences (Switzerland)	Applied Sciences (Switzerland), 1, 1.49% Applied Sciences (Switzerland) 1 publication, 1.49%
Neural Computing and Applications	Neural Computing and Applications, 1, 1.49% Neural Computing and Applications 1 publication, 1.49%
bioRxiv	bioRxiv, 1, 1.49% bioRxiv 1 publication, 1.49%
BioData Mining	BioData Mining, 1, 1.49% BioData Mining 1 publication, 1.49%
Technologies	Technologies, 1, 1.49% Technologies 1 publication, 1.49%
Neurocomputing	Neurocomputing, 1, 1.49% Neurocomputing 1 publication, 1.49%
Scientific Reports	Scientific Reports, 1, 1.49% Scientific Reports 1 publication, 1.49%
Energy Conversion and Management	Energy Conversion and Management, 1, 1.49% Energy Conversion and Management 1 publication, 1.49%
Geophysical Research Letters	Geophysical Research Letters, 1, 1.49% Geophysical Research Letters 1 publication, 1.49%
IEEE Transactions on Geoscience and Remote Sensing	IEEE Transactions on Geoscience and Remote Sensing, 1, 1.49% IEEE Transactions on Geoscience and Remote Sensing 1 publication, 1.49%
PLoS Computational Biology	PLoS Computational Biology, 1, 1.49% PLoS Computational Biology 1 publication, 1.49%
Information Fusion	Information Fusion, 1, 1.49% Information Fusion 1 publication, 1.49%
IEEE Transactions on Knowledge and Data Engineering	IEEE Transactions on Knowledge and Data Engineering, 1, 1.49% IEEE Transactions on Knowledge and Data Engineering 1 publication, 1.49%
	1 2 3 4 5 6

Publishers

	5 10 15 20 25 30
Institute of Electrical and Electronics Engineers (IEEE)	Institute of Electrical and Electronics Engineers (IEEE), 26, 38.81% Institute of Electrical and Electronics Engineers (IEEE) 26 publications, 38.81%
Springer Nature	Springer Nature, 15, 22.39% Springer Nature 15 publications, 22.39%
Association for Computing Machinery (ACM)	Association for Computing Machinery (ACM), 7, 10.45% Association for Computing Machinery (ACM) 7 publications, 10.45%
MDPI	MDPI, 5, 7.46% MDPI 5 publications, 7.46%
Elsevier	Elsevier, 5, 7.46% Elsevier 5 publications, 7.46%
Oxford University Press	Oxford University Press, 2, 2.99% Oxford University Press 2 publications, 2.99%
Cold Spring Harbor Laboratory	Cold Spring Harbor Laboratory, 2, 2.99% Cold Spring Harbor Laboratory 2 publications, 2.99%
Wiley	Wiley, 1, 1.49% Wiley 1 publication, 1.49%
Public Library of Science (PLoS)	Public Library of Science (PLoS), 1, 1.49% Public Library of Science (PLoS) 1 publication, 1.49%
IntechOpen	IntechOpen, 1, 1.49% IntechOpen 1 publication, 1.49%
	5 10 15 20 25 30

We do not take into account publications without a DOI.
Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.

Metrics

Cite this

GOST |

Cite this

GOST Copy

Lu K. et al. Frozen Pretrained Transformers as Universal Computation Engines // Proceedings of the AAAI Conference on Artificial Intelligence. 2022. Vol. 36. No. 7. pp. 7628-7636.

GOST all authors (up to 50) Copy

Lu K., Grover A., Abbeel P., Mordatch I. Frozen Pretrained Transformers as Universal Computation Engines // Proceedings of the AAAI Conference on Artificial Intelligence. 2022. Vol. 36. No. 7. pp. 7628-7636.

RIS |

Cite this

RIS Copy

TY - JOUR

DO - 10.1609/aaai.v36i7.20729

UR - https://doi.org/10.1609/aaai.v36i7.20729

TI - Frozen Pretrained Transformers as Universal Computation Engines

T2 - Proceedings of the AAAI Conference on Artificial Intelligence

AU - Lu, Kevin

AU - Grover, Aditya

AU - Abbeel, Pieter

AU - Mordatch, Igor

PY - 2022

DA - 2022/06/28

PB - Association for the Advancement of Artificial Intelligence (AAAI)

SP - 7628-7636

IS - 7

VL - 36

SN - 2159-5399

SN - 2374-3468

ER -

BibTex |

Cite this

BibTex (up to 50 authors) Copy

@article{2022_Lu,

author = {Kevin Lu and Aditya Grover and Pieter Abbeel and Igor Mordatch},

title = {Frozen Pretrained Transformers as Universal Computation Engines},

journal = {Proceedings of the AAAI Conference on Artificial Intelligence},

year = {2022},

volume = {36},

publisher = {Association for the Advancement of Artificial Intelligence (AAAI)},

month = {jun},

url = {https://doi.org/10.1609/aaai.v36i7.20729},

number = {7},

pages = {7628--7636},

doi = {10.1609/aaai.v36i7.20729}

}

MLA

Cite this

MLA Copy

Lu, Kevin, et al. “Frozen Pretrained Transformers as Universal Computation Engines.” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 7, Jun. 2022, pp. 7628-7636. https://doi.org/10.1609/aaai.v36i7.20729.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Journal

Proceedings of the AAAI Conference on Artificial Intelligence

SJR

0.133

CiteScore

2.0

Impact factor

—

ISSN

21595399 (Print)

23743468 (Electronic)