,
volume 31
,
issue 8
,
pages 1223-1233
X-Former: In-Memory Acceleration of Transformers
Publication type: Journal Article
Publication date: 2023-08-01
scimago Q2
wos Q2
SJR: 0.744
CiteScore: 6.1
Impact factor: 3.1
ISSN: 10638210, 15579999
Electrical and Electronic Engineering
Hardware and Architecture
Software
Abstract
Transformers have achieved great success in a wide variety of natural language processing (NLP) tasks due to the self-attention mechanism, which assigns an importance score for every word relative to other words in a sequence. However, these models are very large, often reaching hundreds of billions of parameters, and therefore require a large number of dynamic random access memory (DRAM) accesses. Hence, traditional deep neural network (DNN) accelerators such as graphical processing units (GPUs) and tensor processing units (TPUs) face limitations in processing Transformers efficiently. In-memory accelerators based on nonvolatile memory (NVM) promise to be an effective solution to this challenge, since they provide high storage density while performing massively parallel matrix–vector multiplications (MVMs) within memory arrays. However, attention score computations, which are frequently used in Transformers unlike convolutional neural networks (CNNs) and recurrent neural network (RNNs), require MVMs where both the operands change dynamically for each input. As a result, conventional NVM-based accelerators incur high write latency and write energy when used for Transformers and further suffer from the low endurance of most NVM technologies. To address these challenges, we present X-Former, a hybrid in-memory hardware accelerator that consists of both NVM and CMOS processing elements to execute transformer workloads efficiently. To improve the hardware utilization of X-Former, we also propose a sequence blocking dataflow, which overlaps the computations of the two processing elements and reduces execution time. Across several benchmarks, we show that X-Former achieves up to $69.8\times $ and $13\times $ improvements in latency and energy over a NVIDIA GeForce GTX 1060 GPU and up to $24.1\times $ and $7.95\times $ improvements in latency and energy over a state-of-the-art in-memory NVM accelerator.
Found
Nothing found, try to update filter.
Found
Nothing found, try to update filter.
Top-30
Journals
|
1
2
3
|
|
|
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
3 publications, 6.67%
|
|
|
IEEE Transactions on Circuits and Systems I: Regular Papers
3 publications, 6.67%
|
|
|
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
3 publications, 6.67%
|
|
|
ACM Computing Surveys
2 publications, 4.44%
|
|
|
Neurocomputing
2 publications, 4.44%
|
|
|
IEEE Journal on Emerging and Selected Topics in Circuits and Systems
2 publications, 4.44%
|
|
|
Nature Computational Science
2 publications, 4.44%
|
|
|
CAAI Transactions on Intelligence Technology
1 publication, 2.22%
|
|
|
ACM Transactions on Design Automation of Electronic Systems
1 publication, 2.22%
|
|
|
Applied Sciences (Switzerland)
1 publication, 2.22%
|
|
|
IEEE Circuits and Systems Magazine
1 publication, 2.22%
|
|
|
Frontiers in Psychology
1 publication, 2.22%
|
|
|
Journal of Systems Architecture
1 publication, 2.22%
|
|
|
Transactions on Embedded Computing Systems
1 publication, 2.22%
|
|
|
Science China Information Sciences
1 publication, 2.22%
|
|
|
Integration, the VLSI Journal
1 publication, 2.22%
|
|
|
IEEE Transactions on Emerging Topics in Computing
1 publication, 2.22%
|
|
|
Future Generation Computer Systems
1 publication, 2.22%
|
|
|
Communications Physics
1 publication, 2.22%
|
|
|
IEEE Access
1 publication, 2.22%
|
|
|
IEEE Transactions on Circuits and Systems for Artificial Intelligence
1 publication, 2.22%
|
|
|
1
2
3
|
Publishers
|
5
10
15
20
25
|
|
|
Institute of Electrical and Electronics Engineers (IEEE)
25 publications, 55.56%
|
|
|
Association for Computing Machinery (ACM)
8 publications, 17.78%
|
|
|
Elsevier
5 publications, 11.11%
|
|
|
Springer Nature
4 publications, 8.89%
|
|
|
Institution of Engineering and Technology (IET)
1 publication, 2.22%
|
|
|
MDPI
1 publication, 2.22%
|
|
|
Frontiers Media S.A.
1 publication, 2.22%
|
|
|
5
10
15
20
25
|
- We do not take into account publications without a DOI.
- Statistics recalculated weekly.
Are you a researcher?
Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
45
Total citations:
45
Citations from 2024:
43
(95.56%)
The most citing journal
Citations in journal:
3
Cite this
GOST |
RIS |
BibTex |
MLA
Cite this
GOST
Copy
Sridharan S. et al. X-Former: In-Memory Acceleration of Transformers // IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2023. Vol. 31. No. 8. pp. 1223-1233.
GOST all authors (up to 50)
Copy
Sridharan S., Stevens J. R., Roy K., Raghunathan A. X-Former: In-Memory Acceleration of Transformers // IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2023. Vol. 31. No. 8. pp. 1223-1233.
Cite this
RIS
Copy
TY - JOUR
DO - 10.1109/tvlsi.2023.3282046
UR - https://ieeexplore.ieee.org/document/10155455/
TI - X-Former: In-Memory Acceleration of Transformers
T2 - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
AU - Sridharan, Shrihari
AU - Stevens, Jacob R.
AU - Roy, Kaushik
AU - Raghunathan, Anand
PY - 2023
DA - 2023/08/01
PB - Institute of Electrical and Electronics Engineers (IEEE)
SP - 1223-1233
IS - 8
VL - 31
SN - 1063-8210
SN - 1557-9999
ER -
Cite this
BibTex (up to 50 authors)
Copy
@article{2023_Sridharan,
author = {Shrihari Sridharan and Jacob R. Stevens and Kaushik Roy and Anand Raghunathan},
title = {X-Former: In-Memory Acceleration of Transformers},
journal = {IEEE Transactions on Very Large Scale Integration (VLSI) Systems},
year = {2023},
volume = {31},
publisher = {Institute of Electrical and Electronics Engineers (IEEE)},
month = {aug},
url = {https://ieeexplore.ieee.org/document/10155455/},
number = {8},
pages = {1223--1233},
doi = {10.1109/tvlsi.2023.3282046}
}
Cite this
MLA
Copy
Sridharan, Shrihari, et al. “X-Former: In-Memory Acceleration of Transformers.” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 31, no. 8, Aug. 2023, pp. 1223-1233. https://ieeexplore.ieee.org/document/10155455/.