ACM Computing Surveys, volume 57, issue 8, pages 1-35

A Review on Edge Large Language Models: Design, Execution, and Applications

Yue Zheng ¹

Yuhao Chen ^{2, 3}

Bin Qian ^{2, 3}

Xiufang Shi ^{1, 4}

Yuanchao Shu ^{2, 3}

Jiming Chen ^{2, 3}

Hide authors affiliations

ZheJiang University of Technology, HangZhou, China |

Zhejiang University, Hangzhou China |

Zhejiang University, HangZhou, China |

⁴

Zhejiang University of Technology, Hangzhou China |

Publication type: Journal Article

Publication date: 2025-03-23

Association for Computing Machinery (ACM)

Journal: ACM Computing Surveys

scimago Q1

SJR: 6.280

CiteScore: 33.2

Impact factor: 23.8

ISSN: 03600300, 15577341

DOI: 10.1145/3719664

Copy DOI

Abstract

Large language models (LLMs) have revolutionized natural language processing with their exceptional understanding, synthesizing, and reasoning capabilities. However, deploying LLMs on resource-constrained edge devices presents significant challenges due to computational limitations, memory constraints, and edge hardware heterogeneity. This survey provides a comprehensive overview of recent advancements in edge LLMs, covering the entire lifecycle — from resource-efficient model design and pre-deployment strategies to runtime inference optimizations. It also explores on-device applications across various domains. By synthesizing state-of-the-art techniques and identifying future research directions, this survey bridges the gap between the immense potential of LLMs and the constraints of edge computing.

Found 5

By date By citations

Institute of Electrical and Electronics Engineers (IEEE)

End-to-end Multi-Target Flexible Job Shop Scheduling With Deep Reinforcement Learning

Wang R., Jing Y., Gu C., He S., Chen J.

IEEE Internet of Things Journal scimago Q1 wos Q1 ,

2025-02-15, citations by CoLab: 2

X-Former: In-Memory Acceleration of Transformers

Sridharan S., Stevens J.R., Roy K., Raghunathan A.

IEEE Transactions on Very Large Scale Integration (VLSI) Systems scimago Q1 wos Q2 ,

2023-08-01, citations by CoLab: 14

Flexible high-resolution object detection on edge devices with tunable latency

Jiang S., Lin Z., Li Y., Shu Y., Liu Y.

2021-10-25, citations by CoLab: 66 , Abstract

Recurrent Neural Networks for Edge Intelligence

Lalapura V.S., Amudha J., Satheesh H.S.

ACM Computing Surveys scimago Q1 wos Q1 ,

2021-05-24, citations by CoLab: 54 , Abstract

citations by CoLab: 31

Found 2

By date By citations

Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models

Wang X., Tang Z., Guo J., Meng T., Wang C., Wang T., Jia W.

ACM Computing Surveys scimago Q1 wos Q1 ,

2025-04-04, citations by CoLab: 0 , Abstract

Transitioning from TinyML to Edge GenAI: A Review

Giorgetti G., Pau D.P.

Big Data and Cognitive Computing scimago Q2 wos Q1 Open Access

2025-03-06, citations by CoLab: 0 , PDF, Abstract

	1
Big Data and Cognitive Computing	Big Data and Cognitive Computing, 1, 50% Big Data and Cognitive Computing 1 publication, 50%
ACM Computing Surveys	ACM Computing Surveys, 1, 50% ACM Computing Surveys 1 publication, 50%
	1

	1
MDPI	MDPI, 1, 50% MDPI 1 publication, 50%
Association for Computing Machinery (ACM)	Association for Computing Machinery (ACM), 1, 50% Association for Computing Machinery (ACM) 1 publication, 50%
	1

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.

Metrics

Cite this

GOST | RIS | BibTex | MLA

Found error?

Publisher

Association for Computing Machinery (ACM)

Journal

ACM Computing Surveys

scimago Q1

SJR

6.280

CiteScore

33.2

Impact factor

23.8

ISSN

03600300 (Print)

15577341 (Electronic)

Profiles

Chen, Jiming -M

A Review on Edge Large Language Models: Design, Execution, and Applications

Top-30

Journals

Publishers

Are you a researcher?