Open Access
,
pages 271-284
Adaptive Greedy Layer Pruning: Iterative Layer Pruning with Subsequent Model Repurposing
Publication type: Book Chapter
Publication date: 2024-09-19
scimago Q2
SJR: 0.352
CiteScore: 2.4
Impact factor: —
ISSN: 03029743, 16113349, 18612075, 18612083
Abstract
Reducing the memory requirements during inference time in pre-trained language models (PLMs) constitutes a key challenge. In this paper, we rigorously investigate the possibility of progressively removing layers from PLMs during their fine-tuning process, in such a way that their final task performance degrade minimally. Our proposed approach not only provides a considerable reduction in the inference cost of using PLMs, but it also highlights the importance of distinct layers, via the identification of layers with marginal contribution to downstream task performance. Our experiments, encompassing seven diverse tasks, corroborate that the exclusion of less pertinent transformer layers facilitates more efficient inference without causing serious degradation of task performance. Indeed, we were able to omit up to 2.2x more layers from the investigated PLMs (depending on the backbone model) compared to a strong layer pruning baseline when preserving no less than 95% of the performance of the full backbone model.
Found
Nothing found, try to update filter.
Are you a researcher?
Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
0
Total citations:
0
Cite this
GOST |
RIS |
BibTex
Cite this
GOST
Copy
Ficsor T., Berend G. Adaptive Greedy Layer Pruning: Iterative Layer Pruning with Subsequent Model Repurposing // Lecture Notes in Computer Science. 2024. pp. 271-284.
GOST all authors (up to 50)
Copy
Ficsor T., Berend G. Adaptive Greedy Layer Pruning: Iterative Layer Pruning with Subsequent Model Repurposing // Lecture Notes in Computer Science. 2024. pp. 271-284.
Cite this
RIS
Copy
TY - GENERIC
DO - 10.1007/978-3-031-70239-6_19
UR - https://link.springer.com/10.1007/978-3-031-70239-6_19
TI - Adaptive Greedy Layer Pruning: Iterative Layer Pruning with Subsequent Model Repurposing
T2 - Lecture Notes in Computer Science
AU - Ficsor, Tamás
AU - Berend, Gábor
PY - 2024
DA - 2024/09/19
PB - Springer Nature
SP - 271-284
SN - 0302-9743
SN - 1611-3349
SN - 1861-2075
SN - 1861-2083
ER -
Cite this
BibTex (up to 50 authors)
Copy
@incollection{2024_Ficsor,
author = {Tamás Ficsor and Gábor Berend},
title = {Adaptive Greedy Layer Pruning: Iterative Layer Pruning with Subsequent Model Repurposing},
publisher = {Springer Nature},
year = {2024},
pages = {271--284},
month = {sep}
}