Squeeze-and-Threshold Based Quantization for Low-Precision Neural Networks

Publication typeBook Chapter
Publication date2021-06-23
SJR
CiteScore
Impact factor
ISSN26618141, 2661815X
Abstract
In this work, we propose a method based on attention to quantize convolutional neural networks (CNNs) to run on low-precision (binary and multi-bit). Intuitively, high-quality pictures are very conducive to distinguishing objects. However, even in low-quality black-and-white photos (analogous to low-precision), various features can also be well distinguished and the content is easily understood. Based on this intuition, we introduce an attention-based block called squeeze-and-threshold (ST) to adjust different features to different ranges and learn the best threshold to distinguish (quantize) them. Furthermore, to eliminate the extra calculations caused by the ST block in the inference process, we propose a momentum-based method to learn the inference threshold during the training stage. Additionally, with the help of ST block, our quantization approach is faster and takes less than half the training epochs of prior multi-bit quantization works. The experimental results on different datasets and networks show the versatility of our method and demonstrate state-of-the-art performance.
Found 
Found 

Top-30

Journals

1
Electronics (Switzerland)
1 publication, 33.33%
Lecture Notes in Computer Science
1 publication, 33.33%
1

Publishers

1
MDPI
1 publication, 33.33%
Springer Nature
1 publication, 33.33%
Institute of Electrical and Electronics Engineers (IEEE)
1 publication, 33.33%
1
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
3
Share
Cite this
GOST |
Cite this
GOST Copy
Wu B. et al. Squeeze-and-Threshold Based Quantization for Low-Precision Neural Networks // Proceedings of the International Neural Networks Society. 2021. pp. 232-243.
GOST all authors (up to 50) Copy
Wu B., Waschneck B., Mayr C. Squeeze-and-Threshold Based Quantization for Low-Precision Neural Networks // Proceedings of the International Neural Networks Society. 2021. pp. 232-243.
RIS |
Cite this
RIS Copy
TY - GENERIC
DO - 10.1007/978-3-030-80568-5_20
UR - https://doi.org/10.1007/978-3-030-80568-5_20
TI - Squeeze-and-Threshold Based Quantization for Low-Precision Neural Networks
T2 - Proceedings of the International Neural Networks Society
AU - Wu, Binyi
AU - Waschneck, Bernd
AU - Mayr, Christian
PY - 2021
DA - 2021/06/23
PB - Springer Nature
SP - 232-243
SN - 2661-8141
SN - 2661-815X
ER -
BibTex
Cite this
BibTex (up to 50 authors) Copy
@incollection{2021_Wu,
author = {Binyi Wu and Bernd Waschneck and Christian Mayr},
title = {Squeeze-and-Threshold Based Quantization for Low-Precision Neural Networks},
publisher = {Springer Nature},
year = {2021},
pages = {232--243},
month = {jun}
}