том 45 издание 1 страницы 1328-1334

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition

Тип публикацииJournal Article
Дата публикации2023-01-01
scimago Q1
wos Q1
white level БС1
SJR3.91
CiteScore35
Impact factor18.6
ISSN01628828, 21609292, 19393539
Computational Theory and Mathematics
Artificial Intelligence
Applied Mathematics
Software
Computer Vision and Pattern Recognition
Краткое описание
In this paper, we present Vision Permutator, a conceptually simple and data efficient MLP-like architecture for visual recognition. By realizing the importance of the positional information carried by 2D feature representations, unlike recent MLP-like models that encode the spatial information along the flattened spatial dimensions, Vision Permutator separately encodes the feature representations along the height and width dimensions with linear projections. This allows Vision Permutator to capture long-range dependencies and meanwhile avoid the attention building process in transformers. The outputs are then aggregated in a mutually complementing manner to form expressive representations. We show that our Vision Permutators are formidable competitors to convolutional neural networks (CNNs) and vision transformers. Without the dependence on spatial convolutions or attention mechanisms, Vision Permutator achieves 81.5% top-1 accuracy on ImageNet without extra large-scale training data (e.g., ImageNet-22k) using only 25M learnable parameters, which is much better than most CNNs and vision transformers under the same model size constraint. When scaling up to 88M, it attains 83.2% top-1 accuracy, greatly improving the performance of recent state-of-the-art MLP-like networks for visual recognition. We hope this work could encourage research on rethinking the way of encoding spatial information and facilitate the development of MLP-like models. PyTorch/MindSpore/Jittor code is available at https://github.com/Andrew-Qibin/VisionPermutator .
Для доступа к списку цитирований публикации необходимо авторизоваться.

Топ-30

Журналы

1
2
3
4
5
6
7
Lecture Notes in Computer Science
7 публикаций, 4.58%
IEEE Transactions on Geoscience and Remote Sensing
6 публикаций, 3.92%
IEEE Transactions on Pattern Analysis and Machine Intelligence
5 публикаций, 3.27%
IEEE Access
4 публикации, 2.61%
Biomedical Signal Processing and Control
4 публикации, 2.61%
IEEE International Conference on Computer Vision (ICCV)
4 публикации, 2.61%
IEEE Transactions on Circuits and Systems for Video Technology
4 публикации, 2.61%
Engineering Applications of Artificial Intelligence
4 публикации, 2.61%
Remote Sensing
3 публикации, 1.96%
Applied Soft Computing Journal
3 публикации, 1.96%
Computer Methods and Programs in Biomedicine
2 публикации, 1.31%
IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
2 публикации, 1.31%
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
2 публикации, 1.31%
Multimedia Tools and Applications
2 публикации, 1.31%
IEEE Journal of Biomedical and Health Informatics
2 публикации, 1.31%
Sensors
2 публикации, 1.31%
Pattern Recognition
2 публикации, 1.31%
Information Sciences
2 публикации, 1.31%
Neurocomputing
2 публикации, 1.31%
Artificial Intelligence Review
2 публикации, 1.31%
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
1 публикация, 0.65%
Frontiers in Marine Science
1 публикация, 0.65%
Visual Computer
1 публикация, 0.65%
IEEE Transactions on Neural Systems and Rehabilitation Engineering
1 публикация, 0.65%
IEICE Transactions on Information and Systems
1 публикация, 0.65%
IEEE Wireless Communications Letters
1 публикация, 0.65%
CAAI Transactions on Intelligence Technology
1 публикация, 0.65%
Applied Sciences (Switzerland)
1 публикация, 0.65%
Biomimetics
1 публикация, 0.65%
1
2
3
4
5
6
7

Издатели

10
20
30
40
50
60
70
Institute of Electrical and Electronics Engineers (IEEE)
64 публикации, 41.83%
Elsevier
36 публикаций, 23.53%
Springer Nature
28 публикаций, 18.3%
MDPI
10 публикаций, 6.54%
Wiley
5 публикаций, 3.27%
Frontiers Media S.A.
2 публикации, 1.31%
IOP Publishing
2 публикации, 1.31%
Institute of Electronics, Information and Communications Engineers (IEICE)
1 публикация, 0.65%
Institution of Engineering and Technology (IET)
1 публикация, 0.65%
Taylor & Francis
1 публикация, 0.65%
World Scientific
1 публикация, 0.65%
SPIE-Intl Soc Optical Eng
1 публикация, 0.65%
Association for Computing Machinery (ACM)
1 публикация, 0.65%
10
20
30
40
50
60
70
  • Мы не учитываем публикации, у которых нет DOI.
  • Статистика публикаций обновляется еженедельно.

Вы ученый?

Создайте профиль, чтобы получать персональные рекомендации коллег, конференций и новых статей.
Метрики
153
Поделиться
Цитировать
ГОСТ |
Цитировать
Hou Q. et al. Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition // IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023. Vol. 45. No. 1. pp. 1328-1334.
ГОСТ со всеми авторами (до 50) Скопировать
Hou Q., Jiang Z., Yuan L., Cheng M., YAN S., Feng J. Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition // IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023. Vol. 45. No. 1. pp. 1328-1334.
RIS |
Цитировать
TY - JOUR
DO - 10.1109/tpami.2022.3145427
UR - https://doi.org/10.1109/tpami.2022.3145427
TI - Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
T2 - IEEE Transactions on Pattern Analysis and Machine Intelligence
AU - Hou, Qibin
AU - Jiang, Zihang
AU - Yuan, Li
AU - Cheng, Ming-Ming
AU - YAN, SHUICHENG
AU - Feng, Jiashi
PY - 2023
DA - 2023/01/01
PB - Institute of Electrical and Electronics Engineers (IEEE)
SP - 1328-1334
IS - 1
VL - 45
PMID - 35077359
SN - 0162-8828
SN - 2160-9292
SN - 1939-3539
ER -
BibTex |
Цитировать
BibTex (до 50 авторов) Скопировать
@article{2023_Hou,
author = {Qibin Hou and Zihang Jiang and Li Yuan and Ming-Ming Cheng and SHUICHENG YAN and Jiashi Feng},
title = {Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
year = {2023},
volume = {45},
publisher = {Institute of Electrical and Electronics Engineers (IEEE)},
month = {jan},
url = {https://doi.org/10.1109/tpami.2022.3145427},
number = {1},
pages = {1328--1334},
doi = {10.1109/tpami.2022.3145427}
}
MLA
Цитировать
Hou, Qibin, et al. “Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, Jan. 2023, pp. 1328-1334. https://doi.org/10.1109/tpami.2022.3145427.
Ошибка в публикации?