Memory transformer with hierarchical attention for long document processing

Тип публикацииProceedings Article
Дата публикации2021-11-24
Краткое описание
Transformers have attracted lots of interest from the researchers. Up to now, transformers achieved state-of-the-art results in a wide range of natural language processing tasks such as different sequence modeling tasks like language understanding, text summarization and translation, and definitely more transformers to come. Still, transformers has their limitations in the tasks requiring long document processing. This paper introduces a new version of transformer, a Sentence level transformer with global memory pooling and hierarchical attention to cope with long text. We replace self-attention of vanilla transformer with multi-head attention between memory and a sequence, and also add a decoder sequence selector on the top of the encoder output. In our architecture sentences are encoded in parallel and then summarized with soft-attention on every decoding step. Proposed model was validated in machine translation task. We hypothesize that attaching memory slots to each sequence improves the quality of translation, besides tuning the model on context-aware data set by using pre-trained sequence-level weights will help to get more precise translation and promote translating long documents. Results show that extending each sentence with a memory slot and employing the attention over the encoder outputs improves translation results.
Найдено 
Найдено 

Топ-30

Журналы

1
Information (Switzerland)
1 публикация, 16.67%
Knowledge-Based Systems
1 публикация, 16.67%
Studies in Computational Intelligence
1 публикация, 16.67%
Journal of Supercomputing
1 публикация, 16.67%
1

Издатели

1
2
Springer Nature
2 публикации, 33.33%
MDPI
1 публикация, 16.67%
Elsevier
1 публикация, 16.67%
Institute of Electrical and Electronics Engineers (IEEE)
1 публикация, 16.67%
1
2
  • Мы не учитываем публикации, у которых нет DOI.
  • Статистика публикаций обновляется еженедельно.

Вы ученый?

Создайте профиль, чтобы получать персональные рекомендации коллег, конференций и новых статей.
Метрики
6
Поделиться
Цитировать
ГОСТ |
Цитировать
Al Adel A. et al. Memory transformer with hierarchical attention for long document processing // 2021 International Conference Engineering and Telecommunication (En&T). 2021.
ГОСТ со всеми авторами (до 50) Скопировать
Al Adel A., Burtsev M. Memory transformer with hierarchical attention for long document processing // 2021 International Conference Engineering and Telecommunication (En&T). 2021.
RIS |
Цитировать
TY - CPAPER
DO - 10.1109/EnT50460.2021.9681776
UR - https://doi.org/10.1109/EnT50460.2021.9681776
TI - Memory transformer with hierarchical attention for long document processing
T2 - 2021 International Conference Engineering and Telecommunication (En&T)
AU - Al Adel, Arij
AU - Burtsev, Mikhail
PY - 2021
DA - 2021/11/24
PB - Institute of Electrical and Electronics Engineers (IEEE)
ER -
BibTex
Цитировать
BibTex (до 50 авторов) Скопировать
@inproceedings{2021_Al Adel,
author = {Arij Al Adel and Mikhail Burtsev},
title = {Memory transformer with hierarchical attention for long document processing},
year = {2021},
month = {nov},
publisher = {Institute of Electrical and Electronics Engineers (IEEE)}
}