том 17 страницы 2410-2414

Rule-Based Punctuation Algorithm for the Uzbek Language

Тип публикацииProceedings Article
Дата публикации2024-06-28
Краткое описание
Punctuation analysis occupies an important place in natural language processing, presenting the need for a model that can predict correct punctuations in number of tasks like text pre-processing, spell checking, grammar checking, information retrieval and so on. The task of predicting right punctuation marks is context-dependent making languageindependent general punctuation generation tools non-trivial for the job. Although the idea of creating such tool has already been accomplished for many languages, the Uzbek language is one of the few low-resource languages, and to our knowledge, punctuation analysis and prediction algorithms for Uzbek texts have not yet been developed. In this paper, it is proposed a rulebased algorithm and a model for punctuation analysis of periods and commas in Uzbek language texts. While the major contribution of this paper is a rule-based algorithm for determining the correct or incorrect placement of periods and commas in Uzbek language text, the authors also present the analysis results on a corpus with various fields, acknowledging the need for further analysis of the task, including machine learning and deep learning solutions for the future work. The proposed rule-based algorithm for punctuation analysis will not only help Uzbek texts, but also will hopefully play as a pivot point for other closely-related Turkic languages as well.
Для доступа к списку цитирований публикации необходимо авторизоваться.

Топ-30

Журналы

1
Data in Brief
1 публикация, 11.11%
1

Издатели

1
2
3
4
5
6
7
8
Institute of Electrical and Electronics Engineers (IEEE)
8 публикаций, 88.89%
Elsevier
1 публикация, 11.11%
1
2
3
4
5
6
7
8
  • Мы не учитываем публикации, у которых нет DOI.
  • Статистика публикаций обновляется еженедельно.

Вы ученый?

Создайте профиль, чтобы получать персональные рекомендации коллег, конференций и новых статей.
Метрики
9
Поделиться
Ошибка в публикации?