Computational Linguistics and Intellectual Technologies
Current Landscape of the Russian Sentiment Corpora
Publication type: Proceedings Article
Publication date: 2021-06-19
Quartile SCImago
— Quartile WOS
—
Impact factor: —
ISSN: 20757182
Abstract
Currently, there are more than a dozen Russian-language corpora for sentiment analysis, differing in the source of the texts, domain, size, number and ratio of sentiment classes, and annotation method. This work examines publicly available Russian-language corpora, presents their qualitative and quantitative characteristics, which make it possible to get an idea of the current landscape of the corpora for sentiment analysis. The ranking of corpora by annotation quality is proposed, which can be useful when choosing corpora for training and testing. The influence of the training dataset on the performance of sentiment analysis is investigated based on the use of the deep neural network model BERT. The experiments with review corpora allow us to conclude that on average the quality of models increases with an increase in the number of training corpora. For the first time, quality scores were obtained for the corpus of reviews of ROMIP seminars based on the BERT model. Also, the study proposes the task of the building a universal model for sentiment analysis.
Citations by journals
1
|
|
IEEE Access
|
IEEE Access
1 publication, 100%
|
1
|
Citations by publishers
1
|
|
IEEE
|
IEEE
1 publication, 100%
|
1
|
- We do not take into account publications that without a DOI.
- Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
- Statistics recalculated weekly.
{"yearsCitations":{"type":"bar","data":{"show":true,"labels":[2023],"ids":[0],"codes":[0],"imageUrls":[""],"datasets":[{"label":"Citations number","data":[1],"backgroundColor":["#3B82F6"],"percentage":["100"],"barThickness":null}]},"options":{"indexAxis":"x","maintainAspectRatio":true,"scales":{"y":{"ticks":{"precision":0,"autoSkip":false,"font":{"family":"Montserrat"},"color":"#000000"}},"x":{"ticks":{"stepSize":1,"precision":0,"font":{"family":"Montserrat"},"color":"#000000"}}},"plugins":{"legend":{"position":"top","labels":{"font":{"family":"Montserrat"},"color":"#000000"}},"title":{"display":true,"text":"Citations per year","font":{"size":24,"family":"Montserrat","weight":600},"color":"#000000"}}}},"journals":{"type":"bar","data":{"show":true,"labels":["IEEE Access"],"ids":[25260],"codes":[0],"imageUrls":["\/storage\/images\/resized\/6scCJegesojp2jubwY3uKCzTAmgsaH2GIFlg6Hfk_medium.webp"],"datasets":[{"label":"","data":[1],"backgroundColor":["#3B82F6"],"percentage":[100],"barThickness":13}]},"options":{"indexAxis":"y","maintainAspectRatio":false,"scales":{"y":{"ticks":{"precision":0,"autoSkip":false,"font":{"family":"Montserrat"},"color":"#000000"}},"x":{"ticks":{"stepSize":null,"precision":0,"font":{"family":"Montserrat"},"color":"#000000"}}},"plugins":{"legend":{"position":"top","labels":{"font":{"family":"Montserrat"},"color":"#000000"}},"title":{"display":true,"text":"Journals","font":{"size":24,"family":"Montserrat","weight":600},"color":"#000000"}}}},"publishers":{"type":"bar","data":{"show":true,"labels":["IEEE"],"ids":[6953],"codes":[0],"imageUrls":["\/storage\/images\/resized\/6scCJegesojp2jubwY3uKCzTAmgsaH2GIFlg6Hfk_medium.webp"],"datasets":[{"label":"","data":[1],"backgroundColor":["#3B82F6"],"percentage":[100],"barThickness":13}]},"options":{"indexAxis":"y","maintainAspectRatio":false,"scales":{"y":{"ticks":{"precision":0,"autoSkip":false,"font":{"family":"Montserrat"},"color":"#000000"}},"x":{"ticks":{"stepSize":null,"precision":0,"font":{"family":"Montserrat"},"color":"#000000"}}},"plugins":{"legend":{"position":"top","labels":{"font":{"family":"Montserrat"},"color":"#000000"}},"title":{"display":true,"text":"Publishers","font":{"size":24,"family":"Montserrat","weight":600},"color":"#000000"}}}}}
Metrics
Cite this
GOST |
RIS |
BibTex
Cite this
GOST
Copy
Kotelnikov E. V. Current Landscape of the Russian Sentiment Corpora // Computational Linguistics and Intellectual Technologies. 2021.
GOST all authors (up to 50)
Copy
Kotelnikov E. V. Current Landscape of the Russian Sentiment Corpora // Computational Linguistics and Intellectual Technologies. 2021.
Cite this
RIS
Copy
TY - CPAPER
DO - 10.28995/2075-7182-2021-20-433-444
UR - https://doi.org/10.28995%2F2075-7182-2021-20-433-444
TI - Current Landscape of the Russian Sentiment Corpora
T2 - Computational Linguistics and Intellectual Technologies
AU - Kotelnikov, E V
PY - 2021
DA - 2021/06/19 00:00:00
PB - Russian State University for the Humanities
SN - 2075-7182
ER -
Cite this
BibTex
Copy
@inproceedings{2021_Kotelnikov,
author = {E V Kotelnikov},
title = {Current Landscape of the Russian Sentiment Corpora},
year = {2021},
month = {jun},
publisher = {Russian State University for the Humanities}
}
Profiles