Open Access
Open access
том 13 издание 13 страницы 7579

Automatic Speech Disfluency Detection Using wav2vec2.0 for Different Languages with Variable Lengths

Тип публикацииJournal Article
Дата публикации2023-06-27
SCImago Q2
WOS Q2
БС2
SJR0.555
CiteScore5.5
Impact factor2.5
ISSN20763417
Computer Science Applications
Process Chemistry and Technology
General Materials Science
Instrumentation
General Engineering
Fluid Flow and Transfer Processes
Краткое описание

Speech is critical for interpersonal communication, but not everyone has fluent communication skills. Speech disfluency, including stuttering and interruptions, affects not only emotional expression but also clarity of expression for people who stutter. Existing methods for detecting speech disfluency rely heavily on annotated data, which can be costly. Additionally, these methods have not considered the issue of variable-length disfluent speech, which limits the scalability of detection methods. To address these limitations, this paper proposes an automated method for detecting speech disfluency that can improve communication skills for individuals and assist therapists in tracking the progress of stuttering patients. The proposed method focuses on detecting four types of disfluency features using single-task detection and utilizes embeddings from the pre-trained wav2vec2.0 model, as well as convolutional neural network (CNN) and Transformer models for feature extraction. The model’s scalability is improved by considering the issue of variable-length disfluent speech and modifying the model based on the entropy invariance of attention mechanisms. The proposed automated method for detecting speech disfluency has the potential to assist individuals in overcoming speech disfluency, improve their communication skills, and aid therapists in tracking the progress of stuttering patients. Additionally, the model’s scalability across languages and lengths enhances its practical applicability. The experiments demonstrate that the model outperforms baseline models in both English and Chinese datasets, proving its universality and scalability in real-world applications.

Для доступа к списку цитирований публикации необходимо авторизоваться.

Топ-30

Журналы

1
2
Lecture Notes in Computer Science
2 публикации, 22.22%
Expert Systems with Applications
1 публикация, 11.11%
Clinical Linguistics and Phonetics
1 публикация, 11.11%
Lecture Notes in Networks and Systems
1 публикация, 11.11%
Computers in Biology and Medicine
1 публикация, 11.11%
Intelligent Systems with Applications
1 публикация, 11.11%
1
2

Издатели

1
2
3
Elsevier
3 публикации, 33.33%
Springer Nature
3 публикации, 33.33%
Taylor & Francis
1 публикация, 11.11%
Cold Spring Harbor Laboratory
1 публикация, 11.11%
Institute of Electrical and Electronics Engineers (IEEE)
1 публикация, 11.11%
1
2
3
  • Мы не учитываем публикации, у которых нет DOI.
  • Статистика публикаций обновляется еженедельно.

Вы ученый?

Создайте профиль, чтобы получать персональные рекомендации коллег, конференций и новых статей.
 Войти с ORCID
Метрики
10
Поделиться
Цитировать
ГОСТ |
Цитировать
Liu J. et al. Automatic Speech Disfluency Detection Using wav2vec2.0 for Different Languages with Variable Lengths // Applied Sciences (Switzerland). 2023. Vol. 13. No. 13. p. 7579.
ГОСТ со всеми авторами (до 50) Скопировать
Liu J., Wumaier A., Wei D., Guo S. Automatic Speech Disfluency Detection Using wav2vec2.0 for Different Languages with Variable Lengths // Applied Sciences (Switzerland). 2023. Vol. 13. No. 13. p. 7579.
RIS |
Цитировать
TY - JOUR
DO - 10.3390/app13137579
UR - https://doi.org/10.3390/app13137579
TI - Automatic Speech Disfluency Detection Using wav2vec2.0 for Different Languages with Variable Lengths
T2 - Applied Sciences (Switzerland)
AU - Liu, Jiajun
AU - Wumaier, Aishan
AU - Wei, Dongping
AU - Guo, Shen
PY - 2023
DA - 2023/06/27
PB - MDPI
SP - 7579
IS - 13
VL - 13
SN - 2076-3417
ER -
BibTex |
Цитировать
BibTex (до 50 авторов) Скопировать
@article{2023_Liu,
author = {Jiajun Liu and Aishan Wumaier and Dongping Wei and Shen Guo},
title = {Automatic Speech Disfluency Detection Using wav2vec2.0 for Different Languages with Variable Lengths},
journal = {Applied Sciences (Switzerland)},
year = {2023},
volume = {13},
publisher = {MDPI},
month = {jun},
url = {https://doi.org/10.3390/app13137579},
number = {13},
pages = {7579},
doi = {10.3390/app13137579}
}
MLA
Цитировать
Liu, Jiajun, et al. “Automatic Speech Disfluency Detection Using wav2vec2.0 for Different Languages with Variable Lengths.” Applied Sciences (Switzerland), vol. 13, no. 13, Jun. 2023, p. 7579. https://doi.org/10.3390/app13137579.
Ошибка в публикации?