Artículo

Enhancements in Immediate Speech Emotion Detection: Harnessing Prosodic and Spectral Characteristics

Resumen

Speech is essential to human communication for expressing and understanding feelings. Emotional speech processing has challenges with expert data sampling, dataset organization, and computational complexity in large-scale analysis. This study aims to reduce data redundancy and high dimensionality by introducing a new speech emotion recognition system. The system employs Diffusion Map to reduce dimensionality and includes Decision Trees and K-Nearest Neighbors(KNN)ensemble classifiers. These strategies are suggested to increase voice emotion recognition accuracy. Speech emotion recognition is gaining popularity in affective computing for usage in medical, industry, and academics. This project aims to provide an efficient and robust real-time emotion identification framework. In order to identify emotions using supervised machine learning models, this work makes use of paralinguistic factors such as intensity, pitch, and MFCC. In order to classify data, experimental analysis integrates prosodic and spectral information utilizing methods like Random Forest, Multilayer Perceptron, SVM, KNN, and Gaussian Naïve Bayes. Fast training times make these machine learning models excellent for real-time applications. SVM and MLP have the highest accuracy at 70.86% and 79.52%, respectively. Comparisons to benchmarks show significant improvements over earlier models.
https://doi.org/10.38124/ijisrt/ijisrt24apr872
Enhancements in Immediate Speech Emotion Detection: Harnessing Prosodic and Spectral Characteristics
2024
gold
https://doi.org/10.38124/ijisrt/ijisrt24apr872
ZEWAR Shah; SHAN Zhiyong; Adnan
Artículo obtenido de:
OpenAlex
0 0 votos
Califica el artículo
Subscribirse
Notificación de