UGC Approved Journal no 63975(19)
New UGC Peer-Reviewed Rules

ISSN: 2349-5162 | ESTD Year : 2014
Volume 12 | Issue 9 | September 2025

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 12 Issue 4
April-2025
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2504D45


Registration ID:
560234

Page Number

n347-n360

Share This Article


Jetir RMS

Title

Multi-Modal Speech Emotion Recognition: Integrating Transformer Models and Contextual Analysis

Abstract

Speech Emotion Recognition (SER) is an interdisciplinary area that enhances machine understanding of human emotions through voice. It holds vital significance for fields such as healthcare, assistive technologies, intelligent virtual assistants, and affective computing. This paper introduces a comprehensive SER system leveraging deep learning and context-awareness using the Wav2Vec2 transformer model. Our solution uses raw speech input, avoids traditional feature engineering, and improves emotion classification accuracy by integrating speech transcription with sentiment analysis. Using the RAVDESS dataset, we trained and evaluated our model to classify eight emotions with real-time inference capability. Results show 78% overall accuracy with high performance for emotions like Calm, Disgust, and Surprise. This system showcases the future potential of context-enriched emotional AI systems.

Key Words

Audio Classification, Context-Aware Analysis, Deep Learning, Human-Computer Interaction, Real-Time Systems, Speech Emotion Recognition, Transformer Models, Wav2Vec2.

Cite This Article

"Multi-Modal Speech Emotion Recognition: Integrating Transformer Models and Contextual Analysis", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.12, Issue 4, page no.n347-n360, April-2025, Available :http://www.jetir.org/papers/JETIR2504D45.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"Multi-Modal Speech Emotion Recognition: Integrating Transformer Models and Contextual Analysis", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.12, Issue 4, page no. ppn347-n360, April-2025, Available at : http://www.jetir.org/papers/JETIR2504D45.pdf

Publication Details

Published Paper ID: JETIR2504D45
Registration ID: 560234
Published In: Volume 12 | Issue 4 | Year April-2025
DOI (Digital Object Identifier):
Page No: n347-n360
Country: Mumbai, Maharashtra, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

000121

Print This Page

Current Call For Paper

Jetir RMS