UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 5 | May 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 10 Issue 12
December-2023
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIRGA06042


Registration ID:
530124

Page Number

375-384

Share This Article


Jetir RMS

Title

MULTI-MODAL FUSION FOR ENHANCED IMAGE AND SPEECH RECOGNITION IN AI SYSTEMS

Abstract

QThis research investigates the integration of multi-modal information, specifically images and speech, to enhance the recognition capabilities of artificial intelligence (AI) systems. Adopting an interpretive philosophy and employing a deductive approach, the study explores the potential of dynamic attention mechanisms, semi-supervised learning, and cross-domain adaptation techniques. A descriptive research design is employed, utilizing secondary data collection from reputable academic sources. The research critically evaluates the feasibility and applicability of hardware optimization for efficient multi-modal processing, considering factors like specialized processors and parallel computing. The study presents a thorough analysis of dynamic attention mechanisms, emphasizing their role in dynamically allocating attention across different modalities based on contextual relevance. Additionally, it delves into semi-supervised learning techniques, showcasing their ability to leverage both labeled and unlabeled data for improved recognition performance. Cross-domain adaptation techniques are explored to facilitate the seamless deployment of multi-modal fusion models in diverse real-world scenarios.

Key Words

AI systems, knowledge, connecting, integrating, multi-modal classification, aural, visual information

Cite This Article

"MULTI-MODAL FUSION FOR ENHANCED IMAGE AND SPEECH RECOGNITION IN AI SYSTEMS", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.10, Issue 12, page no.375-384, December-2023, Available :http://www.jetir.org/papers/JETIRGA06042.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"MULTI-MODAL FUSION FOR ENHANCED IMAGE AND SPEECH RECOGNITION IN AI SYSTEMS", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.10, Issue 12, page no. pp375-384, December-2023, Available at : http://www.jetir.org/papers/JETIRGA06042.pdf

Publication Details

Published Paper ID: JETIRGA06042
Registration ID: 530124
Published In: Volume 10 | Issue 12 | Year December-2023
DOI (Digital Object Identifier):
Page No: 375-384
Country: -, -, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

00055

Print This Page

Current Call For Paper

Jetir RMS