UGC Approved Journal no 63975(19)
New UGC Peer-Reviewed Rules

ISSN: 2349-5162 | ESTD Year : 2014
Volume 12 | Issue 10 | October 2025

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 12 Issue 3
March-2025
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2503119


Registration ID:
556359

Page Number

b156-b163

Share This Article


Jetir RMS

Title

Automated Image Captioning And Voice Generation Using Deep Learning Techniques

Abstract

Automated image captioning and voice generation have emerged as transformative technologies, enabling machines to interpret visual content and generate human-like descriptions. This study explores the integration of deep learning models, particularly Convolutional Neural Networks (CNNs) for image analysis and Recurrent Neural Networks (RNNs), specifically Long Short-Term Memory (LSTM) networks, for generating descriptive text. The research further investigates the role of text-to-speech (TTS) systems in converting these generated captions into natural-sounding speech. These technologies are crucial for improving accessibility, particularly for visually impaired individuals, and enhancing user engagement across multimedia platforms. The study highlights the impact of automated image captioning and voice generation in content creation, education, and accessibility. Challenges such as dataset availability, model accuracy, and computational complexity are discussed, with a focus on potential solutions and future research directions. Ultimately, the findings underscore the potential of these technologies to foster more inclusive, interactive, and engaging digital experiences.

Key Words

Automated Image Captioning, Deep Learning, Convolutional Neural Networks, Recurrent Neural Networks, Long Short-Term Memory Networks, Text-to-Speech, Accessibility, Multimedia, Artificial Intelligence, Natural Language Processing, Human-Computer Interaction.

Cite This Article

"Automated Image Captioning And Voice Generation Using Deep Learning Techniques", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.12, Issue 3, page no.b156-b163, March-2025, Available :http://www.jetir.org/papers/JETIR2503119.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"Automated Image Captioning And Voice Generation Using Deep Learning Techniques", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.12, Issue 3, page no. ppb156-b163, March-2025, Available at : http://www.jetir.org/papers/JETIR2503119.pdf

Publication Details

Published Paper ID: JETIR2503119
Registration ID: 556359
Published In: Volume 12 | Issue 3 | Year March-2025
DOI (Digital Object Identifier):
Page No: b156-b163
Country: Natepute, Maharashtra, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

00088

Print This Page

Current Call For Paper

Jetir RMS