UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 6 | June 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 11 Issue 5
May-2024
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2405095


Registration ID:
537974

Page Number

a798-a811

Share This Article


Jetir RMS

Title

SpeechSnap- Describing Images with voice and multi-language support features

Abstract

: In this study, we have developed an Image Captioning model leveraging InceptionV3 for image classification and extracting relevant features. The model incorporates Recurrent Neural Networks (RNNs), specifically Long Short-Term Memory (LSTM) networks, to generate descriptive captions in English for the images processed. Additionally, we have employed GloVe word embeddings to enhance the linguistic representation within the model. The system classifies images by identifying elements within them and generates meaningful statements to describe the fed images. The generated captions hold potential applications in various domains, such as social media posts, blog articles, accessibility for visually impaired individuals, e-commerce, advertising, surveillance, medical diagnosis, education, among others. The model is trained using Categorical Cross Entropy Loss, aiming to optimize the accuracy of caption predictions. As the model evolves, it is anticipated that it will provide more accurate descriptions for images of diverse formats and content, contributing to its versatility and utility across a wide array of applications

Key Words

Image Captioning, InceptionV3, Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM), GloVe, Categorical Cross Entropy Loss, Gated Recurrent Unit(GRU), Encoder, Decoder, Image captioning.

Cite This Article

"SpeechSnap- Describing Images with voice and multi-language support features", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.11, Issue 5, page no.a798-a811, May-2024, Available :http://www.jetir.org/papers/JETIR2405095.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"SpeechSnap- Describing Images with voice and multi-language support features", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.11, Issue 5, page no. ppa798-a811, May-2024, Available at : http://www.jetir.org/papers/JETIR2405095.pdf

Publication Details

Published Paper ID: JETIR2405095
Registration ID: 537974
Published In: Volume 11 | Issue 5 | Year May-2024
DOI (Digital Object Identifier):
Page No: a798-a811
Country: Buldhana, Maharashtra, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

00078

Print This Page

Current Call For Paper

Jetir RMS