UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 4 | April 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 9 Issue 4
April-2022
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2204766


Registration ID:
401352

Page Number

h496-h501

Share This Article


Jetir RMS

Title

Automated Image Caption Generator For Visually Impaired

Abstract

Image processing is used in various industries and is one of the most advanced technologies used in Google, the medical field etc. google vision is one of the most used APIs in our daily lives for image labelling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content, etc. The widely used google photos and iCloud use image processing to recognize the scenario of the photo and facial recognition. Image processing can help the visually impaired describe their surroundings. A voice-based image caption generation which is built using encoder-decoder architecture is used to describe the image. Images of the surroundings are used to generate captions which will be read aloud to the visually impaired so that they'll get a stronger sense of what's happening around them. As an encoder, we have a pre-trained Resnet50 model where ResNet-50 is a convolutional neural network which is 50 layers deep. This network learns rich feature representations for a large range of images. Besides extracting high-level features from images using Resnet50, we also maintain the image colour composition using OpenCV techniques which also helps the model to extract the features from small components within the image. As a decoder, we have an LSTM network. Long remembering (LSTM) is a recurrent neural network (RNN) architecture and contains the Time Distributed Layer that's, it can process not only single image data points but also entire sequences of image data.

Key Words

Deep learning, Resnet50, LSTM, Flutter, Encoder-Decoder, Flicker8k

Cite This Article

"Automated Image Caption Generator For Visually Impaired", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.9, Issue 4, page no.h496-h501, April-2022, Available :http://www.jetir.org/papers/JETIR2204766.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"Automated Image Caption Generator For Visually Impaired", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.9, Issue 4, page no. pph496-h501, April-2022, Available at : http://www.jetir.org/papers/JETIR2204766.pdf

Publication Details

Published Paper ID: JETIR2204766
Registration ID: 401352
Published In: Volume 9 | Issue 4 | Year April-2022
DOI (Digital Object Identifier):
Page No: h496-h501
Country: Bnagalore, Karnataka, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

000451

Print This Page

Current Call For Paper

Jetir RMS