UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 4 | April 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 10 Issue 5
May-2023
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2305946


Registration ID:
516639

Page Number

j1-j5

Share This Article


Jetir RMS

Title

Image to Text and Speech Synthesis

Abstract

Generating textual descriptions of images has been an important topic in computer vision and natural language processing. A number of techniques based on deep learning have been proposed on this topic. Several existing technologies can perform image-to-text conversion, including optical character recognition (OCR) systems. These systems utilize computer vision algorithms to identify and extract text from images. Additionally, text-to-speech (TTS) systems can convert textual information into audible speech. On the other hand, text-to-image conversion involves generative models like deep neural networks, which can learn to generate images based on textual descriptions. Our proposed methodology involves converting images into text and speech, as well as converting text into images. This process can have various applications, such as assisting visually impaired individuals in understanding visual content or generating visual representations of textual information. Generative Adversarial Network based text to image generator to generate images. A image captioning method trained on real images to generate the captions. Fliker8k dataset is being used. The results of the models using both qualitative and quantitative analysis on popularly used evaluation metrics. A Text-to-speech synthesizer is an application that converts text into spoken word, by analyzing and processing the text using Natural Language Processing (NLP) and then using Digital Signal Processing (DSP) technology to convert this processed text into synthesized speech representation of the text.

Key Words

Natural Language Processing, Digital Signal Processing, Image captioning, Speech generation, Generative Adversarial Network.

Cite This Article

"Image to Text and Speech Synthesis", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.10, Issue 5, page no.j1-j5, May-2023, Available :http://www.jetir.org/papers/JETIR2305946.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"Image to Text and Speech Synthesis", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.10, Issue 5, page no. ppj1-j5, May-2023, Available at : http://www.jetir.org/papers/JETIR2305946.pdf

Publication Details

Published Paper ID: JETIR2305946
Registration ID: 516639
Published In: Volume 10 | Issue 5 | Year May-2023
DOI (Digital Object Identifier):
Page No: j1-j5
Country: Bangalore, Karnataka, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

00057

Print This Page

Current Call For Paper

Jetir RMS