UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 3 | March 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 7 Issue 6
June-2020
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2006189


Registration ID:
234179

Page Number

1306-1312

Share This Article


Jetir RMS

Title

REAL TIME VOICE CLONING

Abstract

Recent progress in deep learning has shown impressive results in the area of speech-to-text. For this reason, a deep neural network is usually trained from a single speaker using a corpus of several hours of voice recorded professionally. Giving such a model a new voice is highly expensive, as it needs a new dataset to be collected and the model retrained. A recent research has developed a three-stage pipeline that allows you to clone an unseen voice from just a few seconds of reference speech during practice and without retraining the template. The researchers share strikingly natural-sounding findings. The plan is to replicate this model and open source it to the public. With a new vocoder model, the aim is to adapt the framework to make it run in real time. The aim is to develop a three-stage deep learning system that will perform real-time voice cloning. This framework is the result of Google's 2018 paper, for which only one public implementation exists before ours. The system could capture a realistic representation of the voice spoken in a digital format from a speech utterance of only 5 seconds. Because of a text prompt, it can use any voice extracted from this process to perform text-to-speech. With our own implementations or open-source ones then plan is to replicate each of the three stages of the model. The plan is to implement successful models of deep learning and appropriate pipelines for pre-processing information. The next step is training these models from several thousand speakers for weeks or months on large datasets of tens of thousands of hours of speech. Instead examine their strengths and their drawbacks. The main focus on making this system function in real time, that is, to allow a voice to be captured and speech to be generated in less time than the duration of the speech produced. The framework will be able to clone voices it has never heard during training, and to generate speech from text it has never seen.

Key Words

Cite This Article

"REAL TIME VOICE CLONING", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.7, Issue 6, page no.1306-1312, June-2020, Available :http://www.jetir.org/papers/JETIR2006189.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"REAL TIME VOICE CLONING", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.7, Issue 6, page no. pp1306-1312, June-2020, Available at : http://www.jetir.org/papers/JETIR2006189.pdf

Publication Details

Published Paper ID: JETIR2006189
Registration ID: 234179
Published In: Volume 7 | Issue 6 | Year June-2020
DOI (Digital Object Identifier):
Page No: 1306-1312
Country: Pune, Maharashtra, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

0005911

Print This Page

Current Call For Paper

Jetir RMS