UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 5 | May 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 10 Issue 10
October-2023
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2310506


Registration ID:
526744

Page Number

f36-f44

Share This Article


Jetir RMS

Title

NEURAL MACHINE TRANSLATION FOR UNDER-REPRESENTED INDIAN LANGUAGE

Abstract

Many useful things are available on the internet in English, but not everyone understands English well. So, we often need to translate these things into local languages to help people who don't speak English. But doing this translation by hand is hard, expensive, and takes a long time. That's where machine translation comes in – it's a way for computers to automatically change text from one language to another without people having to do it. Among the different ways computers can do this, there's one called neural machine translation (NMT), which is really good at it. In this research paper, we talk about using NMT for two complicated Indian languages: English-Tamil and English-Malayalam. These languages are tricky because they have lots of unique features, and there aren't many online tools to help with translations. To make this work, we came up with a new way of using NMT. We added something called Multihead self-attention and used special pre-trained techniques called Byte-Pair-Encoded (BPE) and MultiBPE embeddings. These fancy words might sound complicated, but they help our system handle words it doesn't know well (Out Of Vocabulary or OOV words). We also collected texts from different places, fixed problems in the data we found online, and made it all better for our system to use. To check how well our system worked, we used something called the BLEU score. And guess what? Our system did really well! It got high scores of 24.34 for English-Tamil and 9.78 for English-Malayalam translations. That's better than what Google Translator could do, which got scores of 9.40 and 5.94 for the same translations.

Key Words

Multihead self-attention, Byte-Pair-Encodding, MultiBPE, low-resourced, Morphology, Indian Languages.

Cite This Article

"NEURAL MACHINE TRANSLATION FOR UNDER-REPRESENTED INDIAN LANGUAGE", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.10, Issue 10, page no.f36-f44, October-2023, Available :http://www.jetir.org/papers/JETIR2310506.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"NEURAL MACHINE TRANSLATION FOR UNDER-REPRESENTED INDIAN LANGUAGE", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.10, Issue 10, page no. ppf36-f44, October-2023, Available at : http://www.jetir.org/papers/JETIR2310506.pdf

Publication Details

Published Paper ID: JETIR2310506
Registration ID: 526744
Published In: Volume 10 | Issue 10 | Year October-2023
DOI (Digital Object Identifier):
Page No: f36-f44
Country: Junagadh, Gujarat, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

00090

Print This Page

Current Call For Paper

Jetir RMS