UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 5 | May 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 10 Issue 7
July-2023
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2307073


Registration ID:
520792

Page Number

a595-a603

Share This Article


Jetir RMS

Title

Gurmukhi Punjabi Part of speech tagging using IndicBERT-BiLSTM architecture

Abstract

Recent advancements in neural network-based language representation have facilitated the transfer of learned internal states from trained models to various downstream natural language processing tasks, including Part of Speech (POS) Tagging and question answering. It has demonstrated the notable improvements achieved by leveraging pre-trained language models, particularly when labeled data is limited. We collected our dataset from the publicly accessible Indian Languages Corpora Initiative (ILCI) phase-II project, which encompasses 28,733 sentences from six diverse text domains, namely science and technology, religion, health, entertainment, sports, and agriculture. The dataset has been meticulously annotated using the Bureau of Indian Standards (BIS) tagset, which includes a comprehensive set of 34 grammatical categories. It consists of 25,859 sentences for training, 1,437 sentences for validation, and 1,437 sentences for the test set. To best of our knowledge this is a first attempt to make a part of speech tagger using transformer architecture for low resource Punjabi language. With only 28733 sentences we have developed a system that achieved an F1 score of 84.46% on unseen data for the POS tagging task from six different domains.

Key Words

Part of speech tagging, Transformers, IndicBERT, BiLSTM, Natural language processing

Cite This Article

"Gurmukhi Punjabi Part of speech tagging using IndicBERT-BiLSTM architecture", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.10, Issue 7, page no.a595-a603, July-2023, Available :http://www.jetir.org/papers/JETIR2307073.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"Gurmukhi Punjabi Part of speech tagging using IndicBERT-BiLSTM architecture", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.10, Issue 7, page no. ppa595-a603, July-2023, Available at : http://www.jetir.org/papers/JETIR2307073.pdf

Publication Details

Published Paper ID: JETIR2307073
Registration ID: 520792
Published In: Volume 10 | Issue 7 | Year July-2023
DOI (Digital Object Identifier): http://doi.one/10.1729/Journal.35065
Page No: a595-a603
Country: Bengaluru, Karnataka, India .
Area: Science & Technology
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

000170

Print This Page

Current Call For Paper

Jetir RMS