UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 4 | April 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 5 Issue 7
July-2018
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR180Z029


Registration ID:
212408

Page Number

950-956

Share This Article


Jetir RMS

Title

Preprocessing Phase to Develop an Interface to Query Relational Databases in Punjabi Language: Query Normalization

Abstract

A natural language sentence needs some pre-processing before it is used for Natural Language Processing (NLP). The pre-processing step depends on the task to be performed on that natural language sentence. It is often called normalization of text. It is the step for preparing the sentence for further processing and so largely depends on the further processes. The first step for preparing the raw text for further processing involves cleaning the unwanted special characters from text. After cleaning, the next step is to replace some words or multiword expressions with alternative standard terms that are easy to process. In third step the sentence is split into tokens called tokenization. Last important step is stemming the words to remove any affixes attached to them. This paper presents a technique to normalize text for the development of an interface to query relational databases in Punjabi language. In this development the important words will be extracted from the query sentence and the sentence as a whole is not taken into consideration; instead some undesired words are ignored during further processing. These undesirable words are those words that may not be the information themselves but may be helpful in extracting the information from the text. This paper presents a normalization methodology that includes four steps that are Cleaning, Substituting, Tokenizing and Stemming.

Key Words

Normalization, Natural Language Processing, Punjabi Language Processing, Cleaning, Substituting, Tokenization, Stemming.

Cite This Article

"Preprocessing Phase to Develop an Interface to Query Relational Databases in Punjabi Language: Query Normalization", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.5, Issue 7, page no.950-956, July-2018, Available :http://www.jetir.org/papers/JETIR180Z029.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"Preprocessing Phase to Develop an Interface to Query Relational Databases in Punjabi Language: Query Normalization", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.5, Issue 7, page no. pp950-956, July-2018, Available at : http://www.jetir.org/papers/JETIR180Z029.pdf

Publication Details

Published Paper ID: JETIR180Z029
Registration ID: 212408
Published In: Volume 5 | Issue 7 | Year July-2018
DOI (Digital Object Identifier): http://doi.one/10.1729/Journal.20880
Page No: 950-956
Country: Patiala, Punjab, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

0002969

Print This Page

Current Call For Paper

Jetir RMS