UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 4 | April 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 7 Issue 4
April-2020
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2004101


Registration ID:
230347

Page Number

774-781

Share This Article


Jetir RMS

Title

A new framework for ensembling of text clustering of data

Abstract

A clustering ensemble algorithm aims to combine multiple clustering techniques to produce a better result than an individual clustering algorithm in this paper we describe a novel approach to clustering of text data. Text clustering is a technique through which text documents are divided into a particular number of groups so that the text documents within each group are related in content for these purposes we use two different clustering algorithms k-means and Birch Algorithm. Before using these algorithms, we perform the pre-processing of the documents Preprocessing techniques used are Stopword removal, pruning, stemming, Document representation-Vector Space model, after performing the preprocessing of the documents inverse document frequency (IDF) has been achieved. These achieved IDF is used as an input to the clustering algorithms k-means and Birch. The common weighing scheme is TF-IDF (Term Frequency-Inverse Document Frequency),it has been found that the new weighting scheme word to vector provide better results than TF_IDF. We aim at applying the text clustering to articles like in newspaper using Word to Vector scheme to calculate the terms weight in the document vector. In this project the idea of ensemble text clustering of majority voting is used. ___________________________________________________________

Key Words

K-means, Birch, Word to Vector, Pre-processing.

Cite This Article

"A new framework for ensembling of text clustering of data", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.7, Issue 4, page no.774-781, April-2020, Available :http://www.jetir.org/papers/JETIR2004101.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"A new framework for ensembling of text clustering of data", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.7, Issue 4, page no. pp774-781, April-2020, Available at : http://www.jetir.org/papers/JETIR2004101.pdf

Publication Details

Published Paper ID: JETIR2004101
Registration ID: 230347
Published In: Volume 7 | Issue 4 | Year April-2020
DOI (Digital Object Identifier):
Page No: 774-781
Country: Visakhapatnam, Andhra Pradesh, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

0003014

Print This Page

Current Call For Paper

Jetir RMS