JETIREXPLORE- Search Thousands of research papers



Published in:

Volume 7 Issue 10
October-2020
eISSN: 2349-5162

Unique Identifier

JETIR2010019

Page Number

167-172

Share This Article


Title

To Determine the Optimal Number of Clusters

ISSN

2349-5162

Cite This Article

"To Determine the Optimal Number of Clusters", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.7, Issue 10, page no.167-172, October-2020, Available :http://www.jetir.org/papers/JETIR2010019.pdf

Authors

Abstract

─ In this paper we propose an approach to determining the number of clusters in a data set, a quantity often labelled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. For a certain class of clustering algorithms in particular k-means, k-medoids and expectation–maximization algorithm, there is a parameter commonly referred to as k that specifies the number of clusters to detect. Other algorithms such as DBSCAN and OPTICS algorithm do not require the specification of this parameter; hierarchical clustering avoids the problem altogether. The correct choice of k is often ambiguous, with interpretations depending on the shape and scale of the distribution of points in a data set and the desired clustering resolution of the user. In addition, increasing k without penalty will always reduce the amount of error in the resulting clustering, to the extreme case of zero error if each data point is considered its own cluster (i.e., when k equals the number of data points, n). Intuitively then, the optimal choice of k will strike a balance between maximum compression of the data using a single cluster, and maximum accuracy by assigning each data point to its own cluster. If an appropriate value of k is not apparent from prior knowledge of the properties of the data set, it must be chosen somehow. There are several categories of methods for making this decision.

Key Words

data clustering , k-means algorithm , k-medoids and expectation–maximization algorithm , optics algorithm , hierarchical clustering

Cite This Article

"To Determine the Optimal Number of Clusters", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.7, Issue 10, page no. pp167-172, October-2020, Available at : http://www.jetir.org/papers/JETIR2010019.pdf

Publication Details

Published Paper ID: JETIR2010019
Registration ID: 301635
Published In: Volume 7 | Issue 10 | Year October-2020
DOI (Digital Object Identifier):
Page No: 167-172
ISSN Number: 2349-5162

Download Paper

Preview Article

Download Paper




Cite This Article

"To Determine the Optimal Number of Clusters", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.7, Issue 10, page no. pp167-172, October-2020, Available at : http://www.jetir.org/papers/JETIR2010019.pdf




Preview This Article


Downlaod

Click here for Article Preview