K-means clustering is one of the most popular algorithms in clustering and segmentation. K-means clustering treats each feature point as a location in space. The basic algorithm of K-means arbitrarily locates that number of centers of grouping in the multidimensional space of measurement. Each point is then assigned to the group whose arbitrary mean vector is the nearest. The procedure continues until there are no significant changes in the location of the middle class vectors between successive iterations of the algorithms. K-means clustering is a partitioning method. The kimagen function divides the data into mutually exclusive k clusters and returns the index of the cluster to which each observation has been assigned. Unlike hierarchical grouping, k-means grouping operates on actual observations (instead of the largest set of dissimilarity measures), and creates a single level of groupings. Distinctions mean that k-means clustering is often more suited than hierarchical clustering for large amounts of data.
Kmeans treats every observation in your data as an object that has a location in space. Find a partition in which objects within each cluster are as close as possible to each other, and as far away as possible from objects in other clusters. You can choose from five different distance measures, depending on the type of data you are grouping.
Each cluster in the partition is defined by its member objects and by its centroid or center. The centroid for each group is the point at which the sum of the distances of all objects in that group is minimized. K means calculates the cluster centroids differently for each distance measure, to minimize the sum with respect to the measure you specify.
You can control the details of the minimization by using several optional input parameters for k means, including the initial values of the cluster centroids and the maximum number of iterations. By default, k means uses the k-means ++ algorithm to initialize the cluster center and the Euclidean metric squared to determine the distances.