Cluster analysis can not only group the inspected objects according to certain standards, but if the inspected objects have regional characteristics, it can also reflect the spatial distribution pattern of the inspected objects. Cluster analysis is generally divided into two methods: K-Means Cluster cluster analysis and C-Means Cluster cluster analysis. Comparing the two, K-Means Cluster cluster analysis is currently the most commonly used and simplest cluster analysis method. . The basic principle of K-Means Cluster cluster analysis is to assume that the given sample is {x (1), …, x (n)}, and for each x( i) ∈ R n, randomly select k cluster centroid points U1 ,U2,...Uk∈R n, what K-means has to do is minimize. <br>Among them, when Γnk falls within the range of cluster k, the value is 1; otherwise, the value is 0 if it is outside the range. In general, it is very difficult to find the optimal X n and Uk through intuitive methods to minimize the value of the entire function, and it is usually obtained by multiple iterations. The specific step is to first assume that Uk is constant, and it is easy to find the optimal Uk, as long as the data points are classified to the center point closest to him to ensure that the entire function is minimized. Next, select Γnk which is constant, and then find the optimal Uk. Finally, take the derivative of Uk and assume that the entire derivative is zero, it is easy to get the smallest function value J, and Uk must be satisfied: <br>Uk is the optimal value at this time, that is, the average of all data points in cluster k value. Since each iteration takes the minimum value of J, the entire J will only decrease (or remain unchanged) without increasing, which ensures that k-means will eventually reach a minimum value. On the basis of obtaining the minimum value, through continuous clustering and merging, we will finally get the cluster grouping we want according to the initial clustering standard.
正在翻译中..