Member-only story
Here’s a list of 100 technical facts about clustering:
- Clustering is an unsupervised machine learning technique for grouping similar data points.
- K-means clustering aims to partition data into k clusters based on the mean of data points.
- Hierarchical clustering builds a tree of clusters by recursively merging or splitting them.
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise) identifies clusters based on density.
- Agglomerative clustering uses a bottom-up approach, starting with individual data points as clusters.
- Divisive clustering employs a top-down approach, starting with all data points in one cluster.
- The term “centroid” refers to the center point of a cluster in K-means clustering.
- Dendrograms are tree-like diagrams representing hierarchical clustering relationships.
- Silhouette score is a metric for evaluating the cohesion and separation of clusters.
- The Davies-Bouldin Index is another metric for evaluating clustering performance.
- The Curse of Dimensionality can affect clustering algorithms in high-dimensional spaces.
- Euclidean distance is commonly used in distance-based clustering algorithms.