1. Measures the Cluster Validity
- Numerical measures that are applied to judge various aspects of cluster validity, are classified into the following three types
- External Index: Used to measure the extent to which cluster labels match externally supplied class labels
- entropy
- Internal Index: Used to measure the goodness of a clustering structure without respect to the external information
- sum of squared error (SSE)
- Relative Index: used to compare two different clustrings of clusters
- often an external or internal index is used for this function, e.g., SSE or entropy
2. Measuring Cluster Validity via Correlation
- Two matrix
- Proximity Matrix
- Incidence Matrix
- one row and one column for each data point
- an entry is 1 if the associated pair of points belong to the same cluster, else 0
- Compute the correlation between the two matrices
- since the matrices are symmetric, only the correlation between n(n-1)/2 entries needs to be calculated
- High correlation indicates that points that belong the same cluster are closed to each other
- Not a good measure for some density