# Clustering Measurements

1. Measures the Cluster Validity
• Numerical measures that are applied to judge various aspects of cluster validity, are classified into the following three types
• External Index: Used to measure the extent to which cluster labels match externally supplied class labels
• entropy
• Internal Index: Used to measure the goodness of a clustering structure without respect to the external information
• sum of squared error (SSE)
• Relative Index: used to compare two different clustrings of clusters
• often an external or internal index is used for this function, e.g., SSE or entropy
2. Measuring Cluster Validity via Correlation

• Two matrix
• Proximity Matrix
• Incidence Matrix
• one row and one column for each data point
• an entry is 1 if the associated pair of points belong to the same cluster, else 0
• Compute the correlation between the two matrices
• since the matrices are symmetric, only the correlation between n(n-1)/2 entries needs to be calculated
• High correlation indicates that points that belong the same cluster are closed to each other
• Not a good measure for some density