1. DBScan

  • DBScan is a density-based algorithm
    • density = number of points with a specified radius (Eps)
    • a point is a core point if it has more than a specified number of point (MinPts) within Eps
      • There are points that are at the interior of a cluster
    • A boarder point has fewer then MinPts within Eps, but is the neighborhood of a core point
    • A noise point is any point that is not a core point or a boarder point.
2. DBScan Algorithm

3. Strongness v.s. Weakeness
  • Strongness
    • Resistant to noise
    • Can handle clusters of different shapes and sizes
  • Weakness
    • when dataset has varying densities
    • high dimensional data
4. Determine EPS and MinPts
  • The idea is that for points in a cluster, their k-th nearest neighbors are at roughly the same distance
  • Noise points have the kth nearest neighbor at farther distance

Leave a Reply