当前位置:Gxlcms > 数据库问题 > Intro to DBSCAN

Intro to DBSCAN

时间:2021-07-01 10:21:17 帮助过:3人阅读

DBSCAN

  • Density-Based Spatial Clustering of Application with Noise
  • It can discover cluster of arbitrary shape

  • A cluster is defined as a maximal set of density-connected points

  • Two parameters

    1. Eps: Maximun radius of the neighbourhood
    2. MinPts: Minimum number of points in the Eps-Neighbourhood of a point.
  • Suppose we have a point q, with the pre-determined parameters. If the number of neighbourhood within the Eps技术分享 is larger than the value of MinPts, we say this point is a core.

  • Three types of points

    1. Core point: dense neighborhood
    2. Border point: in the cluster, but neighbourhood is not dense, or can be reached by other cluster
    3. Noise/Outlier: not in a cluster and also cannot be reached by other cluster.
  • Directly density-reachable: A point p is directly density-reachable from q if:

    1. p belongs to 技术分享
    2. q itself is a core point: 技术分享
  • Density-reachable

    A point p is density-reachable from a point q if there is a chain of points p1,...pn, s.t p1=q, pn=p and pi+1 is directly density-reachable from pi

  • Density-connected

    A point is density-connected to a point q if there is a point o such that both p and q are density-reachable from o. Even if both p and q can be a border, they could be in the same cluster as long as there is a point o that it is density-reachable to p and q.

Algorithm

  1. Arbitrarily select a point p.
  2. Retrieve all points density-reachable from p under the constrain of Eps and MinPts.

    1. if p is a core point, a cluster is formed that the border is also found.
    2. if p is a border, no points are density-reachable from p. Then p is a noise or outlier, DBSCAN just skips to the next point.

  3. Continue the process until all the points have been processed.

But DBSCAN is sensitive to the setting of Eps and MinPts.

Intro to DBSCAN

标签:

人气教程排行