python - sklearn.cluster.DBSCAN gives unexpected result -
i'm using dbscan method clustering images, gives unexpected result. let's assume have 10 images.
firstly, read images in loop using cv2.imread
. compute structural similarity index between each images. after that, have matrix this:
[ [ 1. -0.00893619 0. 0. 0. 0.50148778 0.47921832 0. 0. 0. ] [-0.00893619 1. 0. 0. 0. 0.00996088 -0.01873205 0. 0. 0. ] [ 0. 0. 1. 0.57884212 0. 0. 0. 0. 0. 0. ] [ 0. 0. 0.57884212 1. 0. 0. 0. 0. 0. 0. ] [ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [ 0.50148778 0.00996088 0. 0. 0. 1. 0.63224396 0. 0. 0. ] [ 0.47921832 -0.01873205 0. 0. 0. 0.63224396 1. 0. 0. 0. ] [ 0. 0. 0. 0. 0. 0. 0. 1. 0.77507487 0.69697053] [ 0. 0. 0. 0. 0. 0. 0. 0.77507487 1. 0.74861881] [ 0. 0. 0. 0. 0. 0. 0. 0.69697053 0.74861881 1. ]]
looks good. have simple invokation of dbscan:
db = dbscan(eps=0.4, min_samples=3, metric='precomputed').fit(distances) labels = db.labels_ n_clusters_ = len(set(labels)) - (1 if -1 in labels else 0)
and result is
[0 0 0 0 0 0 0 0 0 0]
what do wrong? why puts images 1 cluster?
dbscan assumes dissimilarity (distance) not similarity. can implemented similarity threshold, (see generalized dbscan)
Comments
Post a Comment