Scikit-learn: Examples using precomputed distance matrix for clustering

Created on 11 Dec 2015  ·  4Comments  ·  Source: scikit-learn/scikit-learn

Hi, I want to use clustering methods with precomputed distance matrix (NxN). I found DBSCAN has "metric" attribute but can't find examples to follow. Can you please help. Examples for other clustering methods are also very helpful. Thanks.

Most helpful comment

what part are you unclear about?
use metric="precomputed" and provide a as X a precomputed similarity matrix.
See http://scikit-learn.org/dev/auto_examples/manifold/plot_mds.html for an example using MDS.

All 4 comments

what part are you unclear about?
use metric="precomputed" and provide a as X a precomputed similarity matrix.
See http://scikit-learn.org/dev/auto_examples/manifold/plot_mds.html for an example using MDS.

no, it's not feasible algorithmically for k means

Can we change the type of distance to cosine similarity? Thanks!

no. We have a PR in the works for K medoid which is a related algorithm
that can take an arbitrary distance metric. Try it out: #7694. K means
needs to repeatedly calculate Euclidean distance from each point to an
arbitrary vector, and requires the mean to be meaningful; it cannot work
with a metric of your choice.

Was this page helpful?
0 / 5 - 0 ratings