compute_dist¶
-
hyppo.tools.
compute_dist
(x, y, metric='euclidean', workers=1, **kwargs)¶ Distance matrices for the inputs.
- Parameters
x,y (
ndarray
) -- Input data matrices.x
andy
must have the same number of samples. That is, the shapes must be(n, p)
and(n, q)
where n is the number of samples and p and q are the number of dimensions. Alternatively,x
andy
can be distance matrices, where the shapes must both be(n, n)
.metric (
str
,callable
, orNone
, default:"euclidean"
) -- A function that computes the distance among the samples within each data matrix. Valid strings formetric
are, as defined insklearn.metrics.pairwise_distances
,From scikit-learn: [‘cityblock’, ‘cosine’, ‘euclidean’, ‘l1’, ‘l2’, ‘manhattan’] See the documentation for scipy.spatial.distance for details on these metrics.
From scipy.spatial.distance: [‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘correlation’, ‘dice’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘yule’] See the documentation for scipy.spatial.distance for details on these metrics.
Set to
None
or'precomputed'
ifx
andy
are already distance matrices. To call a custom function, either create the distance matrix before-hand or create a function of the formmetric(x, **kwargs)
wherex
is the data matrix for which pairwise distances are calculated and**kwargs
are extra arguements to send to your custom function.workers (
int
, default:1
) -- The number of cores to parallelize the p-value computation over. Supply-1
to use all cores available to the Process.**kwargs -- Arbitrary keyword arguments provided to
sklearn.metrics.pairwise_distances
or a custom distance function.
- Returns
distx, disty (
ndarray
) -- Distance matrices based on the metric provided by the user.