Cite

[1] K. Beyer, J. Goldstein, R. Ramakrishnan, e U. Shaft, «When Is “Nearest Neighbor” Meaningful?», in Database Theory — ICDT’99, C. Beeri e P. Buneman, A c. di, Berlin, Heidelberg: Springer, 1999, pp. 217–235. doi: 10.1007/3-540-49257-7_15.

Synthesis

Contribution::

Strong Points::

Weak Points::

Related::

Metadata

FirstAuthor:: Beyer, Kevin
Author:: Goldstein, Jonathan
Author:: Ramakrishnan, Raghu
Author:: Shaft, Uri
~> FirstEditor:: Beeri, Catriel
Editor:: Buneman, Peter
~
Title:: When Is “Nearest Neighbor” Meaningful?
Year:: 1999
Citekey:: beyerWhenNearestNeighbor1999
itemType:: conferencePaper
Publisher:: Springer
Location:: Berlin, Heidelberg
Pages:: 217-235
DOI:: 10.1007/3-540-49257-7_15
ISBN:: 978-3-540-49257-3

LINK

Beyer et al. - 1999 - When Is “Nearest Neighbor” Meaningful.pdf.

Abstract

We explore the effect of dimensionality on the “nearest neighbor” problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10–15 dimensions. .

Notes

$lim_{m \to \infty} v a r (\frac{( d _{m} ( P _{m, 1} , Q _{m} ) ) ^{p}}{E [( d _{m} ( P _{m, 1} , Q _{m} ) ) ^{p} ]}) = 0$

then for every $ϵ > 0$ :

$lim_{m \to \infty} P [D ma x_{m} \leq (1 + ϵ) D mi n_{m}] = 1$

Basically, if the condition holds (and it holds quite frequently) all points converge to the same distance to the query point

The condition holds for:

iid dimensions with query and data independence
unique dimensions with correlation between all dimensions
variance converging to 0
marginal data and query distributions change with dimensionality

do not hold for:

identical dimensions with no independence ⇒ points fall on a line.

🌱 Enrico's Digital Garden

Explorer

@beyerWhenNearestNeighbor1999

Notes

Annotations

Graph View

Table of Contents

Backlinks