Cite
[1] E. M. Mirkes, J. Allohibi, e A. N. Gorban, «Fractional norms and quasinorms do not help to overcome the curse of dimensionality», Entropy, vol. 22, fasc. 10, p. 1105, set. 2020, doi: 10.3390/e22101105.
Synthesis
Contribution::
Strong Points::
Weak Points::
Related::
Metadata
FirstAuthor:: Mirkes, Evgeny M.
Author:: Allohibi, Jeza
Author:: Gorban, Alexander N.
~
Title:: Fractional norms and quasinorms do not help to overcome the curse of dimensionality
Year:: 2020
Citekey:: mirkesFractionalNormsQuasinorms2020
itemType:: journalArticle
Journal:: Entropy
Volume:: 22
Issue:: 10
Pages:: 1105
DOI:: 10.3390/e22101105
LINK
Abstract
The curse of dimensionality causes the well-known and widely discussed problems for machine learning methods. There is a hypothesis that using of the Manhattan distance and even fractional quasinorms lp (for p less than 1) can help to overcome the curse of dimensionality in classification problems. In this study, we systematically test this hypothesis. We confirm that fractional quasinorms have a greater relative contrast or coefficient of variation than the Euclidean norm l2, but we also demonstrate that the distance concentration shows qualitatively the same behaviour for all tested norms and quasinorms and the difference between them decays as dimension tends to infinity. Estimation of classification quality for kNN based on different norms and quasinorms shows that a greater relative contrast does not mean better classifier performance and the worst performance for different databases was shown by different norms (quasinorms). A systematic comparison shows that the difference of the performance of kNN based on lp for p=2, 1, and 0.5 is statistically insignificant. .
Notes
The functional in d dimensional vector space is defined as:
- norm for
- quasinorm for (violates triangle inequality)
using quasinorms was proposed in (Aggarwal et al., 2001), because using them can compensate the distance concentration.
Problem: time of calculation and algorithms which assume triangle inequality cannot use it.
This paper demonstrated:
-
small p does not prevent distance concentration
-
smaller distance concentration does not mean better accuracy.