Approximate nearest neighbor: towards removing the curse of dimensionality (Q2913814)

In the nearest neighbor problem we are given a set of points \(P\) in a metric space and the goal is to preprocess \(P\) so that queries asking for a point in \(P\) closest to a query point \(q\) can be answered efficiently. This is a fundamental problem in computational geometry and it has been extensively studied. There are many algorithms for the nearest neighbor problem in the \(d\)-dimensional Euclidean space, however, for all these algorithms their space and time complexities grow exponentially in the dimension (this is the so-called ``curse of dimensionality'').NEWLINENEWLINEOne way to try to avoid the exponential dependency of time and space complexity on the dimension is by considering a relaxed version of the problem where the goal is not to produce the closest point \(p\) to a given query point \(q\), but to produce an approximate nearest point to \(q\): A point at distance at most a factor \((1+\epsilon)\) times the distance between \(p\) and \(q\), where \(\epsilon > 0\). This paper presents several results for the approximate nearest neighbor problem under the \(\ell_1\) norm:NEWLINENEWLINE- A deterministic data structure that allows to answer an approximate nearest neighbor query in time \(O(d \log n)\), where \(n\) is the number of points in the input data set \(P\). The space and time needed to preprocess the data structure is \(O(n \log^2 n)\times O(1/\epsilon)^d\).NEWLINENEWLINE- A randomized data structure with query time polynomial in \(d\), \(\log n\), and \(1/\epsilon\), that requires space and preprocessing time bounded by \(n^{O(\log(1/\epsilon)/\epsilon^2)}\).NEWLINENEWLINE- A randomized data structure with query time \(O(dn^{1/(1+\epsilon)}\log^2n \log\log n)\) that requires space \(O(dn+n^{1+1/(1+\epsilon)}\log^2 n \log \log n)\).NEWLINENEWLINEThese results can be extended to the \(\ell_2\) norm with a slight increase in the query time. The paper also points out that one of their algorithms can be used to design a low complexity approximate Voronoi decomposition. Finally, the authors present an algorithm for computing an approximate geometric minimum spanning tree of a given set of \(n\) points in time \(O(dn^{1+1/(1+\epsilon)}\log ^3 n)\).

0 references

reviewed by

Roberto Solis-Oba

0 references