Fast and exact fixed-radius neighbor search based on sorting-Reference-Cited by-同舟云学术

Fast and exact fixed-radius neighbor search based on sorting

Published:2024-03-29 Issue: Volume:10 Page:e1929
ISSN:2376-5992
Container-title:PeerJ Computer Science
language:en
Short-container-title:

Author:

Chen Xinye¹^ORCID,Güttel Stefan²

Affiliation:

1. Charles University Prague, Prague, Czech Republic

2. University of Manchester, Manchester, United Kingdom

Abstract

Fixed-radius near neighbor search is a fundamental data operation that retrieves all data points within a user-specified distance to a query point. There are efficient algorithms that can provide fast approximate query responses, but they often have a very compute-intensive indexing phase and require careful parameter tuning. Therefore, exact brute force and tree-based search methods are still widely used. Here we propose a new fixed-radius near neighbor search method, called SNN, that significantly improves over brute force and tree-based methods in terms of index and query time, provably returns exact results, and requires no parameter tuning. SNN exploits a sorting of the data points by their first principal component to prune the query search space. Further speedup is gained from an efficient implementation using high-level basic linear algebra subprograms (BLAS). We provide theoretical analysis of our method and demonstrate its practical performance when used stand-alone and when applied within the DBSCAN clustering algorithm.

Funder

Royal Society Industry Fellowship

Publisher

PeerJ

Link

https://peerj.com/articles/cs-1929.pdf

Reference64 articles.

1. Refining a k-nearest neighbor graph for a computationally efficient spectral clustering;Alshammari;Pattern Recognition,2021

2. ANN-benchmarks: a benchmarking tool for approximate nearest neighbor algorithms;Aumüller;Information Systems,2020

3. Speeding up the Xbox recommender system using a Euclidean transformation for inner-product spaces;Bachrach,2014

4. LSH Forest: self-tuning indexes for similarity search;Bawa,2005

5. Multidimensional binary search trees used for associative searching;Bentley;Communications of the ACM,1975a