Affiliation:
1. Computer & Information Science Department, University of Pennsylvania, 200 South 33rd Street, Philadelphia, PA
Abstract
An ∈-approximate quantile summary of a sequence of
N
elements is a data structure that can answer quantile queries about the sequence to within a precision of ∈
N
.
We present a new online algorithm for computing∈-approximate quantile summaries of very large data sequences. The algorithm has a worst-case space requirement of
Ο
(1÷∈ log(∈
N
)). This improves upon the previous best result of
Ο
(1÷∈ log
2
(∈
N
)). Moreover, in contrast to earlier deterministic algorithms, our algorithm does not require a priori knowledge of the length of the input sequence.
Finally, the actual space bounds obtained on experimental data are significantly better than the worst case guarantees of our algorithm as well as the observed space requirements of earlier algorithms.
Publisher
Association for Computing Machinery (ACM)
Subject
Information Systems,Software
Cited by
130 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Learning to Rank for Non Independent and Identically Distributed Datasets;Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval;2024-08-02
2. Robust and Memoryless Median Estimation for Real-Time Spike Detection;2024-07-23
3. M4: A Framework for Per-Flow Quantile Estimation;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13
4. Online Detection of Outstanding Quantiles with QuantileFilter;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13
5. Simple & Optimal Quantile Sketch: Combining Greenwald-Khanna with Khanna-Greenwald;Proceedings of the ACM on Management of Data;2024-05-10