Streaming Algorithms with Few State Changes

Author:

Jayaram Rajesh1ORCID,Woodruff David P.2ORCID,Zhou Samson3ORCID

Affiliation:

1. Google Research, New York City, NY, USA

2. Carnegie Mellon University, Google Research, Pittsburgh, PA, USA

3. Texas A&M University, College Station, TX, USA

Abstract

In this paper, we study streaming algorithms that minimize the number of changes made to their internal state (i.e., memory contents). While the design of streaming algorithms typically focuses on minimizing space and update time, these metrics fail to capture the asymmetric costs, inherent in modern hardware and database systems, of reading versus writing to memory. In fact, most streaming algorithms write to their memory on every update, which is undesirable when writing is significantly more expensive than reading. This raises the question of whether streaming algorithms with small space and number of memory writes are possible. We first demonstrate that, for the fundamental F p moment estimation problem with p ≥ 1, any streaming algorithm that achieves a constant factor approximation must make Ω(n 1-1/p ) internal state changes, regardless of how much space it uses. Perhaps surprisingly, we show that this lower bound can be matched by an algorithm which also has near-optimal space complexity. Specifically, we give a (1+ε)-approximation algorithm for F p moment estimation that use a near-optimal ~O ε (n 1-1/p ) number of state changes, while simultaneously achieving near-optimal space, i.e., for p∈[1,2), our algorithm uses poly(log n,1/ε) bits of space for, while for p>2, the algorithm uses ~O ε (n 1-1/p ) space. We similarly design streaming algorithms that are simultaneously near-optimal in both space complexity and the number of state changes for the heavy-hitters problem, sparse support recovery, and entropy estimation. Our results demonstrate that an optimal number of state changes can be achieved without sacrificing space complexity.

Funder

NSF

Publisher

Association for Computing Machinery (ACM)

Reference100 articles.

1. Kook Jin Ahn and Sudipto Guha. 2009. Graph Sparsification in the Semi-streaming Model. In Automata, Languages and Programming, 36th Internatilonal Colloquium, ICALP Proceedings, Part II. 328--338.

2. The White-Box Adversarial Data Stream Model

3. Ameen Akel, Adrian M. Caulfield, Todor I. Mollov, Rajesh K. Gupta, and Steven Swanson. 2011. Onyx: A Prototype Phase Change Memory Storage Array. In 3rd USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage1.

4. Tracking Join and Self-Join Sizes in Limited Storage

5. The Space Complexity of Approximating the Frequency Moments

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3