Continual Release of Differentially Private Synthetic Data from Longitudinal Data Collections

Author:

Bun Mark1ORCID,Gaboardi Marco1ORCID,Neunhoeffer Marcel2ORCID,Zhang Wanrong3ORCID

Affiliation:

1. Boston University, Boston, MA, USA

2. Institute for Employment Research & LMU Munich, Nuremberg, Germany

3. Harvard University, Cambridge, MA, USA

Abstract

Motivated by privacy concerns in long-term longitudinal studies in medical and social science research, we study the problem of continually releasing differentially private synthetic data from longitudinal data collections. We introduce a model where, in every time step, each individual reports a new data element, and the goal of the synthesizer is to incrementally update a synthetic dataset in a consistent way to capture a rich class of statistical properties. We give continual synthetic data generation algorithms that preserve two basic types of queries: fixed time window queries and cumulative time queries. We show nearly tight upper bounds on the error rates of these algorithms and demonstrate their empirical performance on realistically sized datasets from the U.S. Census Bureau's Survey of Income and Program Participation.

Funder

U.S. Census Bureau

Computing Research Association

Computing Community Consortium

Publisher

Association for Computing Machinery (ACM)

Reference52 articles.

1. John Abowd et al. 2021. An uncertainty principle is a price of privacy-preserving microdata. In Advances in Neural Information Processing Systems. M. Ranzato A. Beygelzimer Y. Dauphin P. S. Liang and J. Wortman Vaughan (Eds.) Vol. 34. Curran Associates Inc. 11883--11895. https://proceedings.neurips.cc/paper_files/paper/2021/file/639d79cc857a6c76c2723b7e014fccb0-Paper.pdf.

2. Daniel Alabi Omri Ben-Eliezer and Anamay Chaturvedi. 2022. Bounded space differentially private quantiles. CoRR abs/2201.03380.

3. Héber H. Arcolezi Carlos Pinzón Catuscia Palamidessi and Sébastien Gambs. 2022. Frequency estimation of evolving data under local differential privacy. arXiv preprint arXiv:2210.00262.

4. Privacy, accuracy, and consistency too

5. Gary Benedetto Stanley Jordan C. and Totty Evan. 2018. The creation and use of the {sipp} synthetic beta v7.0. https://www.census.gov/library/working-papers/2018/adrm/SIPP-Synthetic-Beta.html.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3