research-article

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

Authors:
Briland Hitaj

Stevens Institute of Technology & University of Rome - La Sapienza, Hoboken, NJ, USA

Stevens Institute of Technology & University of Rome - La Sapienza, Hoboken, NJ, USA
View Profile

,
Giuseppe Ateniese

Stevens Institute of Technology, Hoboken, NJ, USA

Stevens Institute of Technology, Hoboken, NJ, USA
View Profile

,
Fernando Perez-Cruz

Stevens Institute of Technology, Hoboken, NJ, USA

Stevens Institute of Technology, Hoboken, NJ, USA
View Profile

CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications SecurityOctober 2017Pages 603–618https://doi.org/10.1145/3133956.3134012

Published:30 October 2017Publication History

CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

Pages 603–618

ABSTRACT

Deep Learning has recently become hugely popular in machine learning for its ability to solve end-to-end learning systems, in which the features and the classifiers are learned simultaneously, providing significant improvements in classification accuracy in the presence of highly-structured and large databases.

Its success is due to a combination of recent algorithmic breakthroughs, increasingly powerful computers, and access to significant amounts of data.

Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15.

Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level differential privacy applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).

Supplemental Material

References

Martín Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 308--318. Google ScholarDigital Library
Ahmad Abdulkader, Aparna Lakshmiratan, and Joy Zhang. 2016. Introducing DeepText: Facebook's text understanding engine. (2016). https://tinyurl.com/jj359dvGoogle Scholar
Martin Arjovsky and Léon Bottou 2017. Towards principled methods for training generative adversarial networks 5th International Conference on Learning Representations (ICLR).Google Scholar
Giuseppe Ateniese, Luigi V Mancini, Angelo Spognardi, Antonio Villani, Domenico Vitali, and Giovanni Felici 2015. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. International Journal of Security and Networks, Vol. 10, 3 (2015), 137--150. https://arxiv.org/abs/1306.4447 Google ScholarDigital Library
Gilles Barthe, Noémie Fong, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub 2016. Advanced probabilistic couplings for differential privacy Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 55--67.Google Scholar
Yoshua Bengio. 2009. Learning Deep Architectures for AI. Found. Trends Mach. Learn. Vol. 2, 1 (Jan. 2009), 1--127. 1109/CVPR.2014.220Google ScholarDigital Library
Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart 2016. Stealing Machine Learning Models via Prediction APIs USENIX Security.Google Scholar
Vladimir Naumovich Vapnik and Vlamimir Vapnik 1998. Statistical learning theory. Vol. Vol. 1. Wiley New York.Google ScholarCross Ref
Martin J Wainwright, Michael I Jordan, and John C Duchi. 2012. Privacy aware learning. In Advances in Neural Information Processing Systems. 1430--1438.Google Scholar
Pengtao Xie, Misha Bilenko, Tom Finley, Ran Gilad-Bachrach, Kristin Lauter, and Michael Naehrig. 2014. Crypto-nets: Neural networks over encrypted data. arXiv preprint arXiv:1412.6181 (2014).Google Scholar
Weilin Xu, Yanjun Qi, and David Evans 2016. Automatically evading classifiers. In NDSS'16.Google Scholar
Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, and Marianne Winslett 2012. Functional mechanism: regression analysis under differential privacy. Proceedings of the VLDB Endowment Vol. 5, 11 (2012), 1364--1375. Google ScholarDigital Library
Tong Zhang. 2004. Solving large scale linear prediction problems using stochastic gradient descent algorithms Proceedings of the twenty-first international conference on Machine learning. ACM, 116.Google Scholar
Xiang Zhang and Yann André LeCun 2016. Text Understanding from Scratch. arXiv preprint arXiv:1502.01710v5 (2016).Google Scholar
Martin Zinkevich, Markus Weimer, Lihong Li, and Alex J Smola. 2010. Parallelized stochastic gradient descent. In Advances in neural information processing systems. 2595--2603.Google Scholar

Index Terms

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning
1. Security and privacy

Recommendations

Anti-cloning protocol suitable to EPCglobal Class-1 Generation-2 RFID systems

Radio frequency Identification (RFID) systems are used to identify remote objects equipped with RFID tags by wireless scanning without manual intervention. Recently, EPCglobal proposed the Electronic Product Code (EPC) that is a coding scheme considered ...
Read More
Revisiting unpredictability-based RFID privacy models
ACNS'10: Proceedings of the 8th international conference on Applied cryptography and network security

Recently, there have been several attempts in establishing formal RFID privacy models in the literature. These models mainly fall into two categories: one based on the notion of indistinguishability of two RFID tags, denoted as ind-privacy, and the ...
Read More
Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning
CCS '18: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

Website fingerprinting enables a local eavesdropper to determine which websites a user is visiting over an encrypted connection. State-of-the-art website fingerprinting attacks have been shown to be effective even against Tor. Recently, lightweight ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security
October 2017
2682 pages
ISBN:9781450349468
DOI:10.1145/3133956
General Chair:
Bhavani Thuraisingham
The University of Texas at Dallas, USA
,
Program Chairs:
David Evans
University of Virginia
,
Tal Malkin
Columbia University
,
Dongyan Xu
Purdue University
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
collaborative learning
deep learning
privacy
security
Qualifiers
- research-article
Conference

Acceptance Rates
CCS '17 Paper Acceptance Rate151of836submissions,18%Overall Acceptance Rate1,261of6,999submissions,18%
More
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 737
  Total Citations
  View Citations
- 5,970
  Total Downloads
- Downloads (Last 12 months)673
- Downloads (Last 6 weeks)85
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Anti-cloning protocol suitable to EPCglobal Class-1 Generation-2 RFID systems

Revisiting unpredictability-based RFID privacy models

Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Anti-cloning protocol suitable to EPCglobal Class-1 Generation-2 RFID systems

Revisiting unpredictability-based RFID privacy models

Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media