Learning on Large-Scale Data with Security and Privacy | George Mason Department of Computer Science

When: Monday, November 30, 2020 from 02:00 PM to 04:00 PM
Speakers: Sahar Sadat Seyed Mazloom
Location: Virtual
Export to iCal

Abstract

The recent advancements in machine learning domain have been enabled by the ability of analyzing massive volumes of data and being able to extract and learn patterns within that data. However, large-scale data collections rise privacy concerns as it can expose individual's sensitive data to actors with malicious intent. This lack of privacy can lead to potential data breaches, and consequently compromise successful development of machine learning techniques. Secure Computation is a branch of modern cryptography that introduces promising solutions to process data in a privacy-preserving manner. It enables computing any functionality on data while the data is “encrypted”. This field has been the topic of extensive research in recent years and made remarkable progress. However, most results remained impractical for real applications and its deployment stayed limited due to efficiency and scalability constraints.

The goal of this dissertation is to present novel protocol designs and development techniques to overcome these efficiency and scalability limitations. We demonstrate how to construct secure and privacy-preserving machine learning schemes that are practical for real-world applications, while dealing with large-scale data, and guaranteeing security against different types of adversaries. In the first part of this dissertation, we design and develop privacy-preserving machine learning frameworks using secure computation techniques and explore the trade-off between security and efficiency on these frameworks. In order to improve the efficiency, we relax the security notion by allowing the adversary to learn some small information during the computation. Then, we use Differential Privacy mechanisms to provide a formal bound on the amount of leakage and prove what is learned by the adversary is deferentially private. We also leverage Parallel Computation techniques to improve the performance and running time of these novel algorithms. These frameworks follow a centralized computation architecture in which users send their private data to untrusted computation servers in order to perform some computations on them. In the second part, we design and develop secure and privacy-preserving machine learning algorithms in the distributed setting known as Federated Learning. In federate learning, users do not share their sensitive data to the computation severs, but instead they train a local model on their private data and only send their model parameters to the computation servers, which in turn aggregate those local parameters and construct a global model on all participants' data. Our secure and privacy-preserving federated learning protocols are designed to have low communication cost, as well as being robust to the users dropping out of the protocol at any point. We leverage secure computation and differential privacy techniques to preserve the privacy of user's data as well as trained model's parameters. All of our secure and privacy-preserving frameworks presented in this dissertation, are designed to be secure against two adversarial models, passive and active adversaries.

Posted 4 years, 9 months ago

Learning on Large-Scale Data with Security and Privacy Events / Oral Defense of Doctoral Dissertation

Categories

Learning on Large-Scale Data with Security and Privacy
Events / Oral Defense of Doctoral Dissertation