Lecture 14: Distributed Learning, Security, and Privacy

Post on 17-Nov-2021

0 views 0 download

transcript

1Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Lecture 14: Distributed Learning, Security, and Privacy

2Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Announcements● TA office hours will be project advising sessions during this week and the

week of the 29th○ Sign up on spreadsheet (see Ed announcement)○ Attendance is worth 5% of project grade

● No class or OH next week due to university holiday● Wed 11/17 (over zoom): Guest lecture: Prof Julia Salzman (molecular

discovery, single cell sequencing)○ Extra credit opportunity!

● Mon 11/29: Course conclusion

3Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Agenda- Distributed Learning and Federated Learning- Privacy and Differential Privacy

4Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Distributed Learning

Figure credit: Alsheikh et al. Mobile big data analytics using deep learning and apache spark, 2016.

- Sharing the computational load of training a model among multiple worker nodes

5Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Distributed Learning

Figure credit: Alsheikh et al. Mobile big data analytics using deep learning and apache spark, 2016.

- Sharing the computational load of training a model among multiple worker nodes

Data and task of computing gradient updates is distributed among nodes

6Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Distributed Learning

Figure credit: Alsheikh et al. Mobile big data analytics using deep learning and apache spark, 2016.

- Sharing the computational load of training a model among multiple worker nodes

Data and task of computing gradient updates is distributed among nodes

Can have data parallelism or model parallelism

7Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Federated Learning- Related to distributed computing, but with an important property for many medical

settings: data is decentralized and never leaves local silos. Central server controls training across decentralized sources.

Figure credit: https://blogs.nvidia.com/wp-content/uploads/2019/10/federated_learning_animation_still_white.png

8Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Federated Learning

Figure credit: https://blog.ml.cmu.edu/2019/11/12/federated-learning-challenges-methods-and-future-directions/

- Example: learning a next-word prediction model from many individual cell phones

9Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Federated Learning

Figure credit: https://blog.ml.cmu.edu/2019/11/12/federated-learning-challenges-methods-and-future-directions/

- Example: learning a next-word prediction model from many individual cell phones

Current copy of global model is shipped to local devices that are ready to contribute to training

10Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Federated Learning

Figure credit: https://blog.ml.cmu.edu/2019/11/12/federated-learning-challenges-methods-and-future-directions/

- Example: learning a next-word prediction model from many individual cell phones

Local gradient updates are computed on local data after 1 or several iterations of gradient descent

11Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Federated Learning

Figure credit: https://blog.ml.cmu.edu/2019/11/12/federated-learning-challenges-methods-and-future-directions/

- Example: learning a next-word prediction model from many individual cell phones

Local gradient updates are shipped back to the central server

12Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Federated Learning

Figure credit: https://blog.ml.cmu.edu/2019/11/12/federated-learning-challenges-methods-and-future-directions/

- Example: learning a next-word prediction model from many individual cell phones

Local gradient updates are combined and used to update the global model

13Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Federated Learning

Figure credit: https://blog.ml.cmu.edu/2019/11/12/federated-learning-challenges-methods-and-future-directions/

- Example: learning a next-word prediction model from many individual cell phones

Updated global model is shipped to local devices for future rounds of federated learning training

14Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Federated Learning

Figure credit: https://blog.ml.cmu.edu/2019/11/12/federated-learning-challenges-methods-and-future-directions/

- Example: learning a personalized healthcare model from data across different healthcare organizations

15Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

From earlier: BRATS brain tumor segmentation dataset- Segmentation of tumors in brain MR image slices- BRATS 2015 dataset: 220 high-grade brain tumor and 54 low-grade brain tumor MRIs- U-Net architecture, Dice loss function

Dong et al. Automatic Brain Tumor Detection and Segmentation Using U-Net Based Fully Convolutional Networks. MIUA, 2017.

16Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Li et al. 2019- NVIDIA Clara’s Federated Learning system for medical imaging data

- Used federated learning to train segmentation model on BRATS

- Achieved comparable performance to non-federated learning, training somewhat slower but data “silos” preserved

Li et al. Privacy-preserving Federated Brain Tumour Segmentation, 2019.

17Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Li et al. 2019- NVIDIA Clara’s Federated Learning system for medical imaging data

- Used federated learning to train segmentation model on BRATS

- Achieved comparable performance to non-federated learning, training somewhat slower but data “silos” preserved

Li et al. Privacy-preserving Federated Brain Tumour Segmentation, 2019.

Also differentially private version… will talk about this in a moment

18Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Privacy: HIPAA

Figure credit: https://www.jet-software.com/en/data-masking-hipaa/

Health Insurance Portability and Accountability Act (HIPAA), 1996: created “Privacy Rule” for how healthcare entities must protect the privacy of patients’ medical information

18 HIPAA identifiers (Protected Health Information):

19Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Risks of data re-identification

Figure credit: Sweeney et al. Matching Known Patients to Health Records in Washington State Data, 2011.

Data triangulation: a person may be de-identified as to one data set, but the knowledge that they are a member of another available data set may allow them to be reidentified

20Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Matching Known Patients to Health Records in Washington State Data

Sweeney. Matching Known Patients to Health Records in Washington State Data, 2011.

News stories (e.g., those containing the word “hospitalized”) contain identifying information that could be used to identify medical records in the state medical record database, for 43% of studied cases

Distribution of values for fields harvested from news stories

21Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Matching Known Patients to Health Records in Washington State Data

Sweeney. Matching Known Patients to Health Records in Washington State Data, 2011.

22Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Matching Known Patients to Health Records in Washington State Data

Sweeney. Matching Known Patients to Health Records in Washington State Data, 2011.

23Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Identifying Participants in the Personal Genome Project by Name

Sweeney. Identifying Participants in the Personal Genome Project by Name, 2013.

Linked demographics information in the Personal Genome Project (PGP) to public records such as voter lists, to correctly identify 84 to 97% of profiles for which guessed names were provided to PGP staff

24Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

K-anonymityA data release provides k-anonymity protection if the information for each person contained in the release cannot be distinguished from at least k-1 individuals whose information also appears in the release.

Sweeney. K-anonymity: a model for protecting privacy. 2002.

25Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

K-anonymity

Sweeney. K-anonymity: a model for protecting privacy. 2002.

2 k-anonymity tables (where

k = 2)

26Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Re-identification from ML models

Fredrickson et al. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures, 2015.

- White-box (as opposed to black-box) setting: have access to model parameters, e.g. local model downloaded on device to run inference

- Model inversion attack: can use gradient descent if model parameters are available, to infer sensitive features

27Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Differential privacyKey idea: output for a dataset, vs. the dataset with a difference for a single entry (e.g., one individual), is “hardly different”. Mathematical guarantees on this idea.

Abadi et al. Deep Learning with Differential Privacy, 2016.

28Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Differential privacyKey idea: output for a dataset, vs. the dataset with a difference for a single entry (e.g., one individual), is “hardly different”. Mathematical guarantees on this idea.

Abadi et al. Deep Learning with Differential Privacy, 2016.

29Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Differential privacySimple intuition behind how we can achieve differential privacy: adding noise!

Figure credit: https://github.com/frankmcsherry/blog/blob/master/posts/2016-02-03.md

Example of reporting a value with Laplacian noise added

30Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Training differentially private deep learning models

Abadi et al. Deep Learning with Differential Privacy, 2016.

31Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Training differentially private deep learning models

Abadi et al. Deep Learning with Differential Privacy, 2016.

Compute gradient as usual

32Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Training differentially private deep learning models

Abadi et al. Deep Learning with Differential Privacy, 2016.

Clip the gradient

33Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Training differentially private deep learning models

Abadi et al. Deep Learning with Differential Privacy, 2016.

Add noise for differential privacy

34Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Training differentially private deep learning models

Abadi et al. Deep Learning with Differential Privacy, 2016.

Compute overall privacy cost

35Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Privacy Aggregation of Teacher Ensembles (PATE)

Papernot et al. Semi-supervised Knowledge Transfer for Deep Learning from Private Data, 2017.

PATE Teacher model

Approach to combine data from multiple disjoint sensitive populations, with privacy guarantees

Figure credit: http://www.cleverhans.io/privacy/2018/04/29/privacy-and-machine-learning.html

36Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Privacy Aggregation of Teacher Ensembles (PATE)

Papernot et al. Semi-supervised Knowledge Transfer for Deep Learning from Private Data, 2017.

PATE Teacher model

Approach to combine data from multiple disjoint sensitive populations, with privacy guarantees Train separate classifiers from

disjoint data sets -- no privacy guarantees yet

Figure credit: http://www.cleverhans.io/privacy/2018/04/29/privacy-and-machine-learning.html

37Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Privacy Aggregation of Teacher Ensembles (PATE)

Papernot et al. Semi-supervised Knowledge Transfer for Deep Learning from Private Data, 2017.

PATE Teacher model

Approach to combine data from multiple disjoint sensitive populations, with privacy guarantees To get a privacy-preserving

prediction, first obtain predictions from all distinct classifiers

Figure credit: http://www.cleverhans.io/privacy/2018/04/29/privacy-and-machine-learning.html

38Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Privacy Aggregation of Teacher Ensembles (PATE)

Papernot et al. Semi-supervised Knowledge Transfer for Deep Learning from Private Data, 2017.

PATE Teacher model

Approach to combine data from multiple disjoint sensitive populations, with privacy guarantees Then add noise to the vote

histogram (giving differential privacy guarantees), and take the class with the most votes as the final output

Figure credit: http://www.cleverhans.io/privacy/2018/04/29/privacy-and-machine-learning.html

39Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Privacy Aggregation of Teacher Ensembles (PATE)

Papernot et al. Semi-supervised Knowledge Transfer for Deep Learning from Private Data, 2017.

PATE Student model uses public data to train a model replicating noisy aggregated teacher outputs

This teacher model alone can still be compromised if too many queries are performed (privacy cost builds up with each query, so privacy guarantees become meaningless with too many queries), or if model parameters are made accessible (and attackable) e.g. distributed in local application

Figure credit: http://www.cleverhans.io/privacy/2018/04/29/privacy-and-machine-learning.html

40Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Generator network: try to fool the discriminator by generating real-looking imagesDiscriminator network: try to distinguish between real and fake images

zRandom noise

Generator Network

Discriminator Network

Fake Images(from generator)

Real Images(from training set)

Real or Fake

After training, use generator network to generate new images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

Fake and real images copyright Emily Denton et al. 2015. Reproduced with permission.

Remember GANs: Two-player game

41Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Generator network: try to fool the discriminator by generating real-looking imagesDiscriminator network: try to distinguish between real and fake images

zRandom noise

Generator Network

Discriminator Network

Fake Images(from generator)

Real Images(from training set)

Real or Fake

After training, use generator network to generate new images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

Fake and real images copyright Emily Denton et al. 2015. Reproduced with permission.

Remember GANs: Two-player game

Can train GANs using differentially private SGD (DP-SGD)! Afterwards, using the generator to generate synthetic data does not incur additional privacy cost

Xie et al. Differentially Private Generative Adversarial Network, 2018.

42Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Implementation of DP-SGD

Utilities for calculating epsilon

Can work with differential privacy within deep learning frameworks

https://blog.tensorflow.org/2019/03/introducing-tensorflow-privacy-learning.htmlhttp://www.cleverhans.io/privacy/2019/03/26/machine-learning-with-differential-privacy-in-tensorflow.html

43Serena Yeung BIODS 220: AI in Healthcare Lecture 14 -

Today we covered:- Distributed Learning and Federated Learning- Privacy and Differential Privacy

Next time: Guest lecture from Prof. Julia Salzman over zoom, discussing molecular discovery and single cell sequencing