Towards Practical Differentially Private Convex Optimization · Contributions • New Algorithm for...

transcript

Towards Practical Differentially Private Convex Optimization

ROGER IYENGAR

CARNEGIE MELLON UNIVERSITY

JOSEPH P. NEAR UNIVERSITY OF

VERMONT

DAWN SONG UNIVERSITY OF

CALIFORNIA, BERKELEY

ABHRADEEP THAKURTA UNIVERSITY OF

CALIFORNIA, SANTA CRUZ

LUN WANG UNIVERSITY OF

CALIFORNIA, BERKELEY

OM THAKKAR BOSTON

UNIVERSITY

Contributions • NewAlgorithmforDifferentiallyPrivateConvexOptimization:ApproximateMinimaPerturbation(AMP)• Canleverageanyoff-the-shelfoptimizer• Worksforallconvexlossfunctions• Hasacompetitivehyperparameter-freevariant

• BroadEmpiricalStudy• 6state-of-the-arttechniques• 2models:LogisticRegression,andHuberSVM• 13datasets:9public(4high-dimensional),4real-worldusecases• Open-sourcerepo:https://github.com/sunblaze-ucb/dpml-benchmark

This Talk • WhyPrivacyforLearning?• Background• DifferentialPrivacy(DP)• ConvexOptimization

• ApproximateMinimaPerturbation(AMP)• BroadEmpiricalStudy

Why Privacy for Learning? SensitiveData𝐷

TrainingAlgorithm𝐴TrainedModel 𝜃 Input Output

• Modelscanleakinformationabouttrainingdata• Membershipinferenceattacks[ShokriStronatiSongShmatikov’17,CarliniLiuKosErlingssonSong’18,

MelisSongCristofaroShmatikov’18]• Modelinversionattacks[FredriksonJhaRistenpart’15,WuFredriksonJhaNaughton’16]

• Solution?

𝐏𝐫(𝑨(𝑫)= 𝜽 )

𝐷::

Differential Privacy [Dwork Mcsherry Nissim Smith ‘06]

Alice Bob Cathy Doug Emily Om

Randomized

Outcomes 𝜽 ∈𝚯

𝐏𝐫(𝑨(𝑫)= 𝜽 )

𝐷↑′ :

Randomized

𝐏𝐫(𝑨(𝑫′)= 𝜽 )𝐷↑′ :

Randomized

ΘSmall

Differential Privacy [Dwork Mcsherry Nissim Smith ‘06] • Privacyparameters:(𝜀,𝛿)• Arandomizedalgorithm𝐴:𝒟↑𝑛 →𝑇is(𝜀,𝛿)-DPif• forallneighboringdatasets𝐷, 𝐷↑′ ∈ 𝒟↑𝑛 ,i.e.,𝑑𝑖𝑠𝑡(𝐷, 𝐷↑′ )=1• forallsetsofoutcomes𝑆⊆Θ,wehave

Pr�(𝐴(𝐷)∈𝑆) ≤ 𝑒↑𝜀 Pr�(𝐴(𝐷↑′ )∈𝑆) + 𝛿

𝜀:Multiplicativechange.Typically,𝜀=𝑂(1)

𝛿:Additivechange.Typically,𝛿=𝑂(1/ 𝑛↑2 )

Convex Optimization • Input:

• Dataset𝐷∈ 𝒟↑𝑛 • Lossfunction𝐿(𝜃,𝐷),where

• 𝜃∈ ℝ↑𝑝 isamodel• Loss𝐿isconvexinthefirstparameter𝜃

• Goal:Outputmodel𝜃 suchthat 𝜃 ∈ min┬𝜃∈ ℝ↑𝑝  �𝐿(𝜃,𝐷) 

• Applications:• MachineLearning,DeepLearning,CollaborativeFiltering,etc. 𝜃 

𝐿(𝜃,𝐷)

DP Convex Optimization - Prior Work SensitiveData𝐷

TrainingAlgorithm𝐴 TrainedModel 𝜃 Input Output

ObjectivePerturbation[ChaudhuriMonteleoniSarwate’11,KiferSmith

Thakurta’12,JainThakurta’14]

DPGD/SGD[SongChaudhuri

Sarwate’13,BassilySmithThakurta’14,AbadiChuGoodfellowMcMahan

MironovTalwarZhang’16]

DPFrankWolfe

[TalwarThakurtaZhang’14]

OutputPerturbation[CMS’11,KST’12,JT’14]

DPPermutation-basedSGD[WuLiKumarChaudhuri

JhaNaughton’17]

-Requiresminimaofloss-Requirescustomoptimizer

• Input:• Dataset𝐷,Lossfunction:𝐿(𝜃,𝐷)• Privacyparameters:𝑏=(𝜖, 𝛿)• Gradientnormbound𝛾

• Algorithm(high-level):1.  Splitprivacybudgetinto2parts𝑏↓1 and 𝑏↓2 2.  Perturbloss: 𝐿↓𝑝𝑟𝑖𝑣 (𝜃,𝐷)=𝐿(𝜃,𝐷)+𝑅𝑒𝑔(𝜃, 𝑏↓1 )

𝐿↓𝑝𝑟𝑖𝑣 (𝜃

,𝐷)

𝐿(𝜃,𝐷)

Approximate Minima Perturbation (AMP)

𝜃𝜃↓𝑝𝑟𝑖𝑣 𝜃 SimilartostandardObjectivePerturbation[KST’12]

• Input:• Dataset𝐷,Lossfunction:𝐿(𝜃,𝐷)• Privacyparameters:𝑏=(𝜖, 𝛿)• Gradientnormbound𝛾

• Algorithm(high-level):1.  Splitprivacybudgetinto2parts𝑏↓1 and 𝑏↓2 2.  Perturbloss: 𝐿↓𝑝𝑟𝑖𝑣 (𝜃,𝐷)=𝐿(𝜃,𝐷)+𝑅𝑒𝑔(𝜃, 𝑏↓1 )3.  Let 𝜃↓𝑎𝑝𝑝𝑟𝑜𝑥 =𝜃s.t. ‖∇𝐿↓𝑝𝑟𝑖𝑣 (𝜃,𝐷)‖↓2 ≤𝛾4.  Output𝜃↓𝑎𝑝𝑝𝑟𝑜𝑥 +𝑁𝑜𝑖𝑠𝑒(𝑏↓2 ,𝛾)

𝐿↓𝑝𝑟𝑖𝑣 (𝜃

,𝐷)

Approximate Minima Perturbation (AMP)

𝜃𝜃↓𝑝𝑟𝑖𝑣 

‖∇𝐿↓𝑝𝑟𝑖𝑣 (𝜃,𝐷)‖↓2 ≤𝛾

𝜃↓𝑎𝑝𝑝𝑟𝑜𝑥 

SimilartostandardObjectivePerturbation[KST’12]

Utility guarantees • Let 𝜃 minimize𝐿(𝜃;𝐷),andtheregularizationparameterΛ= Θ (𝜉√�𝑝 /𝜖𝑛‖𝜃 ‖ ).

• ObjectivePerturbation[KST’12]:If𝜃↓𝑝𝑟𝑖𝑣 istheoutputofobj.pert.: 𝔼(𝐿(𝜃↓𝑝𝑟𝑖𝑣 ;𝐷)−𝐿(𝜃 ;𝐷))= 𝑂 (𝜉√�𝑝 ‖𝜃 ‖/𝜖𝑛 ).• AMP(adaptedfrom[KST’12]):Foroutput 𝜃↓𝐴𝑀𝑃 :

𝔼(𝐿(𝜃↓𝐴𝑀𝑃 ;𝐷)−𝐿(𝜃 ;𝐷))= 𝑂 (𝜉√�𝑝 ‖𝜃 ‖/𝜖𝑛 +‖𝜃 ‖𝛾𝑛).• For𝛾=𝑂(1/𝑛↑2  ),theutilityofAMPisasymptoticallythesameasthatofObj.Pert.

• PrivatePSGD[WLK↑+ 17]:Foroutput𝜃↓𝑃𝑆𝐺𝐷 ,andmodelspaceradius𝑅:𝔼(𝐿(𝜃↓𝑃𝑆𝐺𝐷 ;𝐷)−𝐿(𝜃 ;𝐷))= 𝑂 (𝜉√�𝑝  𝑅/𝜖√�𝑛  ).

• For𝛾=𝑂(1/𝑛↑2  ),theutilityofAMPhasabetterdependenceon𝑛thanPrivatePSGD.thanPrivatePSGD.

AMP - Takeaways • Canleverageanyoff-the-shelfoptimizer• Worksforallstandardconvexlossfunctions• For𝛾=𝑂(1/𝑛↑2  ),theutilityofAMP:

•  isasymptoticallythesameasObjectivePerturbation[KST’12]• hasabetterdependenceon𝑛thanPrivatePSGD[WLK↑+ 17]

• 𝛾= 1/𝑛↑2  achievableusingstandardPythonlibraries

Empirical Evaluation • Algorithmsevaluated:

• ApproximateMinimaPerturbation(AMP)• PrivateSGD[BST↑′ 14, ACG↑+ 17]

• PrivateFrank-Wolfe(FW)[TTZ↑′ 14]

• PrivatePermutation-basedSGD(PSGD)[WLK↑+ 17]

• PrivateStrongly-convex(SC)PSGD[WLK↑+ 17]

• Hyperparameter-free(HF)AMP• Splittingtheprivacybudget:Weprovideascheduleforlow-andhigh-dim.databyevaluatingAMPonlyonsyntheticdata

• Non-private(NP)Baseline

Empirical Evaluation • Lossfunctionsconsidered:

• Logisticloss• HuberSVM

• Procedure:• 80/20train/testrandomsplit• Fix𝛿= 1/𝑛↑2  ,andvary𝜖from0.01to10• Measureaccuracyoffinaltuned*privatemodelovertestset• Reportthemeanaccuracyandstd.dev.over10independentruns

Thistalk

*DoesnotapplytoHyperparameter-freeAMP.

Synthetic Datasets Synthetic-L(10k ×20)

LegendNPBaseline

PrivateSGD

PrivatePSGD

PrivateSCPSGD

PrivateFW

-  Synthetic-Hishigh-dimensional,butlow-rank-  PrivateFrank-WolfeperformsthebestonSynthetic-H

Synthetic-H(2k ×2k)

High-dimensional Datasets Real-sim(72k×21k)

LegendNPBaseline

PrivateSGD

PrivatePSGD

PrivateSCPSGD

PrivateFW

-  BothvariantsofAMPalmostalwaysprovidethebestperformance

RCV-1(50k×47k)

Real-world Use Cases (Uber) Dataset1(4m×23)

LegendNPBaseline

PrivateSGD

PrivatePSGD

PrivateSCPSGD

PrivateFW

-  DPasaregularizer[BST’14,DworkFeldmanHardtPitassiReingoldRoth’15]-  Evenfor𝜖= 10↑−2 ,accuracyofAMPisclosetonon-privatebaseline

Dataset2(18m×294)

Conclusions • Forlargedatasets,costofprivacyislow

• Privatemodeliswithin4%accuracyofthenon-privateonefor𝜖=0.01,andwithin2%for𝜖=0.1

• AMPalmostalwaysprovidesthebestaccuracy,andiseasilydeployableinpractice

• Hyperparameter-freeAMPiscompetitivew.r.t.tunedstate-of-the-artprivatealgorithms

• Open-sourcerepo:https://github.com/sunblaze-ucb/dpml-benchmark

ThankYou!

Towards Practical Differentially Private Convex Optimization · Contributions • New Algorithm for...

Documents