1 17-214
PrinciplesofSoftwareConstruction: Objects,Design,andConcurrencyPart3:ConcurrencyIntroductiontoconcurrency,part4InthetrenchesofparallelismJoshBloch CharlieGarrod
2 17-214
Administrivia
• Homework5BestFrameworksavailabletoday• Homework5cdueTuesday,11:59p.m.
3 17-214
KeyconceptsfromTuesday
4 17-214
Policiesforthreadsafety
1. Thread-confinedstate–mutatebutdon’tshare2. Sharedread-onlystate–sharebutdon’tmutate3. Sharedthread-safe–objectsynchronizesitselfinternally4. Sharedguarded–clientsynchronizesobject(s)externally
5 17-214
3.Sharedthread-safestate
• Thread-safeobjectsthatperforminternalsynchronization• Youcanbuildyourown,butnotforthefaintofheart• You’rebetteroffusingonesfromjava.util.concurrent• j.u.calsoprovidesskeletalimplementations
6 17-214
Adviceforbuildingthread-safeobjects
• Doaslittleaspossibleinsynchronizedregion:getin,getout– Obtainlock– Examineshareddata– Transformasnecessary– Dropthelock
• Ifyoumustdosomethingslow,moveitoutsidethesynchronizedregion
7 17-214
Thefork-joinpattern
if(myportionoftheworkissmall)dotheworkdirectlyelsesplitmyworkintopiecesrecursivelyprocessthepieces
8 17-214
Today
• Concurrencyinpractice:Inthetrenchesofparallelism
9 17-214
Concurrencyatthelanguagelevel
• Consider:Collection<Integer>collection=…;intsum=0;for(inti:collection){sum+=i;}
• Inpython:collection=…sum=0foritemincollection:sum+=item
10 17-214
ParallelquicksortinNesl
functionquicksort(a)=if(#a<2)thenaelseletpivot=a[#a/2];lesser={eina|e<pivot}; equal={eina|e==pivot};greater={eina|e>pivot};result={quicksort(v):vin[lesser,greater]};inresult[0]++equal++result[1];• Operationsin{}occurinparallel• 210-esquequestions:Whatistotalwork?Whatisspan?
11 17-214
Prefixsums(a.k.a.inclusivescan,a.k.a.scan)
• Goal:givenarrayx[0…n-1],computearrayofthesumofeachprefixofx[sum(x[0…0]),sum(x[0…1]),sum(x[0…2]),…sum(x[0…n-1])]
• e.g.,x = [13, 9, -4, 19, -6, 2, 6, 3]prefixsums:[13, 22, 18, 37, 31, 33, 39, 42]
12 17-214
Parallelprefixsums
• Intuition:Partialsumscanbeefficientlycombinedtoformmuchlargerpartialsums.E.g.,ifweknowsum(x[0…3])andsum(x[4…7]),thenwecaneasilycomputesum(x[0…7])
• e.g.,x = [13, 9, -4, 19, -6, 2, 6, 3]
13 17-214
Parallelprefixsumsalgorithm,upsweep
Computethepartialsumsinamoreusefulmanner
[13, 9, -4, 19, -6, 2, 6, 3]
[13, 22, -4, 15, -6, -4, 6, 9]
14 17-214
Parallelprefixsumsalgorithm,upsweep
Computethepartialsumsinamoreusefulmanner
[13, 9, -4, 19, -6, 2, 6, 3]
[13, 22, -4, 15, -6, -4, 6, 9]
[13, 22, -4, 37, -6, -4, 6, 5]
15 17-214
Parallelprefixsumsalgorithm,upsweep
Computethepartialsumsinamoreusefulmanner
[13, 9, -4, 19, -6, 2, 6, 3]
[13, 22, -4, 15, -6, -4, 6, 9]
[13, 22, -4, 37, -6, -4, 6, 5]
[13, 22, -4, 37, -6, -4, 6, 42]
16 17-214
Parallelprefixsumsalgorithm,downsweep
Nowunwindtocalculatetheothersums
[13, 22, -4, 37, -6, -4, 6, 42]
[13, 22, -4, 37, -6, 33, 6, 42]
17 17-214
Parallelprefixsumsalgorithm,downsweep
Nowunwindtocalculatetheothersums
[13, 22, -4, 37, -6, -4, 6, 42]
[13, 22, -4, 37, -6, 33, 6, 42]
[13, 22, 18, 37, 31, 33, 39, 42]
• Recall,westartedwith:[13, 9, -4, 19, -6, 2, 6, 3]
18 17-214
Doublingarraysizeaddstwomorelevels
Upsweep
Downsweep
19 17-214
Parallelprefixsums
pseudocode//Upsweepprefix_sums(x):fordin0to(lgn)-1://disdepthparallelforiin2d-1ton-1,by2d+1:x[i+2d]=x[i]+x[i+2d]//Downsweepfordin(lgn)-1to0:parallelforiin2d-1ton-1-2d,by2d+1:if(i-2d>=0):x[i]=x[i]+x[i-2d]
20 17-214
Parallelprefixsumsalgorithm,incode
• AniterativeJava-esqueimplementation:voiditerativePrefixSums(long[]a){intgap=1;for(;gap<a.length;gap*=2){parfor(inti=gap-1;i+gap<a.length;i+=2*gap){a[i+gap]=a[i]+a[i+gap];}}for(;gap>0;gap/=2){parfor(inti=gap-1;i<a.length;i+=2*gap){a[i]=a[i]+((i-gap>=0)?a[i-gap]:0);}}
21 17-214
Parallelprefixsumsalgorithm,incode• ArecursiveJava-esqueimplementation:
voidrecursivePrefixSums(long[]a,intgap){if(2*gap–1>=a.length){return;}parfor(inti=gap-1;i+gap<a.length;i+=2*gap){a[i+gap]=a[i]+a[i+gap];}recursivePrefixSums(a,gap*2);parfor(inti=gap-1;i<a.length;i+=2*gap){a[i]=a[i]+((i-gap>=0)?a[i-gap]:0);}}
22 17-214
Parallelprefixsumsalgorithm
• Howgoodisthis?
23 17-214
Parallelprefixsumsalgorithm
• Howgoodisthis?– Work:O(n)– Span:O(lgn)
• SeePrefixSums.java,PrefixSumsSequentialWithParallelWork.java
24 17-214
Goal:parallelizethePrefixSumsimplementation
• Specifically,parallelizetheparallelizableloopsparfor(inti=gap-1;i+gap<a.length;i+=2*gap){a[i+gap]=a[i]+a[i+gap];}
• Partitionintomultiplesegments,runindifferentthreadsfor(inti=left+gap-1;i+gap<right;i+=2*gap){a[i+gap]=a[i]+a[i+gap];}
25 17-214
Recallfromthepreviouslecture:Fork/joininJava
• Thejava.util.concurrent.ForkJoinPoolclass– ImplementsExecutorService– Executes java.util.concurrent.ForkJoinTask<V>or
java.util.concurrent.RecursiveTask<V>or java.util.concurrent.RecursiveAction
• Inalongcomputation:– Forkathread(ormore)todosomework– Jointhethread(s)toobtaintheresultofthework
26 17-214
TheRecursiveAction abstractclasspublicclassMyActionFooextendsRecursiveAction{publicMyActionFoo(…){storethedatafieldsweneed}@Overridepublicvoidcompute(){if(thetaskissmall){dotheworkhere;return;}invokeAll(newMyActionFoo(…),//smallernewMyActionFoo(…),//subtasks…);//…}}
27 17-214
AForkJoinexample
• SeePrefixSumsParallelForkJoin.java• Seetheprocessorgo,gogo!
28 17-214
Parallelprefixsumsalgorithm
• Howgoodisthis?– Work:O(n)– Span:O(lgn)
• SeePrefixSumsParallelArrays.java
29 17-214
Parallelprefixsumsalgorithm
• Howgoodisthis?– Work:O(n)– Span:O(lgn)
• SeePrefixSumsParallelArrays.java• SeePrefixSumsSequential.java
30 17-214
Parallelprefixsumsalgorithm
• Howgoodisthis?– Work:O(n)– Span:O(lgn)
• SeePrefixSumsParallelArrays.java• SeePrefixSumsSequential.java
– n-1additions– Memoryaccessissequential
• ForPrefixSumsSequentialWithParallelWork.java– About2nusefuladditions,plusextraadditionsfortheloopindexes– Memoryaccessisnon-sequential
• Thepunchline:– Don'trollyourown.Knowthelibraries– Cacheandconstantsmatter