Parallelizing R with the Snow Library Advanced Research Computing September 22, 2015
Outline
• Introduc9on • Snow Basics • Examples • Conclusions
4 4
Introduc9on
R
• Programming language and environment for sta9s9cal compu9ng
• Free • Intrinsic support for wide array of sta9s9cal func9onality
• Huge number of user-‐created packages to add or improve func9onality
An Aside: Op9mizing R
• Pre-‐allocate Variables • Vectorize (or perhaps apply func9ons)
– Yes: z = x * y – No:
for (i in 1:length(x)) { z[i] = x[i] * y[i] }
• Reference: The R Inferno hOp://www.burns-‐stat.com/documents/books/the-‐r-‐inferno/
An Aside: Op9mizing R (con9nued)
• Many R opera9ons use Basic Linear Algebra Subrou9nes (BLAS)
• Build R with op9mized BLAS è op9mized R
0"
50"
100"
150"
200"
250"
300"
gcc" Intel"
Run$Time$(s)$
Run$Time$for$R$2.5$Benchmark$by$Build$Type$
Standard"
Op4mized"BLAS"
The Need for Parallelism
9 9
Snow
Snow Basics
• Simple Network of Worksta9ons (SNOW) • For embarrassingly parallel tasks • Master/Slave model: $ ps -u jkrometi -o cmd | grep R
R -f time_mh.r --restore –no-save
R –slave <etc>
R –slave <etc>
Snow: Start/Stop Cluster
• Load libraries: library(snow) library(Rmpi)
• Start a cluster with ncores cores: cl <- makeCluster(ncores, type = 'MPI')
• Ini9alize random number generator: clusterSetupRNG(cl, type = 'RNGstream')
• Stop the cluster (important): stopCluster(cl)
Snow: Compu9ng
• Call same func9on across cluster (ncores 9mes): clusterCall(cl, fun, ...)
• Parallel versions of apply: clusterApply(cl, x, fun, ...)
parApply(cl, X, MARGIN, FUN, ...) parLapply(cl, x, fun, ...)
parRapply(cl, x, fun, ...) parCapply(cl, x, fun, ...)
13 13
Examples
Monte Carlo: Calcula9ng π
• The ra9o of the area of the unit circle to the area of the unit square is
• So: – Randomly pick S points in the unit square – Count the number in the unit circle (C) – Then
π4
π ≈ 4CS
MC π: Code
mcpi <- function(n.pts) { #generate n.pts (x,y) points in the unit square
m = matrix(runif(2*n.pts),n.pts,2)
#determine if they are in the unit circle
in.ucir = function(x) {as.integer((x[1]^2 + x[2]^2)<=1)} cir = apply(m, 1, in.ucir )
#return the proportion of points in the unit circle * 4 return (4*mean(cir))
}
MC π: Parallelize
#start up and initialize the cluster cl <- makeCluster(ncores, type = 'MPI') clusterSetupRNG(cl, type = 'RNGstream') #determine if points are in the unit circle cir = parRapply(cl, m, in.ucir ) #calculate pi pi.approx = 4*mean(cir) #stop the cluster stopCluster(cl)
MC π: An Op9miza9on Example
> n.pts <- 500000
> m = matrix(runif(2*n.pts),n.pts,2) > in.ucir <- function(x) { as.integer((x[1]^2 + x[2]^2) <= 1) }
> system.time( apply(m, 1, in.ucir ) )
user system elapsed
5.037 0.025 5.069
> system.time( as.integer(m[,1]^2 + m[,2]^2 <= 1) )
user system elapsed
0.02 0.00 0.02
MCMC: Metropolis-‐Has9ngs
• Goal: Draw random samples with probability density approxima9ng given distribu9on
• Used to model stochas9c inputs • Do not need to know normalizing factor
– Func9ons in high dimensions
MCMC: Metropolis-‐Has9ngs
• Given a: – Target distribu9on – Jumping distribu9on – Ini9al sample
• Choose candidate sample from jumping distribu9on centered at ini9al sample
• Accept candidate as new sample: – Always if candidate is beOer fit (per target dist) – With probability <1 if candidate is worse fit
• Repeat with new sample as ini9al sample
M-‐H: Code (Markov Chain Part) #function to calculate next sample theta.update <- function(theta.cur) { #candidate sample theta.can <- jump(theta.cur) #acceptance probability accept.prob <- samp(theta.can)/samp(theta.cur) #compare with sample from uniform dist (0 to 1) if (runif(1) <= accept.prob) theta.can else theta.cur }
Reference: Lam, Patrick. "MCMC Methods: Gibbs Sampling and the Metropolis-‐HasDngs Algorithm."
Metropolis-‐Has9ngs: Code #function to generate (n.sims-burnin) samples mh <- function(n.sims, start, burnin, samp, jump) { theta.cur <- start draws <- c() #call theta.update() n.sims times for (i in 1:n.sims) { draws[i] <- theta.cur <- theta.update(theta.cur) } #return the samples after the burn in return( draws[(burnin + 1):n.sims] ) }
Reference: Lam, Patrick. "MCMC Methods: Gibbs Sampling and the Metropolis-‐HasDngs Algorithm."
Metropolis-‐Has9ngs: Parallelize #start up and initialize the cluster cl <- makeCluster(ncores, type = 'MPI') clusterSetupRNG(cl, type = 'RNGstream') #samples per core core mh.n.sims.cl <- ceiling(mh.n.sims / ncores) #call mh on each core mh.draws.cl <- clusterCall(cl, mh, mh.n.sims.cl, start = 1, burnin = mh.burnin, samp = samp.fcn, jump = jump.fcn) #reduce list to 1-D mh.draws <- unlist(mh.draws.cl) #stop the cluster stopCluster(cl)
29 29
Conclusions
R on ARC’s Systems
• R 3.0.3, 2.14.1: – Each R build comes with Rmpi and Snow module load intel R openmpi
• R 3.2.0: – Built with rlecuyer and ggplot2 – Ploeng via cairo (offline), X11 (interac9ve) – Parallel packages (Rmpi, snow, snowfall, pbdR) built into R-‐parallel module
module load intel R/3.2.0 openmpi hdf5 netcdf R-parallel/3.2.0
Geeng Started on ARC Systems
• Request an account (anyone with a VT PID): hOp://www.arc.vt.edu/account – Can also request for external collaborators
• Request a system unit alloca9on: hOp://www.arc.vt.edu/alloca9ons
32
References
• Snow Manual: hOp://cran.r-‐project.org/web/packages/snow/snow.pdf
• Snow Func9ons: hOp://www.sfu.ca/~sblay/R/snow.html
• ARC’s R page: hOp://www.arc.vt.edu/r
• Course Slides: hOp://www.arc.vt.edu/?class_note=parallel-‐r-‐i-‐snow
Ques9ons?