Date post: | 27-Mar-2015 |
Category: |
Documents |
Upload: | arianna-nichols |
View: | 225 times |
Download: | 3 times |
The useR! Conference
•June 12-15, 2012•Nashville, TN•482 attendees•Very open environment•Presenters, probably no one rejected
The useR! Attendees
•Lots of academia•Mostly advanced degrees•They are really into graphs
but call them plots
The useR! ConferenceReproducible Results
•The cautionary tale•Potti et el of Notre Dame 2006•Biogenenics for cancer treatment•Kevin Coombes team discovered
• Off by 1 error• Willful data manipulations and fraud
•National news coverage•Fallout – people died, lawsuits, Potti
The useR! ConferenceMoving your org to R
• Robert Muenchen• Manager, University of Tennessee• Office of Information Technology, Research Computing Support
• In-house• Do what users know• Stop licensing little used SAS modules like ETS, QC
• Replace with R packages• Freeze new development in SAS
The useR! R and Big DataPL/R in EMC GreenPlum
• PL/R is a PostgreSQL language extension that allows you to write PostgreSQL functions and aggregate functions in the R statistical computing language.
• Use PostgreSQL to query Greenplum data.
•Parallel processing• Install R on every GreenPlum node•Know how data is split•Define and query on the splits
The useR! R and Big Datain-database vendor solutions
•Oracle•Netezza•EMC GreenPlum
•PL/R explicit parallel•MADLib implicit parallel•GPHD GreenPlum + Hadoop
The useR! Conclusion
Knitr /= Sweave
HiveR /= Rhive
Eclipse/Rstudio
PL/R to PostgreSQL
PostgreSQL to ECM greenPlum
GPHD
R/R/R/R/R/R/R/R!