+ All Categories
Home > Education > R as supporting tool for analytics and simulation

R as supporting tool for analytics and simulation

Date post: 15-Jan-2017
Category:
Upload: alvaro-gil
View: 713 times
Download: 2 times
Share this document with a friend
22
R as Supporting Tool for Analytics and Simulation Alvaro Gil Simulation & Optimization Consultant http://agiltools.com June 2016
Transcript
Page 1: R as supporting tool for analytics and simulation

R as Supporting Tool for Analytics and

SimulationAlvaro Gil

Simulation & Optimization Consultanthttp://agiltools.com

June 2016

Page 2: R as supporting tool for analytics and simulation

Agenda Introduction

What is R? Why use it?

What to Install

Example of Companies Using R

Some Facts About R

Interesting R Applications

R: The Generic Scripting Language

R + AnyLogic

Interfacing Programming Languages from R

R and IoT

Useful Links

Page 3: R as supporting tool for analytics and simulation

Introduction Pre and post processing of information is a necessary step for modeling and simulation

Information processing is part of the Analytics field.◦ Analytics is a discipline which combines: Descriptive, Predictive and Prescriptive

techniques on all type of data (INFORMS).

Applying Analytics requires special skills as well as knowledge of specialized software (SPSS, SAS, R, Python, JMP, Stata, etc.).

Several specialists are promoting the use of R as the standard language for data analysis (reasons to come in the following slides)

This presentation is an overview of R and what we expect to achieve with it.

Page 4: R as supporting tool for analytics and simulation

What is R? Why use it? R is a high level matrix programming language for statistical and data analysis.

It runs on multiple platforms including Windows, MacOS and Linux.

R is an interpreted language, meaning that user gets an immediate response of the tools without the need of program compilation.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible.

Free and open source

R’s main selling point is the massive amount of libraries allowing you to perform almost any statistical procedure in a single command

◦ There are more than 8000 available packages on CRAN, all independently tested, and generally peer reviewed.

R is great for performing analysis on a dataset, and presenting findings in a static set of graphics

R is very useful to perform distributed automatized data analysis process

Page 5: R as supporting tool for analytics and simulation

R Video

Page 6: R as supporting tool for analytics and simulation

What to Install? R Language

◦ R CRAN or◦ Microsoft Open R (MRO)

An IDE◦ R-Studio◦ Red-R◦ Rattle◦ EMACS + Emacs Speaks Statistics (ESS)◦ Eclipse (StatET)◦ Visual Studio

A full set of packages

Page 7: R as supporting tool for analytics and simulation

Microsoft R MRO: Microsoft R Open (personal version link) MRS: Microsoft R Server (professional version link) Enhanced distribution of R from Microsoft Corporation. It includes the R languages plus additional capabilities for improved performance, reproducibility and platform support.

◦ The installation of many packages include all base and recommended R packages plus a set of specialized packages released by Microsoft Corporation to further enhance your Microsoft R Open experience

◦ Multi-threaded math libraries (Math Kernel Library MKL)◦ A high-performance default CRAN repository that provide a consistent and static set of packages to all Microsoft R

Open users.◦ The checkpoint package that make it easy to share R code and replicate results using specific R package versions.◦ Platforms: Windows, Mac OS X, and Linux◦ MRS also includes specialized packages for big data.

Visit https://mran.microsoft.com/open/ for more info

Page 8: R as supporting tool for analytics and simulation

R Studio

Visit https://www.rstudio.com/ for more info

Page 9: R as supporting tool for analytics and simulation

R Packages More than 8,000 available packages

source('http://agiltools.com/R/rp.R')

Page 10: R as supporting tool for analytics and simulation

Examples of Companies Using Rhttp://www.revolutionanalytics.com/companies-using-r

http://www.r-bloggers.com/airbnb-uses-r-to-scale-data-science/

http://data-informed.com/companies-use-r-compete-data-driven-world/

http://www.r-bloggers.com/companies-using-open-source-r-in-2013/

Sou

rces

Page 11: R as supporting tool for analytics and simulation

Some Facts About R R is the highest paid IT skill (Dice.com survey, January 2014)

R most-used data science language after SQL (O'Reilly survey, January 2014)

R is used as Analytics tool by 75% of professionals (Rexer survey, October 2015)

R is #13 of all programming languages (RedMonk language rankings, June 2015)

R growing faster than any other data science language (KDNuggets survey, August 2014)

R is the #1 Google Search for Advanced Analytics software (Google Trends April 2016)

R has more than 2 million users worldwide (Oracle estimate, February 2012)

Page 12: R as supporting tool for analytics and simulation

Interesting R Applications Complete Libraries Specialized by Topic e.g.:

◦ Econometrics◦ Finances (e.g. actuar, fPortfolio, financial, etc.)◦ Machine Learning (e.g. nnet, neuralnet, RSNNS, deepnet, darch, h2o, etc.)◦ Optimization (e.g. Rquadprog, optmix, etc.)◦ Simulation (e.g. simmer)◦ Social Sciences◦ Spatial (e.g. maps)◦ See more at https://cran.r-project.org/web/views/

Markdown (R-Studio)

Shiny (R-Studio)

Big Data (e.g. bigmemory, ff, RevoScaleR)

Page 13: R as supporting tool for analytics and simulation

Interesting R Applications: Markdown

Markdown is a text-to-HTML conversion tool for reporting.

It allows users to share and/or present their work.

External examples:◦ 1 (pdf): https://github.com/yihui/knitr/releases/download/doc/knitr-minimal.pdf ◦ 2 (html): https://rawgit.com/yihui/knitr-examples/master/003-minimal.html◦ 3 (knitr + googleVis): https://cran.r-project.org/web/packages/googleVis/vignettes/Using_googleVis_with_knitr.html ◦ 4 (with Shiny): https://cpsievert.shinyapps.io/animintRmarkdown/ ◦ 5 (combined with JavaScript):

http://www.nytimes.com/interactive/2014/01/23/business/case-shiller-slider.html?_r=0

Page 14: R as supporting tool for analytics and simulation

Interesting R Applications: Shiny Web application for R. Interactive visualization tool based on JavaScript libraries like d3, Leaflet and Google Charts.

This reporting tool runs in all type of devices Can be connected to R to perform any kind of data analysis in real time (data mining, optimization, etc.)

See some examples at: Shiny User Showcase Shiny + javascript (https://frissdemo.shinyapps.io/FrissDashboard/)

Shiny can be embedded in individual servers to add security and increase performance.

Shiny is available at Predix through cf-buildpack-r (check link)

Page 15: R as supporting tool for analytics and simulation

Interesting R Applications: Big Data

Specialized libraries to manipulate big data◦ bigmemory+ biganalytics (article)◦ ff+ffbase (article)

R has proven to be very effective to manipulate millions of rows in short time (e.g. less than 30 seconds to perform a linear regression of a sample of 10M).

Machine learning algorithms with millions of rows can run in seconds with the right libraries and configuration

MRS implements RevoScaleR to manipulate big data and handle parallelism

Page 16: R as supporting tool for analytics and simulation

R: The Generic Scripting Language Given the popularity and versatility of R, many companies are adapting its services to be compatibles with R

Oracle, Microsoft, GE among others

Since 2016 SQL Server has the ability to run R scripts directly in database using SQL Server R Services. This means the R code will run directly on the server, as opposed to first extracting the data to a local R session.

In the words of Joseph Sirosh, corporate VP at Microsoft Data Group, “[Microsoft R Server enables] enterprise customers to standardize advanced analytics on one core tool, regardless of whether they are using Hadoop (Hortonworks, Cloudera and MapR), Linux (Red Hat and SUSE) or Teradata. [We are committed to] building R and Revolution’s technology into our broader database, big data and business intelligence offerings and to bring these benefits to customers and students – on-premises, in the Azure cloud and to new platforms.”

Forbes January 2016 https://t.co/AJicDBqv47

Page 17: R as supporting tool for analytics and simulation

R: The Generic Scripting Language

R and Azure

Microsoft is adapting services like Azure to include R as the scripting language for data analysis

Page 18: R as supporting tool for analytics and simulation

Calling R from AnyLogic AnyLogic can work with R by using the Java library Rcaller.

Rcaller is a software library which is developed to simplify calling R from Java (see link)

It successfully simplifies and wraps type conversations and makes variables in each languages accessible between platforms

multiple R processes can be created and handled by multiple RCaller instances in Java

Page 19: R as supporting tool for analytics and simulation

ExampleWatch demo video

More info: http://agiltools.com/blogsp/anylogic_r_qchart/

Page 20: R as supporting tool for analytics and simulation

Interfacing Programming Languages from R

The R environment can interface with other programming languages, such as Fortran, C and Java.

Examples of interfaces with C and Java can be found in:

C: http://adv-r.had.co.nz/C-interface.html

Java: http://rforge.net/rJava/

Page 21: R as supporting tool for analytics and simulation

R and IoT R can be executed inside Internet of Things (IoT) platforms like Bluemix, Amazon Web Services, Azure and Predix

Libraries like cf-buildpack-r allows users to execute Rscripts in cloudfoundry based plaforms and even embed Shiny applications.

In Microsoft platforms Rscripts are already embedded in Azure


Recommended