+ All Categories
Home > Documents > Data Mining & Matrices Lecture 2: Introduction to R · DMM, summer 2015 1 Introduction to R Data...

Data Mining & Matrices Lecture 2: Introduction to R · DMM, summer 2015 1 Introduction to R Data...

Date post: 06-Aug-2018
Category:
Upload: doankiet
View: 214 times
Download: 0 times
Share this document with a friend
20
Introduction to R Data Mining & Matrices Lecture 2: Saskia Metzler 28 April 2015
Transcript

1DMM, summer 2015

Introduction to R

Data Mining & Matrices Lecture 2:

Saskia Metzler28 April 2015

2DMM, summer 2015

Agenda

Part 1: Why R?

Part 2: Learn basic tasks in R

Get R now from http://www.r-project.org/

if you haven't yetdone so!

3DMM, summer 2015

Why R?Why to learn another

language?

Why to learn R?

4DMM, summer 2015

R is ...... good for statistical programming and data analysis tasks

... used in companies like Google, Bank of America, Shell

... for free

... available for Linux, Windows, OSX

5DMM, summer 2015

R is ...

“R is really important to the point that it’s hard to overvalue it,” said Daryl Pregibon, a research scientist at Google, which uses the software widely. “It allows statisticians to do very intricate and complicated analyses without knowing the blood and guts of computing systems.”

New York Times, 2009http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=0

6DMM, summer 2015

R is ...... easy with reading and writing of data

... equipped with built in tools for statistics and plotting

... vectorized thinking

7DMM, summer 2015

R in Comparison

C(++)Java

Python

Matlab

Take the right tool for each task.

Excel

8DMM, summer 2015

R in Comparison

C(++)Java

Python

Matlab

Take the right tool for each task.

Excel

9DMM, summer 2015

R vs. Excel

Importing data is easy, statistics and plotting are supported. But did you ever try to do the same analysis on various datasets? - on big data sets? - with different sizes? - on many of them?

Excel is designed for accounting.

10DMM, summer 2015

R in Comparison

C(++)Java

Python

Matlab

Take the right tool for each task.

Excel

11DMM, summer 2015

R vs. Java or C(++)

Java (or C, C++, ...) is capable but requiresconsiderable programming overhead for - reading data - plotting data - manipulating matrix/table data - reformatting data to use different libraries

12DMM, summer 2015

R in Comparison

C(++)Java

Python

Matlab

Take the right tool for each task.

Excel

13DMM, summer 2015

R vs. Matlab

Conceptionally similar.

R is free, Matlab is not.R is developed for statistics, Matlab for matrices.

14DMM, summer 2015

R in Comparison

C(++)Java

Python

Matlab

Take the right tool for each task.

Excel

15DMM, summer 2015

R vs. Python

Libraries make prototyping easy. But these libraries are not inherent and might require differently formatted input.

Python doesn't come with the concept of vectors built in.

16DMM, summer 2015

R in Comparison

C(++)Java

Python

Matlab

Take the right tool for each task.

Excel

17DMM, summer 2015

Let's learn some R

18DMM, summer 2015

1. Syntax basics & getting help2. Vectors3. Sequence generation4. Matrices5. Data frames6. Reading & writing data7. Plotting8. Saving scripts9. Function definitions

Topics

19DMM, summer 2015

Topics1. Syntax basics & getting help2. Vectors3. Sequence generation4. Matrices5. Data frames6. Reading & writing data7. Plotting8. Saving scripts9. Function definitions

20DMM, summer 2015

Useful LinksWhat you get from help.start() too https://stat.ethz.ch/R-manual/R-patched/doc/html/

"Quick R" http://www.statmethods.net/

FAQ for very basic things http://www.ats.ucla.edu/stat/r/faq/R_basics.htm

More tutorials http://www.ats.ucla.edu/stat/r/


Recommended