About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
An Introduction to the R Environment
Peter Dalgaard
Department of BiostatisticsUniversity of Copenhagen
Center for Bioinformatics, Univ.Copenhagen, June 2005
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Outline
About the Talk
Basics of R
Modeling
The Package System
Some Practical Issues
Graphics
Programming
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Outline
About the Talk
Basics of R
Modeling
The Package System
Some Practical Issues
Graphics
Programming
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Practicalities
• Short tutorial (approx. 2hr)
• High coverage, not great depth
• Little time for interaction, but there should be made roomfor clarification.
• Short break (5–10 minutes) in the middle.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Practicalities
• Short tutorial (approx. 2hr)
• High coverage, not great depth
• Little time for interaction, but there should be made roomfor clarification.
• Short break (5–10 minutes) in the middle.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Practicalities
• Short tutorial (approx. 2hr)
• High coverage, not great depth
• Little time for interaction, but there should be made roomfor clarification.
• Short break (5–10 minutes) in the middle.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Practicalities
• Short tutorial (approx. 2hr)
• High coverage, not great depth
• Little time for interaction, but there should be made roomfor clarification.
• Short break (5–10 minutes) in the middle.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Plan
• Elementary things about R, simple interactive demo
• Modeling tools
• R packages
• Dealing with the R workspace
• Graphics in R
• Ad-hoc programming
• Not covering: Installation, Data summaries, GUIs,Computing on the language, C/Fortran interface, Advancedanalysis. . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Plan
• Elementary things about R, simple interactive demo
• Modeling tools
• R packages
• Dealing with the R workspace
• Graphics in R
• Ad-hoc programming
• Not covering: Installation, Data summaries, GUIs,Computing on the language, C/Fortran interface, Advancedanalysis. . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Plan
• Elementary things about R, simple interactive demo
• Modeling tools
• R packages
• Dealing with the R workspace
• Graphics in R
• Ad-hoc programming
• Not covering: Installation, Data summaries, GUIs,Computing on the language, C/Fortran interface, Advancedanalysis. . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Plan
• Elementary things about R, simple interactive demo
• Modeling tools
• R packages
• Dealing with the R workspace
• Graphics in R
• Ad-hoc programming
• Not covering: Installation, Data summaries, GUIs,Computing on the language, C/Fortran interface, Advancedanalysis. . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Plan
• Elementary things about R, simple interactive demo
• Modeling tools
• R packages
• Dealing with the R workspace
• Graphics in R
• Ad-hoc programming
• Not covering: Installation, Data summaries, GUIs,Computing on the language, C/Fortran interface, Advancedanalysis. . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Plan
• Elementary things about R, simple interactive demo
• Modeling tools
• R packages
• Dealing with the R workspace
• Graphics in R
• Ad-hoc programming
• Not covering: Installation, Data summaries, GUIs,Computing on the language, C/Fortran interface, Advancedanalysis. . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Plan
• Elementary things about R, simple interactive demo
• Modeling tools
• R packages
• Dealing with the R workspace
• Graphics in R
• Ad-hoc programming
• Not covering: Installation, Data summaries, GUIs,Computing on the language, C/Fortran interface, Advancedanalysis. . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Outline
About the Talk
Basics of R
Modeling
The Package System
Some Practical Issues
Graphics
Programming
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The R environment
• Built around the programming language R, an OpenSource dialect of the S language
• R is Free Software, and runs on a variety of platforms (I’llbe using Linux here, mainly to avoid technical surprises).
• Command-line execution based on function calls
• Extensible with user functions
• Workspace containing data and functions
• Various graphics devices (interactive and non-interactive)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The R environment
• Built around the programming language R, an OpenSource dialect of the S language
• R is Free Software, and runs on a variety of platforms (I’llbe using Linux here, mainly to avoid technical surprises).
• Command-line execution based on function calls
• Extensible with user functions
• Workspace containing data and functions
• Various graphics devices (interactive and non-interactive)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The R environment
• Built around the programming language R, an OpenSource dialect of the S language
• R is Free Software, and runs on a variety of platforms (I’llbe using Linux here, mainly to avoid technical surprises).
• Command-line execution based on function calls
• Extensible with user functions
• Workspace containing data and functions
• Various graphics devices (interactive and non-interactive)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The R environment
• Built around the programming language R, an OpenSource dialect of the S language
• R is Free Software, and runs on a variety of platforms (I’llbe using Linux here, mainly to avoid technical surprises).
• Command-line execution based on function calls
• Extensible with user functions
• Workspace containing data and functions
• Various graphics devices (interactive and non-interactive)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The R environment
• Built around the programming language R, an OpenSource dialect of the S language
• R is Free Software, and runs on a variety of platforms (I’llbe using Linux here, mainly to avoid technical surprises).
• Command-line execution based on function calls
• Extensible with user functions
• Workspace containing data and functions
• Various graphics devices (interactive and non-interactive)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The R environment
• Built around the programming language R, an OpenSource dialect of the S language
• R is Free Software, and runs on a variety of platforms (I’llbe using Linux here, mainly to avoid technical surprises).
• Command-line execution based on function calls
• Extensible with user functions
• Workspace containing data and functions
• Various graphics devices (interactive and non-interactive)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The basic vector types
• Numeric (integer/double)
• Character (strings)
• Logical
• Factor (really integer + level attribute)
• Lists (generic vectors)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The basic vector types
• Numeric (integer/double)
• Character (strings)
• Logical
• Factor (really integer + level attribute)
• Lists (generic vectors)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The basic vector types
• Numeric (integer/double)
• Character (strings)
• Logical
• Factor (really integer + level attribute)
• Lists (generic vectors)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The basic vector types
• Numeric (integer/double)
• Character (strings)
• Logical
• Factor (really integer + level attribute)
• Lists (generic vectors)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The basic vector types
• Numeric (integer/double)
• Character (strings)
• Logical
• Factor (really integer + level attribute)
• Lists (generic vectors)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic operations
• Standard arithmetic (x + y , etc.)
• Recycling: If operating on two vectors of different length,the shorter one is replicated (with warning if it is not aneven multiple)
• c — concatenate
• seq — sequences
• rep — replication
• sum, mean, range , . . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic operations
• Standard arithmetic (x + y , etc.)
• Recycling: If operating on two vectors of different length,the shorter one is replicated (with warning if it is not aneven multiple)
• c — concatenate
• seq — sequences
• rep — replication
• sum, mean, range , . . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic operations
• Standard arithmetic (x + y , etc.)
• Recycling: If operating on two vectors of different length,the shorter one is replicated (with warning if it is not aneven multiple)
• c — concatenate
• seq — sequences
• rep — replication
• sum, mean, range , . . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic operations
• Standard arithmetic (x + y , etc.)
• Recycling: If operating on two vectors of different length,the shorter one is replicated (with warning if it is not aneven multiple)
• c — concatenate
• seq — sequences
• rep — replication
• sum, mean, range , . . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic operations
• Standard arithmetic (x + y , etc.)
• Recycling: If operating on two vectors of different length,the shorter one is replicated (with warning if it is not aneven multiple)
• c — concatenate
• seq — sequences
• rep — replication
• sum, mean, range , . . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic operations
• Standard arithmetic (x + y , etc.)
• Recycling: If operating on two vectors of different length,the shorter one is replicated (with warning if it is not aneven multiple)
• c — concatenate
• seq — sequences
• rep — replication
• sum, mean, range , . . .
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 1
x <- round(rnorm(10,mean=20,sd=5)) # simulate dataxmean(x)m <- mean(x)x - m # notice recycling(x - m)^2sum((x - m)^2)sqrt(sum((x - m)^2)/9)sd(x)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Smart indexing
• a[5] single element
• a[5:7] several elements
• a[-6] all except the 6th
• a[b>200] logical index
• a["name"] by name
• a$b list elements
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Smart indexing
• a[5] single element
• a[5:7] several elements
• a[-6] all except the 6th
• a[b>200] logical index
• a["name"] by name
• a$b list elements
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Smart indexing
• a[5] single element
• a[5:7] several elements
• a[-6] all except the 6th
• a[b>200] logical index
• a["name"] by name
• a$b list elements
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Smart indexing
• a[5] single element
• a[5:7] several elements
• a[-6] all except the 6th
• a[b>200] logical index
• a["name"] by name
• a$b list elements
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Smart indexing
• a[5] single element
• a[5:7] several elements
• a[-6] all except the 6th
• a[b>200] logical index
• a["name"] by name
• a$b list elements
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Smart indexing
• a[5] single element
• a[5:7] several elements
• a[-6] all except the 6th
• a[b>200] logical index
• a["name"] by name
• a$b list elements
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Matrices/tables/arrays
• Used in matrix calculus and as input to, e.g.,chisq.test() . Results of tabulation.
• Vectors with dimensions
• Dimnames
• Matrices: Generate with matrix
• Indexing methods include [i,j] , [i,] , [,j]
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Matrices/tables/arrays
• Used in matrix calculus and as input to, e.g.,chisq.test() . Results of tabulation.
• Vectors with dimensions
• Dimnames
• Matrices: Generate with matrix
• Indexing methods include [i,j] , [i,] , [,j]
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Matrices/tables/arrays
• Used in matrix calculus and as input to, e.g.,chisq.test() . Results of tabulation.
• Vectors with dimensions
• Dimnames
• Matrices: Generate with matrix
• Indexing methods include [i,j] , [i,] , [,j]
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Matrices/tables/arrays
• Used in matrix calculus and as input to, e.g.,chisq.test() . Results of tabulation.
• Vectors with dimensions
• Dimnames
• Matrices: Generate with matrix
• Indexing methods include [i,j] , [i,] , [,j]
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Matrices/tables/arrays
• Used in matrix calculus and as input to, e.g.,chisq.test() . Results of tabulation.
• Vectors with dimensions
• Dimnames
• Matrices: Generate with matrix
• Indexing methods include [i,j] , [i,] , [,j]
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Data frames
• Like data set in other packages
• Technically: Lists of vectors/factors of same length
• Row names (must be unique)
• Indexed like matrices (Beware, though: Data frames arenot matrices)
• Generate from read operation or with data.frame
• Many sample data frames are avalilable using data()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Data frames
• Like data set in other packages
• Technically: Lists of vectors/factors of same length
• Row names (must be unique)
• Indexed like matrices (Beware, though: Data frames arenot matrices)
• Generate from read operation or with data.frame
• Many sample data frames are avalilable using data()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Data frames
• Like data set in other packages
• Technically: Lists of vectors/factors of same length
• Row names (must be unique)
• Indexed like matrices (Beware, though: Data frames arenot matrices)
• Generate from read operation or with data.frame
• Many sample data frames are avalilable using data()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Data frames
• Like data set in other packages
• Technically: Lists of vectors/factors of same length
• Row names (must be unique)
• Indexed like matrices (Beware, though: Data frames arenot matrices)
• Generate from read operation or with data.frame
• Many sample data frames are avalilable using data()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Data frames
• Like data set in other packages
• Technically: Lists of vectors/factors of same length
• Row names (must be unique)
• Indexed like matrices (Beware, though: Data frames arenot matrices)
• Generate from read operation or with data.frame
• Many sample data frames are avalilable using data()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Data frames
• Like data set in other packages
• Technically: Lists of vectors/factors of same length
• Row names (must be unique)
• Indexed like matrices (Beware, though: Data frames arenot matrices)
• Generate from read operation or with data.frame
• Many sample data frames are avalilable using data()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 2
data(airquality)airquality$Monthairquality[airquality$Month==5,]oz <- airquality[airquality$Month==5,]$Ozonemean(oz)mean(oz, na.rm=TRUE)attach(airquality)mean(Ozone, na.rm=TRUE)tapply(Ozone, Month, mean, na.rm=TRUE)detach()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Some standard procedures
• Continuous data by group: t.test , wilcox.test ,oneway.test , kruskal.test
• Categorical data: prop.test , chisq.test ,fisher.test
• Correlations: cor.test , with options for nonparametrics
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Some standard procedures
• Continuous data by group: t.test , wilcox.test ,oneway.test , kruskal.test
• Categorical data: prop.test , chisq.test ,fisher.test
• Correlations: cor.test , with options for nonparametrics
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Some standard procedures
• Continuous data by group: t.test , wilcox.test ,oneway.test , kruskal.test
• Categorical data: prop.test , chisq.test ,fisher.test
• Correlations: cor.test , with options for nonparametrics
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 3
library(ISwR)data(intake)attach(intake)t.test(pre, post, paired=TRUE)detach()data(caesarean) # loads a tablecaesar.shoechisq.test(caesar.shoe)fisher.test(caesar.shoe)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Outline
About the Talk
Basics of R
Modeling
The Package System
Some Practical Issues
Graphics
Programming
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Modeling Tools: Overview
• Model formulas
• Model objects and summaries
• Comparing models
• Evaluating model fit
• Generalized linear models
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Modeling Tools: Overview
• Model formulas
• Model objects and summaries
• Comparing models
• Evaluating model fit
• Generalized linear models
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Modeling Tools: Overview
• Model formulas
• Model objects and summaries
• Comparing models
• Evaluating model fit
• Generalized linear models
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Modeling Tools: Overview
• Model formulas
• Model objects and summaries
• Comparing models
• Evaluating model fit
• Generalized linear models
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Modeling Tools: Overview
• Model formulas
• Model objects and summaries
• Comparing models
• Evaluating model fit
• Generalized linear models
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model formulas
• Linear model, y = Xβ + ε
• In practice something like
y = β0 + β1 × height + β2 × 1(type=2) + β3 × 1(type=3) + ε
• Wilkinson-Rogers formulas:
y = height + type
(Interpretation depends on whether variables arecategorical or continuous)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model formulas
• Linear model, y = Xβ + ε
• In practice something like
y = β0 + β1 × height + β2 × 1(type=2) + β3 × 1(type=3) + ε
• Wilkinson-Rogers formulas:
y = height + type
(Interpretation depends on whether variables arecategorical or continuous)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model formulas
• Linear model, y = Xβ + ε
• In practice something like
y = β0 + β1 × height + β2 × 1(type=2) + β3 × 1(type=3) + ε
• Wilkinson-Rogers formulas:
y = height + type
(Interpretation depends on whether variables arecategorical or continuous)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model formulas in R
• R representation y ~ height + type where type is afactor
• Interactions a:b , a*b = a + b + a:b
• Algebra (a:(b + c) = a:b + a:c etc.)
• Notice special interpretation of operators
• Special items: offset , -1 (no intercept)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model formulas in R
• R representation y ~ height + type where type is afactor
• Interactions a:b , a*b = a + b + a:b
• Algebra (a:(b + c) = a:b + a:c etc.)
• Notice special interpretation of operators
• Special items: offset , -1 (no intercept)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model formulas in R
• R representation y ~ height + type where type is afactor
• Interactions a:b , a*b = a + b + a:b
• Algebra (a:(b + c) = a:b + a:c etc.)
• Notice special interpretation of operators
• Special items: offset , -1 (no intercept)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model formulas in R
• R representation y ~ height + type where type is afactor
• Interactions a:b , a*b = a + b + a:b
• Algebra (a:(b + c) = a:b + a:c etc.)
• Notice special interpretation of operators
• Special items: offset , -1 (no intercept)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model formulas in R
• R representation y ~ height + type where type is afactor
• Interactions a:b , a*b = a + b + a:b
• Algebra (a:(b + c) = a:b + a:c etc.)
• Notice special interpretation of operators
• Special items: offset , -1 (no intercept)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Fitting linear models
data(airquality)aq <- transform(airquality, Month=factor(Month))fit.aq <- lm(log(Ozone) ~ Solar.R + Wind +
Temp + Month, data=aq)
• lm generates a fitted model object
• Extract information from model object
• Fit other models based on model object
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Inspecting model objects
• Extract information about the fit
• summary(fit.aq)
• fitted(fit.aq) , resid(fit.aq)
• anova(model1, model2)
• plot(fit.aq) – diagnostics
• predict(fit.aq, newdata)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Inspecting model objects
• Extract information about the fit
• summary(fit.aq)
• fitted(fit.aq) , resid(fit.aq)
• anova(model1, model2)
• plot(fit.aq) – diagnostics
• predict(fit.aq, newdata)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Inspecting model objects
• Extract information about the fit
• summary(fit.aq)
• fitted(fit.aq) , resid(fit.aq)
• anova(model1, model2)
• plot(fit.aq) – diagnostics
• predict(fit.aq, newdata)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Inspecting model objects
• Extract information about the fit
• summary(fit.aq)
• fitted(fit.aq) , resid(fit.aq)
• anova(model1, model2)
• plot(fit.aq) – diagnostics
• predict(fit.aq, newdata)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Inspecting model objects
• Extract information about the fit
• summary(fit.aq)
• fitted(fit.aq) , resid(fit.aq)
• anova(model1, model2)
• plot(fit.aq) – diagnostics
• predict(fit.aq, newdata)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Inspecting model objects
• Extract information about the fit
• summary(fit.aq)
• fitted(fit.aq) , resid(fit.aq)
• anova(model1, model2)
• plot(fit.aq) – diagnostics
• predict(fit.aq, newdata)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model search
• anova(model) “Type I” sum of squares
• drop1 (“Type III”), add1
• step (AIC/BIC) criteria
• update
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model search
• anova(model) “Type I” sum of squares
• drop1 (“Type III”), add1
• step (AIC/BIC) criteria
• update
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model search
• anova(model) “Type I” sum of squares
• drop1 (“Type III”), add1
• step (AIC/BIC) criteria
• update
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Model search
• anova(model) “Type I” sum of squares
• drop1 (“Type III”), add1
• step (AIC/BIC) criteria
• update
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 4
data(airquality)aq <- transform(airquality, Month=factor(Month))fit.aq <- lm(log(Ozone) ~ Solar.R + Wind +
Temp + Month, data=aq)fit.aq2 <- update(fit.aq, ~ . - Month)summary(fit.aq)plot(fit.aq)drop1(fit.aq, test="F")anova(fit.aq, fit.aq2)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Generalized linear models
• Statistical distribution (exponential) family
• Link function transforming mean to linear scale
• Deviance
• Examples; Binomial, Poisson, Gaussian (σ known — inprinciple)
• Canonical link functions
• Fit using glm in R
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Generalized linear models
• Statistical distribution (exponential) family
• Link function transforming mean to linear scale
• Deviance
• Examples; Binomial, Poisson, Gaussian (σ known — inprinciple)
• Canonical link functions
• Fit using glm in R
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Generalized linear models
• Statistical distribution (exponential) family
• Link function transforming mean to linear scale
• Deviance
• Examples; Binomial, Poisson, Gaussian (σ known — inprinciple)
• Canonical link functions
• Fit using glm in R
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Generalized linear models
• Statistical distribution (exponential) family
• Link function transforming mean to linear scale
• Deviance
• Examples; Binomial, Poisson, Gaussian (σ known — inprinciple)
• Canonical link functions
• Fit using glm in R
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Generalized linear models
• Statistical distribution (exponential) family
• Link function transforming mean to linear scale
• Deviance
• Examples; Binomial, Poisson, Gaussian (σ known — inprinciple)
• Canonical link functions
• Fit using glm in R
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Generalized linear models
• Statistical distribution (exponential) family
• Link function transforming mean to linear scale
• Deviance
• Examples; Binomial, Poisson, Gaussian (σ known — inprinciple)
• Canonical link functions
• Fit using glm in R
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 5
no.yes <- c("No","Yes")smoking <- gl(2, 1, 8, no.yes)obesity <- gl(2, 2, 8, no.yes)snoring <- gl(2, 4, 8, no.yes)n.tot <- c(60,17,8,2,187,85,51,23)n.hyp <- c(5,2,1,0,35,13,15,8)data.frame(smoking,obesity,snoring,n.tot,n.hyp)hyp.tbl <- cbind(n.hyp,n.tot-n.hyp)glm.hyp <- glm(hyp.tbl~smoking+obesity+snoring,
family=binomial("logit"))summary(glm.hyp)0.87194 + qnorm(c(.025,.975))*0.39757library(MASS)confint(glm.hyp)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Likelihood-based inference
• Wald approximations (β̂/s.e.(β̂), etc.) can be badlyinaccurate in small samples
• Likelihood-based inference is preferable
• Use drop1(model, test="Chisq") (forbinomial/Poisson)
• profile (in MASSfor glm ) investigates behaviour oflikelihood around maximum
• plot(profile(model)) shows signed LR statisticsign(β − β̂)
√Q when varying each parameter.
• confint gives likelihood-based confidence intervals
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Likelihood-based inference
• Wald approximations (β̂/s.e.(β̂), etc.) can be badlyinaccurate in small samples
• Likelihood-based inference is preferable
• Use drop1(model, test="Chisq") (forbinomial/Poisson)
• profile (in MASSfor glm ) investigates behaviour oflikelihood around maximum
• plot(profile(model)) shows signed LR statisticsign(β − β̂)
√Q when varying each parameter.
• confint gives likelihood-based confidence intervals
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Likelihood-based inference
• Wald approximations (β̂/s.e.(β̂), etc.) can be badlyinaccurate in small samples
• Likelihood-based inference is preferable
• Use drop1(model, test="Chisq") (forbinomial/Poisson)
• profile (in MASSfor glm ) investigates behaviour oflikelihood around maximum
• plot(profile(model)) shows signed LR statisticsign(β − β̂)
√Q when varying each parameter.
• confint gives likelihood-based confidence intervals
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Likelihood-based inference
• Wald approximations (β̂/s.e.(β̂), etc.) can be badlyinaccurate in small samples
• Likelihood-based inference is preferable
• Use drop1(model, test="Chisq") (forbinomial/Poisson)
• profile (in MASSfor glm ) investigates behaviour oflikelihood around maximum
• plot(profile(model)) shows signed LR statisticsign(β − β̂)
√Q when varying each parameter.
• confint gives likelihood-based confidence intervals
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Likelihood-based inference
• Wald approximations (β̂/s.e.(β̂), etc.) can be badlyinaccurate in small samples
• Likelihood-based inference is preferable
• Use drop1(model, test="Chisq") (forbinomial/Poisson)
• profile (in MASSfor glm ) investigates behaviour oflikelihood around maximum
• plot(profile(model)) shows signed LR statisticsign(β − β̂)
√Q when varying each parameter.
• confint gives likelihood-based confidence intervals
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Likelihood-based inference
• Wald approximations (β̂/s.e.(β̂), etc.) can be badlyinaccurate in small samples
• Likelihood-based inference is preferable
• Use drop1(model, test="Chisq") (forbinomial/Poisson)
• profile (in MASSfor glm ) investigates behaviour oflikelihood around maximum
• plot(profile(model)) shows signed LR statisticsign(β − β̂)
√Q when varying each parameter.
• confint gives likelihood-based confidence intervals
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Outline
About the Talk
Basics of R
Modeling
The Package System
Some Practical Issues
Graphics
Programming
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
R packages
• Collections of R functions, data, and compiled code
• Well-defined format that ensures easy installation, a basicstandard of documentation, and enhances portability andreliability,
• You can write your own packages! It is not entirely trivial,but tools are there to help you.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
R packages
• Collections of R functions, data, and compiled code
• Well-defined format that ensures easy installation, a basicstandard of documentation, and enhances portability andreliability,
• You can write your own packages! It is not entirely trivial,but tools are there to help you.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
R packages
• Collections of R functions, data, and compiled code
• Well-defined format that ensures easy installation, a basicstandard of documentation, and enhances portability andreliability,
• You can write your own packages! It is not entirely trivial,but tools are there to help you.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Packages that come with R
• Standard R (1.9.0) loads with these packages available
• methods — New (S4) class system• stats — Statistical procedures• graphics — Graphics• utils — Utilities (file handling, packages, . . . )• base — All the basic stuff
• Further packages are available in core R, but notautomatically loaded (tcltk , grid , splines ,stats4 . . . )
• 10 further packages and package bundles are maintainedseparately, but included with source and binarydistributions (survival , nlme , MASS, . . . ).
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Packages that come with R
• Standard R (1.9.0) loads with these packages available
• methods — New (S4) class system• stats — Statistical procedures• graphics — Graphics• utils — Utilities (file handling, packages, . . . )• base — All the basic stuff
• Further packages are available in core R, but notautomatically loaded (tcltk , grid , splines ,stats4 . . . )
• 10 further packages and package bundles are maintainedseparately, but included with source and binarydistributions (survival , nlme , MASS, . . . ).
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Packages that come with R
• Standard R (1.9.0) loads with these packages available
• methods — New (S4) class system• stats — Statistical procedures• graphics — Graphics• utils — Utilities (file handling, packages, . . . )• base — All the basic stuff
• Further packages are available in core R, but notautomatically loaded (tcltk , grid , splines ,stats4 . . . )
• 10 further packages and package bundles are maintainedseparately, but included with source and binarydistributions (survival , nlme , MASS, . . . ).
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Packages that come with R
• Standard R (1.9.0) loads with these packages available
• methods — New (S4) class system• stats — Statistical procedures• graphics — Graphics• utils — Utilities (file handling, packages, . . . )• base — All the basic stuff
• Further packages are available in core R, but notautomatically loaded (tcltk , grid , splines ,stats4 . . . )
• 10 further packages and package bundles are maintainedseparately, but included with source and binarydistributions (survival , nlme , MASS, . . . ).
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Packages that come with R
• Standard R (1.9.0) loads with these packages available
• methods — New (S4) class system• stats — Statistical procedures• graphics — Graphics• utils — Utilities (file handling, packages, . . . )• base — All the basic stuff
• Further packages are available in core R, but notautomatically loaded (tcltk , grid , splines ,stats4 . . . )
• 10 further packages and package bundles are maintainedseparately, but included with source and binarydistributions (survival , nlme , MASS, . . . ).
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Packages that come with R
• Standard R (1.9.0) loads with these packages available
• methods — New (S4) class system• stats — Statistical procedures• graphics — Graphics• utils — Utilities (file handling, packages, . . . )• base — All the basic stuff
• Further packages are available in core R, but notautomatically loaded (tcltk , grid , splines ,stats4 . . . )
• 10 further packages and package bundles are maintainedseparately, but included with source and binarydistributions (survival , nlme , MASS, . . . ).
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Packages that come with R
• Standard R (1.9.0) loads with these packages available
• methods — New (S4) class system• stats — Statistical procedures• graphics — Graphics• utils — Utilities (file handling, packages, . . . )• base — All the basic stuff
• Further packages are available in core R, but notautomatically loaded (tcltk , grid , splines ,stats4 . . . )
• 10 further packages and package bundles are maintainedseparately, but included with source and binarydistributions (survival , nlme , MASS, . . . ).
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Packages that come with R
• Standard R (1.9.0) loads with these packages available
• methods — New (S4) class system• stats — Statistical procedures• graphics — Graphics• utils — Utilities (file handling, packages, . . . )• base — All the basic stuff
• Further packages are available in core R, but notautomatically loaded (tcltk , grid , splines ,stats4 . . . )
• 10 further packages and package bundles are maintainedseparately, but included with source and binarydistributions (survival , nlme , MASS, . . . ).
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
CRAN
• The Comprehensive R Archive Network• Collection of servers mirroring a central server in Vienna.
Modeled on CTAN and CPAN (for TEX and Perl code)• http://cran.us.r-project.org• Maintains a curated collection of R packages as well as the
source and binary distributions R itself• About 320 packages available• Unix/Linux variants generally install packages from
sources. Windows and MacOSX have binary packageformats which are even easier to install
• See also: Bioconductor,http://www.bioconductor.org
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
CRAN
• The Comprehensive R Archive Network• Collection of servers mirroring a central server in Vienna.
Modeled on CTAN and CPAN (for TEX and Perl code)• http://cran.us.r-project.org• Maintains a curated collection of R packages as well as the
source and binary distributions R itself• About 320 packages available• Unix/Linux variants generally install packages from
sources. Windows and MacOSX have binary packageformats which are even easier to install
• See also: Bioconductor,http://www.bioconductor.org
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
CRAN
• The Comprehensive R Archive Network• Collection of servers mirroring a central server in Vienna.
Modeled on CTAN and CPAN (for TEX and Perl code)• http://cran.us.r-project.org• Maintains a curated collection of R packages as well as the
source and binary distributions R itself• About 320 packages available• Unix/Linux variants generally install packages from
sources. Windows and MacOSX have binary packageformats which are even easier to install
• See also: Bioconductor,http://www.bioconductor.org
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
CRAN
• The Comprehensive R Archive Network• Collection of servers mirroring a central server in Vienna.
Modeled on CTAN and CPAN (for TEX and Perl code)• http://cran.us.r-project.org• Maintains a curated collection of R packages as well as the
source and binary distributions R itself• About 320 packages available• Unix/Linux variants generally install packages from
sources. Windows and MacOSX have binary packageformats which are even easier to install
• See also: Bioconductor,http://www.bioconductor.org
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
CRAN
• The Comprehensive R Archive Network• Collection of servers mirroring a central server in Vienna.
Modeled on CTAN and CPAN (for TEX and Perl code)• http://cran.us.r-project.org• Maintains a curated collection of R packages as well as the
source and binary distributions R itself• About 320 packages available• Unix/Linux variants generally install packages from
sources. Windows and MacOSX have binary packageformats which are even easier to install
• See also: Bioconductor,http://www.bioconductor.org
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
CRAN
• The Comprehensive R Archive Network• Collection of servers mirroring a central server in Vienna.
Modeled on CTAN and CPAN (for TEX and Perl code)• http://cran.us.r-project.org• Maintains a curated collection of R packages as well as the
source and binary distributions R itself• About 320 packages available• Unix/Linux variants generally install packages from
sources. Windows and MacOSX have binary packageformats which are even easier to install
• See also: Bioconductor,http://www.bioconductor.org
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
CRAN
• The Comprehensive R Archive Network• Collection of servers mirroring a central server in Vienna.
Modeled on CTAN and CPAN (for TEX and Perl code)• http://cran.us.r-project.org• Maintains a curated collection of R packages as well as the
source and binary distributions R itself• About 320 packages available• Unix/Linux variants generally install packages from
sources. Windows and MacOSX have binary packageformats which are even easier to install
• See also: Bioconductor,http://www.bioconductor.org
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 6
# Cheat for offline demo:# Pretend CRAN is local directoryoptions(CRAN="file:/home/pd/cran.r-project.org")# Manipulate install path.libPaths("~/Rlibrary").libPaths()# Source install (gives harmless warning)install.packages("mvtnorm")library(mvtnorm)library(help=mvtnorm)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Outline
About the Talk
Basics of R
Modeling
The Package System
Some Practical Issues
Graphics
Programming
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Practical issues
• Dealing with the workspace
• Reading data
• Saving and restoring data and results
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Practical issues
• Dealing with the workspace
• Reading data
• Saving and restoring data and results
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Practical issues
• Dealing with the workspace
• Reading data
• Saving and restoring data and results
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The workspace
• The global environment contains R objects created on thecommand line.
• There is an additional search path of loaded packages andattached data frames.
• The search path is maintained by library() , attach() ,and detach()
• This determines the way R looks up objects by name
• Notice that objects in the global environment may maskobjects in packages.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The workspace
• The global environment contains R objects created on thecommand line.
• There is an additional search path of loaded packages andattached data frames.
• The search path is maintained by library() , attach() ,and detach()
• This determines the way R looks up objects by name
• Notice that objects in the global environment may maskobjects in packages.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The workspace
• The global environment contains R objects created on thecommand line.
• There is an additional search path of loaded packages andattached data frames.
• The search path is maintained by library() , attach() ,and detach()
• This determines the way R looks up objects by name
• Notice that objects in the global environment may maskobjects in packages.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The workspace
• The global environment contains R objects created on thecommand line.
• There is an additional search path of loaded packages andattached data frames.
• The search path is maintained by library() , attach() ,and detach()
• This determines the way R looks up objects by name
• Notice that objects in the global environment may maskobjects in packages.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
The workspace
• The global environment contains R objects created on thecommand line.
• There is an additional search path of loaded packages andattached data frames.
• The search path is maintained by library() , attach() ,and detach()
• This determines the way R looks up objects by name
• Notice that objects in the global environment may maskobjects in packages.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 7
search()data(intake) # From ISwRls()attach(intake)search()ls("intake") # show variables in data framepost - prerm(intake) # remove data framedetach() # remove from search path
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Reading data
• Simple data vectors can be read using scan()
• Data frames can be read from most reasonably structuredtext file formats (space separated columns, tab- andcomma-delimited files) using read.table() orread.delim() . Note colClasses .
• The foreign package can read files from Stata, SASexport libraries, SPSS, and Epi-Info, Minitab, and someS-PLUS versions.
• For spreadsheets and databases, the quick and easy wayis to export to a delimited file, but you can work via ODBCconnections and database access packages
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Reading data
• Simple data vectors can be read using scan()
• Data frames can be read from most reasonably structuredtext file formats (space separated columns, tab- andcomma-delimited files) using read.table() orread.delim() . Note colClasses .
• The foreign package can read files from Stata, SASexport libraries, SPSS, and Epi-Info, Minitab, and someS-PLUS versions.
• For spreadsheets and databases, the quick and easy wayis to export to a delimited file, but you can work via ODBCconnections and database access packages
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Reading data
• Simple data vectors can be read using scan()
• Data frames can be read from most reasonably structuredtext file formats (space separated columns, tab- andcomma-delimited files) using read.table() orread.delim() . Note colClasses .
• The foreign package can read files from Stata, SASexport libraries, SPSS, and Epi-Info, Minitab, and someS-PLUS versions.
• For spreadsheets and databases, the quick and easy wayis to export to a delimited file, but you can work via ODBCconnections and database access packages
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Reading data
• Simple data vectors can be read using scan()
• Data frames can be read from most reasonably structuredtext file formats (space separated columns, tab- andcomma-delimited files) using read.table() orread.delim() . Note colClasses .
• The foreign package can read files from Stata, SASexport libraries, SPSS, and Epi-Info, Minitab, and someS-PLUS versions.
• For spreadsheets and databases, the quick and easy wayis to export to a delimited file, but you can work via ODBCconnections and database access packages
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Getting organized
Several possibilities:
• Save/restore entire workspace (objects only)
• Save selected objects and load them
• source() script files
• Batch processing (R CMD BATCH file.R )
• ESS – Emacs Speaks Statistics: Integrated environmentfor maintaining scripts, running R, saving results, etc.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Getting organized
Several possibilities:
• Save/restore entire workspace (objects only)
• Save selected objects and load them
• source() script files
• Batch processing (R CMD BATCH file.R )
• ESS – Emacs Speaks Statistics: Integrated environmentfor maintaining scripts, running R, saving results, etc.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Getting organized
Several possibilities:
• Save/restore entire workspace (objects only)
• Save selected objects and load them
• source() script files
• Batch processing (R CMD BATCH file.R )
• ESS – Emacs Speaks Statistics: Integrated environmentfor maintaining scripts, running R, saving results, etc.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Getting organized
Several possibilities:
• Save/restore entire workspace (objects only)
• Save selected objects and load them
• source() script files
• Batch processing (R CMD BATCH file.R )
• ESS – Emacs Speaks Statistics: Integrated environmentfor maintaining scripts, running R, saving results, etc.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Getting organized
Several possibilities:
• Save/restore entire workspace (objects only)
• Save selected objects and load them
• source() script files
• Batch processing (R CMD BATCH file.R )
• ESS – Emacs Speaks Statistics: Integrated environmentfor maintaining scripts, running R, saving results, etc.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Outline
About the Talk
Basics of R
Modeling
The Package System
Some Practical Issues
Graphics
Programming
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
R graphics
• The standard interface
• Customizing plots
• Graphics parameters
• Math on plots
• Grid and lattice
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
R graphics
• The standard interface
• Customizing plots
• Graphics parameters
• Math on plots
• Grid and lattice
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
R graphics
• The standard interface
• Customizing plots
• Graphics parameters
• Math on plots
• Grid and lattice
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
R graphics
• The standard interface
• Customizing plots
• Graphics parameters
• Math on plots
• Grid and lattice
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
R graphics
• The standard interface
• Customizing plots
• Graphics parameters
• Math on plots
• Grid and lattice
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Standard R graphics
• Ink on paper model; once something is drawn it cannot beerased.
• Sensible default plots
• Arguments can override defaults
• Options to turn off various elements of plots (e.g. the axes)
• Functions to add elements.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Standard R graphics
• Ink on paper model; once something is drawn it cannot beerased.
• Sensible default plots
• Arguments can override defaults
• Options to turn off various elements of plots (e.g. the axes)
• Functions to add elements.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Standard R graphics
• Ink on paper model; once something is drawn it cannot beerased.
• Sensible default plots
• Arguments can override defaults
• Options to turn off various elements of plots (e.g. the axes)
• Functions to add elements.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Standard R graphics
• Ink on paper model; once something is drawn it cannot beerased.
• Sensible default plots
• Arguments can override defaults
• Options to turn off various elements of plots (e.g. the axes)
• Functions to add elements.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Standard R graphics
• Ink on paper model; once something is drawn it cannot beerased.
• Sensible default plots
• Arguments can override defaults
• Options to turn off various elements of plots (e.g. the axes)
• Functions to add elements.
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic x-y plots
• The plot function with one or two numeric arguments
• Scatterplot or line plot (or both) depending on typeargument: "l" for lines, "p" for points (the default), "b"for both, plus quite a few more.
• Functions for adding to a plot: lines, points,segments, abline, text, mtext, axis
• Also: formula interface, plot(y~x)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic x-y plots
• The plot function with one or two numeric arguments
• Scatterplot or line plot (or both) depending on typeargument: "l" for lines, "p" for points (the default), "b"for both, plus quite a few more.
• Functions for adding to a plot: lines, points,segments, abline, text, mtext, axis
• Also: formula interface, plot(y~x)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic x-y plots
• The plot function with one or two numeric arguments
• Scatterplot or line plot (or both) depending on typeargument: "l" for lines, "p" for points (the default), "b"for both, plus quite a few more.
• Functions for adding to a plot: lines, points,segments, abline, text, mtext, axis
• Also: formula interface, plot(y~x)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Basic x-y plots
• The plot function with one or two numeric arguments
• Scatterplot or line plot (or both) depending on typeargument: "l" for lines, "p" for points (the default), "b"for both, plus quite a few more.
• Functions for adding to a plot: lines, points,segments, abline, text, mtext, axis
• Also: formula interface, plot(y~x)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Graphical parameters
• Arguments to plot et al. (67 possibilities!)
• The par function can be used to set most of thempersistently. Most info is found via help(par)
• Look them up! Here are some of the more commonlyused:
• Point and line characteristics: pch, col, lty, lwd• Multiframe layout: mfrow, mfcol• Axes: xlim, ylim, xaxt, yaxt, log
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Graphical parameters
• Arguments to plot et al. (67 possibilities!)
• The par function can be used to set most of thempersistently. Most info is found via help(par)
• Look them up! Here are some of the more commonlyused:
• Point and line characteristics: pch, col, lty, lwd• Multiframe layout: mfrow, mfcol• Axes: xlim, ylim, xaxt, yaxt, log
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Graphical parameters
• Arguments to plot et al. (67 possibilities!)
• The par function can be used to set most of thempersistently. Most info is found via help(par)
• Look them up! Here are some of the more commonlyused:
• Point and line characteristics: pch, col, lty, lwd• Multiframe layout: mfrow, mfcol• Axes: xlim, ylim, xaxt, yaxt, log
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Graphical parameters
• Arguments to plot et al. (67 possibilities!)
• The par function can be used to set most of thempersistently. Most info is found via help(par)
• Look them up! Here are some of the more commonlyused:
• Point and line characteristics: pch, col, lty, lwd• Multiframe layout: mfrow, mfcol• Axes: xlim, ylim, xaxt, yaxt, log
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Graphical parameters
• Arguments to plot et al. (67 possibilities!)
• The par function can be used to set most of thempersistently. Most info is found via help(par)
• Look them up! Here are some of the more commonlyused:
• Point and line characteristics: pch, col, lty, lwd• Multiframe layout: mfrow, mfcol• Axes: xlim, ylim, xaxt, yaxt, log
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Graphical parameters
• Arguments to plot et al. (67 possibilities!)
• The par function can be used to set most of thempersistently. Most info is found via help(par)
• Look them up! Here are some of the more commonlyused:
• Point and line characteristics: pch, col, lty, lwd• Multiframe layout: mfrow, mfcol• Axes: xlim, ylim, xaxt, yaxt, log
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Specific plots
• Histograms — hist(x)
• Density plots — plot(density(x))
• Boxplots — boxplot(x)
• Barplots — barplot(x) (x can be a matrix)
• Pies — pie()
• Matrix plots (multiple y columns) — matplot()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Specific plots
• Histograms — hist(x)
• Density plots — plot(density(x))
• Boxplots — boxplot(x)
• Barplots — barplot(x) (x can be a matrix)
• Pies — pie()
• Matrix plots (multiple y columns) — matplot()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Specific plots
• Histograms — hist(x)
• Density plots — plot(density(x))
• Boxplots — boxplot(x)
• Barplots — barplot(x) (x can be a matrix)
• Pies — pie()
• Matrix plots (multiple y columns) — matplot()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Specific plots
• Histograms — hist(x)
• Density plots — plot(density(x))
• Boxplots — boxplot(x)
• Barplots — barplot(x) (x can be a matrix)
• Pies — pie()
• Matrix plots (multiple y columns) — matplot()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Specific plots
• Histograms — hist(x)
• Density plots — plot(density(x))
• Boxplots — boxplot(x)
• Barplots — barplot(x) (x can be a matrix)
• Pies — pie()
• Matrix plots (multiple y columns) — matplot()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Specific plots
• Histograms — hist(x)
• Density plots — plot(density(x))
• Boxplots — boxplot(x)
• Barplots — barplot(x) (x can be a matrix)
• Pies — pie()
• Matrix plots (multiple y columns) — matplot()
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 8
data(intake)par(mfrow=c(2,2))matplot(intake)matplot(t(intake))matplot(t(intake),type="b")matplot(t(intake),type="b",pch=1:11,col="black",
lty="solid", xaxt="n")axis(1,at=1:2,labels=names(intake))
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Math on plots
• Sort of like TeX
• Works on unevaluated expressions (quote(alpha),expression(alpha))
• Special conventions: ˆ ,[] sub/superscript, special namesalpha , sum, int
• See help(plotmath)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Math on plots
• Sort of like TeX
• Works on unevaluated expressions (quote(alpha),expression(alpha))
• Special conventions: ˆ ,[] sub/superscript, special namesalpha , sum, int
• See help(plotmath)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Math on plots
• Sort of like TeX
• Works on unevaluated expressions (quote(alpha),expression(alpha))
• Special conventions: ˆ ,[] sub/superscript, special namesalpha , sum, int
• See help(plotmath)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Math on plots
• Sort of like TeX
• Works on unevaluated expressions (quote(alpha),expression(alpha))
• Special conventions: ˆ ,[] sub/superscript, special namesalpha , sum, int
• See help(plotmath)
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Grid and Lattice graphics
• Standard R graphics allow graphs to be arranged in anm × n gridded layout.
• The grid package allows arbitrary viewports and creategraph objects (“grobs”) which can be modified before theyare printed.
• The lattice package uses grid for a structuralapproach to multiframe graphs
• Model formulas, y~x|g1*g2*...
• Shingles: Partially overlapping intervals used forconditioning plots
• Panel functions — potentially user codable
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Grid and Lattice graphics
• Standard R graphics allow graphs to be arranged in anm × n gridded layout.
• The grid package allows arbitrary viewports and creategraph objects (“grobs”) which can be modified before theyare printed.
• The lattice package uses grid for a structuralapproach to multiframe graphs
• Model formulas, y~x|g1*g2*...
• Shingles: Partially overlapping intervals used forconditioning plots
• Panel functions — potentially user codable
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Grid and Lattice graphics
• Standard R graphics allow graphs to be arranged in anm × n gridded layout.
• The grid package allows arbitrary viewports and creategraph objects (“grobs”) which can be modified before theyare printed.
• The lattice package uses grid for a structuralapproach to multiframe graphs
• Model formulas, y~x|g1*g2*...
• Shingles: Partially overlapping intervals used forconditioning plots
• Panel functions — potentially user codable
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Grid and Lattice graphics
• Standard R graphics allow graphs to be arranged in anm × n gridded layout.
• The grid package allows arbitrary viewports and creategraph objects (“grobs”) which can be modified before theyare printed.
• The lattice package uses grid for a structuralapproach to multiframe graphs
• Model formulas, y~x|g1*g2*...
• Shingles: Partially overlapping intervals used forconditioning plots
• Panel functions — potentially user codable
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Grid and Lattice graphics
• Standard R graphics allow graphs to be arranged in anm × n gridded layout.
• The grid package allows arbitrary viewports and creategraph objects (“grobs”) which can be modified before theyare printed.
• The lattice package uses grid for a structuralapproach to multiframe graphs
• Model formulas, y~x|g1*g2*...
• Shingles: Partially overlapping intervals used forconditioning plots
• Panel functions — potentially user codable
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Grid and Lattice graphics
• Standard R graphics allow graphs to be arranged in anm × n gridded layout.
• The grid package allows arbitrary viewports and creategraph objects (“grobs”) which can be modified before theyare printed.
• The lattice package uses grid for a structuralapproach to multiframe graphs
• Model formulas, y~x|g1*g2*...
• Shingles: Partially overlapping intervals used forconditioning plots
• Panel functions — potentially user codable
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 9
library(lattice)data(airquality)
lset(theme = col.whitebg())myplot <-
xyplot(log(Ozone)~Solar.R | equal.count(Temp),group=Month, data=airquality,ylab=list(label=expression("log"*O[3]),cex=2),xlab=list(cex=2))
myplot # OBS: no plot until object is printed!
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Outline
About the Talk
Basics of R
Modeling
The Package System
Some Practical Issues
Graphics
Programming
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Ad-hoc programming
• This had better be brief . . .
• What does an R function look like
• Flow control
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Ad-hoc programming
• This had better be brief . . .
• What does an R function look like
• Flow control
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Ad-hoc programming
• This had better be brief . . .
• What does an R function look like
• Flow control
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Simple functions
• logit <- function(p) log(p/(1-p))
• logit(0.5)
• Formal arguments
• Actual arguments
• Positional matching: plot(x,y)
• Keyword matching: t.test(x ~ g, mu=2,alternative="less")
• Partial matching: t.test(x ~ g, mu=2, alt="l")
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Simple functions
• logit <- function(p) log(p/(1-p))
• logit(0.5)
• Formal arguments
• Actual arguments
• Positional matching: plot(x,y)
• Keyword matching: t.test(x ~ g, mu=2,alternative="less")
• Partial matching: t.test(x ~ g, mu=2, alt="l")
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Simple functions
• logit <- function(p) log(p/(1-p))
• logit(0.5)
• Formal arguments
• Actual arguments
• Positional matching: plot(x,y)
• Keyword matching: t.test(x ~ g, mu=2,alternative="less")
• Partial matching: t.test(x ~ g, mu=2, alt="l")
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Simple functions
• logit <- function(p) log(p/(1-p))
• logit(0.5)
• Formal arguments
• Actual arguments
• Positional matching: plot(x,y)
• Keyword matching: t.test(x ~ g, mu=2,alternative="less")
• Partial matching: t.test(x ~ g, mu=2, alt="l")
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Simple functions
• logit <- function(p) log(p/(1-p))
• logit(0.5)
• Formal arguments
• Actual arguments
• Positional matching: plot(x,y)
• Keyword matching: t.test(x ~ g, mu=2,alternative="less")
• Partial matching: t.test(x ~ g, mu=2, alt="l")
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Simple functions
• logit <- function(p) log(p/(1-p))
• logit(0.5)
• Formal arguments
• Actual arguments
• Positional matching: plot(x,y)
• Keyword matching: t.test(x ~ g, mu=2,alternative="less")
• Partial matching: t.test(x ~ g, mu=2, alt="l")
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Simple functions
• logit <- function(p) log(p/(1-p))
• logit(0.5)
• Formal arguments
• Actual arguments
• Positional matching: plot(x,y)
• Keyword matching: t.test(x ~ g, mu=2,alternative="less")
• Partial matching: t.test(x ~ g, mu=2, alt="l")
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Flow control
• if/else
• ifelse()
• switch()
• for loops
• repeat , while
• break
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Flow control
• if/else
• ifelse()
• switch()
• for loops
• repeat , while
• break
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Flow control
• if/else
• ifelse()
• switch()
• for loops
• repeat , while
• break
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Flow control
• if/else
• ifelse()
• switch()
• for loops
• repeat , while
• break
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Flow control
• if/else
• ifelse()
• switch()
• for loops
• repeat , while
• break
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Flow control
• if/else
• ifelse()
• switch()
• for loops
• repeat , while
• break
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Apply-functions/loop avoidance
• lapply – list-apply
• sapply – simplifying apply
• tapply – tabulating apply
• apply , sweep – along slices of tables
• replicate – repeat expression
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Apply-functions/loop avoidance
• lapply – list-apply
• sapply – simplifying apply
• tapply – tabulating apply
• apply , sweep – along slices of tables
• replicate – repeat expression
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Apply-functions/loop avoidance
• lapply – list-apply
• sapply – simplifying apply
• tapply – tabulating apply
• apply , sweep – along slices of tables
• replicate – repeat expression
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Apply-functions/loop avoidance
• lapply – list-apply
• sapply – simplifying apply
• tapply – tabulating apply
• apply , sweep – along slices of tables
• replicate – repeat expression
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Apply-functions/loop avoidance
• lapply – list-apply
• sapply – simplifying apply
• tapply – tabulating apply
• apply , sweep – along slices of tables
• replicate – repeat expression
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Demo 10
# these examples all do the samepval <- numeric(1000)for (i in 1:1000)
pval[i] <- t.test(rexp(25),mu=1)$p.value
f <- function(i) t.test(rexp(25),mu=1)$p.valuepval <- sapply(1:1000, f)
pval <- replicate(1000, t.test(rexp(25),mu=1)$p.value)
qqplot(ppoints(1000), pval, pch=".")
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Summary
So what have we seen?
• R is a versatile working environment
• There is a very flexible toolkit for building graphics displays
• You can handle simple tasks quite easily
• Complicated task can be handled via ad hoc programming,often elegantly
• Extensions can be made to integrate seamlessly and alarge body of such extensions is available from CRAN
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Summary
So what have we seen?
• R is a versatile working environment
• There is a very flexible toolkit for building graphics displays
• You can handle simple tasks quite easily
• Complicated task can be handled via ad hoc programming,often elegantly
• Extensions can be made to integrate seamlessly and alarge body of such extensions is available from CRAN
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Summary
So what have we seen?
• R is a versatile working environment
• There is a very flexible toolkit for building graphics displays
• You can handle simple tasks quite easily
• Complicated task can be handled via ad hoc programming,often elegantly
• Extensions can be made to integrate seamlessly and alarge body of such extensions is available from CRAN
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Summary
So what have we seen?
• R is a versatile working environment
• There is a very flexible toolkit for building graphics displays
• You can handle simple tasks quite easily
• Complicated task can be handled via ad hoc programming,often elegantly
• Extensions can be made to integrate seamlessly and alarge body of such extensions is available from CRAN
About the Talk Basics of R Modeling The Package System Some Practical Issues Graphics Programming
Summary
So what have we seen?
• R is a versatile working environment
• There is a very flexible toolkit for building graphics displays
• You can handle simple tasks quite easily
• Complicated task can be handled via ad hoc programming,often elegantly
• Extensions can be made to integrate seamlessly and alarge body of such extensions is available from CRAN