+ All Categories
Home > Documents > Henrik Bengtsson [email protected] Mathematical Statistics, Centre for Mathematical Sciences

Henrik Bengtsson [email protected] Mathematical Statistics, Centre for Mathematical Sciences

Date post: 25-Feb-2016
Category:
Upload: harris
View: 40 times
Download: 0 times
Share this document with a friend
Description:
The R.oo Package – Object-Oriented Programming With References Using Standard R Code. Henrik Bengtsson [email protected] Mathematical Statistics, Centre for Mathematical Sciences Lund University, Sweden DSC-2003, Vienna. March 20-22, 2003. Outline. - PowerPoint PPT Presentation
Popular Tags:
22
Henrik Bengtsson [email protected] Mathematical Statistics, Centre for Mathematical Sciences Lund University, Sweden DSC-2003, Vienna. March 20-22, 2003 The R.oo Package The R.oo Package Object-Oriented Programming With Object-Oriented Programming With References References Using Standard R Code Using Standard R Code
Transcript
Page 1: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

Henrik [email protected]

Mathematical Statistics, Centre for Mathematical SciencesLund University, Sweden

DSC-2003, Vienna. March 20-22, 2003

The R.oo Package The R.oo Package – –

Object-Oriented Programming With References Object-Oriented Programming With References Using Standard R CodeUsing Standard R Code

Page 2: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

2 of 22 For slides etc: http://www.maths.lth.se/help/R/

Outline

• Purpose and what the package is and is not.• RCC: R Coding Conventions (draft).• Reference variables.• The root class Object.• setMethodS3() & setConstructorS3().• Rdoc comments.• Static methods. • Virtual fields.• trycatch() - exception handling based on class.

Page 3: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

3 of 22 For slides etc: http://www.maths.lth.se/help/R/

Purposes• End user (the most important person at the end of the day!)

– Provide consistent object-oriented APIs across different packages, e.g. by having a well defined naming convention for classes, methods, fields and variables.

– Make class inheritance more explicit.– Provide a simpler API, e.g. less arguments.– More memory efficient packages.

• Developer / programmer– Provide reference variables to reduce memory req.'s and data redundancy.– R Coding Convention, e.g. naming conventions.– Create generic functions automatically.– Make code cleaner and remove the need for tedious code repetitions.– Minimize the risk for package conflicts.– More code checking when creating methods and classes to catch errors early on.– Catch rare but “classical” bugs, e.g. using reserved words in method names. – Make help pages more up to date with the source code by allowing Rd document

to be placed together with the code in the source files.

Page 4: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

4 of 22 For slides etc: http://www.maths.lth.se/help/R/

Real world example# Read all GenePix Result filesgpr <- MicroarrayData$read(pattern=“*.gpr”)

# Extract the foreground & background signals of the red and# the green channels. The slide layout is also included.raw <- as.RawData(gpr)

# Get the background corrected signal as M=log(R/G) and A=log(RG)/2.ma <- getSignal(raw, bgSubtract=TRUE)

normalizeWithinSlide(ma, method=“p”) # print-tip normalization.

knownGenes <- c(50,194,3433,5541,6384)plot(ma); highlight(ma, knownGenes) # highlights the data points from theplotPrintorder(ma); highlight(ma, knownGenes) # correct slide in the correct space.plotSpatial(ma); highlight(ma, knownGenes)plotSpatial3d(gpr, field=“area”, col=getColor(ma))

# Write the normalized data to a tab-delimited filewrite(ma, “NormalizedExpressions.dat”)

Page 5: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

5 of 22 For slides etc: http://www.maths.lth.se/help/R/

What the package is and isn’t• It is not supposed to replace S3 or S4, but• it is an extra layer on top of S3 (eventually S4), to• move the focus from S3 & S4 details to

object-oriented design and implementation.

R.oo

R environment(S3 and eventually S4)

• It has been tested and verified for > 2 years!

Page 6: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

6 of 22 For slides etc: http://www.maths.lth.se/help/R/

RCC: R Coding Conventions (draft)• Standardizes the coding style

– Example of the naming conventions:• Variables, objects, fields and methods

should verbs starting with a lower case letter, e.g. shape$side and normalize().

• Classes should be nouns starting with an upper case letter, e.g. MicroarrayData.

• Constants should be in all upper case, e.g. Colors$RED.HUE.

• Similar to Java.

• Standards– make the code (and the design) easier to

read, share and maintain.– reduce the risk for bugs and

misunderstandings.

http://www.maths.lth.se/help/R/RCC/

Page 7: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

7 of 22 For slides etc: http://www.maths.lth.se/help/R/

Reference variables• Memory efficient.• Minimizes the amount redundant data.• Very useful for some data structures, e.g. graphs.• References in R.oo are implemented using the

environment data type.– Collected by the R garbage collector.

• (More user friendly methods interfaces since methods can “communicate” with each other by updating the state of the object.)

Page 8: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

8 of 22 For slides etc: http://www.maths.lth.se/help/R/

A common root class: Object1. All classes should have the common root class Object.

– A similar idea exists in R today, e.g. print(), as.character() etc, but a common root class makes it more explicit.

Object$(name): ANY$<-(name, value)[[(name): ANY[[<-(name, value)as.character(): characterattach(private=FALSE, pos=2)clone(): Objectdata.class(): characterdetach()equals(other): logicalextend(this, ...className, ...): Objectfinalize()getFields(private=FALSE): character[]hashCode(): integerll(...): data.framestatic load(file): ObjectobjectSize(): integerprint()save(file=NULL, ...)

Page 9: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

9 of 22 For slides etc: http://www.maths.lth.se/help/R/

Object – the common root class

Object

Exception

RccViolationException

R.oo

MicroarrayData GenePixData

ImaGeneData

QuantArrayData

ScanAlyzeData

SpotData

SpotFinderData

MAData

RawData

RGData

TMAData

Layout

GalLayout

com.braju.sma

Reporter

HtmlReporter

LaTeXReporter

TextReporter

MultiReporter

R.io

File

FileFilter

RspEngine

BitmapImage

MonochromeImage

GrayImage

RGBImage

R.graphics

Color Device

Page 10: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

10 of 22 For slides etc: http://www.maths.lth.se/help/R/

A common root class: Object1. All classes should have the common root class Object.

– A similar idea exists in R today, e.g. print(), as.character() etc, but a common root class makes it more explicit.

2. Fields of an Object can be accessed as elements of a list, e.g.:– square$side and– square[[“side”]] <- 23

3. Methods can also be called as– square$getArea()

4. The implementation of reference variables is taken care of within the Object class. Under the hood, we roughly have:

”$.Object” <- function(object, name) { get(name, envir=object$env)

}

”$<-.Object” <- function(object, name, value) { assign(name, value, envir=object$env)

}

Object$(name): ANY$<-(name, value)[[(name): ANY[[<-(name, value)as.character(): characterattach(private=FALSE, pos=2)clone(): Objectdata.class(): characterdetach()equals(other): logicalextend(this, ...className, ...): Objectfinalize()getFields(private=FALSE): character[]hashCode(): integerll(...): data.framestatic load(file): ObjectobjectSize(): integerprint()save(file=NULL, ...)

Page 11: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

11 of 22 For slides etc: http://www.maths.lth.se/help/R/

• Defines a method of a class.

• Creates a generic function automatically iff missing.• RCC:

– Methods should start with a lower case letter.– Asserts that a correct method name is used; reserved words and

names of basic functions that must not be overwritten or redefined are protected.

setMethodS3()Does not

require theObject class

setMethodS3(“plotPrintorder”, “MAData”, function(object, ...) { ...})

setMethodS3(“next”, “Iterator”, function(object, ...) { ... })

Error: [2003-03-18 16:28:00] RccViolationException: Method names must not be same as a reserved keyword in R: next, cf. http://www.maths.lth.se/help/R/RCC/

Page 12: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

12 of 22 For slides etc: http://www.maths.lth.se/help/R/

Problems with generic functions• Hard to check if function (generic or not) already exists.• Ad hoc solutions for creating generic function “automatically”.• Under the S3 schema, it is possible to create generic functions that are

truly generic:

normalize <- function(...) UseMethod(“normalize”)

Note that the first argument is omitted. If not, it would be impossible to have default functions with no arguments, e.g. search().

• The R.oo package automatically creates generic functions as above.• We are not aware of how to do the same in S4 (this is the main reason for

why R.oo is currently staying with S3).

Page 13: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

13 of 22 For slides etc: http://www.maths.lth.se/help/R/

• Defines the constructor method of a class, but also the class.

• RCC:– Asserts that a correct class name is used; reserved words and names

of basic functions that must not be overwritten or redefined are protected.

– Class and constructor names should start with an UPPER CASE letter.– Constructors should be named the same as the class.

setConstructorS3()

setConstructorS3(“MAData”, function(M, A, layout=NULL) { extend(MicroarrayData(layout=layout), “MAData”, M = as.matrix(M), A = as.matrix(A) )})

Constructor/class definition hybrid:Creates an object of the super class, which isthen “extended” into an MAData object with additional fields.

Does notrequire the

Object class

Page 14: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

14 of 22 For slides etc: http://www.maths.lth.se/help/R/

Quick inspection of a class• print(<class name>) or simply type the class name at

the prompt and press ENTER, e.g.

> MADataMAData extends MicroarrayData, Object { public A public layout public M ... normalizeWithinSlide(...) ... public plot(what="MvsA", ...) public plot3d(...) public plotPrintorder(what="M", ...) ... public print(...) public save(file=NULL, path=NULL, ...)}

MicroarrayData

MADataA: matrixM: matrix

as.RGData(): RGData...normalizeWithinSlide(...)normalizeAcrossSlides(...)...

Object

...plot(...)plot3d(...)plotPrintorder(...)...

Layoutngrid.c: integerngrid.r: integernspot.c: integernspot.r: integer

...getName(...): charactergetId(...): character...nbrOfSpots(): integernbrOfGrids(): integer...

Page 15: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

15 of 22 For slides etc: http://www.maths.lth.se/help/R/

• print(<object>) or simply <object> and ENTER at the prompt, which by default is equal to print(as.character(<object>)), e.g.

> ma[1] "MAData: M (5184x4), A (5184x4), Layout: Grids: 4x4 (=16), spots in grids:18x18 (=324), total number of spots: 5184. Spot name's are specified. Spot id's are specified."

• ll(<object>) gives details information about the (public) fields, e.g.

Quick inspection of an object

> ll(ma) member data.class dimension object.size

1 A SpotSlideArray c(5184,4) 1660082 layout Layout 1 4283 M SpotSlideArray c(5184,4) 166008

> ll(ma$layout) # or ll(getLayout(ma)) member data.class dimens2ion object.size1 geneGrps NULL 0 02 geneSpotMap NULL 0 03 id character 5184 638684 ngrid.c numeric 1 36... 11 printtipGrps NULL 0 0

Page 16: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

16 of 22 For slides etc: http://www.maths.lth.se/help/R/

Rdoc: Source-to-Rd converter• Rdoc comments are Rd

documentation within the source files:– easy to generate complete

Rd files from source files.– less risk to forget to update

Rd files.– automatically generates

class hierarchy and method lists.

– extra tags to include external files, e.g. example code.

#####################################################################/**# @Class Matlab## \title{Matlab client for remote or local Matlab access}## \description{# @include "Matlab.declaration.Rdoc"# }## \usage{# matlab <- Matlab(host="localhost", port=9999, remote=FALSE)# }## \arguments{# \item{host}{Name of host to connect to. # Default value is \code{localhost}.}# \item{port}{Port number on host to connect to. # Default value is \code{9999}.}# \item{remote}{If \code{TRUE}, all data to and from the Matlab server will# be transferred through the socket connection, otherwise the data will# be transferred via a temporary file. Default value is \code{FALSE}.}# }## \section{Fields and Methods}{# @include "Matlab.methods.Rdoc"# @include "Matlab.inheritedMethods.Rdoc"# }## \examples{\dontrun{@include "Matlab.Rex"}}## \author{Henrik Bengtsson, \url{http://www.braju.com/R/}}## \seealso{# Stand-alone methods \code{\link{readMAT}()} and \code{\link{writeMAT}()}# for reading and writing MAT file structures.# }## @visibility public#*/######################################################################setConstructorS3("Matlab", function(host="localhost", port=9999, remote=FALSE) { extend(Object(), "Matlab", ...

Does notrequire the

Object class

Page 17: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

17 of 22 For slides etc: http://www.maths.lth.se/help/R/

Static methods• Methods that are specific to a class and do not belong to a

certain object.

• Keeps the focus on classes/objects, not methods.– For instance, static method names are easy to remember for the end

user (“first class then method”), e.g.

• MicroarrayData$read(“slide1.gpr”)• Sound$read(“chime.wav”)• Colors$getHeatColors(1:10)

instead of • readMicroarrayData(“slide1.gpr”)• readSound(“chime.wav”)• getHeatColors(1:10)

which might not even be unique!

Page 18: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

18 of 22 For slides etc: http://www.maths.lth.se/help/R/

Virtual fields• Virtual fields are fields that does not exist, but appears to do so

because of existing methods get<Field>() and set<Field>().

– Example 1: The virtual field area of the Square class is defined by defining getArea() and setArea():

• square$area will call getArea(square), which will return the area (´calculated from the field side or in some other way)

• square$area <- -12 will call setArea(square, -12), which then throws an OutOfRangeException.

– Example 2: Private fields, e.g. side, can be protected by defining setSide(), which throws a NoSuchFieldException.

– Example 3: The constant field RED.HUE can be write protected by defining setRED.HUE(), which throws an AssignmentException.

– Example 4: Provide cached fields that can be calculated from the other fields, but can be cached in case they are accessed often at it takes a long time to calculate them. The cache can be removed in case of low memory.

Henrik Bengtsson
apoAI = apolipoprotein AI
Page 19: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

19 of 22 For slides etc: http://www.maths.lth.se/help/R/

Summary examplesetConstructorS3(“Square”, function(side=0) { # Creates an object of class Square. Square, whose fields are # defined at the same time, extends the class Shape. extend(Shape(), “Square”, side = side # ‘side’ is public )})

setMethodS3(“setSide”, “Square”, function(this, side) { # sq$side <- “a” will throw a NonNumericException if (!is.numeric(side)) throw(NonNumericException(“Trying to set the side of a square \ to a non-numeric value: “, side))

# sq$side <- -12 will throw an OutOfRangeException if (side < 0) throw(OutOfRangeException(“The side of a square must be zero \ or greater: “, side))

this$side <- side # Assignment remains also after returning!})

Page 20: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

20 of 22 For slides etc: http://www.maths.lth.se/help/R/

Extended exception handling• Throw Exception objects, which can be

caught (quietly) based on class, e.g.

trycatch({ # Calls setArea(), which throws an OutOfRangeException. sq$side <- -12 }, NonNumericException = { cat(“The side of a square must be a numeric value.\n”)}, ANY = { # catches any other types of Exception (also try-error). print(Exception$getLastException())}, finally = { # always double the side whatever happens. sq$side <- 2*sq$side})

Object

Exception

RccViolationException

R.oo

OutOfRangeException

NonNumericException

Exception

static getLastException(): ExceptiongetMessage(): charactergetWhen(): POSIX timethrow()

Error: [2003-03-08 12:11:43] OutOfRangeException:The side of a square must be zero or greater: -12

Does notrequire the

Object class

Page 21: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

21 of 22 For slides etc: http://www.maths.lth.se/help/R/

Future• Make the API (even) more similar to the S4 API

– Makes transitions to and from R.oo (and S4), easier.– Less confusing for beginners.

• Make an S4 version of the package– When the problem “generic functions are too restricted on matching

argument” is solved.

• Make it easier to declare private fields or constants.• Implement the mechanisms for field access in native code.• Publish R.oo on CRAN

– Requires a stable API. After 2+ years it is indeed very stable, but any major changes after v1.0 will be annoying for the user.

Page 22: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences

22 of 22 For slides etc: http://www.maths.lth.se/help/R/

Acknowledgments• The R development team• People on the r-help mailing list• All users that have given feedback on the project

See http://www.maths.lth.se/help/R/ for

RCC, more documentation, help, examples, and installation of

R.classes bundle: R.audio, R.base, R.graphics, R.io, R.lang, R.matlab, R.oo, R.tcltk, R.ui,

cDNA microarray package: com.braju.sma.


Recommended