Post on 25-Feb-2016
description
transcript
Henrik Bengtssonhb@maths.lth.se
Mathematical Statistics, Centre for Mathematical SciencesLund University, Sweden
DSC-2003, Vienna. March 20-22, 2003
The R.oo Package The R.oo Package – –
Object-Oriented Programming With References Object-Oriented Programming With References Using Standard R CodeUsing Standard R Code
2 of 22 For slides etc: http://www.maths.lth.se/help/R/
Outline
• Purpose and what the package is and is not.• RCC: R Coding Conventions (draft).• Reference variables.• The root class Object.• setMethodS3() & setConstructorS3().• Rdoc comments.• Static methods. • Virtual fields.• trycatch() - exception handling based on class.
3 of 22 For slides etc: http://www.maths.lth.se/help/R/
Purposes• End user (the most important person at the end of the day!)
– Provide consistent object-oriented APIs across different packages, e.g. by having a well defined naming convention for classes, methods, fields and variables.
– Make class inheritance more explicit.– Provide a simpler API, e.g. less arguments.– More memory efficient packages.
• Developer / programmer– Provide reference variables to reduce memory req.'s and data redundancy.– R Coding Convention, e.g. naming conventions.– Create generic functions automatically.– Make code cleaner and remove the need for tedious code repetitions.– Minimize the risk for package conflicts.– More code checking when creating methods and classes to catch errors early on.– Catch rare but “classical” bugs, e.g. using reserved words in method names. – Make help pages more up to date with the source code by allowing Rd document
to be placed together with the code in the source files.
4 of 22 For slides etc: http://www.maths.lth.se/help/R/
Real world example# Read all GenePix Result filesgpr <- MicroarrayData$read(pattern=“*.gpr”)
# Extract the foreground & background signals of the red and# the green channels. The slide layout is also included.raw <- as.RawData(gpr)
# Get the background corrected signal as M=log(R/G) and A=log(RG)/2.ma <- getSignal(raw, bgSubtract=TRUE)
normalizeWithinSlide(ma, method=“p”) # print-tip normalization.
knownGenes <- c(50,194,3433,5541,6384)plot(ma); highlight(ma, knownGenes) # highlights the data points from theplotPrintorder(ma); highlight(ma, knownGenes) # correct slide in the correct space.plotSpatial(ma); highlight(ma, knownGenes)plotSpatial3d(gpr, field=“area”, col=getColor(ma))
# Write the normalized data to a tab-delimited filewrite(ma, “NormalizedExpressions.dat”)
5 of 22 For slides etc: http://www.maths.lth.se/help/R/
What the package is and isn’t• It is not supposed to replace S3 or S4, but• it is an extra layer on top of S3 (eventually S4), to• move the focus from S3 & S4 details to
object-oriented design and implementation.
R.oo
R environment(S3 and eventually S4)
• It has been tested and verified for > 2 years!
6 of 22 For slides etc: http://www.maths.lth.se/help/R/
RCC: R Coding Conventions (draft)• Standardizes the coding style
– Example of the naming conventions:• Variables, objects, fields and methods
should verbs starting with a lower case letter, e.g. shape$side and normalize().
• Classes should be nouns starting with an upper case letter, e.g. MicroarrayData.
• Constants should be in all upper case, e.g. Colors$RED.HUE.
• Similar to Java.
• Standards– make the code (and the design) easier to
read, share and maintain.– reduce the risk for bugs and
misunderstandings.
http://www.maths.lth.se/help/R/RCC/
7 of 22 For slides etc: http://www.maths.lth.se/help/R/
Reference variables• Memory efficient.• Minimizes the amount redundant data.• Very useful for some data structures, e.g. graphs.• References in R.oo are implemented using the
environment data type.– Collected by the R garbage collector.
• (More user friendly methods interfaces since methods can “communicate” with each other by updating the state of the object.)
8 of 22 For slides etc: http://www.maths.lth.se/help/R/
A common root class: Object1. All classes should have the common root class Object.
– A similar idea exists in R today, e.g. print(), as.character() etc, but a common root class makes it more explicit.
Object$(name): ANY$<-(name, value)[[(name): ANY[[<-(name, value)as.character(): characterattach(private=FALSE, pos=2)clone(): Objectdata.class(): characterdetach()equals(other): logicalextend(this, ...className, ...): Objectfinalize()getFields(private=FALSE): character[]hashCode(): integerll(...): data.framestatic load(file): ObjectobjectSize(): integerprint()save(file=NULL, ...)
9 of 22 For slides etc: http://www.maths.lth.se/help/R/
Object – the common root class
Object
Exception
RccViolationException
R.oo
MicroarrayData GenePixData
ImaGeneData
QuantArrayData
ScanAlyzeData
SpotData
SpotFinderData
MAData
RawData
RGData
TMAData
Layout
GalLayout
com.braju.sma
Reporter
HtmlReporter
LaTeXReporter
TextReporter
MultiReporter
R.io
File
FileFilter
RspEngine
BitmapImage
MonochromeImage
GrayImage
RGBImage
R.graphics
Color Device
10 of 22 For slides etc: http://www.maths.lth.se/help/R/
A common root class: Object1. All classes should have the common root class Object.
– A similar idea exists in R today, e.g. print(), as.character() etc, but a common root class makes it more explicit.
2. Fields of an Object can be accessed as elements of a list, e.g.:– square$side and– square[[“side”]] <- 23
3. Methods can also be called as– square$getArea()
4. The implementation of reference variables is taken care of within the Object class. Under the hood, we roughly have:
”$.Object” <- function(object, name) { get(name, envir=object$env)
}
”$<-.Object” <- function(object, name, value) { assign(name, value, envir=object$env)
}
Object$(name): ANY$<-(name, value)[[(name): ANY[[<-(name, value)as.character(): characterattach(private=FALSE, pos=2)clone(): Objectdata.class(): characterdetach()equals(other): logicalextend(this, ...className, ...): Objectfinalize()getFields(private=FALSE): character[]hashCode(): integerll(...): data.framestatic load(file): ObjectobjectSize(): integerprint()save(file=NULL, ...)
11 of 22 For slides etc: http://www.maths.lth.se/help/R/
• Defines a method of a class.
• Creates a generic function automatically iff missing.• RCC:
– Methods should start with a lower case letter.– Asserts that a correct method name is used; reserved words and
names of basic functions that must not be overwritten or redefined are protected.
setMethodS3()Does not
require theObject class
setMethodS3(“plotPrintorder”, “MAData”, function(object, ...) { ...})
setMethodS3(“next”, “Iterator”, function(object, ...) { ... })
Error: [2003-03-18 16:28:00] RccViolationException: Method names must not be same as a reserved keyword in R: next, cf. http://www.maths.lth.se/help/R/RCC/
12 of 22 For slides etc: http://www.maths.lth.se/help/R/
Problems with generic functions• Hard to check if function (generic or not) already exists.• Ad hoc solutions for creating generic function “automatically”.• Under the S3 schema, it is possible to create generic functions that are
truly generic:
normalize <- function(...) UseMethod(“normalize”)
Note that the first argument is omitted. If not, it would be impossible to have default functions with no arguments, e.g. search().
• The R.oo package automatically creates generic functions as above.• We are not aware of how to do the same in S4 (this is the main reason for
why R.oo is currently staying with S3).
13 of 22 For slides etc: http://www.maths.lth.se/help/R/
• Defines the constructor method of a class, but also the class.
• RCC:– Asserts that a correct class name is used; reserved words and names
of basic functions that must not be overwritten or redefined are protected.
– Class and constructor names should start with an UPPER CASE letter.– Constructors should be named the same as the class.
setConstructorS3()
setConstructorS3(“MAData”, function(M, A, layout=NULL) { extend(MicroarrayData(layout=layout), “MAData”, M = as.matrix(M), A = as.matrix(A) )})
Constructor/class definition hybrid:Creates an object of the super class, which isthen “extended” into an MAData object with additional fields.
Does notrequire the
Object class
14 of 22 For slides etc: http://www.maths.lth.se/help/R/
Quick inspection of a class• print(<class name>) or simply type the class name at
the prompt and press ENTER, e.g.
> MADataMAData extends MicroarrayData, Object { public A public layout public M ... normalizeWithinSlide(...) ... public plot(what="MvsA", ...) public plot3d(...) public plotPrintorder(what="M", ...) ... public print(...) public save(file=NULL, path=NULL, ...)}
MicroarrayData
MADataA: matrixM: matrix
as.RGData(): RGData...normalizeWithinSlide(...)normalizeAcrossSlides(...)...
Object
...plot(...)plot3d(...)plotPrintorder(...)...
Layoutngrid.c: integerngrid.r: integernspot.c: integernspot.r: integer
...getName(...): charactergetId(...): character...nbrOfSpots(): integernbrOfGrids(): integer...
15 of 22 For slides etc: http://www.maths.lth.se/help/R/
• print(<object>) or simply <object> and ENTER at the prompt, which by default is equal to print(as.character(<object>)), e.g.
> ma[1] "MAData: M (5184x4), A (5184x4), Layout: Grids: 4x4 (=16), spots in grids:18x18 (=324), total number of spots: 5184. Spot name's are specified. Spot id's are specified."
• ll(<object>) gives details information about the (public) fields, e.g.
Quick inspection of an object
> ll(ma) member data.class dimension object.size
1 A SpotSlideArray c(5184,4) 1660082 layout Layout 1 4283 M SpotSlideArray c(5184,4) 166008
> ll(ma$layout) # or ll(getLayout(ma)) member data.class dimens2ion object.size1 geneGrps NULL 0 02 geneSpotMap NULL 0 03 id character 5184 638684 ngrid.c numeric 1 36... 11 printtipGrps NULL 0 0
16 of 22 For slides etc: http://www.maths.lth.se/help/R/
Rdoc: Source-to-Rd converter• Rdoc comments are Rd
documentation within the source files:– easy to generate complete
Rd files from source files.– less risk to forget to update
Rd files.– automatically generates
class hierarchy and method lists.
– extra tags to include external files, e.g. example code.
#####################################################################/**# @Class Matlab## \title{Matlab client for remote or local Matlab access}## \description{# @include "Matlab.declaration.Rdoc"# }## \usage{# matlab <- Matlab(host="localhost", port=9999, remote=FALSE)# }## \arguments{# \item{host}{Name of host to connect to. # Default value is \code{localhost}.}# \item{port}{Port number on host to connect to. # Default value is \code{9999}.}# \item{remote}{If \code{TRUE}, all data to and from the Matlab server will# be transferred through the socket connection, otherwise the data will# be transferred via a temporary file. Default value is \code{FALSE}.}# }## \section{Fields and Methods}{# @include "Matlab.methods.Rdoc"# @include "Matlab.inheritedMethods.Rdoc"# }## \examples{\dontrun{@include "Matlab.Rex"}}## \author{Henrik Bengtsson, \url{http://www.braju.com/R/}}## \seealso{# Stand-alone methods \code{\link{readMAT}()} and \code{\link{writeMAT}()}# for reading and writing MAT file structures.# }## @visibility public#*/######################################################################setConstructorS3("Matlab", function(host="localhost", port=9999, remote=FALSE) { extend(Object(), "Matlab", ...
Does notrequire the
Object class
17 of 22 For slides etc: http://www.maths.lth.se/help/R/
Static methods• Methods that are specific to a class and do not belong to a
certain object.
• Keeps the focus on classes/objects, not methods.– For instance, static method names are easy to remember for the end
user (“first class then method”), e.g.
• MicroarrayData$read(“slide1.gpr”)• Sound$read(“chime.wav”)• Colors$getHeatColors(1:10)
instead of • readMicroarrayData(“slide1.gpr”)• readSound(“chime.wav”)• getHeatColors(1:10)
which might not even be unique!
18 of 22 For slides etc: http://www.maths.lth.se/help/R/
Virtual fields• Virtual fields are fields that does not exist, but appears to do so
because of existing methods get<Field>() and set<Field>().
– Example 1: The virtual field area of the Square class is defined by defining getArea() and setArea():
• square$area will call getArea(square), which will return the area (´calculated from the field side or in some other way)
• square$area <- -12 will call setArea(square, -12), which then throws an OutOfRangeException.
– Example 2: Private fields, e.g. side, can be protected by defining setSide(), which throws a NoSuchFieldException.
– Example 3: The constant field RED.HUE can be write protected by defining setRED.HUE(), which throws an AssignmentException.
– Example 4: Provide cached fields that can be calculated from the other fields, but can be cached in case they are accessed often at it takes a long time to calculate them. The cache can be removed in case of low memory.
19 of 22 For slides etc: http://www.maths.lth.se/help/R/
Summary examplesetConstructorS3(“Square”, function(side=0) { # Creates an object of class Square. Square, whose fields are # defined at the same time, extends the class Shape. extend(Shape(), “Square”, side = side # ‘side’ is public )})
setMethodS3(“setSide”, “Square”, function(this, side) { # sq$side <- “a” will throw a NonNumericException if (!is.numeric(side)) throw(NonNumericException(“Trying to set the side of a square \ to a non-numeric value: “, side))
# sq$side <- -12 will throw an OutOfRangeException if (side < 0) throw(OutOfRangeException(“The side of a square must be zero \ or greater: “, side))
this$side <- side # Assignment remains also after returning!})
20 of 22 For slides etc: http://www.maths.lth.se/help/R/
Extended exception handling• Throw Exception objects, which can be
caught (quietly) based on class, e.g.
trycatch({ # Calls setArea(), which throws an OutOfRangeException. sq$side <- -12 }, NonNumericException = { cat(“The side of a square must be a numeric value.\n”)}, ANY = { # catches any other types of Exception (also try-error). print(Exception$getLastException())}, finally = { # always double the side whatever happens. sq$side <- 2*sq$side})
Object
Exception
RccViolationException
R.oo
OutOfRangeException
NonNumericException
Exception
static getLastException(): ExceptiongetMessage(): charactergetWhen(): POSIX timethrow()
Error: [2003-03-08 12:11:43] OutOfRangeException:The side of a square must be zero or greater: -12
Does notrequire the
Object class
21 of 22 For slides etc: http://www.maths.lth.se/help/R/
Future• Make the API (even) more similar to the S4 API
– Makes transitions to and from R.oo (and S4), easier.– Less confusing for beginners.
• Make an S4 version of the package– When the problem “generic functions are too restricted on matching
argument” is solved.
• Make it easier to declare private fields or constants.• Implement the mechanisms for field access in native code.• Publish R.oo on CRAN
– Requires a stable API. After 2+ years it is indeed very stable, but any major changes after v1.0 will be annoying for the user.
22 of 22 For slides etc: http://www.maths.lth.se/help/R/
Acknowledgments• The R development team• People on the r-help mailing list• All users that have given feedback on the project
See http://www.maths.lth.se/help/R/ for
RCC, more documentation, help, examples, and installation of
R.classes bundle: R.audio, R.base, R.graphics, R.io, R.lang, R.matlab, R.oo, R.tcltk, R.ui,
cDNA microarray package: com.braju.sma.