+ All Categories
Home > Documents > Di 2011 houston

Di 2011 houston

Date post: 02-Jul-2015
Category:
Upload: yintengfei
View: 1,883 times
Download: 0 times
Share this document with a friend
81
Interface 2012, Rice University Stepping 0 5 10 15 0 100 200 300 Coverage 1 2 48245000 48250000 48255000 48260000 48265000 48270000 hg19::chrX strand + 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y 0.0e+00 5.0e+07 1.0e+08 1.5e+08 2.0e+08 gieStain acen gneg gpos100 gpos25 gpos50 gpos75 gvar stalk l l l l l l l l l l l l l l l l l l l l l l l 0M 50M 100M 150M 200M 0M 50M 100M 150M 200M 0M 50M 100M 150M 0M 50M 100M 150M 0M 50M 100M 150M 0M 50M 100M 150M 0M 50M 100M 150M 0M 50M 100M 0M 50M 100M 0M 50M 100M 0M 50M 100M 0M 50M 100M 0M 50M 100M 0M 50M 100M 0M 50M 100M 0M 50M 0M 50M 0M 50M 0M 50M 0M 50M 0M 0M 50M 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 rearrangements interchromosomal intrachromosomal tumreads l l l l l 4 6 8 10 12 ggbio Extending the Grammar of Graphics to Genomic Data Tengfei Yin, Di Cook Iowa State University Michael Lawrence Genentech
Transcript
Page 1: Di 2011 houston

Interface 2012, Rice UniversityStepping

0

5

10

15

0 100 200 300

Coverage

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

12

34

56

78

910

1112

1314

1516

1718

1920

2122

XY

0.0e+00 5.0e+07 1.0e+08 1.5e+08 2.0e+08

gieStainacengneggpos100gpos25gpos50gpos75gvarstalk

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

ll

0M 50M

100M

150M

200M

0M

50M

100M

150M

200M

0M

50M

100M

150M

0M

50M

100M

150M

0M

50M

100M

150M

0M

50M

100M

150M

0M

50M100M150M0M50M100M

0M

50M

100M

0M

50M

100M

0M

50M100M

0M

50M

100M

0M

50M

100M

0M

50M

100M

0M

50M

100M

0M

50M

0M

50M

0M

50M

0M

50M 0M

50M 0M

0M 50M 1

2

34

5

6

7

8910

11

12

1314

1516

1718

1920

21 22

rearrangementsinterchromosomalintrachromosomal

tumreadsl

l

l

l

l

4681012

ggbio Extending the

Grammar of Graphics to Genomic Data

Tengfei Yin, Di Cook Iowa State UniversityMichael Lawrence

Genentech

Page 2: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

MotivationLots of tools exist for displaying genomic dataMany different packages, many standalone, many different data standards

2

Page 3: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

3

Page 4: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

3

Circos

Page 5: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

3

Circos

Page 6: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

3

Circos

Figure 2

Page 7: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

3

Circos

Page 8: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

3

Circos

Page 9: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

3

Circos

Page 10: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

3

CircosNeed construct a central and many other

configuration files from scratch, learning curve is very high

Adding legend not easyCannot map aesthetics to certain

variables 

Page 11: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

Page 12: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

UCSC Genome Browser

Page 13: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

UCSC Genome BrowserKaryogram view, with associated data plotted

Page 14: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

UCSC Genome BrowserKaryogram view, with associated data plotted

Page 15: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

UCSC Genome Browser

Page 16: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

UCSC Genome Browser

Page 17: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

UCSC Genome Browser

Logical zoom, all we know about

this genetic code

Page 18: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

UCSC Genome Browser

Logical zoom, all we know about

this genetic code

Page 19: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

UCSC Genome Browser

Page 20: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

4

UCSC Genome BrowserVery commonly used, very popularGives broadly applicable, generic, but

narrow selection of plot choicesNo operations on genomic ranges views to

facilitate perception of structure

Page 21: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

5

Chromosome X

65.93 mb

65.94 mb

65.95 mb

65.96 mb

65.97 mb

3040506070

−2024

5 Composite plots for multiple chromosomes

As mentioned in the introduction section, a set of Gviz tracks has to share the same chromosome when

plotted, i.e., only a single chromosome can be active during a given plotting operation. Consequently, we can

not directly create plots for multiple chromosomes in a single call to the plotTracks function. However, since

the underlying graphical infrastructure of the Gviz package uses grid graphics, we can build our own composite

plot using multiple consecutive plotTracks calls. All we need to take care of is an adequate layout structure

to plot into, and we also need to tell plotTracks not to clear the graphics device before plotting, which can

be archieved by setting the function’s add argument to FALSE. For details on how to create a layout structure

in the grid graphics system, please see the help page at ? grid.layout).

We start by creating an AnnotationTrack objects and a DataTrack object which both contain data for

several chromosomes.

> chroms <- c("chr1", "chr2", "chr3", "chr4")

> maTrack <- AnnotationTrack(range = GRanges(seqnames = chroms,

+ ranges = IRanges(start = 1, width = c(100, 400,

+ 200, 1000)), strand = c("+", "+", "-", "+")),

+ genome = "mm9", chromosome = "chr1", name = "foo")

> mdTrack <- DataTrack(range = GRanges(seqnames = rep(chroms,

+ c(10, 40, 20, 100)), ranges = IRanges(start = c(seq(1,

38

Page 22: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

5

Gviz (Hahne et al)Chromosome X

65.93 mb

65.94 mb

65.95 mb

65.96 mb

65.97 mb

3040506070

−2024

5 Composite plots for multiple chromosomes

As mentioned in the introduction section, a set of Gviz tracks has to share the same chromosome when

plotted, i.e., only a single chromosome can be active during a given plotting operation. Consequently, we can

not directly create plots for multiple chromosomes in a single call to the plotTracks function. However, since

the underlying graphical infrastructure of the Gviz package uses grid graphics, we can build our own composite

plot using multiple consecutive plotTracks calls. All we need to take care of is an adequate layout structure

to plot into, and we also need to tell plotTracks not to clear the graphics device before plotting, which can

be archieved by setting the function’s add argument to FALSE. For details on how to create a layout structure

in the grid graphics system, please see the help page at ? grid.layout).

We start by creating an AnnotationTrack objects and a DataTrack object which both contain data for

several chromosomes.

> chroms <- c("chr1", "chr2", "chr3", "chr4")

> maTrack <- AnnotationTrack(range = GRanges(seqnames = chroms,

+ ranges = IRanges(start = 1, width = c(100, 400,

+ 200, 1000)), strand = c("+", "+", "-", "+")),

+ genome = "mm9", chromosome = "chr1", name = "foo")

> mdTrack <- DataTrack(range = GRanges(seqnames = rep(chroms,

+ c(10, 40, 20, 100)), ranges = IRanges(start = c(seq(1,

38

Page 23: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

5

Gviz (Hahne et al)

Page 24: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Motivation

5

Gviz (Hahne et al)Pretty good!Incorporated with R, and R data

structuresUses grid (low level) graphics, very

flexible, but not leveraging tools like ggplot2

Page 25: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

OutlineWhat is the grammar of graphics?How it is extended for genomic data.ExamplesNext steps: interactive graphics

6

Page 26: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

GrammarGrammar forms the foundation of a language. It is a set of structural rules that govern composition.For graphics, it provides a way to construct a plot in a common form, and enables clarification of similarities and differences between plots.

7

Page 27: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)

day

count

0

20

40

60

80

Thu Fri Sat Sun

dayThu

Fri

Sat

Sun

Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)

8

Page 28: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)

day

count

0

20

40

60

80

Thu Fri Sat Sun

dayThu

Fri

Sat

Sun

Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)

Pie chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()

8

Page 29: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)

day

count

0

20

40

60

80

Thu Fri Sat Sun

dayThu

Fri

Sat

Sun

Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)

Pie chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()

day

count

020406080

Thu

FriSat

Sunday

Thu

Fri

Sat

Sun

8

Page 30: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)

day

count

0

20

40

60

80

Thu Fri Sat Sun

dayThu

Fri

Sat

Sun

Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)

Pie chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()

day

count

020406080

Thu

FriSat

Sunday

Thu

Fri

Sat

Sun

8

Page 31: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)

day

count

0

20

40

60

80

Thu Fri Sat Sun

dayThu

Fri

Sat

Sun

Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)

Pie chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()

day

count

020406080

Thu

FriSat

Sunday

Thu

Fri

Sat

Sun

8

Page 32: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)

day

count

0

20

40

60

80

Thu Fri Sat Sun

dayThu

Fri

Sat

Sun

Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)

day

count

020406080

Thu

FriSat

Sunday

Thu

Fri

Sat

Sun

8

Page 33: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Rose plot/Coxcombggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()

Grammar (ggplot2)

day

count

0

20

40

60

80

Thu Fri Sat Sun

dayThu

Fri

Sat

Sun

Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)

day

count

020406080

Thu

FriSat

Sunday

Thu

Fri

Sat

Sun

8

Page 34: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)

day

count

0

20

40

60

80

Thu Fri Sat Sun

dayThu

Fri

Sat

Sun

Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)

day

count

020406080

Thu

FriSat

Sunday

Thu

Fri

Sat

Sun

8

Page 35: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)Stacked bar chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1)

""

count

0

50

100

150

200

dayThu

Fri

Sat

Sun

9

Page 36: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)Stacked bar chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1)

Pie chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1) + coord_polar(theta="y")

""

count

0

50

100

150

200

dayThu

Fri

Sat

Sun

9

Page 37: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar (ggplot2)Stacked bar chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1)

Pie chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1) + coord_polar(theta="y")

""

count

0

50

100

150

200

dayThu

Fri

Sat

Sun

""

count

0

50

100150

200 dayThu

Fri

Sat

Sun

9

Page 38: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Grammar ElementsDATA: What is to be plottedSTAT: Statistical operations to make on data, like binning.GEOM: Geometric object, elements to use to displays aspects of the dataSCALE: Map data to aesthetics to geomCOORD: Coordinate system to use, eg Cartesian(FACET): subset and display

10

Page 39: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

Page 40: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

DATA=expression data frame, x=average intensity, y=fold change

Page 41: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

Page 42: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plotGEOM=point

Page 43: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

Page 44: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

SCALE=x is logged

Page 45: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

Page 46: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

SCALE=color is mapped to statistical significance

Page 47: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

Page 48: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

COORD=default, Cartesian

Page 49: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

Page 50: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

FACET=none

Page 51: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3111

Example: MA plot

Page 52: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /3112

Example: MA plotqplot(baseMean, log2FoldChange, data = res, geom = "point", xlab = "Normalized mean", ylab = "log2 fold change", xlim = c(0, 10000), color = group) + scale_x_log10() + scale_color_manual( values = c("black", "red"))

Page 53: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

What’s different?Genomic data has interval contextSeveral common geoms used in standard plots, not in current grammarAdditional transformations commonLining up of multiple data plots, especially against genome

13

Page 54: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

What’s different?

14

No seqnames ranges strand tx id exon id

1 chrX [48242968, 48243005] + 35775 132624

2 chrX [48243475, 48243563] + 35775 132625

3 chrX [48244003, 48244117] + 35775 132626

4 chrX [48244794, 48244889] + 35775 132627

5 chrX [48246753, 48246802] + 35775 132628

... ... ... ... ... ... ...

26 chrX [48270193, 48270307] - 35778 132637

27 chrX [48269421, 48269516] - 35778 132636

28 chrX [48267508, 48267557] - 35778 132635

29 chrX [48262894, 48262998] - 35778 132633

30 chrX [48261524, 48262111] - 35778 132632

Table 2: Typical biological data coerced into a data frame: A GRanges table representing gene SSX4 and

SSX4B. One row represents one exon, seqnames indicates the chromosome name, ranges indicates the interval

of exons, strand is the direction, tx id and exon id are the internal id’s used for mapping cross database.

31

DATA: Genomic ranges

Page 55: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Extensions

15

layout

data source(s)

abstract data (formal model) meta data

geom stat scale

coord facet

plots

grammar of graphics with extensions

autoplot

I/O packages in bioconductor

tracks

Page 56: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Extensions

16

current software. Development of new visualization tools should beindependently factorized into components of the grammar. Table 1describes the extensions developed in this work.

Comp name usage icon

geom geom rect rectangle

geom segment segment

geom chevron chevron

geom arrow arrow

geom arch arches

geom bar bar

geom alignment alignment (gene)

stat stat coverage coverage (of reads)

stat mismatch mismatch pileup foralignments

stat aggregate aggregate in slidingwindow

stat stepping avoid overplotting

stat gene consider gene struc-ture

stat table tabulate ranges

stat identity no change

coord linear ggplot2 linear butfacet by chromo-some

genome put everything ongenominc coordi-nates

truncate gaps compact view byshrinking gaps

layout track stacked tracks

karyogram karyogram display

chr1chr2

chr3

50 100 150 200 250 300start

circle circular

faceting formula facet by formula

ranges facet by rangesscale not extended ggplot2default

Table 1: Components of the basic grammar of graphics, with the

extensions available in ggbio.

In comparison to regular data elements which might be mappedto the ggplot2 geoms of points, lines and polygons, genomic datahas a basic currency of an interval. Intervals underlie exons, in-trons and locations on the genome, which form the reference framefor biological data. We have introduced several new geoms forrepresenting intervals and connections between intervals: rectan-gle, segment, chevron, arch, arrow and arrow rectangle. The geomalignment and its variants are combinations of those geoms, whichfunction as a unit. For example, the alignment geom might drawexons as arrow rectangles and introns as chevrons. Figure 2 showsthe new geoms for interval data.

layout

data source(s)

abstract data (formal model) meta data

geom stat scale

coord facet

plots

grammar of graphics with extensions

autoplot

I/O packages in bioconductor

tracks

Figure 1: Diagram of the framework for biological data visualiza-

tion. It starts with a mapping from empirical data to an abstract data

model, followed by a general and extended grammar of graphics that

map data elements to graphical elements. Orange boxes indicate the

new components provided by ggbio and dashed frame indicates the

body of grammar of graphics, including the parts we extended with

ggbio.

Several new types of statistical transformation (stat) are defined.In ggplot2, some common transformations are possible, for exam-ple binning to create a histogram, or smoothing to add a line rep-resenting a model fit to the data. For genomic data there are somecommonly useful transformations that are incorporated in ggbio:coverage, i.e., feature stack depth, and mismatch summaries fromread alignments.

Additional types of coordinate system (coord), layout andfaceting methods are also available. These additional componentsare listed in Table 1, an they are described in more detail in the plotexamples.

Let’s analyze the anatomy in a minimal example, illustrated inFigure 3, to get an impression of the components of the grammar.In this plot, we are showing a gene structure with four transcriptsby using the alignment geom, stepping transformation and aestheticattributes such as color mapped to strand, in the genomic coordinatesystem. It will become clear that almost all graphics found in exist-ing genome browsers or visualization tools could be described bythe different components introduced here. This is the strength ofthe grammar of graphics. While it may appear simplistic initially,once one gains a deeper understanding of the grammar, one maydiscover that seemingly complex data graphics can be abstracted ascomponents from “pictures” and thus made tangible. This will aidin the design of future graphics.

The following sections of the paper describe the grammar exten-sions, and provide examples of their use.

2.1 Modeling Data

Data are the first component of the grammar, and data may be col-lected in different ways. Wilkinson makes a distinction betweenempirical data, abstract data and metadata [19]. Empirical data arecollected from observations of the real world, while abstract dataare defined by a formal mathematical model. Metadata are dataabout data, which might be empirical, abstract or metadata them-selves.

Genomic data are often communicated in tabular text files, suchas csv and tab-delimited files. Rows always represent observationsand columns always represent a set of variables. Annotation tracksare usually stored according to specific formats, each with a fixed

Page 57: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

current software. Development of new visualization tools should beindependently factorized into components of the grammar. Table 1describes the extensions developed in this work.

Comp name usage icon

geom geom rect rectangle

geom segment segment

geom chevron chevron

geom arrow arrow

geom arch arches

geom bar bar

geom alignment alignment (gene)

stat stat coverage coverage (of reads)

stat mismatch mismatch pileup foralignments

stat aggregate aggregate in slidingwindow

stat stepping avoid overplotting

stat gene consider gene struc-ture

stat table tabulate ranges

stat identity no change

coord linear ggplot2 linear butfacet by chromo-some

genome put everything ongenominc coordi-nates

truncate gaps compact view byshrinking gaps

layout track stacked tracks

karyogram karyogram display

chr1chr2

chr3

50 100 150 200 250 300start

circle circular

faceting formula facet by formula

ranges facet by rangesscale not extended ggplot2default

Table 1: Components of the basic grammar of graphics, with the

extensions available in ggbio.

In comparison to regular data elements which might be mappedto the ggplot2 geoms of points, lines and polygons, genomic datahas a basic currency of an interval. Intervals underlie exons, in-trons and locations on the genome, which form the reference framefor biological data. We have introduced several new geoms forrepresenting intervals and connections between intervals: rectan-gle, segment, chevron, arch, arrow and arrow rectangle. The geomalignment and its variants are combinations of those geoms, whichfunction as a unit. For example, the alignment geom might drawexons as arrow rectangles and introns as chevrons. Figure 2 showsthe new geoms for interval data.

layout

data source(s)

abstract data (formal model) meta data

geom stat scale

coord facet

plots

grammar of graphics with extensions

autoplot

I/O packages in bioconductor

tracks

Figure 1: Diagram of the framework for biological data visualiza-

tion. It starts with a mapping from empirical data to an abstract data

model, followed by a general and extended grammar of graphics that

map data elements to graphical elements. Orange boxes indicate the

new components provided by ggbio and dashed frame indicates the

body of grammar of graphics, including the parts we extended with

ggbio.

Several new types of statistical transformation (stat) are defined.In ggplot2, some common transformations are possible, for exam-ple binning to create a histogram, or smoothing to add a line rep-resenting a model fit to the data. For genomic data there are somecommonly useful transformations that are incorporated in ggbio:coverage, i.e., feature stack depth, and mismatch summaries fromread alignments.

Additional types of coordinate system (coord), layout andfaceting methods are also available. These additional componentsare listed in Table 1, an they are described in more detail in the plotexamples.

Let’s analyze the anatomy in a minimal example, illustrated inFigure 3, to get an impression of the components of the grammar.In this plot, we are showing a gene structure with four transcriptsby using the alignment geom, stepping transformation and aestheticattributes such as color mapped to strand, in the genomic coordinatesystem. It will become clear that almost all graphics found in exist-ing genome browsers or visualization tools could be described bythe different components introduced here. This is the strength ofthe grammar of graphics. While it may appear simplistic initially,once one gains a deeper understanding of the grammar, one maydiscover that seemingly complex data graphics can be abstracted ascomponents from “pictures” and thus made tangible. This will aidin the design of future graphics.

The following sections of the paper describe the grammar exten-sions, and provide examples of their use.

2.1 Modeling Data

Data are the first component of the grammar, and data may be col-lected in different ways. Wilkinson makes a distinction betweenempirical data, abstract data and metadata [19]. Empirical data arecollected from observations of the real world, while abstract dataare defined by a formal mathematical model. Metadata are dataabout data, which might be empirical, abstract or metadata them-selves.

Genomic data are often communicated in tabular text files, suchas csv and tab-delimited files. Rows always represent observationsand columns always represent a set of variables. Annotation tracksare usually stored according to specific formats, each with a fixed

Extensions

17

Page 58: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Extensions

18

current software. Development of new visualization tools should beindependently factorized into components of the grammar. Table 1describes the extensions developed in this work.

Comp name usage icon

geom geom rect rectangle

geom segment segment

geom chevron chevron

geom arrow arrow

geom arch arches

geom bar bar

geom alignment alignment (gene)

stat stat coverage coverage (of reads)

stat mismatch mismatch pileup foralignments

stat aggregate aggregate in slidingwindow

stat stepping avoid overplotting

stat gene consider gene struc-ture

stat table tabulate ranges

stat identity no change

coord linear ggplot2 linear butfacet by chromo-some

genome put everything ongenominc coordi-nates

truncate gaps compact view byshrinking gaps

layout track stacked tracks

karyogram karyogram display

chr1chr2

chr3

50 100 150 200 250 300start

circle circular

faceting formula facet by formula

ranges facet by rangesscale not extended ggplot2default

Table 1: Components of the basic grammar of graphics, with the

extensions available in ggbio.

In comparison to regular data elements which might be mappedto the ggplot2 geoms of points, lines and polygons, genomic datahas a basic currency of an interval. Intervals underlie exons, in-trons and locations on the genome, which form the reference framefor biological data. We have introduced several new geoms forrepresenting intervals and connections between intervals: rectan-gle, segment, chevron, arch, arrow and arrow rectangle. The geomalignment and its variants are combinations of those geoms, whichfunction as a unit. For example, the alignment geom might drawexons as arrow rectangles and introns as chevrons. Figure 2 showsthe new geoms for interval data.

layout

data source(s)

abstract data (formal model) meta data

geom stat scale

coord facet

plots

grammar of graphics with extensions

autoplot

I/O packages in bioconductor

tracks

Figure 1: Diagram of the framework for biological data visualiza-

tion. It starts with a mapping from empirical data to an abstract data

model, followed by a general and extended grammar of graphics that

map data elements to graphical elements. Orange boxes indicate the

new components provided by ggbio and dashed frame indicates the

body of grammar of graphics, including the parts we extended with

ggbio.

Several new types of statistical transformation (stat) are defined.In ggplot2, some common transformations are possible, for exam-ple binning to create a histogram, or smoothing to add a line rep-resenting a model fit to the data. For genomic data there are somecommonly useful transformations that are incorporated in ggbio:coverage, i.e., feature stack depth, and mismatch summaries fromread alignments.

Additional types of coordinate system (coord), layout andfaceting methods are also available. These additional componentsare listed in Table 1, an they are described in more detail in the plotexamples.

Let’s analyze the anatomy in a minimal example, illustrated inFigure 3, to get an impression of the components of the grammar.In this plot, we are showing a gene structure with four transcriptsby using the alignment geom, stepping transformation and aestheticattributes such as color mapped to strand, in the genomic coordinatesystem. It will become clear that almost all graphics found in exist-ing genome browsers or visualization tools could be described bythe different components introduced here. This is the strength ofthe grammar of graphics. While it may appear simplistic initially,once one gains a deeper understanding of the grammar, one maydiscover that seemingly complex data graphics can be abstracted ascomponents from “pictures” and thus made tangible. This will aidin the design of future graphics.

The following sections of the paper describe the grammar exten-sions, and provide examples of their use.

2.1 Modeling Data

Data are the first component of the grammar, and data may be col-lected in different ways. Wilkinson makes a distinction betweenempirical data, abstract data and metadata [19]. Empirical data arecollected from observations of the real world, while abstract dataare defined by a formal mathematical model. Metadata are dataabout data, which might be empirical, abstract or metadata them-selves.

Genomic data are often communicated in tabular text files, suchas csv and tab-delimited files. Rows always represent observationsand columns always represent a set of variables. Annotation tracksare usually stored according to specific formats, each with a fixed

Page 59: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Extensionsautoplot

Tries, and does a jolly good job, of recognizing the data object to be plotted, and how it should be displayed.

19

Page 60: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Example

20

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

Page 61: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Example

20

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

DATA=GRangesList Object

Page 62: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Example

20

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

Page 63: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Example

20

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

GEOM=alignment, chevron

Page 64: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Example

20

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

Page 65: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Example

20

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

SCALE=stepping, color=strand

Page 66: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Example

20

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

Page 67: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Example

20

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

LAYOUT=linear

Page 68: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Example

20

1

2

48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX

strand+

Page 69: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Examples

Examine short readsStack them (top)Collapse into “density” (bottom)

21

Stepping

0

5

10

15

0 100 200 300

Coverage

p1 <- autoplot(gr)p2 <- autoplot(gr, stat = "coverage")tracks(p1, p2)

Page 70: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Examples

Compare transcriptsReduce all to one

22

uc010bzo.2(226)

uc002dvx.3(226)

uc002dvw.3(226)

uc002dvz.3(226)

uc002dwa.4(226)

uc010veg.2(226)

uc002dwc.3(226)

30065000 30070000 30075000 30080000

p1 <- autoplot(txdb,   which = genesymbol["A"])p2 <- autoplot(txdb,   which = genesymbol["A"], stat = "reduce")tracks(p1, p2, heights = c(4, 1))

Page 71: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

uc010bzo.2(226)

uc002dvx.3(226)

uc002dvw.3(226)

uc002dvz.3(226)

uc002dwa.4(226)

uc010veg.2(226)

uc002dwc.3(226)

30065000 30070000 30075000 30080000hg19::chr16

knownG

ene

uc010bzo.2(226)

uc002dvx.3(226)

uc002dvw.3(226)

uc002dvz.3(226)

uc002dwa.4(226)

uc010veg.2(226)

uc002dwc.3(226)

30064411 30081741hg19::chr16

knownG

ene

Examples

Focus on exons

23

p1 <- autoplot(txdb,   which = genesymbol["A"])p2 <- autoplot(txdb,   which = genesymbol["A"],  truncate.gaps = TRUE)tracks(p1, p2, heights = c(4, 4))

Page 72: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Examples

24

Manhattan plot: features plotted against genomic position

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y

0.5

1.0

1.5

2.0

2.5

3.0

ll

ll

l

l

l

ll

l

ll

l

l

l

l

l l

l

l

l

ll

lll

l

l

l

l

ll

l

l

l

ll

ll

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

lll

l

l

l

ll

l

ll

l

l

l

l l

l

l

l

lll

l

llll

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll l

l

ll

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

lll

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

lll

l

l

l

lll

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

lll

l

ll l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

ll

ll

lll

lll

ll

l

ll

l

l

l

l

ll

l

l

l

l

ll

l

l

l

ll

ll

l

l

ll

l

ll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll

ll

l

l

l

l

l

ll

l

l

lll

l

l

l

l

lll

l

l

l

l

ll

l

l

l

ll l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

ll

l

ll

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l l

l

l

ll

l

l

ll

l

l l

l

l

l

l

ll

ll

l

l

l

l

l

ll

l

l

l

ll

lll

l

l

l

l

l

l

lll

l

l

ll

l

l

l l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l l

lll l

l

ll

ll

l

l

l

ll

ll

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

l

ll

l

l

l

l

ll

l l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l l

l

ll

l

l

ll

l

ll

l

ll

l

l

l

l

l

ll

l

ll

ll

l

l

l

l

ll

l

l l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

ll

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

ll

l

lll

l

ll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

ll

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

llll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

ll

ll

ll l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l l l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

ll

l

l

ll

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

lll

l

l

l

l

l

l

ll

l

l

l l

l

l

l

l

l

l

l

l

ll

ll

l

l

l l

l

l

l

l l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

l

l

ll l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

ll

ll

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

ll

l

l

lll

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

lll

l

l

l l

l

l

ll

l

l

lll

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

ll

l

lll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

llll

l

l

l

l

l

ll

ll

l

l

l

l

lll

l

llll

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

llll

lll

l

l

l

l

l

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

llll

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

ll

ll

lll

l

l

lll

l

l

l

l

l

l

lll

l

l

l

ll

ll

l l

ll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

lll

ll

l

ll

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

ll

l

ll

l

l

ll

ll

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

ll

l

l

l

ll

l

ll

l

ll

l

lllll ll

l

l

l

l

l

ll

ll

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

ll

l

l

l

l

l

llll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

ll

ll

l

l

ll

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

ll

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l l

l

l

l

ll

l

l

l

l ll

l

l

l

l

l

l

l

llll

l

ll

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

lll

l

l

ll

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

ll

l

l

l

lll l l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

ll

l

ll

ll

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

lll l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

lll

l

l

l

l

l

5.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+081,10,11,12,13,14,15,16,17,18,19,2,20,21,22,3,4,5,6,7,8,9,X,Y

pvalue

0.5

1.0

1.5

2.0

2.5

3.0

l

l

l

l

l

l

l

llll

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

ll

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

ll

l

l

l

llll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

lll

l

l

l

l

l

l

lll

l

l

l

l

ll

ll

l

l

l

llll

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

ll

lll

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

lll

l

l

ll

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

lll

l

l

l

l

l

ll

l

l

lll

ll

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

ll

ll

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

ll

l

l

llll

l

l

l

l

l

l

l

l

l

lll

l

ll

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

ll

l

l

l

ll

l

l

l

ll

ll

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

ll

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

ll

ll

l

l

l

l

lll

ll

llll

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

llll

l

l

l

l

l

l

l

l

l

ll

l

lll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

llll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

ll

ll

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

ll

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

ll

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

ll

l

l

l

l

l

l

ll

l

lll

l

l

ll

l

l

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

lll

l

l

l

l

l

l

l

l

l

ll

l

ll

ll

l

l

l

l

l

l

lll

l

l

l

l

ll

ll

l

l

l

lll

l

l

l

l

ll

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

lll

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

ll

l

ll

l

l

ll

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

lll

l

l

l

l

l

ll

ll

l

l

l

l

ll

l

l

l

lll

lll

l

l

l

l

l

l

l

l

l

l

lll

l

l

lll

ll

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

l

ll

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

llll

l

l

l

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

ll

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

ll

l

l

l

l

l

ll

ll

lll

l

l

ll

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

l

l

lll

l

lll

l l

l

l

l

l

l

l

l

l

l

l

l

lll

l

lll

ll

l

l

ll

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

lll

l

ll

l

l

l

l

l

l

ll

l

l

llll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

ll

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

llll

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

llll

ll

l

l

l

l

l

lll

l

l

l

ll

l

ll

l

l

l

l

ll

ll

l

l

l

l

l

l

ll

lll

l

l

l

ll

ll

l

l

l

l

l

l

ll

lll

l

ll

l

l

l

l

l

ll

l

l

l

l

lll

l

l

lll

l

l

l

l

l

ll

l

l

ll

l

l

l

l

ll

l

l

llll

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

lll

ll

l

l

l

l

ll

l

l

llll

l

ll

l

l

l

l

l

ll

ll

l

l

l

lll

ll

l

l

ll

l

l

ll

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

lll

l

l

l

l

l

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Ygenome

y

p1 <- autoplot(gr.snp, geom = "point", aes(y = pvalue))

p2 <- plotGrandLinear( gr.snp, aes(y = pvalue))

Page 73: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Examples

24

Manhattan plot: features plotted against genomic position

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y

0.5

1.0

1.5

2.0

2.5

3.0

ll

ll

l

l

l

ll

l

ll

l

l

l

l

l l

l

l

l

ll

lll

l

l

l

l

ll

l

l

l

ll

ll

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

lll

l

l

l

ll

l

ll

l

l

l

l l

l

l

l

lll

l

llll

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll l

l

ll

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

lll

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

lll

l

l

l

lll

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

lll

l

ll l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

ll

ll

lll

lll

ll

l

ll

l

l

l

l

ll

l

l

l

l

ll

l

l

l

ll

ll

l

l

ll

l

ll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll

ll

l

l

l

l

l

ll

l

l

lll

l

l

l

l

lll

l

l

l

l

ll

l

l

l

ll l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

ll

l

ll

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l l

l

l

ll

l

l

ll

l

l l

l

l

l

l

ll

ll

l

l

l

l

l

ll

l

l

l

ll

lll

l

l

l

l

l

l

lll

l

l

ll

l

l

l l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l l

lll l

l

ll

ll

l

l

l

ll

ll

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

l

ll

l

l

l

l

ll

l l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l l

l

ll

l

l

ll

l

ll

l

ll

l

l

l

l

l

ll

l

ll

ll

l

l

l

l

ll

l

l l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

ll

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

ll

l

lll

l

ll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

ll

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

llll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

ll

ll

ll l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l l l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

ll

l

l

ll

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

lll

l

l

l

l

l

l

ll

l

l

l l

l

l

l

l

l

l

l

l

ll

ll

l

l

l l

l

l

l

l l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

l

l

ll l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

ll

ll

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

ll

l

l

lll

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

lll

l

l

l l

l

l

ll

l

l

lll

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

ll

l

lll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

llll

l

l

l

l

l

ll

ll

l

l

l

l

lll

l

llll

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

llll

lll

l

l

l

l

l

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

llll

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

ll

ll

lll

l

l

lll

l

l

l

l

l

l

lll

l

l

l

ll

ll

l l

ll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

lll

ll

l

ll

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

ll

l

ll

l

l

ll

ll

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

ll

l

l

l

ll

l

ll

l

ll

l

lllll ll

l

l

l

l

l

ll

ll

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

ll

l

l

l

l

l

llll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

ll

ll

l

l

ll

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

ll

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l l

l

l

l

ll

l

l

l

l ll

l

l

l

l

l

l

l

llll

l

ll

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

lll

l

l

ll

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

ll

l

l

l

lll l l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

ll

l

ll

ll

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

lll l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

lll

l

l

l

l

l

5.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+081,10,11,12,13,14,15,16,17,18,19,2,20,21,22,3,4,5,6,7,8,9,X,Y

pvalue

0.5

1.0

1.5

2.0

2.5

3.0

l

l

l

l

l

l

l

llll

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

ll

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

ll

l

l

l

llll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

lll

l

l

l

l

l

l

lll

l

l

l

l

ll

ll

l

l

l

llll

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

ll

lll

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

lll

l

l

ll

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

lll

l

l

l

l

l

ll

l

l

lll

ll

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

ll

ll

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

ll

l

l

llll

l

l

l

l

l

l

l

l

l

lll

l

ll

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

ll

l

l

l

ll

l

l

l

ll

ll

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

ll

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

ll

ll

l

l

l

l

lll

ll

llll

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

llll

l

l

l

l

l

l

l

l

l

ll

l

lll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

llll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

ll

ll

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

ll

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

ll

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

ll

l

l

l

l

l

l

ll

l

lll

l

l

ll

l

l

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

lll

l

l

l

l

l

l

l

l

l

ll

l

ll

ll

l

l

l

l

l

l

lll

l

l

l

l

ll

ll

l

l

l

lll

l

l

l

l

ll

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

lll

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

ll

l

ll

l

l

ll

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

lll

l

l

l

l

l

ll

ll

l

l

l

l

ll

l

l

l

lll

lll

l

l

l

l

l

l

l

l

l

l

lll

l

l

lll

ll

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

l

ll

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

llll

l

l

l

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

ll

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

ll

l

l

l

l

l

ll

ll

lll

l

l

ll

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

l

l

lll

l

lll

l l

l

l

l

l

l

l

l

l

l

l

l

lll

l

lll

ll

l

l

ll

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

lll

l

ll

l

l

l

l

l

l

ll

l

l

llll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

ll

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

llll

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

llll

ll

l

l

l

l

l

lll

l

l

l

ll

l

ll

l

l

l

l

ll

ll

l

l

l

l

l

l

ll

lll

l

l

l

ll

ll

l

l

l

l

l

l

ll

lll

l

ll

l

l

l

l

l

ll

l

l

l

l

lll

l

l

lll

l

l

l

l

l

ll

l

l

ll

l

l

l

l

ll

l

l

llll

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

lll

ll

l

l

l

l

ll

l

l

llll

l

ll

l

l

l

l

l

ll

ll

l

l

l

lll

ll

l

l

ll

l

l

ll

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

lll

l

l

l

l

l

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Ygenome

y

p1 <- autoplot(gr.snp, geom = "point", aes(y = pvalue))

p2 <- plotGrandLinear( gr.snp, aes(y = pvalue))

Facets by chromosome #

Page 74: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Examples

24

Manhattan plot: features plotted against genomic position

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y

0.5

1.0

1.5

2.0

2.5

3.0

ll

ll

l

l

l

ll

l

ll

l

l

l

l

l l

l

l

l

ll

lll

l

l

l

l

ll

l

l

l

ll

ll

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

lll

l

l

l

ll

l

ll

l

l

l

l l

l

l

l

lll

l

llll

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll l

l

ll

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

lll

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

lll

l

l

l

lll

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

lll

l

ll l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

ll

ll

lll

lll

ll

l

ll

l

l

l

l

ll

l

l

l

l

ll

l

l

l

ll

ll

l

l

ll

l

ll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll

ll

l

l

l

l

l

ll

l

l

lll

l

l

l

l

lll

l

l

l

l

ll

l

l

l

ll l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

ll

l

ll

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l l

l

l

ll

l

l

ll

l

l l

l

l

l

l

ll

ll

l

l

l

l

l

ll

l

l

l

ll

lll

l

l

l

l

l

l

lll

l

l

ll

l

l

l l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l l

lll l

l

ll

ll

l

l

l

ll

ll

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

l

ll

l

l

l

l

ll

l l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l l

l

ll

l

l

ll

l

ll

l

ll

l

l

l

l

l

ll

l

ll

ll

l

l

l

l

ll

l

l l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

ll

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

ll

l

lll

l

ll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

ll

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

llll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

ll

ll

ll l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l l l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

ll

l

l

ll

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

lll

l

l

l

l

l

l

ll

l

l

l l

l

l

l

l

l

l

l

l

ll

ll

l

l

l l

l

l

l

l l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

l

l

ll l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

ll

ll

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

ll

l

l

lll

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

lll

l

l

l l

l

l

ll

l

l

lll

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

ll

l

lll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

llll

l

l

l

l

l

ll

ll

l

l

l

l

lll

l

llll

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

llll

lll

l

l

l

l

l

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

llll

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

ll

ll

lll

l

l

lll

l

l

l

l

l

l

lll

l

l

l

ll

ll

l l

ll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

lll

ll

l

ll

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

ll

l

ll

l

l

ll

ll

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

ll

l

l

l

ll

l

ll

l

ll

l

lllll ll

l

l

l

l

l

ll

ll

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

ll

l

l

l

l

l

llll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

ll

ll

l

l

ll

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

ll

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l l

l

l

l

ll

l

l

l

l ll

l

l

l

l

l

l

l

llll

l

ll

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

lll

l

l

ll

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

ll

l

l

l

lll l l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

ll

l

ll

ll

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

lll l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

lll

l

l

l

l

l

5.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+081,10,11,12,13,14,15,16,17,18,19,2,20,21,22,3,4,5,6,7,8,9,X,Y

pvalue

0.5

1.0

1.5

2.0

2.5

3.0

l

l

l

l

l

l

l

llll

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

ll

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

ll

l

l

l

llll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

lll

l

l

l

l

l

l

lll

l

l

l

l

ll

ll

l

l

l

llll

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

ll

lll

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

lll

l

l

ll

l

l

l

ll

l

l

l

l

l

l

lll

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

ll

l

ll

l

ll

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

lll

l

l

l

l

l

ll

l

l

lll

ll

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

ll

ll

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

ll

l

l

llll

l

l

l

l

l

l

l

l

l

lll

l

ll

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

ll

l

l

ll

l

l

l

ll

l

l

l

ll

ll

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

lll

l

l

l

l

ll

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

ll

ll

l

l

l

l

lll

ll

llll

ll

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

llll

l

l

l

l

l

l

l

l

l

ll

l

lll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

llll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

lll

l

l

l

l

l

l

ll

ll

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

ll

ll

l

l

l

l

l

ll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

ll

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

ll

l

l

l

l

l

l

ll

l

lll

l

l

ll

l

l

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

l

l

l

lll

l

lll

l

l

l

l

l

l

l

l

l

ll

l

ll

ll

l

l

l

l

l

l

lll

l

l

l

l

ll

ll

l

l

l

lll

l

l

l

l

ll

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

lll

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

ll

l

ll

l

l

ll

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

lll

l

l

l

l

l

ll

ll

l

l

l

l

ll

l

l

l

lll

lll

l

l

l

l

l

l

l

l

l

l

lll

l

l

lll

ll

l

l

l

l

ll

l

l

l

l

l

ll

l

ll

l

l

ll

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

llll

l

l

l

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

ll

l

l

l

l

ll

l

l

l

l

l

l

ll

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

ll

l

l

l

l

l

ll

l

l

l

ll

l

l

l

l

l

l

ll

l

l

l

l

l

ll

ll

lll

l

l

ll

l

l

l

l

l

l

l

ll

lll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

l

ll

l

l

l

l

lll

l

lll

l l

l

l

l

l

l

l

l

l

l

l

l

lll

l

lll

ll

l

l

ll

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

ll

l

l

l

l

ll

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

lll

l

l

l

lll

l

ll

l

l

l

l

l

l

ll

l

l

llll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

ll

l

l

l

ll

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

l

ll

l

lll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

llll

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

ll

l

l

l

llll

ll

l

l

l

l

l

lll

l

l

l

ll

l

ll

l

l

l

l

ll

ll

l

l

l

l

l

l

ll

lll

l

l

l

ll

ll

l

l

l

l

l

l

ll

lll

l

ll

l

l

l

l

l

ll

l

l

l

l

lll

l

l

lll

l

l

l

l

l

ll

l

l

ll

l

l

l

l

ll

l

l

llll

l

l

l

l

l

l

l

l

l

l

l

lll

l

l

lll

ll

l

l

l

l

ll

l

l

llll

l

ll

l

l

l

l

l

ll

ll

l

l

l

lll

ll

l

l

ll

l

l

ll

l

l

ll

ll

ll

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

ll

l

l

ll

l

l

l

l

lll

l

l

l

l

l

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Ygenome

y

p1 <- autoplot(gr.snp, geom = "point", aes(y = pvalue))

p2 <- plotGrandLinear( gr.snp, aes(y = pvalue))

Facets by chromosome #

Turns chromosome # into numerical scale

Page 75: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Examples

25

12

34

56

78

910

1112

1314

1516

1718

1920

2122

5.0e+07 1.0e+08 1.5e+08 2.0e+08Human genome

Karyogram, highlight locations corresponding to some data feature

autoplot(gr, layout = "karyogram", color = "blue")

GRanges

Page 76: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Examples

26

l

l

l

l

l

l

l

l

l

ll

ll

l

l

l

l

l

l

l

l

l

ll

0M 50M

100M

150M

200M

0M

50M

100M

150M

200M

0M

50M

100M

150M

0M

50M

100M

150M

0M

50M

100M

150M

0M

50M

100M

150M

0M

50M100M150M0M50M100M

0M

50M

100M

0M

50M

100M

0M

50M100M

0M

50M

100M

0M

50M

100M

0M

50M

100M

0M

50M

100M

0M

50M

0M

50M

0M

50M

0M

50M 0M

50M 0M

0M 50M 1

2

34

5

6

7

8910

11

12

1314

1516

17

1819

2021 22

rearrangementsinterchromosomalintrachromosomal

tumreadsl

l

l

l

l

4681012 Circular layout

of genome with associated data, and connections

ggplot() + layout_circle(gr1, geom = "link", linked.to = "to.gr",  aes(color = rearrangement),  trackWidth = 1, radius = 10) +   layout_circle(gr2, geom = "point", ...) + ...                 

Page 77: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Examples

27

Layout genome linearly, stack associated data plots, connections

0

2

4

6

8

10

rearrangementsinterchromosomalintrachromosomal

4

6

8

10

12

l

l

l

l

l

l

l

l l

l

l

l

l

l

l l l l

l

l

l

l

ll

score

tumreadsl

l

l

l

l

4681012

Stepping

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Stepping

p1 <- autoplot(gr1, geom = "arch", aes(color = rearrangement), coord = "genome")p2 <- autoplot(gr2, geom = "point", aes(y = score, size = tumreads), color = "red", coord = "genome")...tracks(p1, p2, p3, p4, heights = c(2, 4, 1, 1))

Page 78: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Examples

28

Organize multiple circular layouts

library(gridExtra)grid.arrange(square, gg, ncol = 2, widths = c(4/5, 1/5))

1

2

34

5

6

78910

11

1213

1415

1617

1819

20 21 22CRC−1

1

2

34

5

6

78910

11

1213

1415

1617

1819

20 21 22CRC−2

1

2

34

5

6

78910

11

1213

1415

1617

1819

20 21 22CRC−3

1

2

34

5

6

78910

11

1213

1415

1617

1819

20 21 22CRC−4

1

2

34

5

6

78910

11

1213

1415

1617

1819

20 21 22CRC−5

1

2

34

5

6

78910

11

1213

1415

1617

1819

20 21 22CRC−6

1

2

34

5

6

78910

11

1213

1415

1617

1819

20 21 22CRC−7

1

2

34

5

6

78910

11

1213

1415

1617

1819

20 21 22CRC−8

1

2

34

5

6

78910

11

1213

1415

1617

1819

20 21 22CRC−9

rearrangementsinterchromosomalintrachromosomal

Page 79: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

BenefitsFlexibility in drawing genomic dataAesthetics are changeable, color schemes for different purposesPlots defined in a way to compare and contrastHuge variety of displays is available in one locationBuilds from a good data model and tools available in bioC.

29

Page 80: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Future WorkClean up code, autoplot, consistency in usage, make circular layouts as elegant as CircosIdeally integrate new grammar components better with the ggplot2 code (not trivial)Build interactive graphics, using the qtbase, qtpaint primitives

30

Page 81: Di 2011 houston

ggbio - Genomic Data Vis - Interface 2012, Rice University /31

Availabilityggbio is on www.bioconductor.orgTengfei’s ggbio web page has tutorials and gallery of examples: http://tengfei.github.com/ggbio

Support by Genentech has been vital

31


Recommended