Flow Data AnalysisChallenges Deck from Amgen Attendees
Bioinformatics/ Biostatistics MolecularComputational Biology Sciences
John Gosink Cheng Su Katie Newhall
Hugh Rand Bill Rees
Mark Dalphin Gary Means
Wednesday, September 20, 2006
For Internal Use Only. Amgen Confidential. 2
Sample and meta-data tracking can be complicated
Misccytokines
Miscdrugs
Stimulation/inhibitioncombinations
Bloodsamples
Multiple celltypes
FCSfiles
FSC-H SSC-H FL1-H FL2-H FL3-H FL1-A FL4-H Time44 25 65 63 0 0 53 0
196 143 90 110 0 0 74 1211 129 98 97 3 0 48 172 74 109 25 0 2 20 287 22 153 72 0 13 58 2
173 144 94 139 0 0 72 2
Cellevents
1,000 – 10,00010 – 100
5 – 1010,000 – 100,000
5 – 20
blood samplesstimulations / samplecell types/mixcell events/ cell typechannels/cell
5,000 samples x 50 stims/sample x 7 cell-types/cocktail x 5 Mbytes/FCS file
10 terabytes
An FCS file
Approx. the size of an AffymetrixMicroarray .CEL file
Need a relational database and associated code infrastructure
John Gosink, Bioinformatics/Computational Biology, Amgen
For Internal Use Only. Amgen Confidential. 3
Some meta-data that we need to capture, store, and index (let alone the actual FCS files/data)
• Sample meta-data• Sample ID• Sample to well mapping• Stimulation conditions• Dilutions
• Reagent meta-data• Reagent batches• Labeling scheme
• Machine meta-data (FCS format currently captures most)
• measurement windows• PMT settings• compensation (and matrix)• transformation
• Gating parameters• coordinates• thresholds• gate hierarchy
John Gosink, Bioinformatics/Computational Biology, Amgen
For Internal Use Only. Amgen Confidential. 4
More interesting questions involve natural cell populations and their variation
• Catalog of all cell types- What are their distributions in all of flow parameter space- How to standardize between samples and runs
• What are fruitful approaches to characterizing these distributions-Baseline catagorization
- Number of “typical” cell volumes (archetypes)- Location of archetypes- Shapes of archetypes- Relationships of cell counts in the archetypes
- Characterization of the “void”- How empty is the void- How smooth is the void
• Detection of novel (sub) populations and unforseen changes
John Gosink, Bioinformatics/Computational Biology, Amgen
For Internal Use Only. Amgen Confidential. 5
Question: How do we best quantitate multiple overlapping peaks
One Approach: Fit peaks as a sum of small numbers of basis set functions.
Issues: Basis set choice, sensitivity, accuracy, …
Separation of Overlapping Peaks
Hugh Rand, Bioinformatics/Computational Biology, Amgen
For Internal Use Only. Amgen Confidential. 6
Example histograms
Noisy Overlap
Small Peaks
Shape
More Overlap
Hugh Rand, Bioinformatics/Computational Biology, Amgen
For Internal Use Only. Amgen Confidential. 7
Receptor Occupancy Assay and Analysis
Unlabeled Drug AbLabeled Drug AbLabeled Recpt AbLabeled Isotype Ctrl Ab
Unlabeled Ab @0 – sat’d dose
FlowCytometer
LabeledAb
Labeledanti-recpt Ab
Labeledisotype ctrl
Cell with specificand non-specific receptors
Cell with specificand non-specific receptors.
Ab induces more recpt.
3
31FracBound
6
21FracBoundSome Drug in AnimalNo Drug in Animal
Mark Dalphin, Bioinformatics/Computational Biology, Amgen
For Internal Use Only. Amgen Confidential. 8
Some math…Simple form, without non-specific binding
0
011
R
Rd
D
Dd
Rd
Dd
FFFF
N
NOccupancy
Add non-specific binding and things are not so tidy
)(
)(1
,,
0
0
000
IdRdRd
IdDdDd
R
I
I
IdId
R
RdRd
D
DdDd
PPfP
PPfPOccpancy
N
Nf
F
FP
F
FP
F
FP
Mark Dalphin, Bioinformatics/Computational Biology, Amgen
For Internal Use Only. Amgen Confidential. 9
Problems with receptor occupancy assays
Even with 1:1 conjugates, MFI varies significantly from Ab to Ab against the same receptor
“Can’t see less than 1,000 receptors per cell”
Large variability from instrument to instrument and run to run
Why doesn’t this behave like a well-controlled physical experiment; why is it “semi-quantitative”?
I’d like to see:– Easy loading of data-sets and meta-data– Module to compute occupancy– Some way to look at associated binding curves
Mark Dalphin, Bioinformatics/Computational Biology, Amgen
For Internal Use Only. Amgen Confidential. 10
Gating Sensitivity
If gates change slightly, will results change? Reasons for considering gating sensitivity:
– Quantitative analysis of the responses– Gating is done per individual samples– Gating is somewhat subjective, even auto-gating– Multiple gates used– Subgroups of small size
Cheng Su, Biostatistics, Amgen
For Internal Use Only. Amgen Confidential. 11
Gating Sensitivity Analysis
Sensitivity Analysis– Get new gates by moving the boundary of gates– Conduct analysis– Compare the results
Challenges– software/system: to import the gate boundary
– methodology: methods to automate gate movement
and compare results
Cheng Su, Biostatistics, Amgen
For Internal Use Only. Amgen Confidential. 12
System Outline
Samples LSRII
XML FCS Gating
B Cells
T Cells
NK Cells
Analysis(R,SAS,Java,…)
Result
Checkagainst
Cheng Su, Biostatistics, Amgen
For Internal Use Only. Amgen Confidential. 13
How to move what we do in proprietary graphical tools into a more high-throughput environments?
Question: Are there applications available that can accommodate the size of FCS files that I generate, allow me to compare data across a plate, and provide data output in an acceptable format?
Problem: Currently using a 9-color, 12-parameter antibody panel in whole blood (and it’s only getting bigger!)– FCS file size = 10,000 to 30,000 KB– Analysis time = 8 hours for 32 samples/wells– Export time = 20-30 minutes for 32 FCS files– Output = at least 7 gated files for each FCS file
Katie Newhall, Molecular Sciences, Amgen
For Internal Use Only. Amgen Confidential. 14
How to move what we do in proprietary graphical tools into a more high-throughput environments?
Potential solutions– Analysis
• Automated gating• Sample flagging• Comparison of samples across a plate• Output of histogram statistics in an excel format
– Export time• Gating information and experimental metadata
exported with FCS/TXT files
Katie Newhall, Molecular Sciences, Amgen
For Internal Use Only. Amgen Confidential. 15
immunophenotyping
experiment: – 80 clinical whole blood samples– no ex vivo manipulation– 4 dose cohorts– 38 3-color, RBClyse/no-wash stains– 3280 6-parameter FCS files
What populations of events change in some way as a function of drug dose or disease state or changes in other populations?
Bill Rees, Molecular Sciences, Amgen
For Internal Use Only. Amgen Confidential. 16
An immunophenotyping panel FI TC PE PerCP APC gate
1 CD4 CCR4 CD45 CCR5 L + G 2 CD4 CCR7 CD45 CCR5 L + G 3 CD4 CCR8 CD45 CCR5 L + G 4 CD4 CXCR3 CD45 CCR5 L + G 5 CD4 CD25 CD45 CCR5 L 6 CD4 CD27 CD45 CCR5 L 7 CD4 CD28 CD45 CCR5 L 8 CD4 CD38 CD45 CCR5 L 9 CD4 CD54 CD45 CCR5 L
10 CD4 KI R CD45 CCR5 L 11 CD4 CD161 CD45 CCR5 L 12 CD4 CD212 CD45 CCR5 L 13 CD4 HLA-DR CD45 CCR5 L 14 CD4 CCR6 CD45 CCR5 L 15 CD45RA CD4 CD45 CCR5 L 16 CD244 CD4 CD45 CCR5 L 17 CD26 CD8 CD45 CCR5 L 18 CLA CD8 CD45 CCR5 L 19 CD94 CD8 CD45 CCR5 L 20 CD8 CD161 CD45 CCR5 L 21 CD8 NKG2D CD45 CCR5 L 22 CD8 4-1BB-L CD45 CCR5 L 23 CD8 CD30 CD45 CCR5 L 24 CD8 CD70 CD45 CCR5 L 25 CD20 CD54 CD45 CCR5 L 26 CD20 CD69 CD45 CCR5 L 27 CD4 CD152 CD45 CCR5 L 28 CD45RA CD152 CD45 CD56 L 29 CD8 CD152 CD45 CD3 L 30 I gD CD152 CD45 CD19 L 31 I gM CD152 CD45 CD19 L 32 CD27 CD152 CD45 CD19 L 33 CLA CD152 CD45 CD19 L 34 CD16 CD56 CD45 CD19 L 35 CD14 CD80 CD45 CCR5 M 36 CD14 CD86 CD45 CCR5 M 37 CD14 HLA-DR CD45 CCR5 M 38 CD16 CD11b CD45 CCR5 G
T cells
B cells
NK cellsmonocytes
Bill Rees, Molecular Sciences, Amgen
For Internal Use Only. Amgen Confidential. 17
Immunophenotyping I will not deal with this 2-dimensions at a time
– time
– too many populations in each stain, only some do I know to look for– don’t know what I’m looking for with minimal biological insight
Issues:– definitions of terms– Metrics, e.g. MFI and %CD45+ events, % responders– Linking raw data to other study data/protocols and to analysis product– Autogating with visual QC– Can the identification of the major cell types (operationally defined by
robust stains, e.g. CD3+ CD8+ CD56-) be automated to incrementally reduce the analysis time?
Bill Rees, Molecular Sciences, Amgen
For Internal Use Only. Amgen Confidential. 18
Whole blood stimulation assays where leukocytes are evaluated for phosphoprotein pathway activation inhibition
Note: This is the region where notes could be placed
Specimen_001_C1_C01.fcs
CD8/CD33
210
310
410
510
Par
amet
er 2
0
65536
131072
196608
262144
grans
CD33+ Specimen_001_C1_C01.fcs
CD3
210
310
410
CD
56
210
310
410
CD3+
CD3+/CD56+
Bcells
CD56+
70.52%25.82%
0.21%3.45%Specimen_001_C1_C01.fcs
CD8/CD33
310
410
510
CD
4
210
310
410
510
DN
CD4+
CD8+
66.04% 0.31%
5.12% 28.54%
Specimen_001_C1_C01.fcs
CD45RO
310
410
510
CD
4
210
310
410
510
CD4+ mem
0.00%0.00%
29.34%70.66%
Specimen_001_C1_C01.fcs
CD45RO
310
410
510
CD
8/C
D33
210
310
410
510
CD8+ mem
86.13% 13.87%
0.00% 0.00%
Gary Means, Molecular Sciences, Amgen
For Internal Use Only. Amgen Confidential. 19
Process Cells
Whole blood
Stimulate LabelingFlow
SampleData File
Soft-ware
T cell
Granulocyte
NK B cell
Monocyte
DN
Lymphocyte
CD4+/CD8+ CD8+CD4+
CD8+memory
CD4+memory
Use bioinformatics tools to evaluate
coordinate regulation of at multiple different
intracellular targets
11 gates x 4 targets x 96 wells11 gates x 4 targets x 96 wells
Problem?Each set of gated data must be independently exported and kept linked to the experimental process
metadata
Gary Means, Molecular Sciences, Amgen
For Internal Use Only. Amgen Confidential. 20
Automatically export events with additional columns which contain all of the gating information associated with each event.
Metadata must be inextricably associated with the experimental results.
Solutions?
Gary Means, Molecular Sciences, Amgen