Post on 21-Dec-2015
transcript
1
Getting Started
• Download and install Haploview http://www.broad.mit.edu/mpg/haploview/index.php
2
i. Downloading dense genotype/resequencing data from HapMap and SeattleSNPs
ii. Choosing tagSNPs by using Haploview
Lab 1
Yu-Chun Jean Yenyyen@hsph.harvard.eduBldg.2 Rm. 200
3
• HapMap: http://www.hapmap.org/
• Seattle SNPs: http://pga.mbt.washington.edu/
• “Prettybase Purifier” tool: http://innateimmunity.net/IIPGA2/index_html (register required)
• Haploview: http://www.broad.mit.edu/mpg/haploview/index.php
4
Click Here
5
Search IGF1
6
7
Zoom In
8
9
Using HapMap.Org : A Tutorial http://www.hapmap.org/downloads/presentations/hapmap.org.ppt
ASHG 2007 HapMap Tutorial http://www.hapmap.org/downloads/presentations/ASHG07_HapMapTutorial.ppt
10
Select “Download SNP genotype data”
Click Here
11
Click Here
Choose
12
Click Here
13
Click Here
14
Right click to save
15
079773 E008 G G079773 E009 G G079773 E010 C C079773 E011 G G079773 E012 G G079773 E013 C C079773 E014 C G079773 E015 G G079773 E016 G G079773 E017 C G079773 E018 C G079773 E019 C G079773 E020 G G079773 E021 G G079773 E022 C C079773 E023 N N080761 D001 aa aa080761 D002 aa -080761 D003 aa aa080761 D004 aa aa080761 D005 aa aa080761 D006 aa aa080761 D007 aa aa080761 D008 aa aa080761 D009 aa aa080761 D010 aa -080761 D011 aa aa080761 D012 aa aa080761 D013 aa aa
“Prettybase” format
SNP (relative pos)
SubjectAlleles
Contains insertion-deletion polymorphisms (INDELS)
This can be a problem for many software tools, which expect SNPs – also indels present genotyping difficulties
Can clean file using “prettybase purifier” tool at
http://innateimmunity.net/IIPGA2/index_html (register required)
16
Click Here
17
Click Here
18
Can restrict subjects to a given ethnicity
Can restrict SNPs to those with MAF above a user-defined threshold
Can eliminate INDELS
19
079773 E008 G G079773 E009 G G079773 E010 C C079773 E011 G G079773 E012 G G079773 E013 C C079773 E014 C G079773 E015 G G079773 E016 G G079773 E017 C G079773 E018 C G079773 E019 C G079773 E020 G G079773 E021 G G079773 E022 C C079773 E023 N N080761 D001 aa aa080761 D002 aa -080761 D003 aa aa080761 D004 aa aa080761 D005 aa aa080761 D006 aa aa080761 D007 aa aa080761 D008 aa aa080761 D009 aa aa080761 D010 aa -080761 D011 aa aa080761 D012 aa aa080761 D013 aa aa
“Prettybase” format
SNP (relative pos)
SubjectAlleles
Contains insertion-deletion polymorphisms
This can be a problem for many software tools, which expect SNPs – also indels present genotyping difficulties
Can clean file using “purifier” tool at http://innateimmunity.net/IIPGA2/index_html.
Many software tools don’t like this formatHaploview, for example, wants “pedigree” file (short and fat instead of long and skinny) and “info” file (with SNP positions).
Can convert using makehv (R function) or %haploview (SAS Macro), available at http://www.hsph.harvard.edu/faculty/kraft/soft.htm.
[caveat emptor: this code is unsupported!]
20
ped.id subject.id dad.id mom.id gender affection.status snp1.allele1 snp1.allele2 etc.
21
22
Click Here
23
24
Click Here
25
26
27
Click Here
28
Force Include: have to genotype this SNP
[e.g. nsSNP]
Force Exclude: cannot genotype this SNP
[e.g. known not to genotype well in your lab]
Can combine “force include” and “force exclude” to
evaluate how well a given set of SNPs performs
Can decide what r2 performance you are willing to live with, whether you want to
pursue “aggressive” tags
Click Here
29
SNPs you should genotype, tests
you should perform
How many SNPs does the
highlighted marker “tag”
30
Law of diminishing returns: you have
to do a lot of genotyping to
capture last few stragglers
31
32
ped file
info file
33
Getting ready for next Lab session
• Request IT Help Desk for a Unix Account.IT Helpdesk (LL-15) from 8:00 – 5:00pm, at 617-432-HELP, or helpdesk@hsph.harvard.edu.
• Unix computing guide for beginners: http://www.isites.harvard.edu/icb/icb.do?keyword=k2067&pageid=icb.page23341