+ All Categories
Home > Documents > Assignment 9 - Washington University...

Assignment 9 - Washington University...

Date post: 09-Jul-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
11
Assignment 9 Modified from Mayank’s notes from 2016
Transcript
Page 1: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Assignment 9Modified from Mayank’s notes from 2016

Page 2: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Extremely (re)productive F1s!

2

Page 3: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Anatomy of a VCF file

3http://gatkforums.broadinstitute.org/gatk/discussion/1268/what-is-a-vcf-and-how-should-i-interpret-it

Page 4: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Header of a VCF file

4http://gatkforums.broadinstitute.org/gatk/discussion/1268/what-is-a-vcf-and-how-should-i-interpret-it

Page 5: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Records in a VCF file

5

SNV

INS

DEL

DEL

Profiling countsClass of genome variation

count

SNVs …….indels ……DEL ….DUP …INV ..MEIs …BNDs ….Total GV

Page 6: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Plot the size distributions

6

indels MEIsDEL

Page 7: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Zygosity explained

7

Reference Genome

One pair of homologous chromosomes

ha hb hb

–Homozygous reference–

–Homozygous alternate–

——–Heterozygous——–

0/0

1/1

0/1

./. ————Missing————–GT field

Page 8: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Trio analysis to look for violations of Mendel’s Law of Segregation

8

Page 9: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Trio analysis to look for violations of Mendel’s Law of Segregation

9http://commons.wikimedia.org/wiki/File:Autorecessive.svg

Page 10: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

GQ—Genotype Quality

10http://riddhitubes.com/images/quality-stamp.png

The following formula relates a given GQ value X to the probability that the genotype call is INCORRECT:

X = -10*log10(Probability(genotype call is incorrect)), or Probability(genotype call is incorrect) = 10-X/10

For instance, a GQ value of 20 means that you are 99% sure your genotype call is correct, or there is a 1% chance your genotype call is incorrect.

Page 11: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome

Assignment 9 requirements

11

• Input files located in /home/assignments/assignment9/

• Important: DO NOT copy the input data files to /work/, reference the full path, e.g. python3 count_gv.py /home/assignments/assignment9/sv.reclassed.filtered.vcf

• Your submission folder should contain: • A completed README.txt • Commented scripts:

• count_gv.py• quantify_genotype.py• violate_MS.py

• Figures appropriately scaled with labelled axes and informative titles: • histogram_indels.png• histogram_deletions.png• histogram_meis.png

• Due Wednesday (29th March ‘17) at 10:00 AM


Recommended