Post on 17-Jul-2020
transcript
Fundamental Skills in DNA sequence analysis
Based on publicly available tools
Ning Jiang, Dept. of Horticulture
Outline for today
• Identifying gene structure – promoter, introns, Open reading frame
• Editing DNA sequence and mapping restriction sites
• Primer design
Donor site
Acceptor site
Donor site
Acceptor site
Transcription Start Site
Transcription Stop Site
An example of gene
AT1G09590.1 60S ribosomal protein L21
Promoter finding
• Find promoter (BDGP)• http://www.fruitfly.org/seq_tools/promoter.
html
>chr01 CHROMOSOME dumped from ADB: Jun/20/09 14:53; last updated: 2009-02-02CAGTATAATGTCACTAGGTGTTTGCATTGCTGTCTCTGGTTTCCAGGTACTCTACTTGAAGCAGTACTTTGAGAAGAAGAAGCTTATATAGAGATTCCCTGATATTCAGGTCCTTTAATGAGAGTTTCTTCTTAAAGAATAATATTCAATTCATCTCCTTTCTTTCCTCTTTACACGCCTCCTTTTTACTGACGCTGTAAAATTTTTGTTCATGGATTGTTTATGTAGCTTCTCTCTGATAAGAAGTCATTATTTCTGATTCTGAAATAATAATATTAACTTCTTGTCATCTCCAAAGTCCAAACGAAAGAGATTCTAGAACCATTGATATTGGGTTGTAACTGATACTTTATGTTAGTTTGGGGCTATAACACGTAAGTTTGACAGTACAAGGGTTAAGATTACAAGAAAAAACTAATCAAATGGGTATTACATACAAATCAGCCCAAATTTCTGACCCAACAGGCCCGTTAAGAGCAAACCCTAATTTCAAAGAGAATATATATAAACCCTAATCACATTTCGCAACCACCAAAGCGGAGGAAAAATGCCGGCTGGACATGGAGTTCGAGCGAGAACGAGGGATCTGTTCGCGAGGCCTTTCAGGAAGAAGGGTTACATTCCTCTATCGACTTACCTGAGGACCTTCAAGGTCGGCGATTACGTCGATGTGAAGGTGAATGGTGCGATCCACAAGGGTATGCCTCACAAGTTCTACCATGGTCGTACTGGTCGTATCTGGAACGTCACCAAGCGCGCCGTCGGTGTTGAAGTCAACAAACAGGTCTGATGATCTCTCACTTCCCTTAAACTTTTGTGTCCTTGAATTAGGATTTGAGCTTGTTGTGCTATTTGGATTGTTGATCTGTTGATTTGATGTTGAAATATGTGATTGAGTATTAGGTTTGTAGAATGTAAACCTTCTACTGTGCTCCAATTTAGCTCAGATTGTTAGTTTTGGCTCAAAGATCTTATCTTCTATTATGCAGTAGTGTGGTTCTTGATCCTGTTTCAGATTTGAAGTTTGATTGCTTAGGTCTCGTTTAGTCTTTAGCCTTAGCATGCATTGCGTTGTATCAATTTGGCCCTTTTTGCATCACAGTTCTAGCTAAGCATCTAGTTTTACTTCTCCATTGAACTGCTAGTATGTTTGGTTTCATGACTATCGCATTGCTCTAGTTTTGCTTTAAATGACCACCAACATTCGTATATTAATCTTGTTTCTATAAGTGTCTTAGAGATGTTATTGTGTAATGCTGATTAGCCTATGACCAGAACACAAACATTTTTGCTAGTACTACTTATATTGGGTATTATGACTGACATATGATGGTGGGGTTGTTTCAGATTGGGAACAGGATCATAAGGAAGAGGATTCATGTTCGTGTGGAGCATGTGCAGCAGTCAAGATGTGCCGAGGAGTTCAAGCTGAGGAAGAAGAAAAACGATGAGCTCAAGGCTGCAGCCAAAGCCAATGGTGAGACCATCAGCACCAAGAGACAGCCTAAAGGACCCAAACCAGGATTCATGGTCGAAGGAATGACCTTGGAGACTGTCACTCCAATCCCCTACGATGTCGTCAACGATCTCAAAGGAGGCTATTAGTTCTATCTTCTTGTGCTTTAGAACTCTTTTTCATTTGTTTTGTAGCAGACTTAAAACTAAGAATTATGATTTACTATATAACAGGAGTTTGCAATCCAGATTTTGTAAGGATGGTTTATTTTTATTATTGGTATTTTTGATCTTCCAACTATTGTTCACTTTTGTC
Promoter finding
• Find promoter (BDGP)• http://www.fruitfly.org/seq_tools/promoter.
html• Transcription Start Site at 532 bp
Finding introns
• Finding intron (CBS)• http://www.cbs.dtu.dk/index.shtml• CBS prediction server• NetPlantGene
>chr01 CHROMOSOME dumped from ADB: Jun/20/09 14:53; last updated: 2009-02-02CAGTATAATGTCACTAGGTGTTTGCATTGCTGTCTCTGGTTTCCAGGTACTCTACTTGAAGCAGTACTTTGAGAAGAAGAAGCTTATATAGAGATTCCCTGATATTCAGGTCCTTTAATGAGAGTTTCTTCTTAAAGAATAATATTCAATTCATCTCCTTTCTTTCCTCTTTACACGCCTCCTTTTTACTGACGCTGTAAAATTTTTGTTCATGGATTGTTTATGTAGCTTCTCTCTGATAAGAAGTCATTATTTCTGATTCTGAAATAATAATATTAACTTCTTGTCATCTCCAAAGTCCAAACGAAAGAGATTCTAGAACCATTGATATTGGGTTGTAACTGATACTTTATGTTAGTTTGGGGCTATAACACGTAAGTTTGACAGTACAAGGGTTAAGATTACAAGAAAAAACTAATCAAATGGGTATTACATACAAATCAGCCCAAATTTCTGACCCAACAGGCCCGTTAAGAGCAAACCCTAATTTCAAAGAGAATATATATAAACCCTAATCACATTTCGCAACCACCAAAGCGGAGGAAAAATGCCGGCTGGACATGGAGTTCGAGCGAGAACGAGGGATCTGTTCGCGAGGCCTTTCAGGAAGAAGGGTTACATTCCTCTATCGACTTACCTGAGGACCTTCAAGGTCGGCGATTACGTCGATGTGAAGGTGAATGGTGCGATCCACAAGGGTATGCCTCACAAGTTCTACCATGGTCGTACTGGTCGTATCTGGAACGTCACCAAGCGCGCCGTCGGTGTTGAAGTCAACAAACAGGTCTGATGATCTCTCACTTCCCTTAAACTTTTGTGTCCTTGAATTAGGATTTGAGCTTGTTGTGCTATTTGGATTGTTGATCTGTTGATTTGATGTTGAAATATGTGATTGAGTATTAGGTTTGTAGAATGTAAACCTTCTACTGTGCTCCAATTTAGCTCAGATTGTTAGTTTTGGCTCAAAGATCTTATCTTCTATTATGCAGTAGTGTGGTTCTTGATCCTGTTTCAGATTTGAAGTTTGATTGCTTAGGTCTCGTTTAGTCTTTAGCCTTAGCATGCATTGCGTTGTATCAATTTGGCCCTTTTTGCATCACAGTTCTAGCTAAGCATCTAGTTTTACTTCTCCATTGAACTGCTAGTATGTTTGGTTTCATGACTATCGCATTGCTCTAGTTTTGCTTTAAATGACCACCAACATTCGTATATTAATCTTGTTTCTATAAGTGTCTTAGAGATGTTATTGTGTAATGCTGATTAGCCTATGACCAGAACACAAACATTTTTGCTAGTACTACTTATATTGGGTATTATGACTGACATATGATGGTGGGGTTGTTTCAGATTGGGAACAGGATCATAAGGAAGAGGATTCATGTTCGTGTGGAGCATGTGCAGCAGTCAAGATGTGCCGAGGAGTTCAAGCTGAGGAAGAAGAAAAACGATGAGCTCAAGGCTGCAGCCAAAGCCAATGGTGAGACCATCAGCACCAAGAGACAGCCTAAAGGACCCAAACCAGGATTCATGGTCGAAGGAATGACCTTGGAGACTGTCACTCCAATCCCCTACGATGTCGTCAACGATCTCAAAGGAGGCTATTAGTTCTATCTTCTTGTGCTTTAGAACTCTTTTTCATTTGTTTTGTAGCAGACTTAAAACTAAGAATTATGATTTACTATATAACAGGAGTTTGCAATCCAGATTTTGTAAGGATGGTTTATTTTTATTATTGGTATTTTTGATCTTCCAACTATTGTTCACTTTTGTC
Finding introns
• Intron position 785 -1347
Donor site
Acceptor site
Donor site
Acceptor site
Transcription Start Site
Transcription Stop Site
Editing DNA sequence
• http://www.genomatix.de/cgi-bin/tools/tools.pl (Genomatix)
• Extracting sequence• Genomatix TOOLS form• Remove intron and untranscribed region.
Region retained: 532-784, 1348 - 1771
>chr01 CHROMOSOME dumped from ADB: Jun/20/09 14:53; last updated: 2009-02-02CAGTATAATGTCACTAGGTGTTTGCATTGCTGTCTCTGGTTTCCAGGTACTCTACTTGAAGCAGTACTTTGAGAAGAAGAAGCTTATATAGAGATTCCCTGATATTCAGGTCCTTTAATGAGAGTTTCTTCTTAAAGAATAATATTCAATTCATCTCCTTTCTTTCCTCTTTACACGCCTCCTTTTTACTGACGCTGTAAAATTTTTGTTCATGGATTGTTTATGTAGCTTCTCTCTGATAAGAAGTCATTATTTCTGATTCTGAAATAATAATATTAACTTCTTGTCATCTCCAAAGTCCAAACGAAAGAGATTCTAGAACCATTGATATTGGGTTGTAACTGATACTTTATGTTAGTTTGGGGCTATAACACGTAAGTTTGACAGTACAAGGGTTAAGATTACAAGAAAAAACTAATCAAATGGGTATTACATACAAATCAGCCCAAATTTCTGACCCAACAGGCCCGTTAAGAGCAAACCCTAATTTCAAAGAGAATATATATAAACCCTAATCACATTTCGCAACCACCAAAGCGGAGGAAAAATGCCGGCTGGACATGGAGTTCGAGCGAGAACGAGGGATCTGTTCGCGAGGCCTTTCAGGAAGAAGGGTTACATTCCTCTATCGACTTACCTGAGGACCTTCAAGGTCGGCGATTACGTCGATGTGAAGGTGAATGGTGCGATCCACAAGGGTATGCCTCACAAGTTCTACCATGGTCGTACTGGTCGTATCTGGAACGTCACCAAGCGCGCCGTCGGTGTTGAAGTCAACAAACAGGTCTGATGATCTCTCACTTCCCTTAAACTTTTGTGTCCTTGAATTAGGATTTGAGCTTGTTGTGCTATTTGGATTGTTGATCTGTTGATTTGATGTTGAAATATGTGATTGAGTATTAGGTTTGTAGAATGTAAACCTTCTACTGTGCTCCAATTTAGCTCAGATTGTTAGTTTTGGCTCAAAGATCTTATCTTCTATTATGCAGTAGTGTGGTTCTTGATCCTGTTTCAGATTTGAAGTTTGATTGCTTAGGTCTCGTTTAGTCTTTAGCCTTAGCATGCATTGCGTTGTATCAATTTGGCCCTTTTTGCATCACAGTTCTAGCTAAGCATCTAGTTTTACTTCTCCATTGAACTGCTAGTATGTTTGGTTTCATGACTATCGCATTGCTCTAGTTTTGCTTTAAATGACCACCAACATTCGTATATTAATCTTGTTTCTATAAGTGTCTTAGAGATGTTATTGTGTAATGCTGATTAGCCTATGACCAGAACACAAACATTTTTGCTAGTACTACTTATATTGGGTATTATGACTGACATATGATGGTGGGGTTGTTTCAGATTGGGAACAGGATCATAAGGAAGAGGATTCATGTTCGTGTGGAGCATGTGCAGCAGTCAAGATGTGCCGAGGAGTTCAAGCTGAGGAAGAAGAAAAACGATGAGCTCAAGGCTGCAGCCAAAGCCAATGGTGAGACCATCAGCACCAAGAGACAGCCTAAAGGACCCAAACCAGGATTCATGGTCGAAGGAATGACCTTGGAGACTGTCACTCCAATCCCCTACGATGTCGTCAACGATCTCAAAGGAGGCTATTAGTTCTATCTTCTTGTGCTTTAGAACTCTTTTTCATTTGTTTTGTAGCAGACTTAAAACTAAGAATTATGATTTACTATATAACAGGAGTTTGCAATCCAGATTTTGTAAGGATGGTTTATTTTTATTATTGGTATTTTTGATCTTCCAACTATTGTTCACTTTTGTC
>chr01 CHROMOSOME dumped from ADB: Jun/20/09 14:53; last updated: 2009-02-02
>chr01 CHROMOSOME dumped from ADB: Jun/20/09 14:53; last updated: 2009-02-02CCAAAGCGGAGGAAAAATGCCGGCTGGACATGGAGTTCGAGCGAGAACGAGGGATCTGTTCGCGAGGCCT TTCAGGAAGAAGGGTTACATTCCTCTATCGACTTACCTGAGGACCTTCAAGGTCGGCGATTACGTCGATG TGAAGGTGAATGGTGCGATCCACAAGGGTATGCCTCACAAGTTCTACCATGGTCGTACTGGTCGTATCTG GAACGTCACCAAGCGCGCCGTCGGTGTTGAAGTCAACAAACAG ATTGGGAACAGGATCATAAGGAAGAGGATTCATGTTCGTGTGGAGCATGTGCAGCAGTCAAGATGTGCCG AGGAGTTCAAGCTGAGGAAGAAGAAAAACGATGAGCTCAAGGCTGCAGCCAAAGCCAATGGTGAGACCAT CAGCACCAAGAGACAGCCTAAAGGACCCAAACCAGGATTCATGGTCGAAGGAATGACCTTGGAGACTGTC ACTCCAATCCCCTACGATGTCGTCAACGATCTCAAAGGAGGCTATTAGTTCTATCTTCTTGTGCTTTAGA ACTCTTTTTCATTTGTTTTGTAGCAGACTTAAAACTAAGAATTATGATTTACTATATAACAGGAGTTTGC AATCCAGATTTTGTAAGGATGGTTTATTTTTATTATTGGTATTTTTGATCTTCCAACTATTGTTCACTTT TGTC
Editing DNA sequence
• Reverse–complement sequence: DNA has two strands
• Reformat sequence• Sequence statistics
Work on a single DNA sequence
• http://www.restrictionmapper.org/• Restriction Map (For DNA recombination
work)
Donor site
Acceptor site
Donor site
Acceptor site
Transcription Start Site
Transcription Stop Site
Finding Open Reading Frame
• http://www.ncbi.nlm.nih.gov/ • ORF finder
>chr01 CHROMOSOME dumped from ADB: Jun/20/09 14:53; last updated: 2009-02-02CCAAAGCGGAGGAAAAATGCCGGCTGGACATGGAGTTCGAGCGAGAACGAGGGATCTGTTCGCGAGGCCT TTCAGGAAGAAGGGTTACATTCCTCTATCGACTTACCTGAGGACCTTCAAGGTCGGCGATTACGTCGATG TGAAGGTGAATGGTGCGATCCACAAGGGTATGCCTCACAAGTTCTACCATGGTCGTACTGGTCGTATCTG GAACGTCACCAAGCGCGCCGTCGGTGTTGAAGTCAACAAACAG ATTGGGAACAGGATCATAAGGAAGAGGATTCATGTTCGTGTGGAGCATGTGCAGCAGTCAAGATGTGCCG AGGAGTTCAAGCTGAGGAAGAAGAAAAACGATGAGCTCAAGGCTGCAGCCAAAGCCAATGGTGAGACCAT CAGCACCAAGAGACAGCCTAAAGGACCCAAACCAGGATTCATGGTCGAAGGAATGACCTTGGAGACTGTC ACTCCAATCCCCTACGATGTCGTCAACGATCTCAAAGGAGGCTATTAGTTCTATCTTCTTGTGCTTTAGA ACTCTTTTTCATTTGTTTTGTAGCAGACTTAAAACTAAGAATTATGATTTACTATATAACAGGAGTTTGC AATCCAGATTTTGTAAGGATGGTTTATTTTTATTATTGGTATTTTTGATCTTCCAACTATTGTTCACTTT TGTC
Work on a single DNA sequence
• http://www.restrictionmapper.org/• Restriction Map (For DNA recombination
work)
>chr01 CHROMOSOME dumped from ADB: Jun/20/09 14:53; last updated: 2009-02-02CAGTATAATGTCACTAGGTGTTTGCATTGCTGTCTCTGGTTTCCAGGTACTCTACTTGAAGCAGTACTTTGAGAAGAAGAAGCTTATATAGAGATTCCCTGATATTCAGGTCCTTTAATGAGAGTTTCTTCTTAAAGAATAATATTCAATTCATCTCCTTTCTTTCCTCTTTACACGCCTCCTTTTTACTGACGCTGTAAAATTTTTGTTCATGGATTGTTTATGTAGCTTCTCTCTGATAAGAAGTCATTATTTCTGATTCTGAAATAATAATATTAACTTCTTGTCATCTCCAAAGTCCAAACGAAAGAGATTCTAGAACCATTGATATTGGGTTGTAACTGATACTTTATGTTAGTTTGGGGCTATAACACGTAAGTTTGACAGTACAAGGGTTAAGATTACAAGAAAAAACTAATCAAATGGGTATTACATACAAATCAGCCCAAATTTCTGACCCAACAGGCCCGTTAAGAGCAAACCCTAATTTCAAAGAGAATATATATAAACCCTAATCACATTTCGCAACCACCAAAGCGGAGGAAAAATGCCGGCTGGACATGGAGTTCGAGCGAGAACGAGGGATCTGTTCGCGAGGCCTTTCAGGAAGAAGGGTTACATTCCTCTATCGACTTACCTGAGGACCTTCAAGGTCGGCGATTACGTCGATGTGAAGGTGAATGGTGCGATCCACAAGGGTATGCCTCACAAGTTCTACCATGGTCGTACTGGTCGTATCTGGAACGTCACCAAGCGCGCCGTCGGTGTTGAAGTCAACAAACAGGTCTGATGATCTCTCACTTCCCTTAAACTTTTGTGTCCTTGAATTAGGATTTGAGCTTGTTGTGCTATTTGGATTGTTGATCTGTTGATTTGATGTTGAAATATGTGATTGAGTATTAGGTTTGTAGAATGTAAACCTTCTACTGTGCTCCAATTTAGCTCAGATTGTTAGTTTTGGCTCAAAGATCTTATCTTCTATTATGCAGTAGTGTGGTTCTTGATCCTGTTTCAGATTTGAAGTTTGATTGCTTAGGTCTCGTTTAGTCTTTAGCCTTAGCATGCATTGCGTTGTATCAATTTGGCCCTTTTTGCATCACAGTTCTAGCTAAGCATCTAGTTTTACTTCTCCATTGAACTGCTAGTATGTTTGGTTTCATGACTATCGCATTGCTCTAGTTTTGCTTTAAATGACCACCAACATTCGTATATTAATCTTGTTTCTATAAGTGTCTTAGAGATGTTATTGTGTAATGCTGATTAGCCTATGACCAGAACACAAACATTTTTGCTAGTACTACTTATATTGGGTATTATGACTGACATATGATGGTGGGGTTGTTTCAGATTGGGAACAGGATCATAAGGAAGAGGATTCATGTTCGTGTGGAGCATGTGCAGCAGTCAAGATGTGCCGAGGAGTTCAAGCTGAGGAAGAAGAAAAACGATGAGCTCAAGGCTGCAGCCAAAGCCAATGGTGAGACCATCAGCACCAAGAGACAGCCTAAAGGACCCAAACCAGGATTCATGGTCGAAGGAATGACCTTGGAGACTGTCACTCCAATCCCCTACGATGTCGTCAACGATCTCAAAGGAGGCTATTAGTTCTATCTTCTTGTGCTTTAGAACTCTTTTTCATTTGTTTTGTAGCAGACTTAAAACTAAGAATTATGATTTACTATATAACAGGAGTTTGCAATCCAGATTTTGTAAGGATGGTTTATTTTTATTATTGGTATTTTTGATCTTCCAACTATTGTTCACTTTTGTC
How to design primers for PCR
T
T
TT
Annealing issues with primers
Stem loop Self dimer heterodimer
Primer issues
• Whether it is unique in the genome –specificity check
• Stem loop • Self dimer• Heterodimer
Primer design
• http://www.ncbi.nlm.nih.gov/tools/primer-blast/
• Primer design • Custom primer examined with IDT DNA
site• http://www.idtdna.com/analyzer/Applicatio
ns/OligoAnalyzer/
>chr01 CHROMOSOME dumped from ADB: Jun/20/09 14:53; last updated: 2009-02-02CAGTATAATGTCACTAGGTGTTTGCATTGCTGTCTCTGGTTTCCAGGTACTCTACTTGAAGCAGTACTTTGAGAAGAAGAAGCTTATATAGAGATTCCCTGATATTCAGGTCCTTTAATGAGAGTTTCTTCTTAAAGAATAATATTCAATTCATCTCCTTTCTTTCCTCTTTACACGCCTCCTTTTTACTGACGCTGTAAAATTTTTGTTCATGGATTGTTTATGTAGCTTCTCTCTGATAAGAAGTCATTATTTCTGATTCTGAAATAATAATATTAACTTCTTGTCATCTCCAAAGTCCAAACGAAAGAGATTCTAGAACCATTGATATTGGGTTGTAACTGATACTTTATGTTAGTTTGGGGCTATAACACGTAAGTTTGACAGTACAAGGGTTAAGATTACAAGAAAAAACTAATCAAATGGGTATTACATACAAATCAGCCCAAATTTCTGACCCAACAGGCCCGTTAAGAGCAAACCCTAATTTCAAAGAGAATATATATAAACCCTAATCACATTTCGCAACCACCAAAGCGGAGGAAAAATGCCGGCTGGACATGGAGTTCGAGCGAGAACGAGGGATCTGTTCGCGAGGCCTTTCAGGAAGAAGGGTTACATTCCTCTATCGACTTACCTGAGGACCTTCAAGGTCGGCGATTACGTCGATGTGAAGGTGAATGGTGCGATCCACAAGGGTATGCCTCACAAGTTCTACCATGGTCGTACTGGTCGTATCTGGAACGTCACCAAGCGCGCCGTCGGTGTTGAAGTCAACAAACAGGTCTGATGATCTCTCACTTCCCTTAAACTTTTGTGTCCTTGAATTAGGATTTGAGCTTGTTGTGCTATTTGGATTGTTGATCTGTTGATTTGATGTTGAAATATGTGATTGAGTATTAGGTTTGTAGAATGTAAACCTTCTACTGTGCTCCAATTTAGCTCAGATTGTTAGTTTTGGCTCAAAGATCTTATCTTCTATTATGCAGTAGTGTGGTTCTTGATCCTGTTTCAGATTTGAAGTTTGATTGCTTAGGTCTCGTTTAGTCTTTAGCCTTAGCATGCATTGCGTTGTATCAATTTGGCCCTTTTTGCATCACAGTTCTAGCTAAGCATCTAGTTTTACTTCTCCATTGAACTGCTAGTATGTTTGGTTTCATGACTATCGCATTGCTCTAGTTTTGCTTTAAATGACCACCAACATTCGTATATTAATCTTGTTTCTATAAGTGTCTTAGAGATGTTATTGTGTAATGCTGATTAGCCTATGACCAGAACACAAACATTTTTGCTAGTACTACTTATATTGGGTATTATGACTGACATATGATGGTGGGGTTGTTTCAGATTGGGAACAGGATCATAAGGAAGAGGATTCATGTTCGTGTGGAGCATGTGCAGCAGTCAAGATGTGCCGAGGAGTTCAAGCTGAGGAAGAAGAAAAACGATGAGCTCAAGGCTGCAGCCAAAGCCAATGGTGAGACCATCAGCACCAAGAGACAGCCTAAAGGACCCAAACCAGGATTCATGGTCGAAGGAATGACCTTGGAGACTGTCACTCCAATCCCCTACGATGTCGTCAACGATCTCAAAGGAGGCTATTAGTTCTATCTTCTTGTGCTTTAGAACTCTTTTTCATTTGTTTTGTAGCAGACTTAAAACTAAGAATTATGATTTACTATATAACAGGAGTTTGCAATCCAGATTTTGTAAGGATGGTTTATTTTTATTATTGGTATTTTTGATCTTCCAACTATTGTTCACTTTTGTC