Date post: | 17-Dec-2015 |
Category: |
Documents |
Upload: | randall-harmon |
View: | 214 times |
Download: | 1 times |
Preparation of Pindel inputAlignment BAM file generated by BWA
Alignment BAM file generated by other aligners
Pindel input with sample tag
(1) bam2pindel.pl Adaptor.pm
(2) sam2pindel.cpp
Filtered Pindel input with sample tag
(3) FilterPindelReads.cpp
Merge Pindel input files for paired or population sequence data
(1) bam2pindel.pl• Written by Keiran Raine at Sanger Institute ([email protected])• This tool was designed for BWA based BAM/SAM Illumina data• You must prepare a name sorted bam file• Set BAM_2_PINDEL_ADAPT
setenv BAM_2_PINDEL_ADAPT <path to Adaptor.pm>• Arguments:
-i|input: Input BAM file (req)-o|output: Output ready for pindel-s|sample: Sample or label (sampA,sampB...) (req)-pi|insert: Required if BAM file does not have PI tag in header RG record-r|restrict: Restrict to chromosome xx
• Example: ./bam2pindel_bwa.pl –i NameSorted.bam –o output_prefix -s tumour –om –pi 300
(2) sam2pindel.cpp• Written by Kai Ye at Leiden University Medical Center ([email protected])• This tool was designed for all BAM/SAM Illumina data• You must first compile the cpp source code: g++ sam2pindel.cpp –o sam2pindel –O3
• 5 arguments are required by sam2pindel– 1. Input sam file.– 2. Output for pindel.– 3. insert size.– 4. tag.– 5. number of extra lines (not start with @) in the beginning of the file.
• If you start with standard sam file (Input.sam with insert size 300) ./sam2pindel Input.sam Output4Pindel.txt 300 tumour 0• If you start with bam file ./samtools view Input.bam | ./sam2pindel - Output4Pindel.txt 300 tumour 0
Running Pindel1. Input: the reference genome sequences in fasta format;2. Input: the unmapped reads in a modified fastq format;3. Output folder4. Which chr/fragment5. BreakDancer result: Format per line: ChrA LocA stringA ChrB LocB stringB others If you don't have BreakDancer result, please provide an empty
file here.
Example:./pindel hg19.fa pindel_input_chr1.txt Output_Folder chr1 empty
Input format of Pindel
@9113TGGGGACCGGTGGAATGCTTCCACTGGCTGGGGGGC + chr2 41149518 50 Tumor
ref
Anchor
Strand, chr, 3’ coordinate and mapping quality of the mapped reads; sample tag
18 April 2023 7
Output format: deletions
1base - 1million bases
Allow mismatches to accommodate sequence errors and SNPs
D 10 ChrID 13 BP 32913041 32913052
AAATCAACTAGTGACCTTCCAGGGACAACCCGAACGTGATGAAAAGATCAaagaacctacTCTATTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAAAGT
GATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAA
CAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGA
CGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGA
CGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGA
TGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAAAG
GTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGAC
TAGTGACCTTCCAGGGACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAA
CCTTCCAGGGACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAA
ACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGG
CGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTT
CCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATC
AACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAA
TGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACA
ACCTTCCAGGGACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAA
GATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAA
AACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAA
GAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTT
8
Inversions
9
sample
ref
Large insertions
10
Non-template sequence in deletions, inversions and tandem duplications
11
ref
sample
Non-template sequence: deletion of 4 bases with 2 bases inserted
D 4 I 2 ChrID 3 BP 156978978 156978983 Supports 12 + 0 - 12 S1 13 SUM_MS 627 NumSupSamples 1 HCC1599a 12
CATGGCTGACTTATAAATCCCTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCACGTTGATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTTAAAGACATAGGTTTTATTGTC
TTATAAATCCCTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGCCTTGGGCAACTGCCAAA GATGCACT
ATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCAT
CTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCT
AGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCT
TTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCT
TTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTT
TTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCT
CTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTTAAAGACATAGGTTT
CTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTC
AAATCCCTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAG
CTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTTAAAGACATAGGTTT
TTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTT