Thesis Report Lucia De Moja' - Artifacts in qPCR

ARTIFACTS IN QPCRInvestigating unspecific products

Master Degree Project in Molecular BiologyOne year Level 30 ECTSAutumn term 2014

Lucia De Mojà[email protected]

Supervisor: Robert

AbstractUnspecific products are a common artifact in PCR and its variants. The artifacts include primer-dimers (PDs) and double or multiple peaks in the melt curve. The aim of this experiment was to analyze the sequences of the unspecific products to define their composition. The results showed that with the method used in this experiment it was not possible to obtain sequences for the primer-dimers. Statistical analyses were performed in which the characteristics of the tested primer pairs were compared with a control group of primer pairs that do not produce unspecific products. The unpaired t test revealed that the differences between the group of primer pairs giving PDs and the control group were statistically significant (95% confidence) regarding Tm and self 3’ complementarity with p-values of 0,0012 and 0,0477 respectively, leading to the conclusion that the criteria for the design of these two characteristics (below 3-5 °C Tm mismatch between Fw and Rv) might be too permissive and even designing a primer pair within the range allowed to avoid PDs, they are still going to be formed. To increase the reliability of these data, a further study with a higher number of samples should be performed, selecting only primer pairs that are already part of the experiment as test samples. For what concerns the multiple peaks, the sequences of the PCR products were aligned against the sequence expected and the Fw primer used, allowing us to see that an additional binding site was generated on the target gene during qPCR.

SummaryA gene is a segment of DNA that contains a specific sequence of double stranded nucleotides. To be able to amplify a gene it is necessary to know its sequence so that a primer (short sequence of nucleotides complementary to a certain segment of DNA) can be designed to bind to the DNA. During amplification, an enzyme called polymerase starts to add nucleotides to the short sequence until the whole target is duplicated. A fluorescent dye can be used as a reporter that keeps track of the amount of DNA that is duplicated during this process. Reagents involved in a qPCR reaction include the template DNA, the primers and the fluorescent dye. The dye can bind to the double stranded DNA emitting a fluorescent signal (a common dye used in qPCR is SYBR® Green), so the more double stranded DNA is present in the reaction mix, the more intense the fluorescent signal will be. If the intensity of the signal grows exponentially, the amplification curve will have an exponential growth too until a peak is reached and the growth stops because there are no reagents left in the reaction mix. A melt curve is generated by reading the fluorescent signal as the temperature is slowly increased. When the DNA strands separate (melts) the dye is liberated from the DNA and loses its fluorescence. This can be translated into a graphical curve and, depending on the length of the fragments of DNA that are duplicated, each peak will have a specific melting temperature (Tm) that corresponds to the temperature at which the peak is reached in the melting curve. A good qPCR melt curve representing the amplification of one single target is supposed to present only one clear peak. Based on these concepts, unspecific products can be visible in the melt curve as additional peaks that make the curve difficult to read and compromise the quality of the results. Assuming that a good qPCR result is necessary to give accurate feedbacks on queries as the identification of a gene involved in cancer development or similar, in a future perspective, eliminating artifacts of this nature is essential in many different fields that include not only scientific research but also forensic science.

The present study was carried out in order to investigate the nature of the unspecific products that are formed during Quantitative Real-Time Polymerase Chain Reaction (also known as qPCR), a technology used in molecular biology to amplify DNA fragments. Unspecific products are not supposed to be amplified during qPCR since this is considered to be a specific reaction due to the specificity of the primer pair used. The amplification of any product requires material to be used and all the reagents present in the reaction mix are meant to be used for the amplification of the DNA target. In the presence of unspecific products, some of the reagents present in the reaction mix are used to amplify the wrong DNA fragment and the consequence is that there is a loss of reagent and less amplification of the desired target possibly giving errors in quantification, and a very unclear result.

The data analysis shows that even though the primer pairs were designed considering Tm and self 3’ complementarity below the maximum values (5°C of difference between Fw and Rv primer and less than 4 as score for the complementarity), there is a significant difference between complementarity and Tm of test group (giving PDs) and control group (PDs free). In order to reduce the occurrence of the artifacts, a more strict value (less than 2°C of difference between Fw and Rv primer for the Tm and no more than 1 as score for the complementarity) for these two characteristics should be used during the design process. Furthermore, in this experiment, double peaks have been proven to be the result of the amplification of two different products in the same reaction due to the presence of multiple binding sites on the PCR product (not on the original sequence of the target gene), that results in the amplification of two versions of the same gene with different length. The reason for the formation of the additional binding site is yet to be established with further experiments.

Table of contents

List of abbreviations.............................................................................................1

Introduction.........................................................................................................2

qPCR..................................................................................................................2

Artifacts.............................................................................................................2

Primer-Dimers................................................................................................3

Double Peaks.................................................................................................4

Aim.......................................................................................................................5

Materials and Methods........................................................................................6

Criteria for the primer pairs..............................................................................6

qPCR..................................................................................................................7

Purification........................................................................................................7

Gel electrophoresis and extraction...................................................................8

Capillary gel electrophoresis.............................................................................8

Sample preparation for sequencing..................................................................9

Statistical analysis.............................................................................................9

Results................................................................................................................10

Primer-dimers..............................................................................................10

Double peaks...............................................................................................12

Statistical results.............................................................................................13

Discussion..........................................................................................................15

Primer-dimers.................................................................................................15

Double peaks..................................................................................................17

Conclusions........................................................................................................18

Future perspectives...........................................................................................19

Acknowledgements............................................................................................20

References.........................................................................................................21

Appendices........................................................................................................24

Appendix 1 – Kits and reagents.......................................................................24

Appendix 2 – Figures.......................................................................................25

Appendix 3 - Sequences..................................................................................29

Appendix 4 - Tables.........................................................................................34

List of abbreviationsbp - base pair

dNTPs - deoxynucleotide triphosphate

ddNTPs - dideoxynucleotide triphosphate

dsDNA – double stranded DNA

Fw – forward

NTC - no template control

PDs – primer-dimers

PPi - pyrophosphate

qPCR - Quantitative Real-Time Polymerase Chain Reaction

Rv - reverse

SD – standard deviation

ssDNA – single strand DNA

Tm – melti temperature

1

Introduction

qPCR

PCR is a widely used method for amplification of nucleic acids. The principle behind any PCR reaction is based on the fact that nucleic acids can be amplified thanks to an enzyme called polymerase, which recognizes the nucleotides’ sequence and creates a chain reaction in which the free nucleotides present in the reaction mix are paired to their complementary ones on the nucleic acid chain supposed to be amplified. For the reaction to be successful, it is important for the reaction mix to have a balanced amount of primers, salts, deoxynucleotide triphosphates (dNTP) and the targeted nucleic acid. The amplification of the fragments takes place in a thermocycler, which is programmed to change the temperature according to several steps in order to cause the denaturation of the genetic material that will provide a single stranded nucleic acid, to which the primers can then anneal at the right melting temperature and the polymerase can elongate the target. Each step is performed at a specific temperature that will be maintained for several seconds and then automatically changed to proceed with the reaction (Wilson and Walker, 2010).When all the steps are completed, a cycle is performed and a normal PCR reaction is considered to be around 40 cycles.

qPCR is a quantitative PCR method that measures in real time the amount of product generated in each cycle. It is based on the detection of a fluorescent signal that can be generated from fluorescent dyes like SYBR® Green, EvaGreen®, BOXTO (that bind the DNA non-specifically within the double strand), or probes (that bind specifically along the nucleotides’ sequence) (Reed, et al., 2013). These two approaches differ on the fact that, in the case of the probe based method, the fluorescent signal is free from interferences since it is the direct result of the binding between the probe and the target sequence, while the dye based method gives a signal that is directly proportional to the quantity of double-stranded DNA bound by the dye itself, therefore it cannot distinguish between different sequences (Vandesompele, 2009, pp. 12-19). This condition implies that also unspecific products, which are created during the reaction, will contribute to the fluorescent signal emitted. The detector will wrongly assign to the target, giving a margin of imprecision within the method itself. This kind of interference does not take place in the probe based method due to its specificity, so it is not possible to have a signal for the unspecific products formed using this method but this does not imply that they are not formed (Poritz and Ririe, 2014).

ArtifactsIt is not unusual to encounter unexpected results during PCR and qPCR, especially when using SYBR® Green technology. Depending on different conditions, the kind of artifact that can be produced goes from multiple peaks in the same melt curve, to the amplification of unspecific products like primer-dimers in no template controls (NTC). This depends on the fact that SYBR® Green binds to double stranded DNA regardless if that dsDNA is the target of the experiment or not (Tajadini, et al., 2014).

2

Primer-Dimers

PDs are thought to be the principal source of the artifacts in PCR and its variants. They are stated to be the result of the annealing between forward (Fw) and reverse (Rv) primers (present in the reaction mix during PCR), due to a certain level of complementarity between them. The theory behind the formation of primer dimers as the result of self-annealing between the primers pair is a common concept within the scientific community (Reed, et al., 2013). This concept should lead to the logic conclusion that the products of primer-dimers are not supposed to exceed in length the sum of the two nucleotides’ sequences of the primers involved (SantaLucia, 2007).

Primer-dimer artifacts have actually been proven to be longer than that as shown in figure 1 (by few nucleotides more), leading to question their composition in contrast to the common conceptions (Brownie, et al., 1997; SantaLucia, 2007).

Figure 1: Image taken from a study conducted by Brownie et al. in 1997 (The elimination of primer-dimer accumulation in PCR. Nucleic Acid Research, 25(16)), in which they demonstrated that the sequence of the primer-dimers is longer than the sum of the primer pair due to the fact that some additional nucleotides come in between the two primers. A – B – C – D – E and F are the primers and the Rv complements used to perform the alignment with the sequenced products. The red squares show the additional sequence that is bond to the last nucleotide of each primer.

3

Brownie, et al. (1997) focused their aim on producing primer dimers using different levels of complementarity between the primers, while Satterfield (2014) recently tried adding Rv and standard phosphoramidites to synthesize oligonucleotides without 5’ end to reduce the annealing of the primers between each other.

This experiment was focused on the attempt to produce primer dimers from known primer pairs that were supposed to have a low self-complementarity in normal PCR conditions increasing the total number of cycles, since primer-dimers are a so-called “late product” in PCR reactions. Once the primer-dimers were formed in NTCs, the product was subsequently used as template in the following qPCR experiments, to increase the total amount of product to be further analyzed.

Double Peaks

Double peaks are artifacts that are often displayed in the melt curve during PCR experiments and its variants that use SYBR Green® as reporter dye (Ririe, et al., 1996). Even though the literature is not rich in experiments that research the nature of multiple peaks in the melt curve, several companies as Life Technologies and IDT® attempt to explain in their troubleshooting guidelines that, due to the presence of artifacts as multiple peaks, melt curve analysis are not to be considered as a diagnostic method but only as indicators (Downey, 2014). The common conception behind the presence of double peaks is that they are the resultant of different products amplified during the reaction (SantaLucia, 2007).

In this experiment, samples giving double peaks in the melt curve analysis were purified and sequenced in order to establish whether two different products were actually present or not in the same reaction.

4

AimThe aim of this project was to produce and analyze unspecific products in qPCR to find out:

- Whether primer-dimers are the actual product of the dimerization of the primers or something else participates in their formation.

- Whether multiple peaks in the melt curves prove the presence of different products using SYBR® Green as reporter dye.

- Whether there is or not a pattern in the formation of unspecific products depending on the characteristics of the primers involved.

To achieve this goal, qPCR was performed using several primer pairs as NTCs (to establish the presence of PDs formations). A couple of samples were tested using serial dilutions to try and separate the multiple peaks obtained. Statistical tests, as unpaired t test, were performed in order to compare a test group (primer pairs prone to form PDs) with a control group (primer pairs that were negative as NTCs for PDs formations).

Since artifacts during qPCR cause unclear results and loss of reagents, as well as less amplified target, understanding any of the queries mentioned above could be helpful in order to have better performances and more reliable results.

5

Materials and MethodsDifferent primer pairs were tested in qPCR for unspecific products’ formation, collecting them based on their previous results such as number of peaks in the melt curves for one single target or presence/absence of a melt peak in NTCs. The primer pairs showing the presence of unspecific amplification such as multiple peaks, in the melt curve were further tested using five or seven points standard curves with 10 fold serial dilutions, while the primer pairs showing amplification in the NTCs, were analyzed in qPCR without template to obtain a sample to purify and some information about their length by capillary gel electrophoresis using the Fragment Analyzer™ (AATI, Ames, USA). Sequencing was then performed to analyze the sequence of the purified fragments.

The first set of experiments was performed running qPCR on seven primer pairs of which five were NTCs and two were samples with human cDNA and the primers designed to target IGFBP3 and CD44. The samples that showed amplification were then purified with the MinElute™ kit (Qiagen, Hilding, Germany) and analyzed in the Fragment Analyzer™ from Advanced Analytical by capillary gel electrophoresis. The purified products were sent for cycle sequencing to Eurofins Genomics (Ebersberg, Germany).

The second set of experiments was performed on 20 primer pairs that have been previously demonstrated to produce artifacts in qPCR, running NTCs. The products were purified with the MinElute™ kit (Qiagen), Oligo clean-up kit (Norgen) and gel electrophoresis was performed to extract the fragments using the QIAEX II® gel extraction kit (Qiagen). The purified fragments were sent for Sanger sequencing at GATC Biotech (Konstanz, Germany).

Criteria for the primer pairs

The criteria used to collect the primer pairs to be tested in qPCR focused on two characteristics: the ability of the primer pairs to self-anneal (this property was verified looking at the melt curves generated by NTCs in previous experiments), and the number of peaks in the melt curve for the samples that shown artifacts in previous experiments. All the primer pairs producing melt curves attributable to unspecific products, were analyzed in NTCs in order to prove the presence of products even without any template in the reaction mix.

Upon confirmation of the presence of the product in qPCR, capillary gel electrophoresis was performed on the positive samples to get more information about the length of the fragments. The NTCs giving a product and the samples with multiple peaks in qPCR, were sent for sequencing to a third party. For the artifacts that were thought to be PDs, the length was chosen arbitrarily within a range that had the sum of the length of the two primers as a minimum and the maximal length obtained as maximum.

As control, primer pairs that did not produce artifacts in qPCR in previous experiments were used to perform statistical analysis to compare the characteristics of their sequences with the tested ones.

All the primer pairs were provided by IDT® - Integrated DNA Technologies (Coralville, USA) and the results of the previous experiments were provided by the primer library and the laboratory Books of TATAA Biocenter (Gothenburg, Sweden).

6

qPCR

Each primer pair giving unspecific products when analyzed without template in previous experiments performed at TATAA Biocenter was analyzed in qPCR without template to confirm the presence of the artifacts. The product was then diluted 1:108 and amplified again to be sure that the fragments obtained were the product of the primer pair used in the NTC. Each NTC was analyzed in quadruplicates, to increase the possibility to see some unspecific product. Once the product was obtained, each sample was amplified again in triplicates to ensure repeatability. Primer pairs giving different Tm in different replicates were analyzed again separately.

The primer pairs giving multiple peaks in the melt curves in previous experiments were tested using human cDNA as template, to confirm the presence of the artifacts, and then serial dilutions were performed to obtain a standard curve that made possible to compare the differences in relative amounts of the two products based on the concentration of the template.

The master mix used for all the samples was TATAA SYBR® Grandmaster® Mix (TATAA Biocenter). The amount of template used was 2 µl (1:108 diluted from the product of NTC) in 20 µl of total reaction volume. All the primers were used from stock solutions 100 µM and diluted to a concentration of 10 µM in a working solution with nuclease free water. The characteristics of the primers can be found attached in appendix 4 table 1a (control group) and 1b (Test group). Each qPCR was performed at the same conditions in all the tests as reported in table 2 after optimization of number of cycles and amount of time for each step.

Table 2: Temperature and cycling program used for all the experiments.*

Program Temperature (°C) Time (s) CyclesInitial Hold 95 60 1 Denaturation 95 5 Annealing 60 30 55 Extension 72 15Melt curve 95 – 60 1*LightCycler® Nano from Roche Life Science and QuantStudio® 12K Flex System from Life Technologies were used to perform the experiments. The number of cycles used was higher than the usual to ensure a proper amount of product to be amplified since unspecific products as primer-dimers are shorter than normal qPCR fragments and only produce a detectable signal in the late phase of PCR.

Purification

For the first set of the experiments, the products showing a clear peak in the melt curve were collected and purified using MinElute™ PCR Purification Kit (50) from QIAGEN following the manufacturer’s instructions, one volume of PCR product was added to five volumes of PB buffer and then centrifuged using a purification column that keeps the genetic material in the filter at the bottom, flushing away the buffer and the reaction mix used in qPCR. 750 µl of PE buffer were then added to each column to wash the residues in the filter that were not related with genomic material and centrifuged again. The last step was to elute the DNA in EB buffer (14 µl) centrifuging for the last time. All the centrifugation steps were conducted at 13000 rpm for 1 minute.

The second cycle of the experiment was performed using the Oligo Clean-Up and Concentration Kit (50) provided by Norgen Biotek Corporation, due to the fact that the

7

Qiagen kit is able to recover fragments from 70 bp up to 4 kb. The Norgen kit was chosen since its range of purification includes fragments from 10 bp, allowing the recovery of smaller fragments such as primer-dimers. The samples were diluted up to 50 µl using nuclease free water, to which were then added 150 µl of Binding solution to bind the DNA and 300 µl of isopropanol. This procedure allows the DNA to bind to the filter of the column during the centrifugation, discarding the residues along with the aqueous phase. To wash away from the column residual debris, 400 µl of washing solution were added and the sample was centrifuged again. The washing step was repeated and after discarding all of the residue, another centrifugation step was performed to make sure that the filter of the column was dry. To elute the DNA in the filter, 50 µl of Elution solution were added to the column and then another centrifugation step was performed. All the centrifugation step were conducted at 14000 rpm for 1 or 2 minutes.

The purity and concentration of the samples was measured using both Nano Drop 1000 spectrophotometer (Thermo Scientific, Waltham, USA) and Drop Sense 96 spectrophotometer (Trinean, Gentbrygge, Belgium)

Gel electrophoresis and extraction

As none of the two kits mentioned above was enough to get a high enough amount of pure product, gel extraction was used to get the smaller fragments without compromising the purity of the samples and make sure that the residual primers were not collected together with the target fragments. This procedure was performed only for the NTC samples, NuSieve™ 3:1 Agarose (LONZA, Basel, Switzerland) was used according to the instructions for casting a 6% agarose gel in TBE buffer. Concentrated qPCR products were added to the wells and left running for one hour at 60 V and then 30 minutes at 75 V. The gel was stained with GelGreen™ (Biotium, Hayward, USA) in a water bath (for a final concentration of 3X in H2O) for 30 minutes in constant agitation at room temperature and the bands were visualized and excised using a UV lamp and a scalpel.

To extract the DNA from the gel, QIAEX II® Gel Extraction Kit from QIAGEN was used following the manufacturer instructions. For this procedure no columns are required, but in order to extract the fragments from the gel, a water bath at 50° for 10 minutes was required. The tube was filled with agarose gel (up to 250 mg) containing the DNA band, six volumes of buffer QX1 to wash and solubilize the gel and 10 µl of QIAEX II solution to bind the DNA. Once the gel had melted, the samples were centrifuged for 30 seconds at 13.000 rpm and the supernatant was discarded, leaving the DNA in the pellet. To remove residual agarose, salts and contaminants from the pellet, two washing steps with 500 µl of QX1 each were performed resuspending the pellet in the buffer between each centrifugation. The pellet containing the fragments was air dried at room temperature and then dissolved in 20 µl of water, centrifuged again and the supernatant was collected in a clean tube.

The purified material was stored in -20° until further use for capillary gel electrophoresis.

Capillary gel electrophoresis

8

Capillary gel electrophoresis was performed using the Fragment Analyzer™ from Advanced Analytical to do quality control tests on the purified material. The Fragment Analyzer™ was used following the manufacturer’s instructions for DNA samples. This instrument uses capillary technology to run the samples in a gel and then discards the waste automatically showing a digital image of the bands.

The gel was prepared with an intercalating dye provided by AATI, the sample plate was prepared with PCR product diluted 1:11. The marker solution used had a lower marker of 35 bp and a high marker of 1500 bp. As a ladder the same 100 bp ladder run in a previous experiment was used.

Sample preparation for sequencing

The samples collected were prepared following the sample submission guideline provided by Eurofins Genomics. To analyze Fw and Rv strands separately, the two primers were sent in two separate tubes at a concentration of 10 µM and the concentration of the purified samples was 2 ng/µl. Cycle sequencing was performed for both unspecific products obtained running NTCs and the samples with multiple peaks in the melt curve.

The fragments produced during the second set of the experiments were sent to GATC Biotech following the sample submission guideline. A minimum concentration of 10 ng/µl in a total for each sample was sent together with Fw and Rv primers at 10 µM. Each qPCR product was purified with the three different methods described above and all the purified sample were sent for Sanger sequencing. Only PDs products obtained running NTCs were analyzed for the sequences.

Statistical analysis

Two groups of primer pairs were considered to investigate the reasons why certain primer pairs are more prone than others in giving artifacts in qPCR. The test group involved primer pairs that have demonstrated the ability to produce unspecific products known as primer-dimers in previous experiments, while the control group consisted of primer pairs proven to not produce PDs in NTC. The characteristics of the primers used in each primer pair were observed using the primer-BLAST tool from NCBI keeping the standard parameters except for the Tm box that was empty. Average and SD were calculated for the following characteristics: Tm, GC% content, self 3’ complementarity and self-complementarity score between the two groups and within the same group between Fw and Rv primer. The unpaired t test was performed and t and p-value were calculated using the online tool QuickCalcs from GraphPad Software, to find out if there was a statistically significant difference between the two groups for each of the considered parameters. The confidence interval was set at 95% establishing the significance level at 0.05 as maximum.

9

Results

This project pointed as aim to understand the nature of the unspecific products in PCR and the conditions in which they show up.

Unspecific products are a common artifact in qPCR. Within the unspecific products there are deviations in the shape of the melt curve, additional products other than the target of the experiment, multiple peaks in the melting.

A common artifact that is often encountered during PCR and its variants is the formation of primer-dimers. In this work, we tried to intentionally produce primer-dimers using primer pairs known to have the tendency of producing such artifacts. Some of the primer pairs objects of this study were able to produce, in the presence of DNA as template, multiple peaks in the melt curve. After testing the PDs set of primer pair as NTC and the double peaks set in a serial dilution, to see the trend of the peaks at different template concentrations, the results showed that primer-dimers formations are longer than the expected length of ~40 bp (50 to 99 bp according to the results in the Fragment Analyzer™).

This section will be divided in two parts, since the artifacts that were analyzed turned out to be very different between each other. PDs were generated from 30 primer pairs samples while the longer unspecific products giving double peaks were encountered in only two samples.

Primer-dimers

A total number of 30 different NTC primer pairs were selected to be part of the study as generators of unspecific products like primer-dimers. The primer pairs were selected based on their previous results in previous experiments, allowing us to take only the ones that showed amplification of unspecific products even without any template in the reaction mix. Primer-dimers are unspecific products that are generated randomly in all kind of conditions during PCR. This randomness leads to the fact that when running the primer pairs in quadruplicates, the results can be heterogeneous. As a matter of fact, primer pair one (shown in figure 2) presented four different peaks in the melt curve plot, suggesting that in the different replicates four different products were generated. The same pattern was visible in almost all of the analyzeded primer pairs with some exceptions (appendix 2 - figure 4), suggesting as a consequence that the method per se is not repeatable even following the same protocol at the same conditions.

Even though the melt curves are different between the replicates, a quality control test performed with the Fragment Analyzer™ using capillary gel electrophoresis confirmed that the length of the fragments was exactly the same (see appendix 2 - figure 3).

10

Figure 2: Melt curves belonging to the NTCs quadruplicates (2a) and the three replicates of the second generation product (2b) for Primer pair 1. Figure 2a: The Tm of all the products seems to be comprehended between 75 and 78°C. For all the replicates the peaks are clear and defined except for one (indicated by the red circle) that showed almost no amplification even though was still visible at ~77°C. It is possible to notice a noisy pattern in the baseline for all the replicates. Figure 2b: The image shows the three of four replicates of Primer pair one that were analyzed in qPCR and gave different melt curves (Figure 1). Each of the three replicates was analyzed in triplicates and two out of three replicates resulted negative (red rectangles). Only one of them (Primer pair one Replicate three in the red circle) gave some product with Tm of ~73°C, while for the 1st generation products the Tm was between 75-78°C

Only 18 of the 30 primer pairs originally selected for this study presented artifacts attributable to PDs formations during the performance of qPCR despite the fact that they were selected based on their capability to form primer-dimers. Once a reaction showed amplification of unspecific products, another qPCR was performed using the 1st generation product as template to confirm that there still was amplification and the product was generated from the two primers used in the first qPCR.

The replicates giving different Tm were amplified a second time separately and in triplicates. These 2nd generation products were not matching the expectations most of the time, giving no amplification at all or a different Tm compared to the 1st generation product (figure 2b). In some cases it was observed that the replicates giving different Tm in the 1 st generation, gave the same Tm when amplified a second time using 1st generation product as template (appendix 2 - figure 5).

The samples that showed no amplification in the qPCR were then excluded from the statistical analysis and the sequencing.

The concentration obtained with the different purification methods was always very low for the samples giving PDs formation, from a minimum of 2 ng/µl to a maximum of 13 ng/µl (MinElute™), 40 ng/µl (Oligo Clean-up) and 13 ng/µl (QIAEX II®). Even though the concentration was not optimal, the number of copies was really high (from a minimum of 3.3 x 1010 to a maximum of 6.74 x 1011) due to the short length of the fragments.

For what concerned the fragments generated running NTCs from the primer pairs objects of the study, no results were obtained by sequencing the purified material with either Sanger sequencing and its variant cycle sequencing. As shown in appendix 3 - figure 11, the chromatograms reported a noisy baseline pattern that made impossible to identify a sequence since no clear peak was generated.

11

Even though for certain samples a sequence was identified, the degree of certainty about the nature of the bases was so low that it was impossible to use the data for further analysis.

Double peaks

Two samples were considered for this part of the experiment. Human cDNA was used as template to amplify part of the IGFBP3 and CD44 genes using two different primer pairs.

The 1st generation product is visible in the melt curves represented in figure 6 and 7 in appendix 2. Both of the samples present two peaks but the second one is not perfectly defined yet in the case of CD44 (appendix 2 - figure 7). Multiple peaks in qPCR suggest the presence of two different products in the same reaction. In the attempt to separate the peaks, obtaining one final sample with only one clear peak, the qPCR product was then diluted 1:108 and used as template for the second reaction and the results are shown and described in appendix 2 - figure 8. Due to the fact that for both IFGBP3 and CD44 samples the peaks with higher Tm (attributable to the unspecific product) showed a higher level of fluorescence than the peak with lower Tm (attributable to the expected product), the dilution series were repeated again 1:108 leading to the amplification of the unspecific products as major product of the reaction (figure 9a and 9b).

Figure 9: The figure shows CD44 melt curves for the old product with double peaks and the newly diluted and amplified product with only one peak (9a) and the original product of IGFBP3 sample against the one obtained diluting 1:108 both 1st and 2nd generation product of IGFBP3 (9b) Figure 9a: The original target had a Tm of ~82°C and in the 1st generation product it is possible to see that a little peak of unspecific product is still trying to emerge (in the red circle in the figure). The unspecific product isolated in the diluted sample in the 5 th generation shown a Tm of ~86°C (black arrow).and . Figure 9b: The melt curve with only one peak with a Tm of 83°C represents the expected IGFBP3 cDNA amplified with the same primer pair. The melt curve with the double peak is the result of the dilution of the 2nd generation product and consequent amplification in qPCR. It is possible to infer by looking at the Tm of the original IGFBP3 overlapping with the hump residue of the dilutions that they are probably representing the same product.

After different cycles of dilution, qPCR and purification, the final products with only one peak isolated had a concentration of 229.70 ng/µl for IGFBP3 and an absorbance of 1.85 (A280/260) and 2.16 (A260/230), while for CD44 the concentration was 217.40 ng/µl with an absorbance of 1.84 (A280/260) and 1.70 (A260/230). The absorbance of nucleic acids is visible between 260 and 280 nm (the higher is the absorbance the purest is the solution). A ratio higher than 2.0 within the range of absorbance mentioned indicates high DNA purity.

12

Even though the concentration and the purity were high for both IGFBP3 and CD44, the quality control running the Fragment Analyzer™ was disappointing since no band stood up in the gel and no peak was visible in the electropherogram, determining CD44 as a sample that could not be sequenced (appendix 2 – figure 10).

Nor IGFBP3 and CD44 passed the quality control in the Fragment Analyzer™, so CD44 was not sent for sequencing but IGFBP3 was still sent as a control to see if it could be sequenced since the amplified fragment was supposed to be more or less 100 bp long. As a result, both 1st generation product with the expected target amplified (IGFBP3 old) and the 5th generation unspecific product isolated (IGFBP3 new) shown in figure 9b were successfully purified and sequenced using Cycle sequencing (See appendix 3 figure 12).Despite being performed twice the sequencing was unable recover the whole sequence of IGFBP3 new, but still an alignment with the IGFBP3 old and the cDNA from IGFBP3 provided by GeneBank was performed using Clustal Omega from EMBL-EBI. Because the cDNA was over 2000 bp, three extracts where taken from the alignment results covering all the matching sequences or fragments. Extract one shows in a green rectangle in appendix 3 the part of the sequences that matched perfectly between new, old and cDNA IGFBP3. Due to the fact that the two fragments analyzed had a high difference in length but almost a perfect matching in the alignment results, the longest sequence was aligned with the sequence of the Fw primer showing a double site of match on the same fragment as shown in figure 13 appendix 3. To understand if the primer was designed in a way that allowed two different binding sites, the same primer was aligned with the original sequence for IGFBP3 provided by GenBank. The result of this second alignment showed that the sequenced obtained for IGFBP3 during this experiment was not matching completely the original one, but some sort of severe deletion happened since the same sequence of the Fw primer was not present in two different sites anymore (figure 14 appendix 3).

Statistical results

Test group and control group were consistent of 18 and 19 primer pairs respectively. Due to the fact that two particular primer pairs from the test group showed to produce PDs or not using different working solutions equally concentrated, they were named primer pair α and primer pair β and put into both groups to try and see if their presence in one or the other group could affect the final results. table 3a shows the standard deviations (SD) and averages of the groups of primer pairs, table 3b in appendix 4 shows the same data with and without primer pair α and primer pair β in the calculations.

Table 3a: SDs and averages between Fw and Rv primers for all the considered characteristics for each group. T – test, C – control.

Group Length (b) Tm (°C) GC% Self Comp Self 3’ CompT group Average 20,02777778 58,4663888 52,1775 3,333333333 0,5555555556T group SD 1,383290303 2,53090402 7,170243421 1,041976145 0,5555555556C group Average 20,55263158 59,8855263 53,19289474 3,289473684 0,4131549501C group SD 1,703690246 0,80530906 7,074501953 0,802290462 0,2105263158

Based on the results obtained with the calculation of average and SD, an unpaired t test was performed and the results are shown in table 4. Considering SD, Average and number of

13

samples for each group, the p-value, t value and the statistical significance of the calculated differences were obtained.

The results shows that the differences between the test group and control Group Tm:s were considered to be statistically significant for all the combinations (with α and β primer pairs included, excluded, included only in one group or the other). The p-value are reported in table 4 and the highlighted ones are the one to be considered statistically significant with more than 95% of chances for the data to be reliable.

Table 4: P-values calculated between the different groups.*

Characteristic T and C T αβ and C T and C αβ T αβ and C αβLength 0.1406 0.2375 0.3821 0.1497Tm 0.0012 0.0305 0.0180 0.0015GC% 0.5310 0.8478 0.5110 0.5645Self Comp 0.8351 0.8932 0.6889 0.8624Self 3’ Comp 0.0477 0.1774 0.1574 0.0530Test group and Control group (T and C), Test group + α and β primer pairs and Control group (T αβ and C), Test group and Control group + α and β primer pairs (T and C αβ) and Test group + α and β primer pairs and Control group + α and β primer pairs. The confidence was 95% leading to a significance level of 0.05, the statistically significant results are highlighted and reported for all the Tm:s calculated and the self 3’ complementarity of the T and C groups and T αβ and C αβ groups. As reported in the table, the statistical significance is very high when the groups compared both contain or exclude α and β primer pairs.

14

DiscussionThe present study had as a major purpose to find out the source of the additional sequence between Fw and Rv primers in the formation of PDs, whether SYBR® Green is involved in the manifestation of artifacts as additional peaks in the melt curve and if a correlation between the characteristics of all the primer pairs giving artifacts in qPCR exists when compared with primer pairs that do not produce PDs. The methods used to perform this experiment were designed based on the knowledge that amplifying a product in qPCR, purifying the amplicons and performing sequencing, the sequence of the products should have been possible to analyze and then compared with other sequences (Mardis, 2008).

Primer-dimers

Once the artifacts were produced, the results were sent to third parties to analyze the sequences of the products. Sequencing methods can be different depending on the characteristics of the fragments (like their length). Sequencing is a technique that allows one to read the sequence of the fragments object of the study thanks to different signaling molecules that emit different fluorescent signals (Gupta and Gupta, 2014). In the past, this technique used different kinds of molecules like radioactive agents as marker for the nucleotides (Maxam and Gilbert, 1977) but in the recent years, other molecules like fluorescent dyes were introduced as alternative (Reed, et al., 2013).

For this experiment, Sanger sequencing and its variant cycle sequencing were used to analyze the unspecific products that gave multiple peaks in the melt curve and PDs. Even though the new sequencing techniques are very precise and reliable, very short fragments as primer-dimers are difficult to sequence since the first part of the sequence is often lost during the process, meaning that in short fragments a large portion of information can be lost impairing the collection of the data required to analyze the sequence. To overcome this problem, a single read sequencing can be performed in both directions of the fragment object, allowing the recovery of the lost part in the beginning of the Fw filament with the complementary ending part of the Rv filament and vice versa (Wilson and Walker, 2010).

In the present study, both Sanger and cycle sequencing have been demonstrated to be inadequate for the analysis of the very short fragments purified with either column based methods or gel extraction. For this reason it was not possible to align the sequence of the primers with the sequence of their products in qPCR to find out what kind of additional sequence could have been inserted between the primers. The failure in sequencing the PDs is most likely due to the fact that the column based systems used to purify the products were optimized to purify molecules longer than 70 bp (MinElute™), leading to a big loss of product during the process that caused a too low concentration of DNA in the samples. Also, the Oligo clean-up column based kit, was able to collect molecules from 10 to 70 bp, including, along with the purified samples other molecules like the primers and eventually other unspecific products, impairing the sequencing procedure and leading to the results shown in figure 11 – appendix 3.

Even though several purification methods were used, it is probable that the fluorescent dye was still present in small amount in the samples sent for sequencing, due to its high affinity to dsDNA and AT rich fragments that make it bind to the nucleic acid in a strong way (Mao, et al., 2007) interfering in this way with the nucleotides’ signals. Giglio et al. (2003), claim

15

instead that SYBR® Green manifests a higher affinity for GC rich fragments, in contrast with the statement of Mao et al., three years earlier in their study about the preferential binding of this dye to DNA with specific characteristics.

Other sequencing methods like NGS (Next Generation Sequencing) of Pyrosequencing could be used to sequence short fragments (Weitschek et al., 2014) as well as cloning the fragments into vectors to be able to sequence longer products (Brownie et al., 1997).

Other possible reasons for this lack of results is the fact that SYBR® Green is considered to be able to inhibit several reactions including qPCR if used in high enough concentrations (Nath, et al., 2000). It would not be unreasonable to think that, as EDTA, SYBR® Green could be an inhibitor also for other methods such as sequencing, especially when analyzing very small fragments as PDs (50 – 60 bp). This assumption is based on the fact that both PCR and sequencing are enzymatic reactions that use the emission of a signal as detection method, so a fluorescent dye could interfere with the signaling creating noise and unclear peaks (Leonard, et al., 1998).

A question arose while observing the melt curves regarded the fact that often, the same product shows different Tm in different experiments. An example was primer pair one reported in figure 2a (1st generation product, different Tm for each replicate), figure 2b (2nd

generation product with no amplification at all even though the 1st generation product was used as template) and figure 5 – appendix 2 (again 2nd generation product obtained from template of each replicate of the 1st generation that this time produced six identical melt curves with the same Tm). The belief that different Tm could belong to the same product might be justified by the fact that also running the samples in the Fragment Analyzer for capillary gel electrophoresis the results for the three different peaks were exactly the same showing a size of 55 bp each time (figure 3 – appendix 2). Several authors such as Ririe, et al.(1996), Wittwer, et al.(2003), Pryor, et al.(2006) and other throughout the last decade reported a high reliability in identifying PCR product with the melt curve analysis, leading one to consider the possibility that it was actually a different fragment generated every time. Unfortunately, since sequencing did not work out, it was not possible in this study to explain this anomaly.

According to the literature and our statistical analysis a relevant role in PDs formation is played by the Tm of the primer-pair and the self 3’ complementarity of each primer (Hsieh, et al., 2006; Chou, et al., 1992; Hongoh, et al., 2006; Kimura, et al., 2011). All the primer pairs were designed following the common rules in order to avoid PDs formations (Poritz and Ririe, 2014), the statistical significance found between the difference in the Tm and self 3’ complementarity of the two groups of primer pairs suggests that the standard conception (which is below 3-5°C Tm mismatch) might be implemented with more strict values. Observing the SD of the test group compared to the SD of the control group in table 3, it is possible to notice that even though the average for the two groups is not so distant regarding the Tm (58,47 °C for the test group and 59,88 °C for the control group), the difference within the same group between Fw and Rv primers is notably higher in the test group (with an SD of 2,5309) than in the control group (with a SD of 0,8053). As reported in the literature (Hyndman and Mitsuhashi, 2003; Yuryev, 2007) the optimal Tm of the primers should not exceed 59 °C to impair the formation of PDs,. In this study, the group giving more PDs was the one with the lower Tm but a higher SD, suggesting that more important that the Tm itself is the fact that the difference of the Tm of the two primers within the same primer pair is should not be higher than 1°C. Online tools for primer design as Primer-BLAST from NCBI and Primer3 suggest a mismatch of the Tm between the two primers not higher than 3-5 °C.

16

Concerning the self 3’ complementarity, also known as 3’-anchored global alignment score, the common knowledge recommend to keep it as low as possible (near to 0) (Markel and León, 2003). Our statistical analysis shows that even a difference of 0.3 in the average of several alignment scores between a control group and a test group shows statistical significance. This result is consistent with the literature (Markel and León, 2003; Poritz and Ririe, 2014).

All the other characteristics tested for this experiment were not shown to be statistically significant between the two groups, pointing at Tm and 3’ complementarity as the two most discriminant values in primer design.

Double peaks

Due to the fact that the results of the sequencing did not lead to a sequence of an acceptable length for the IGFBP3 as well as for the PDs, there was not enough material to draw valid conclusions on the alignment made on the fragments obtained from the IGFBP3 new (the result of serial dilutions and amplification that led to the isolation of a peak of unspecific product). The only thing that is possible to observe in the extracts in appendix 3, is that the stars visible in the green square in extract one indicate a perfect match for all the sequences, suggesting that the product separated in qPCR (figure 8 - appendix 2 and figure 9b) as IGFBP3 new has actually the same sequence as IGFBP3 old. This statement might be true in case the length of the fragments was at least the same and the matching was near to 100%, but since the IGFBP3 new was actually much shorter than IGFBP3 old, one possible conclusion is that, as for the PDs, unspecific products like the ones giving additional peaks are much more difficult to sequence than fragments with a known sequence.

The further analysis performed on the sequences searching for alternative binding sites did not show a match between the Fw primer and the original sequence of IGFBP3, but aligning the primer with the sequence obtained for IGFBP3old it emerged that an alternative binding site was created during the PCR reactions, leading to the formation of a shorter version of the IGFBP3old that we observed as IGFBP3. The results showed in figure 13 and 14 clearly demonstrate how the Fw primer matches completely the two different sites on IGFBP3old sequence. A possible explanation for this could be due to the fact that a deletion took place during the PCR reactions that led to the formation of the second binding site on IGFBP3, or maybe the human cDNA used as template was degraded and the Fw primer annealed to the wrong region of the gene during PCR.

These results allows us to say that the double peak was in this case actually the result of two different products, which were the result of the double binding site on the 1 st generation product that resulted in an altered product.

17

ConclusionsThe present work reports as conclusions the following:

The described method does not allow repeatability in the production of primer-dimers since they behave in an unpredictable way even though the parameters are the same between different experiments.

A column based method is not enough to purify unspecific products as primer-dimers since the concentration obtained is too low to proceed with further analysis.

It is inferable from the statistical results that Tm and self 3’ complementarity could be considered in a more strict range of variability when designing primers.

Double peaks can be the result of two different products amplified during the PCR. Even though the primer pair was designed to bind in only one specific site on the original gene, for some reason (like cDNA degradation) the same Fw primer bind to two different sites, giving two different versions of the same product.

18

Future perspectives Unspecific products are a difficult and still unpredictable subject in qPCR, the short amount of time available for the project did not allow us to fully observe and analyze all the possible troubleshooting alternatives.

Due to the fact that Sanger sequencing did not work out at the presented conditions to analyze PDs formations, a future improvement could include a different kind of sequencing method more appropriate for short fragments. Pyrosequencing could be a more reliable solution as it has been known since the past decade that the need of continuously adding each nucleotide step by step is particularly suitable for sequencing of short fragments (Ronaghi, et al., 1998; Vandenbrouke, et al., 2011).

For what concerns multiple peaks as artifact in qPCR, running control tests with different templates could help understanding if the formation of an alternative binding site on the same PCR product is due to the degradation of the genetic material or something else.

The unpaired t test conducted on average, SD and number of samples in this study shown that a difference higher than 1 °C between the primer-pair’s Tm and even small differences in self 3’ complementarity play a relevant role in PDs formation, but a higher amount of primer pairs could make these findings more reliable if still confirmed with statistical analysis.

19

Acknowledgements The major contribution in the development of this project was provided by Anna Pfister, who supervised me during all my laboratory work and gave me good advices in order to overcome the enormous variety of problems that came up every day. A moral and economical support that was fundamental for the healthy management of the project was given by my parents and Giuseppe, I will never thank them enough for the constant help they give me.

A special thanks goes to the TATAA’s staff, they were always there for advices when I needed as well as my teachers in Skövde.

20

ReferencesBrownie, J., Shawcross, S., Theaker, J., Whitcombe, D., Ferrie, R., Newton, C., Little, S., 1997. The elimination of primer-dimer accumulation in PCR. Nucleic Acid Research, 25(16).

Chou, Q., Russel, M., Birch, D.E., Raymond, J., Bloch, W., 1992. Prevention of pre-PCR mis-priming and primer dimerization improves low-copy-number amplifications. Nucleic Acids Research, 20(7), pp. 1717-1723.

Downey, N., 2014. Interpreting melt curves: an indicator, not a diagnosis. IDT® Integrated DNA Technologies. Core concepts, scientific fundamentals explained.

Dwight, Z., Palais, R., Wittwer, C.T., 2011. uMELT, prediction of high resolution melting cuves and dynamic melting profiles of PCR products in a rich web application. Bioinformatics, February 7, 2011.

Giglio, S., Monis, T.P., Saint, C.P., 2003. Demonstration of preferential binding of SYBR Green I to specific DNA fragments in real-time multiplex PCR. Nucleic Acid Research, 31(22): e136.

Gupta, A.K. and Gupta, U.D, 2014. Next Generation Sequencing and Its Applications. Models in discovery and translation. Chapter 19, pp. 345-367.

Hongoh, Y., Yuzawa, H., Ohkuma, M., Kudo, T., 2006. Evaluation of primers and PCR conditions for the analysis of 16S rRNA genes from a natural environment. FEMS Microbiology Letters, 221(2), pp. 299-304.

Hsieh, M.H., Tsaih, R., Huang, C.Y., 2006. An intelligent primer design system for multiplex reverse transcription polymerase chain reaction and complementary DNA microarray. Expert Systems with Applications, 30(1), pp. 129-136.

Kimura, Y., de Hoon, M.J.L., Aoki, S., Ishizu, Y., Kawai, Y., Kogo, Y., Daub, C.O., Lezhava, A., Arner, E., Hayashizaki, Y., 2011. Optimization of turn-back primers in isothermal amplification. Nucleic Acids Research, 39(9): e59.

Kretz, K., Callen, W., Hedden, V., 2014. Cycle sequencing. Genome Research. Cold Spring Harbor Laboratory Press 1054-9805/94.

Leonard, J.T., Grace, M.B., Buzard, G.S., Mullen, M.J., Barbagallo, C.B., 1998. Preparation of qPCR product for DNA sequencing. BioTechniques, 24:314-317.

Mao, F., Leung, W.Y., Xin, X., 2007. Characterization of EvaGreen and the implication of its physicochemical properties for qPCR applications. BMC Biotechnologies, 2007; 7:76.

Mardis, E.R., 2008. Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet., 2008; 9:387-402.

Markel, S., León, D., 2003. Sequence Analysis in a Nutshell. A guide to common tools and databases. O’Reilly & Associates, 2003; pp. 129-131.

Maxam, A.M. and Gilbert, W., 1977. A new method for sequencing DNA. Proceeding of the National Academy of Sciences, 74(2), pp. 560-564.

21

Nath, K., Sarosy, J.W., Hahn, J., Di Como, C.J., 2000. Effects of ethidium bromide and SYBR® Green I on different polymerase chain reaction systems. Journal of Biochemical and Biophysical Methods. 42(1-2), pp. 15-19.

Norgen Biotek Corporation, 2013. PCR Purification Kit. Product Insert.

Poritz, M.A. and Ririe K.M., 2014. Getting Things Backwards to Prevent Primer Dimers. The Journal of Molecular Diagnostics, 16(2).

Pryor, M.J. and Wittwer, C.T., 2006. Real-time polymerase chain reaction and melt curve analysis. Methods in Molecular Biology, 336:19-32.

Reed, R., Holmes, D., Weyers, J., Jones, A., 2013. Practical Skills in Biomolecular Sciences. 4th ed. Pearson Education.

Ririe, K.M., Rasmussen, R.P., Wittwer, C.T., 1996. Product Differentiation by Analysis of DNA Melting Curves during the Polymerase Chain Reaction. Analytical Biochemistry, 245(2), pp. 154-160.

Ronaghi, M., Uhlén, M., Nyrén, P., 1998. A sequencing method based on real time pyrophosphate. Science, 281 (5375): 363

Satterfield, B.C., 2014. Cooperative Primers: 2.5 Million–Fold Improvement in the Reduction of Nonspecific Amplification. The Journal of Molecular Diagnostics, 16(2).

SantaLucia, J.J., 2007. Physical Principles and Visual-OMP Software for Optimal PCR Design. Methods in Molecular Biology, 402: PCR Primer Design.

Tajadini, M., Panjehpour, M., Javanmard, SH., 2014. Comparison of SYBR Green and TaqMan methods in quantitative real-time polymerase chain reaction analysis of four adenosine receptor subtypes. Advanced Biochemical Research 2014, 3:85.

Vandesompele, J., 2009. qPCR guide. Eurogentec, pp.12-19.

Wilson, K., Walker, J.M., 2010. Principles and Techniques of Biochemistry and Molecular Biology, Seventh Edition. Cambridge University Press.

Wittwer, C.T., Reed, G.H., Gundry, C.N., Vandersteen, J.G., and Pryor, R. J.,2003 High-resolution genotyping by amplicon melting analysis using LCGreen. Clin. Chem. 49, 853–860.

Weitschek, E., Santoni, D., Fiscon, G., De Cola, M.C., Bertolazzi, P., Felici, G., 2014. Next generation sequencing reads comparison with an alignment-free distance. BioMed Central Research Notes 2014, 7:869.

Hyndman, D.L., Mitshuhashi, M., 2003. PCR Primer Design. Methods in Molecular Biology™, 226:81-88.

Yuryev, A., 2007. PCR Primer Design Using Statistical Modeling. Methods in Molecular Biology™, 402:93-103.

22

Vandenbrouke, I., Van Mark, H., Verhasselt, P., Thys, K., Mostmans, W., Dumont, S., Van Fygen, V., Coen, K., Tuefferd, M., Aerssen, J., 2011. Minor Variant Detection in Amplicons Using 454 Massive Parallel Pyrosequencing; Experiences and Considerations for Successful Applications. Bio Techniques, Vol. 51 No 3, pp. 167-177.

23

Appendices

Appendix 1 – Kits and reagents

NORGEN Biotek Corporation Oligo Clean-Up and Concentration Kit (50)

QIAEX II Gel extraction kit (150)

QIAGEN MinElute™ PCR Purification Kit (50)

TATAA SYBR® GrandMaster® Mix

24

Appendix 2 – Figures

Figure 3: Electropherograms obtained running the three replicates of primer pair one in the Fragment Analyzer™ for capillary gel electrophoresis. From top: replicate 1, replicate two and replicate 3. It is visible to the right of the gel image with the two bands indicating the unspecific product with a length of 55 bp and the primers in the lowest band (~20 bp). A 35 – 1500 bp marker was used once and imported for all the other analyses. All the samples looks the same even though the melt curves were different and each peak had a different Tm.

25

Figure 4: The figure shows the melt curves of the NTCs quadruplicates for primer pair two. The 76°C Tm suggests that all the replicates gave the same unspecific product with more or less the same amount of product in each replicate. As it was possible to see in figure 1, also primer pairs that do not have a homogeneous pattern within their replicates have a melt curve that starts with a hump (red circle) that corresponds approximatively to the Tm of the primers.

Figure 5: The image shows primer pair one 2nd generation product analyzed for the second time using the 1st generation product as template. The three replicates giving different Tm in the 1st generation product were analyzed separately and in triplicates and all the 9 resulting samples are shown in the

26

image with the same identical Tm (~71°C). A similar result was obtained with primer pair one replicate three in figure 2b.

Figure 6: The image shows the melt curve for the 1st generation product of IGFBP3 primer pair using human cDNA as template. Two peaks are clearly visible, one at 83°C and another one at 86°C. The expected product is the one included in the higher peak with the lower Tm.

Figure 7: The image shows the melt curve for human cDNA amplified using a CD44 primer pair. The red circle shows a hump that indicates the presence of a different kind of product that may be longer than the target.

27

Figure 8: The melt curves in the image belong to both IGFBP3 and CD44 2nd generation product. It is possible to notice the difference with the 1st generation product shown in figure 6 and 7, where the peaks with the lower Tm represented the expected products and the higher ones the unspecific product. It is evident the difference between the concentration of the unspecific products in the first generation and the second generation where the only parameter that was changed was the nature of the template.

Figure 10: Capillary gel electrophoresis for CD44 sample purified with MinElute™ kit. No peak is visible in the electropherogram and no band is visible on the gel to the left even though purity and concentration of the sample were quite high. Due to these results, the sample was not considered adequate enough to be further analyzed.

28

IGFBP3

CD44

Appendix 3 - Sequences

Figure 11: Chromatograms from Sanger sequencing. Primer pair 10 was purified with three different methods and all the purified samples were sequenced in both Fw and Rv direction. From top: Primer pair 10 - Rv and Fw fragments purified with Qiagen’s kit MinElute™; Primer pair 10 – Rv and Fw fragments purified with Oligo clean-up and concentration kit from Norgen Biotek; Primer pair 10 – Rv and Fw fragments purified with QIAEX II® gel extraction kit from Qiagen. None of the chromatograms shown above presents a clear signal. As it is possible to see in the 4 th and 5th rows, the Fw fragment purified with the Oligo clean-up and the Rv fragment purified with QIAEX II® give a series of peaks that the software assembles in an approximate series of dNTPs.

29

Figure 12: Chromatogram for the IGFBP3 (old) original sequence obtained with cycle sequencing from purified qPCR product. The sequence with the white background is the one that was clipped by the software as thought to be the only relevant part according to the degree of reliability for each nucleotide represented in the figure. The FASTA sequence is reported in extract one– appendix 3 with the alignment results with the sequence obtained from the unspecific product generated in qPCR and the IGFBP3 cDNA stored in GeneBank.

Extract 1IGFBP3_new_F ------------------------------------------------------------IGFBP3_new2_F ------------------------------------------------------------IGFBP3_cDNA AAAGGGCATGCTAAAGACAGCCAGCGCTACAAAGTTGACTACGAGTCTCAGAGCACAGATIGFBP3_old_F --------------------------ATAAG----------CAGTTGTCGCTTCCAAGGC

IGFBP3_new_F ------------------------------------------------------------IGFBP3_new2_F ------------------------------------------------------------IGFBP3_cDNA ACCCAGAACTTCTCCTCCGAGTCCAAGCGGGAGACAGAATATGGTCCCTGCCGTAGAGAAIGFBP3_old_F AGGAAGCGGGGC------TTCTGCTGGTGTATGGATAATGTCATGCGTGCAGGTAGAGAA

IGFBP3_new_F ------------------------------------------------------------IGFBP3_new2_F ------------------------------------------------------------IGFBP3_cDNA ATGGAAGACACACTGAATCACCTGAAGTTCCTCAATGTGCTGAGTCCCAGGGGTGTACACIGFBP3_old_F ATGGAAGACACACTGAATCACCTGAAGTTCCTCAATGTGCTGAGTCCCAGGGGTGTACAC

IGFBP3_new_F ----------------------------------AGTAAAAAAATTGCGCCTTCCAAGGCIGFBP3_new2_F -------------------------------------ATACACGTGTCGCCTTCCAAGGCIGFBP3_cDNA ATTCCCAACTGTGACAAGAAGGGATTTTATAAGAAAAAGCAGTGTCGCCCTTCCAAAGGCIGFBP3_old_F ATTCTCAACTGTGACAAGAAGGGATTTTATAAGAAAAAGCAGTGTCGCCCTTCCAAAGGC * * * * * * *****

IGFBP3_new_F AGGAAGCGGGGCTTCTGCTGGTGTACGGATACATTCTGCTGTTCTACAGAGCTTCT----IGFBP3_new2_F AGGAAGCGGGGCTTCTGCTGGTGTACGGATCATTCTGCTGTTGTACACAAACTTCT----IGFBP3_cDNA AGGAAGCGGGGCTTCTGCTGGTGTGTGGATAAGTATGGGCAGCCTCTCCCAGGCTACACCIGFBP3_old_F AGGAAGCGGGGCTTCTGCTGGTGTATGGATAATTATGGGCAGTATAGAAATGGAAGACAC ************************ **** *

30

IGFBP3_new_F ------------------------------------------------------------IGFBP3_new2_F ------------------------------------------------------------IGFBP3_cDNA ACCAAGGGGAAGGAGGACGTGCACTGCTACAGCATGCAGAGCAAGTAGACGCCTGCCGCAIGFBP3_old_F ACTGAATCACCTGAAATTCCTCAATGTGCTGAGTCCC-AGGGGTGTACACATT--C----

IGFBP3_new_F ------------------------------------------------------------IGFBP3_new2_F ------------------------------------------------------------IGFBP3_cDNA AGGTTAATGTGGAGCTCAAATATGCCTTATTTTGCACAAAAGAC-TGCC--AAGGACATGIGFBP3_old_F ----TCAACTGTGA---CAAGAAGGGATTTTATAAAAAAAAGCATTGTCTCCCTTCCAAA

IGFBP3_new_F ------------------------------------------------------------IGFBP3_new2_F ------------------------------------------------------------IGFBP3_cDNA ACCAGCAG--CTGGCTACAGCCTCGATTTATATTTCTGTTTGTGGTGAACTGATTTTTTTIGFBP3_old_F GGCAGGAAGCGGTGCTACTGCTGGTGTTTG--TTTTTTTTTG----GGAATG-------- Extract 2

IGFBP3_new_F --------------GATGTACTGCTGT---------------------------------IGFBP3_new2_F --------------GATGTACTGCTGT---------------------------------IGFBP3_cDNA TAAACCAAAGTTTAGAAAGAGGTTTTTGAAATGCCTATGGTTTCTTTGAATGGTAAACTTIGFBP3_old_F ------------------------------------------------------------

IGFBP3_new_F --GCTGCTGGTGTCCTGTTCTTC-------------------------------------IGFBP3_new2_F --GCTGCTGCTGTCCTGTTCTTC-------------------------------------IGFBP3_cDNA TGTCTTCAAGTGACCTGTACTGCTTGGGGACTATTGGAGAAAATAAGGTGGAGTCCTACTIGFBP3_old_F ------------------------------------------------------------

IGFBP3_new_F -------------------TGCTGTTGTACTGCTCTGCTTTTGCAATTGTGACAAGAAGGIGFBP3_new2_F -------------------TGCTGTTGTACTGTGCTGCTTTTGCAATTGTGACAAGAAGGIGFBP3_cDNA TGTTTAAAAAATATGTATCTAAGAATGTTCTAGGGCACTCTGGGAACCTATAAAGGCAGGIGFBP3_old_F ------------------------------------------------------------

IGFBP3_new_F GATTTTATTCTGCTGCTGTTCTGTCCCT-TACA---------------------------IGFBP3_new2_F GATTTTACTCTACTGCTGCTCTGGTCCTCTGCT---------------------------IGFBP3_cDNA -TATTTCGGGCCCTCCTCTTCAGGAATCTTCCTGAAGACATGGCCCAGTCGAAGGCCCAGIGFBP3_old_F ------------------------------------------------------------

Extract 3

IGFBP3_new_F --------GCTGAGGAAGTGTGGTGCTGGTGGGGTGATGTGCTGCTGGGGGGCTGAGGT-IGFBP3_new2_F --------GCTGTGGAAGTGTGCTGTCGCTCGGGTGATGTGCTGCTGGGGGTCTGAGGT-IGFBP3_cDNA TGGCCATGACTGAGGAAAGGAGCTCACGCCCAGAGACTGGGCTGCTCTCCCGGAGGCCAAIGFBP3_old_F ------------------------------------------------------------

IGFBP3_new_F ---GCTGCTGGGGGGCGGT--TCTGCTGGTGGGGGGCTCTG--------CTGCCCTTCCCIGFBP3_new2_F ---GGTGGTGGGCGGGGGG--GGTGGCGGTGGGCTGCGGGT--------CTTCTTTTCGGIGFBP3_cDNA ACCCAAGAAGGTCTGGCAAAGTCAGGCTCAGGGAGACTCTGCCCTGCTGCAGACCTCGGTIGFBP3_old_F ------------------------------------------------------------

IGFBP3_new_F ATCCACTCCTCCTC---------CCCCGCG----------------------------CTIGFBP3_new2_F GCCGGCACCCAAAC---------AACAAAAAAAAAACAAAAAACGCAATCCACCC---CCIGFBP3_cDNA GTGGACACACGCTGCATAGAGCTCTCCTTGAAAACAGAGGGGTCTCAAGACATTCTGCCTIGFBP3_old_F ------------------------------------------------------------

IGFBP3_new_F TGTTATCTTC---TTTTTTTGTTTTTTAT--ATCTAGGATTAAGGTGTTGGAGAGG----IGFBP3_new2_F AGCACTCGGA---GGTTTAATCTTGTGCT--ACAAAGGAATCCTGAGAGTCCTTTTATTTIGFBP3_cDNA ACCTATTAGCTTTTCTTTATTTTTTTAACTTTTTGGGGGGAAAAGTATTTTTGAGAAGTTIGFBP3_old_F ------------------------------------------------------------

31

IGFBP3_new_F ------------------------------------------------------------IGFBP3_new2_F TCGCTCCCCCCCCCATCATAATCATAAAATAAATCAAACA----TCAACCTATCTTTATAIGFBP3_cDNA TGTCTTGCAATGTATTTATAAATAGTAAATAAAGTTTTTACCATTAAAAAAATATCTTTCIGFBP3_old_F ------------------------------------------------------------

IGFBP3_new_F ------------------------------------------------------------IGFBP3_new2_F TAATTTTTTTTAACCCAC------------------------------------------IGFBP3_cDNA CCTTTGTTATTGACCATCTCTGGGCTTTGTATCACTAATTATTTTATTTTATTATATAATIGFBP3_old_F ------------------------------------------------------------

IGFBP3_new_F ----------------------------------------------IGFBP3_new2_F -------------------AAACCCAAAACGACACAAC--------IGFBP3_cDNA AATTATTTTATTATAATAAAATCCTGAAAGGGGAAAATAAAAAAAAIGFBP3_old_F ----------------------------------------------

Figure 13: The figure shows the sequence of the IGFBP3 fragment obtained with cycle sequencing aligned with the Fw primer to search for alternative binding sites. It is possible to see that the same sequence is present in two different sites on the fragment analyzed (yellow mark on top and orange mark on bottom).

32

Figure 14: The figure shows the alignment between the sequence obtained sequencing IGFBP3 from our experiment, the Fw primer and the original sequence of IGFBP3 from GenBank. The orange and yellow rectangles on top show the matching sequence between the three different products. In the red rectangle on bottom it is visible that IGFBP3first (1st generation product) reports a tract with severe deletions compared to the original sequence that cause the formation of a sequence that is identical to the sequence of the Fw primer, creating an alternative binding site (see figure 13 for comparison).

33

Appendix 4 - Tables

Table 1a: Characteristics of the primers involved in the study as control group.Forward Reverse

Name Length Tm GC% Comp Self 3’ C Length Tm GC% Comp Self 3’ C1 24 60,36 46,23 4 0 23 60,37 43,48 3 02 20 60,46 60 4 0 21 60,12 52,38 3 03 20 59,09 60 2 0 20 59,03 50 4 04 16 60,57 75 3 0 19 60,08 63,12 4 05 22 59,57 50 2 0 21 59,08 52,38 3 06 20 59,53 55 4 0 20 61,35 55 4 07 19 59,47 53,03 2 0 20 59,39 55 4 08 19 61,17 58,29 2 0 21 59,24 48,02 2 09 24 58,23 37,05 4 0 19 59,07 53,03 4 010 20 59,02 55 2 1 23 59,06 39,13 3 011 20 60,08 55 3 0 20 61,08 60 2 012 20 61,02 55 3 1 20 60,15 55 2 113 23 60,05 48,23 4 0 20 60,37 50 4 014 21 60,48 52,38 4 0 20 61,05 55 3 115 24 60,12 42,07 3 0 21 60,19 52,38 4 016 23 59,22 43,48 4 0 20 60,07 50 4 017 18 58,41 61,11 4 0 21 59,08 52,38 3 118 19 61,15 63,16 4 1 20 59,45 55 4 119 20 59,06 55 3 1 20 60,36 55 4 0The table reports the values for five different characteristics in both Fw and Rv filament of the primer pair. Each primer pair is indicated by a number (Name). The primer length is expressed in bases, Tm is expressed in °C, GC content is expressed in percentage and complementarity and self 3’ complementarity are expressed in alignment score.

34

Table 1b: Characteristics of the primers involved in the study as test group.Forward Reverse

Name Length Tm GC% Comp Self 3’ C Length Tm GC% Com Self 3’ C1 17 57,04 65,1

14 0 15 56,53 73,33 2 0

2 20 58,42 55 2 0 20 61,22 55 3 03 20 57,34 50 4 0 18 59,26 55,56 3 04 21 62,15 57,1

43 1 20 64,14 65 2 1

5 20 61,26 60 2 0 20 62,02 60 4 06 22 59,02 50 3 1 20 59,12 55 2 07 20 57,38 50 4 0 20 59,26 55 5 18 21 59,38 52,3

84 1 22 60,17 45,45 3 0

9 23 59,44 43,48

2 0 21 59,25 52,38 3 010 20 55,12 45 4 0 20 56,29 50 3 011 20 54,54 45 4 0 20 56,21 45 4 012 21 55,22 43,2

65 1 18 55,27 50 2 2

13 20 55,19 45 5 1 20 54,59 45 3 214 20 59,16 50 2 0 20 60,54 55 4 015 21 60,34 52,3

84 0 19 60,45 58,38 4 0

16 20 59,25 43,26

2 1 21 60,19 48,02 4 117 21 54,26 43,2

65 2 20 55,09 45 5 5

18 20 60,39 55 2 0 20 60,29 60 3 0α 18 60,44 67,0

74 1 20 59,06 55 2 0

β 20 60,25 55 2 0 20 60,25 55 2 0

The table reports the values for five different characteristics in both Fw and Rv filament of the primer pair. Each primer pair is indicated by a number (Name). The primer length is expressed in bases, Tm is expressed in °C, GC content is expressed in percentage and complementarity and self 3’ complementarity are expressed in alignment score. The primer pairs α and β showed a behavior attributable to both groups in different tests.

Table 3b: SDs and averages between Fw and Rv primers for all the considered characteristics for the special groups α – β.*Group Length (b) Tm (°C) GC% Self Comp Self 3’ CompT α – β Average 19,975 58,61975 52,7615 3,25 0,525T α – β SD 1,349026240 2,44876157 7,217282517 1,056117709 0,9604352646C α – β Average 20,45238095 59,8964285 53,65238095 3,214285714 0,2142857143C α – β SD 1,670437082 0,78468518 7,062958963 0,842056550 0,4152997322*The special groups α - β contains two more primer pairs in the analysis (Primer pair α and Primer pair β) that were used in both groups as they presented the same behavior of the Test primer pairs in the first tests and results attributable to the Control primer pairs in the other tests using different working solutions from the same stock. Length is expressed in bases.

35

Date post:	11-Apr-2017
Category:	Documents
Upload:	lucia-de-moja
View:	163 times
Download:	1 times

Thesis Report Lucia De Moja' - Artifacts in qPCR

Documents