Comparison of Middle Eastern Bedouin Genotypes with Previously … · Comparison of Middle Eastern...

Comparison of Middle Eastern Bedouin

Genotypes with Previously Studied Populations

Using Polymorphic Alu Insertions

Alison Patricia Pitt (BA, GDipForSci)

Centre for Forensic Science

University of Western Australia

This thesis is presented in partial fulfilment of the requirements for the

Master of Forensic Science

2008 (Font TNR, 16point)

i

I declare that the research presented in this 36 point thesis, as part of the 96 point

Master degree in Forensic Science, at the University of Western Australia, is my own

work. The results of the work have not been submitted for assessment, in full or part,

within any other tertiary institute, except where due acknowledgement has been made

in the text.

…………………………………………………

Your name here

ii

Acknowledgments

I am grateful to my supervisors Dr. Guan Tay and Associate Professor Ian Dadour for

all their support and guidance that they have provided me during my research, as well

as providing me the opportunity to pursue the work that is described in this thesis.

I am also very grateful to Habiba Al-Safar for her collaboration and help that she had

given to me during my visit to Dubai, United Arab Emirates (UAE). As well as to Dr.

Kamal A. Khazanehdari for allowing me the use of his wonderful laboratory and

equipment at the Central Veterinary Research Laboratory (CVRL), Molecular

Biology and Genetics department in Dubai, UAE.

A most appreciated thank you to all the staff at the CVRL for their patience with me

in the lab while guiding me through this learning process, and for all their intellectual

support. They have provided me with the knowledge to help me get further in this

field of research.

My sincere gratitude to all my colleagues at the Centre for Forensic Science, Stephen

Iaschi, Catherine Rinaldi, Rebecca Ford, Yvette Hitchen and Ha Nguyen for

providing me with beginners tips and lending their knowledge so that my journey into

forensic DNA could be an easy and enjoyable one.

iii

Abbreviation and Symbols

μl – Micro litre

°C – Degree celsius

4AOH – Fourth Asia Oceanic Histocompatibility

7SL RNA – An abundant cytoplasmic RNA that functions in protein secretion

A – Adenine

Alu – Arthrobacter luteus restriction enzyme

bp – Base pair

C – Cytosine

CDSN – Corneodesmosin

DNA – Deoxyribonucleic acid

DNTPs – Deoxyribonucleotide triphosphates

G – Guanine

HGP – Human Genome Project

HLA – Human leucocyte antigen

HWE – Hardy-Weinberg Equilibrium

Indel – Insertion/deletion

Kb – Kilobase

LD – Linkage disequilibrium

LINE – Long interspersed element

LTR – Long terminal repeat

MgCl2 – Magnesium chloride

MHC – Major histocompatibility complex

MIC – MHC Class I chain related

mM – Milli molar

MW – Molecular weight

Mya – Million years ago

ng – Nano grams

nmol – Nano mol

PCR – Polymerase chain reaction

POALIN – Polymorphic alu insertion

RNA – Ribonucleic acid

iv

SINE – Short interspersed element

SLP – Single locus probe

SNP – Single nucleotide polymorphism

T – Thymine

TFIIH – Transcription factor II H

Tris-HCl – Tris-Hydrochloride

tRNA – Transfer ribonucleic acid

v

Definitions

Allele:

One of two or more alternate forms of a gene that occupy the same locus in the genome.

Allele Frequency:

The frequency with which one form of a gene (an allele) occurs within a given population.

Alu:

A short transposable element that makes up more than 10% of the human genome. Alu

elements are class 2 retroelements that do not encode protein and as such are

nonautonomous elements.

Annealing:

Often used to describe the binding of a DNA probe, or the binding of a primer to a DNA

strand during a polymerase chain reaction (PCR).

Autonomous Transposable Elements:

A transposable element that encodes the protein necessary for its transposition and for the

transposition of nonautonomous elements on the same family.

Blastocyst:

The structure formed in early embyogenesis.

Denaturation:

The separation of the two strands of a DNA double helix or the severe disruption of the

structure of any complex molecule without breakage of the major bond of its chains.

DNA:

Deoxyribonucleic Acid. A double chain linked nucleotide; the fundamental substance of

which genes are comprised.

Electrophoresis:

The process of separating charged molecules through a gel matrix by the application of

an electric field. The gel matrix allows molecules to be separated on the basis of size.

vi

Embryogenesis:

The process by which the embryo is formed and develops.

Genome:

The entire collection of genetic information in an organism.

Genotype:

Is the inherited genetic information that is stored within an organism, and contributes to

the physical characteristics that determine chances of survival and reproduction.

Genotype Frequency:

The proportion or frequency of any particular genotype among the individuals of a

population.

Haplotypes:

A genetic class described by a sequence of DNA or of genes that are together on the same

physical chromosome.

Hardy-Weinberg Equilibrium:

The stable frequency distribution of genotypes A/A. A/a, a/a in the population of p2, 2pq

and q2, that is a consequence of random mating in the absence of mutation, migration,

natural selection or random drift.

Heterozygous:

A gene pair having different alleles in the two chromosome set of the diploid individual.

Homozygous:

Refers to the state of carrying a pair of identical alleles at the one locus.

Hybridisation:

The process in which two complementary nucleotides: purines and pyrimidine bind

through hydrogen bounds that form between them.

vii

Indel:

A mutation in which one or more nucleotide pairs are added or deleted.

Linkage Disequilibrium:

The non-random association of alleles at different loci. Occurs when genotypes at the

two loci are not independent of another.

Long Interspersed Repeated DNA (LINE):

A type of class I transposable element that encodes a reverse transciptase. LINE’s are

also called non-LTR retrotransposons.

Locus:

The position of a gene or chromosome segment on a chromosome. Alleles are located at

identical loci on homologous chromosomes.

Messenger RNA:

Carries information from DNA to structures called ribosomes. These ribosomes are

made from proteins and ribosomal RNAs, which together can read messenger RNA and

translate the information they carry into proteins.

Multi-Locus Probe (MLP):

A probe created by Alec Jeffreys that hybridises to a number of different sites in the

genome of an organism to compare selected sequences of single stranded DNA.

Nonautonomous Elements:

A transpoable element that relies on the protein products of autonomous elements for it

mobility.

Oocyte:

A female germ cell involved in reproduction.

Phenotype:

Is any observable characteristic of an organism, such as its morphology, development,

biochemical or physiological properties, or behaviour. Phenotypes result from the

expression of an organism’s genes as well as the influence of environmental factors.

viii

Polymerase Chain Reaction (PCR):

An in vitro method of amplifying a specific DNA segment that uses two primers that

hybridise to opposite ends of the segment in opposite polarity and, over successive cycles,

results in the replication of that segment only.

Polymorphism:

The simultaneous occurrence of two or more allelic forms within the population.

Retro Elements:

The general name for the class I transposable elements that move through an RNA

intermediate.

Retrotransposition:

A mechanism of transposition characterised by the reverse flow of information from

RNA to DNA.

Retrotransposons:

A transposable element that uses reverse transcriptase to transpose through an RNA

intermediate.

Retrovirus:

An RNA virus that replicate by first being converted into double-stranded DNA.

Reverse Transcriptase:

An enzyme that catalyses the synthesis of a DNA strand from an RNA template.

Ribonucleic Acid (RNA):

RNA is very similar to DNA, but differs in a few important structural details. In the cell

RNA is usually single stranded and contains only ribose. RNA is transcribed from DNA

by enzymes called RNA polymerase and is generally further processed by other enzymes.

RNA is central to the synthesis of proteins.

ix

Short Interspersed Nuclear Elements (SINE):

A type of Class I transposable element that does not encode reverse transcriptase but it

thought to use the reverse transcriptase encoded by LINEs.

Short Tandem Repeats (STRs):

A class of polymorphisms that occurs when a pattern of two or more nucleotides are

repeated and the repeated sequences are directly adjacent to each other.

Single-Locus Probe (SLP):

Is a DNA or RNA sequence that is able to hybridise with DNA from a specific restriction

fragment on a Southern blot, depending on complementary base pairs and probe sequence.

SLP are usually tagged with radioactive labels for easy detection, and are chosen to

detect one polymorphic genetic locus on a single chromosome.

Single Nucleotide Polymorphisms (SNPs):

A nucleotide pair different at a given location in the genome of two or more naturally

occurring individuals.

Translocation:

The relocation of a chromosomal segment to a different position in the genome.

Variable Number Tandem Repeats (VNTRs):

A location in a genome where a short nucleotide sequence is organised as a tandem

repeat. These can be found on many chromosomes, and often show variations in length

between individuals.

x

Table of Contents

Declaration i

Acknowledgment ii

Abbreviation and Symbols iii – iv

Definitions v - ix

Table of Contents x – xi

List of Tables xii

List of Figures xiii – xiv

Abstract xv - xvii

CHAPTER 1 INTRODUCTION

1.1 Introduction 1

1.2 History of Genetics 1

1.3 Fundamentals of Genetic Structure 5

1.3.1 DNA Principles 5

1.3.2 DNA Structure 5

1.3.3 Base Pairing 6

1.4 Human Genome Project 8

1.5 DNA Profiling 9

1.6 Population Variation 11

1.7 Principles of Population Genetics 13

1.8 Gene Clusters 13

1.9 Major Histocompatibility Complex (MHC) 14

1.10 MHC Ancestral Haploypes 16

1.11 SNPs vs Indels 17

1.12 Retroelements and Repeat Sequences 20

1.13 SINEs – Short Insterspersed Nuclear Elements 21

1.14 Polymorphic Alu Insertions (POALINs) 21

1.15 Profiling Ethnicity in Forensic DNA 23

xi

1.16 Previous Population Studies 24

1.17 Hardy-Weinberg Equilibrium 24

1.17 Bedouin Culture 25

1.18 Aims for this Thesis 25

CHAPTER 2 MATERIALS AND METHODS

2.1 Genomic DNA 28

2.2 POALIN PCR Assay 28

2.2.1 PCR Reaction 28

2.2.2 Cycling Conditions 29

2.2.3 Electrophoresis 29

2.3 Genotype and Phenotype Analysis 29

CHAPTER 3 RESULTS

3.1 Location of POALINs within the MHC Class 1 Region 33

3.2 Distribution of POALIN Allele Frequencies in Bedouin

Population 33

3.3 Hardy-Weinberg Equilibrium 42

CHAPTER 4 DISCUSSION

4.1 MHC POALIN Associations 51

4.2 Population Comparison 51

4.3 Hardy-Weinberg Conditions 58

4.4 Conclusion 60

REFERENCES 63 – 74

APPENDIX A 62

xii

List of Tables

Table 1.1: A synopsis on the history of genetics 5

Table 2.1: Primer sequence and product size of the four POALIN loci. 31

Table 3.1: Observed genotypes, allele frequencies, Hardy–Weinberg

significance and heterozygosity for AluyMICB, AluyTF,

AluyHJ and AluyHF in an Arab Bedouin Population 44

Table 3.2a: Allele frequency comparison for AluyMICB. 46

Table 3.2b: Allele frequency comparison for AluyTF. 47

Table 3.2c: Allele frequency comparison for AluyHJ. 48

Table 3.2d: Allele frequency comparison for AluyHF. 49

xiii

List of Figures Figure 1.1: Mendelian Inherited traits as dominant and recessive

phenotypes. 4

Figure 1.2: Illustration of hydrogen bonds during hybridisation between

adenine and thymine, and guanine and cytosine. 7

Figure 1.3: Illustration showing the difference between a sequence

polymorphism and a length polymorphism. 13

Figure 1.4: Illustration of the Major Histocompatibility Complex (MHC):

Chromosome 6. 16

Figure 1.5: Illustration of a SNP. 19

Figure 1.6: Illustration of an insertion and a deletion at the chromosome

level. 20

Figure 1.7: Structural comparison of a retrovirus to types of transposable

elements in the human genome 24

Figure 2.1: Gel photographic presentation of the variation between the

four MHC Class I POALINs and product size. 33

Figure 3.1: Location of the four POALINs within the short arm of

chromosome 6 within the MHC. 35

Figure 3.2: Gel photographic presentation of AluyMICB. 37

Figure 3.3: Gel photographic presentation of AluyTF. 38

Figure 3.4: Gel photographic presentation of AluyHJ. 39

Figure 3.5: Gel photographic presentation of AluyHF. 40

Figure 3.6: Gel photographic presentation of all four MHC POALINs,

verifying their ability to produce both product sizes and

heterozygosity. 41

xiv

Figure 4.1: Population comparison for AluyMICB allele genotype frequencies

among the populations. 51

Figure 4.2: Population comparison for AluyTF allele genotype frequencies


Figure 4.3: Population comparison for AluyHJ allele genotype frequencies


Figure 4.4: Population comparison for AluyHF allele genotype

frequencies among the populations. 55

Appendix

Figure 3.7: Gel photographic presentation of the second electrophoresis

run of AluyMICB. 62

xv

Abstract

Polymorphic Alu insertions (POALINs) are known to contribute to the variation and

genetic diversity of the human genome. In this report specific POALINs of the Major

Histocompatibility Complex (MHC) were studied. Previous population studies on the

MHC POALINs have focused on individuals of African, European and Asian descent.

In this study, we expand the research by studying a new and previously

uncharacterised population, focusing on the Bedouin from the Middle East.

Specifically we report on the individual insertion frequencies of four POALINs

within the MHC class I region of this population.

POALINs are members of a young Alu subfamily that have only recently been

inserted into the human genome. POALINs are either present or absent at particular

sites. Individuals that share the inserted (or deleted) polymorphism inherited the

insertion (or deletion) from a common ancestor, making Alu alleles identical by

decent. In population genetics a comparison of the resulting products from each

population can then be done by comparing the lengths of the PCR products in a series

of unrelated individuals and may also detect polymorphisms with regard to the

presence or absence of the Alu repeats.

As a direct result of their abundance and sequence identity, they promote genetic

recombination events that are responsible for large-scale deletions, duplication and

translocations. The deletions occur mostly in the A-T rich regions and have found to

be unlikely to have been created independently of the insertions of the Alu elements

(Callinan et al, 2005) The easy genotyping of the POALINs has proven to be very

valuable as lineage markers for the study of human population genetics, pedigree and

forensics as well as genomic diversity and evolution. POALINs have been used in a

range of applications, primarily focusing on anthropological analysis of human

populations. As a result of its ease of use and its utility as a marker in human

evolutions studies, combining the POALINs along with other markers used in

forensics could lead to improved identity testing in forensic science. More

xvi

specifically, in combination with more traditional markers, race specific genotypes

and haplotypes could be used for profiling crime scene samples.

This thesis reports on frequencies of four POALINs, AluyMICB, AluyTF, AluyHJ and

AluyHF. The POALINs simplicity comes from the markers method of showing a

presence or absence of the gene. Allele *1 is represented by a smaller sized band

when run through an electrophoresis and was indicative of the absence of the

POALIN gene. For all four POALINs Allele*1 had the higher frequency in all

populations studied. Allele*2 is represented by a large band and is indicative to the

presence of the POALIN marker. In the case of this report, the majority of the

Bedouin population showed no sign of the presence of the four POALIN used.

Out of the four POALINs, the highest frequency for the insertion of the allele

(Allele*2) in the Middle Eastern Bedouin population was 0.100 for AluyMICB and the

lowest was 0.056 for AluyTF. AluyHJ and AluyHF both showed no variation,

possibly due to an inadequate sample size used for the study. Comparisons were

made with other populations from previous studies. The allele frequencies of

AluyMICB (0.100) showed a strong similarity to the North Eastern Thai AluyMICB

(0.117) and found to be comparable to the Australian Caucasian AluyMICB (0.157)

and the South African Sekele San AluyMICB (0.050). While the AluyTF (0.056)

showed a strong similarity to the Malaysian Chinese AluyTF (0.040) as well as to

those from the North Eastern Thai AluyTF (0.086) and found to also be comparable

with the South African Sekele San AluyTF (0.034)

Anthropologically, this can suggest that the Middle Eastern Bedouin population

diverged from the Asian population after they split from Africa. This is supported by

previously reported molecular data using various types of genetic markers. In a study

using six separate Alu genes, Antunez-de-Mayolo et al were able to generate a

phylogenetic tree, in which the biogeographical groups followed a pattern. The

biogeographical groups started with African populations that were found to relate

closely to the hypothetical ancestral African population. The African populations

xvii

were then followed in order by Southwest Asian populations, European populations

which include Middle Eastern groups (Antunez-de-Mayolo et al, 2002).

This study shows the similarities and differences between the frequencies of the

Middle Eastern Bedouin and the rest of the compared populations. Though no clear

results were determined, the information from the POALINs along with information

provided from other genetic markers can lead to further research on the Bedouin

population and the improvement of the forensic population database in order to

accurately test individual ethnic background of samples to be analysed.

1

Chapter 1 Background

1.1 Introduction

Traditionally, DNA profiling has created controversy in the court room since it was

first introduced in the mid 1980s. Though used in many cases, the court was still

sceptical about the credibility of the techniques that had been used and how the

scientist reached their results. The methods over the years were slowly replaced with

new and more complex techniques, which used chromosome markers and

subpopulation examination, but could still be challenged in the court room. Today‟s

techniques have taken another step forward and allow more accurate comparison

among DNA samples. Subpopulation testing has also made its way back into DNA

profiling methods and has proved to be fundamental in DNA comparison (Lynch,

2003).

In this thesis, we studied Polymorphic Alu Insertions (POALINs) within the Human

Major Histocompatibility Complex (MHC) to establish similarities and differences

among the subpopulation of Middle Eastern Bedouin and other populations studied

using Polymorphic Alu Insertions (POALINs) from the Major Histocompatibility

Complex (MHC). The aim was to use DNA profiling in a subpopulation using a

marker known as Alu for more accurate identification and analysis of DNA samples.

1.2 History of Genetics

Some of the first diversity studies were done over 150 years ago and were published

by Charles Darwin in “On the Origin of Species” (Darwin, 1859). This publication

introduced the Theory of Evolution and the concept of Natural Selection. But one of

the chief difficulties for Darwin and other naturalists of his time was that there was no

agreed upon model of heredity. The idea of heredity had not been completely

separated conceptually from the idea of the development of the organism. Darwin

himself saw variation and heredity as two essentially opposed forces, with most genes

2

working to preserve the fixity of a type rather than acting as the agent of species

variability.

As scientists debated the concept of variation and heredity, Darwin himself produced

the theory of Pangenesis, which essentially was a model of „blended‟ heredity. The

effects of used and disused body parts from the parent were transmitted to the child,

and were contribution from each parent were roughly equal (Darwin, 1859). But with

this model could over time the species still evolve?

It was in 1866 that genetics took a step forward, and it is generally held that it started

with the works of Gregor Mendel and his pea plants. His theories showing the

heritability between plant hybridisation were soon published and became knows as

Mendelian Inheritance (Figure 1.1). This basic principle was later applied to a wide

variety of organisms and was developed by geneticists into the Mendelian-

Chromosome Theory of Heredity (Griffiths et al, 2005). This principle was widely

accepted in 1925, which brought forth the statistical framework of population

genetics and the explanation into the study of evolution. With the basic pattern of

inheritance now understood, biologists were able to focus on the physical nature of

the gene. It was in 1953 that the first models of viruses, bacteria, and the double

helical structure of DNA was discovered (Watson and Crick, 1953; Perutz, et al.,

1969; Olby. 2003). In 1986 Kary Mullis brought genetics another step forward with

the development of the polymerase chain reaction (PCR). The PCR allowed

researchers to amplify DNA easier, which enabled in the 1990s for DNA

fingerprinting and gene therapy to develop (McGill, 2000), and provided the

equipment needed for the Human Genome Project to be set in motion; leading to the

era of molecular genetics.

The increase in technology and the introduction of the automatic sequencing

technology in 1995 helped to further research and understanding in the genome. In

1996 at the Roslin Institute in Scotland, Ian Wilmut and his colleagues began using a

technique known as the Somatic Cell Nucleus Transfer, which allowed the cell

nucleus from an adult cell to be transferred into an unfertilised oocyte that had had its

3

nucleus removed. The hybrid cell was then stimulated to divide by an electric shock,

and when it developed into a blastocyst it was then implanted in a surrogate mother.

It was this technique that allowed Wilmut and his colleagues to genetically clone the

first live animal, known to the world as Dolly. The creation of Dolly has lead to

continuous controversy over the use of stem cell technology and the possibility of

further animal cloning and the possibility of human cloning.

1.3 Fundamentals of Genetic Structure

1.3.1 DNA Principles

To understand how our genetic and physical traits are passed on from our family, one

must understand how the genetic structure of human DNA takes form. An average

human is composed of approximately 100 trillion cells, all of which originated from a

single cell. Each cell contains individual genetic coding and within the cell is the

nucleus which is the control centre. Within the nucleus there is a chemical substance

known as Deoxyribonucleic acid (DNA), which is considered the genetic blueprint as

it stores the information necessary for passing down genetic attributes to future

generations (Watson and Crick, 1953; Benham and Mielke, 2005).

The encoded information within the DNA structure is passed from generation to

generation with one half of a person‟s DNA information coming from their mother

and the other from their father. DNA has two primary purposes: one is to make

copies of itself so cells can divide and carry on the same information; and two, to

carry instructions on how to make protein so cells can build the machinery of life

(Griffiths et al, 2005; Watson and crick, 1953; Benham and Mielke, 2005). DNA

accomplishes this by carrying the information for making all of the cell‟s proteins.

These proteins implement all of the functions of a living organism and determine the

organism‟s characteristics. To pass on the information the cell reproduces by making

copies of itself, in doing so it has passed all of the information it carries on to the

daughter cells created. Before the cell can reproduce, it must first replicate its DNA,

again by making copies.

4

X

G

X b

b

G

G

GG Gb

Gb bb

Figure 1.1: Mendelian Inherited traits as dominant and recessive phenotypes. Parental

generation passes down both the dominant and recessive genes to the first

generation. The dominant (green) and the recessive (blue) phenotype all look

the same in the first generation and show a 3:1 ratio in the second generation.

G b b

b G b G

=

X

G b b G

5

Table 1.1: A synopsis on the history of genetics

Year Discovery

1859 Published the “On the Origins of Species”

1866 Gregor Mendel‟s paper is published. Concept of inheritance in pairs;

dominance & recessive

1869 Nuclein, now known as DNA identified by Friedrich Miescher

1900 Mendel‟s theories were independently discovered and verified, marking the

beginning of modern genetics

1902 Walter Sutton pointed out the interrelationships between cytology and

Mendelism, closing the gap between cell morphology and heredity. Proposes

Chromosome theory of heredity.

1905 Nettie Stevens and Edmund Wilson independently described the behaviour of

sex chromosomes-XX determines female; XY determines male.

1908 Archibold Garrod proposed that some human diseases are due to "inborn errors

of metabolism" that result from the lack of a specific enzyme.

1931 Harriet Creighton and Barbara McClintock determine that genetic

recombination is caused by a physical exchange of chromosomal pieces

1950 Erwin Chargaff discovered a one-to-one ratio of adenine to thymine and

guanine to cytosine in DNA samples from a variety of organisms.

1953 James Watson and Francis Crick discover DNA is in the shape of a double

helix.

1959 François Jacob and Jacques Monod discover that Messenger RNA is the

intermediate between DNA and protein.

1966 Genetic code is cracked by a number of researchers

1977 Fred Sanger developed a chain determination DNA Sequencing technology

1985 Kary Mullis published a paper describing the polymerase chain reaction

(PCR), the most sensitive assay for DNA yet devised.

1988 The Human Genome Project began with the goal of determining the entire

sequence of DNA composing human chromosomes.

1989 Alec Jefferys coined the term DNA fingerprinting and was the first to use

DNA polymorphisms in paternity, immigration, and murder cases.

1990 Gene therapy and genetically modified foods were introduced.

1995 Automated sequencing was introduced.

1996 First cloning of a mammal performed by Ian Wilmut and Colleagues

2001 Sequence of the human genome released.

2007 Controversies continue over human and animal cloning using stem cell

research.

6

1.3.2 DNA Structure

The structure of DNA lends itself easily to DNA replication. Watson and Crick

discovered that DNA had two strands, and that these strands were twisted into the

shape of a double helix. Each side of the double helix runs in opposite directions

allowing the structure to simply unzip when replication is to take place. The DNA

structure itself is composed of nucleotide units that are made up of a nucleotide base,

a sugar, and a phosphate. These nucleotide bases are composed of four characters

representing the four nucleotide bases: A (adenine), T (thymine), C (cytosine) and G

(guanine) (Figure 1.2), providing variation in each nucleotide unit and yielding the

diverse biological differences among human beings and all living creatures, while the

strong back bone structure of the DNA molecule is comprised of sugar-phosphate

portions of adjacent nucleotides are bonded together. The phosphate of one

nucleotide is covalently bounded to the sugar of the next nucleotide (Griffiths et al,

2005; Watson and Crick, 1953; Benham and Mielke, 2005).

1.3.3 Base Pairing

While the strands of DNA are made of the sugar and phosphate portions of the

nucleotides, the middle part of the DNA strands are made up of the nitrogenous bases.

The nitrogenous form pairs with the bases on the other side of the DNA strand, and is

formed from two complementary nucleotides: purines and pyrimidine, bound together

by a process know as hydridisation. The individual nucleotides pair up with their

complimentary base through the hydrogen bonds that form between the bases. Two

hydrogen bonds form between purines allowing adenine to hybridise only with

thymine and three bonds form between pyrimidines, allowing cytosine to hybridise

only with guanine, making G-C base pairs a little stronger then A-T base pairs and

thus creating the twist to shape the double helix (Butler, 2005; Benham and Mielke,

2005).

Though the hybridisation is a fundamental property of DNA, the hydrogen bonds may

be broken by elevating the temperature or through chemical treatment, a process

7

N H O N

N

N

N H N

N H O

N

H

H Guanine Cytosine

N

N

N

N

O H N

H

H N

N

O Adenine Thymine

Figure 1.2: Top, a GC base pair with three hydrogen bonds. Bottom, an AT base pair with two hydrogen

bonds. Hydrogen bonds are shown as dashed lines. (Adapted from Alberts et al, 2002)

8

known as denaturation, resulting in a single stranded DNA. This process has allowed

biologist to extract and further examine DNA, to understand DNA‟s stability and to

replicate DNA using PCR technology.

1.4 Human Genome Project

With genetics taking larger steps forward in the research world, an international

research project began to take form in the early 1990s. The Human Genome Project

(HGP) was initially headed by James D. Watson, and had primary goals to determine

the sequences of the bases that make up the human genetic code, DNA (Watson &

Cook-Deagan, 1991). It was with the introduction of new sequencing analyses over

the years that allowed the HGP to progress further and faster, and it was finally in

2000 that a working draft of the human genome was released and was then completed

in 2003, with further analysis still being published.

The HGP has stimulated the development of advanced technology for characterising

DNA and studying genes. The HGP has also had a profound impact on our

understanding of health and disease, by enabling researchers to locate and study more

than 10,000 genes that contain instructions for building a human being (Robbins,

1992). The genetic information can also be used to predict an individual‟s chance to

of inheriting a genetic disease; this deeper understanding of the disease processes at

the molecular level helps to determine new therapeutic procedures, improve both

population studies and provide more statistical power for drug trials among

populations, establishing the importance of DNA in molecular biology (van Ommen,

2002).

The HGP has considerably improved the study of human genetic disease and animal

model systems, by allowing public access to all their research and shedding light on

the relatedness of different populations, which caused a transformation in medicine,

as researchers turned to DNA based determination of individual risk of future illness

or adverse drug response, facilitating individualised preventive medicine (Collins,

2006). Further analysis of similarities between DNA sequences from different

9

organisms has highlighted the existence of several novel genetic mechanisms, the

impact of which could never have been conceived otherwise, such as genetic

imprinting, and trinucleotide repeat expansion and anticipation (Patrinos & Drell,

1997; Joseph, 1995). In turn, the study of these processes has greatly deepened our

fundamental insights into genetics and is also uncovering new answers in the study of

the theory of evolution.

1.5 DNA Profiling

With research and interest in DNA developing, work began to help increase

productivity and knowledge of DNA. The earliest technique developed by Sir Alec

Jeffreys and his colleagues was the multi-locus probe (MLP) technique, which used

chemical restriction enzymes to dissolve DNA into fragments. Because a DNA

„fingerprint‟ was not a direct trace of a person‟s DNA. Jeffreys created the MLP

technique to visualize the selected sequences of a single-stranded DNA and compare

their sizes (Rand et al, 1991; Lynch, 2003). The MLP technique used markers that

bind to an indefinite number of chromosomal sites, resulting in a complex pattern of

bands (Lynch, 2003; Aronson, 2005).

Jeffreys and other supporters of the MLP technique believed that when two samples

were compared, it was virtually impossible, except with identical twins, that an entire

pattern of bands would match, although precise estimates could not be given for the

likelihood that any given band, or number of bands, would match (Lynch, 2003;

Aronson, 2007; McLay, 1996). But the main problem with the interpretation of MLP

band patterns was that the variation in the band intensity was generally independent

of fragment size but dependent on the DNA concentration (Rand et al, 1991).

MLP „fingerprints‟ were quickly replaced in the late 1980s by the single-locus probe

(SLP) technique. This involved the isolation and marking of a limited number of

non-coding DNA regions known as variable number tandem repeat (VNTR)

sequences. These can be found on many chromosomes, and often show variations in

length between individuals. Selected VNTR sequences were shown to be hyper-

variable in the human population, as each variant acted as an inherited allele. These

10

were marked by means of radioactive probes, allowing them to be used for personal

or parental identification (Jeffreys et al, 1985; Buffery et al, 1991; Schneider et al,

1991; Aronson, 2007). Though it showed less genetic information then the MLP

technique, The SLP technique became more commonly used in forensic cases as it

was found useful when using degraded and mixed (victim and perpetrator) DNA

samples, compared to the MLP. Population studies also generated probability

measures for the frequency of SLP patterns in human populations and selected

„racial‟ subpopulations, such as Caucasian, Asian and African (Lynch, 2003; Buffery

et al, 1991).

Although the SLP technique enjoyed the advantages of greater control and more

precise quantification, it also became subject to heated disputes in the courts and

scientific literature. SLP results were presented in probabilistic form, but the

resulting estimates often seemed to predict near absolute identity (Aronson, 2007).

Estimates of the chance that two, randomly chosen and unrelated, individuals would

share the same combination of alleles in a DNA profile sometimes approached less

than one in hundreds of millions (Aronson, 2007; Lynch, 2003).

Most human identity testing nowadays is performed using the Combined DNA Index

System (CODIS). CODIS had been designed to be a system of pointers to help public

US crime laboratories compare and exchange DNA profiles. It consists of two

indexes: the Convicted Offender Index and the Forensic Index. The Convicted

Offender Index contains profiles of individuals convicted of crimes eligible for

CODIS and the Forensic Index contains profiles developed from biological material

found at a crime scene. Using multiple core short tandem repeats (STRs), usually 10-

14 (Westring et al, 2007; Opel et al, 2007) on the autosomal chromosome, and sex

determination done with markers on the sex chromosome, CODIS compares the 10-

14 STR markers of the DNA found at the crime scene to those already within the

database.

11

Alleles only make up approximately 5% of human genomic DNA (Butler, 2005;

Nusbaum et al, 2005) and so markers used for human identity testing are found on the

non-coding region either between or within alleles and thus do not code for genetic

variation (Schneider, 1997). The STR markers used in CODIS use and compare

alleles at similar loci found on pairs of chromosomes within the genetic sample.

Loci that contain alleles that are the same size are described as homologous and

contain the same genetic structure, as a copy of each gene resides at the same locus

on each chromosome of the homologous pair. The alternative possibilities is for two

of the alleles at a genetic locus on homologous chromosomes to be different, these

alleles are termed heterozygous (Griffiths et al, 2005; Starr, 2005). A genotype is a

characterisation of the alleles present at a genetic locus. If there are two alleles at a

locus, 1 and 2, then there are three possible genotypes: 1,1, 1,2 and 2,2. 1,1 and 2,2

being homozygous and 1,2 being heterozygous.

DNA profiling uses this process of determining the genotype present at specific

locations along the DNA molecule. Polymorphic markers that differ among

individuals can be found throughout the non-coding region of the human genome.

Multiple loci from these areas are typically examined in human identity testing to

reduce the possibility of a random match between unrelated individuals (Ania et al,

2002; Aronson, 2007; Baffery, 1991).

1.6 Population Variation

Genetic variation is one facet of the more general concept of phenotypic variation.

Phenotypic variation describes differences in the characteristics of individuals of a

population and is of interest to biologist because it is what natural selection acts upon;

different phenotypes may have different fitness‟s and selection results in fitter

phenotypes leaving more descendants. Phenotypic variation arises from either of two

sources: genetic variation and environmental variation. However, only differences

that arise from genetic variation can be passed on to future generations.

12

Despite the physical variation observed throughout humans worldwide, there is

surprisingly little difference in DNA content between humans. DNA molecules are

the same between different ethnicities, over 99.7% in common. Only a small fraction

of our DNA (0.3%) differs between populations and even a smaller amount among

subpopulations (Butler, 2005; Romualdi, 2001; Mooser et al, 1994). This is evident

by the fact that with the exception of identical twins, we all appear different from

each other. Hair colour, eye colour, height and shape all represent alleles in our

genetic make up. These variable regions of DNA provide the capability of using

DNA information for human identity purposes. DNA variation can be exhibited in

two different ways; either sequence polymorphism or length polymorphism (Butler,

2005) (Figure 1.3). Polymorphisms are the natural variation in a gene, DNA

sequence, or chromosome and usually occur with fairly high frequency within the

general population (Dawkins, 1999). The genetic variation in DNA sequence among

individuals occurring in a population would be considered a useful polymorphism for

genetic linkage analysis, giving researchers more DNA to be examined and a higher

chance that two unrelated individuals compared will have a greater number of

different genotypes (Schneider, 1997).

1.7 Principles of Population Genetics

As biologist learned more and more polymorphic markers, the question of how each

population related made its way into genetic research. Population genetics studied

the inherited variation and its modification over time. It was and is still an attempt to

quantify the variation observed within a population group or a different population

group in terms of allelic and genotype frequencies (Hammer et al, 1997; Nei, 1972).

The simplest description of variation is the frequency distribution of genotypes. A

measure of this variation is the number of heterozygote individuals present in a

population. Variability within a locus has to be stable enough to accurately pass the

allele to the next generation, yet not be too stable or else only a few alleles would

exist over time and the locus would not be as informative over time, losing variability

such as heterozygosity (Perna et al, 1992; Shen, Batzer & Deninger, 1991).

13

Figure 1.3: Illustration of a sequence polymorphism which is a mutation resulting in a difference of a single-base

pair, and a length polymorphism which is a mutation that differs in the amount of fragments within a

chain sequence (adapted from Butler, 2005).

Sequence Polymorphism

ATCGCGTAGACGATTCGG

ATCGCGGAGAAGATTCGG

Length Polymorphism

ATCGCG(GGCT)(GGCT)-----------ATTCGG

ATCGCG(GGCT)(GGCT)( GGCT)ATTCGG

14

Population genetic forces including mutation, gene flow, natural selection, and

random genetic drift all affect gene frequency of alleles present in a population. This

can be seen over time in isolated populations. Once diverged from one another the

population size decreases, resulting in members all coming from a small number of

individuals, and therefore have limited genetic variation, losing genetic distinction

between each other (Arcos-Burgos, 2002). The gene selection pool is smaller in

isolated groups and therefore, not as much shuffling of genes exists (Middleton,

2000).

1.8 Gene Clusters

Gene clusters also found to be very useful when it came to population variation. A

gene cluster is a set of two or more genes that serve to encode for the same or similar

products. Gene clusters exist all over the genome in every organism, each playing

important roles in body development, body functionality and immunity (Singer and

Berg, 1997). A common and important gene cluster is the Hox cluster. The Hox

genes function is to determine where limbs and other body segments will grow in a

developing foetus or larva (Griffiths, 2005; Singer and Berg, 1997). Mutations in any

of the Hox genes can lead to growth of extra, typically non-functioning body parts in

invertebrates, while in humans; it usually causes deformation of the hands and feet or

may result in miscarriages (Goodman and Scambler, 2001).

Found on the short arm of chromosome 11, the Human Alpha-globin and Beta-globin

gene clusters are other important clusters in the genome. These genes have a role in

the formation of haemoglobin and allow haemoglobin to adjust its oxygen-binding

capacity according to the oxygen concentration of its environment (Efstratiadis et al,

1980). It is beta-globin that is altered in human sickle-cell anemia, while without

sufficient normal alpha-globin proteins, individuals can develop alpha-thalassaemia, a

potentially life threatening form of anaemia. Though in areas where Malaria is wide

spread, the mutation of the alpha and beta-globin clusters is an advantage and

prevents the individual from severe infection.

15

The above mutations have been known to spread in some populations, but it is

unknown why one mutation may be more frequent in one population and not in

another. However, they are useful for tracing back recent evolutionary history, as

common ancestors tend to possess the same varieties of gene clusters and genetic

mutations (Schneider, 1997; Singer and Berg, 1997).

1.9 Major Histocompatibility Complex

A „gold standard‟ for population genetics is found on the short arm of chromosome 6.

The major histocompatibility complex gene cluster is the largest region or gene

family found in most vertebrates; in humans it is a complex collection of genes

clustered closely on chromosome 6 (Kulski et al, 2002) (Figure 1.4). One of the most

striking features of the MHC, particularly in humans, is the high gene density. “This

clustering is considered to be biologically and evolutionarily significant and has been

attributed to selection pressure, possibly supporting the co-ordinated expression

and/or matching of allelic forms in cis and the suppression of recombination” (Kulski

et al, 2002). The MHC region is the most polymorphic gene in the genome and

contains genes that are highly duplicated. This duplication is what is responsible for

much of the genetic diversity (Mungall, 2003; Dawkins et al, 1999; Anzai, 2003;

Takasu, 2007; Hedrick, 2002).

Population surveys of the other classical loci routinely find tens to a hundred alleles

still highly diverse. Perhaps even more remarkable is that many of these alleles are

quite ancient. It is often the case that an allele from a particular MHC gene is more

closely related to an allele found in chimpanzees than it is to another human allele

from the same gene (Gagneux & Varki, 2001; Muchmore, 2001).

The MHC is divided into 3 major sub-regions, Class II, Class III (central MHC) and

Class I, centromeric to telomeric. Strong linkage disequilibrium exists across the

MHC particularly among alleles of specific multilocus haplotypes and between

particular genes (Begovich et al, 1992; Rajsbaum, 2002; Dunn et al, 2005; Dunn et al,

2006; Mungall, 2003). An increased list of polymorphisms identified in both the

16

HL

A-A

HL

A-C

H

LA

-B

HL

A-D

R

HL

A-D

Q

HL

A-D

P

21.3

2p

21.3

1p

21.2

p

Cen

trom

ere

Major Histocompatibility Complex: Human Chromosome 6

Figure 1.4: The human major histocompatibility complex (MHC). This group of genes resides on chromosome 6,

and encodes cell-surface antigen-presenting proteins and many other genes.

17

intragenic and inergenic regions of the MHC genomic region will permit rapid

identification of changes that can be localised to small segments (Dunn, 2005).

Other markers have also been used in studying MHC haplotype variation, such as the

polymorphic MHC-related genes, micro satellites, SNPs, and polymorphic Alu

insertions (POALINs) (Walsh, 2003; Dunn, 2005); all are informative genetic

markers in lineage analysis, hitchhiking effects, population genetics and evolutionary

relationships, especially in studying the MHC genomic region (Begovich et al, 1992,

Dunn et al, 2005; Skaug, 2001; Wakeley, 2001; Leelayuwat et al, 1994).

1.10 MHC Ancestral Haplotypes

Studies within the MHC usually focused on groups of haplotypes referred to as

ancestral or extended haplotypes. The initial definition of a haplotype arose from the

recognition that serologic patterns segregated within families. Allelic products of

closely linked genes were assumed to be inherited en bloc, as a unit (Degli-Esposti et

al, 1992; Yunis et al, 2003). Unless there was specific evidence of recombination

between the genes, it soon became obvious that haplotypes that were described in one

family study were similar or identical to those found in other families, suggesting the

possibility that there could be some remote ancestral relationship between different

families and that haplotypes had been maintained en bloc over many generations

(Degli-Esposti et al, 1992; Martins et al, 2007). They are relatively population

specific and are believe to present the original MHC haplotype of our ancestors,

which are still segregating unchanged (Gaudieri et al, 1997).

The existence of ancestral haplotypes implies conservation of large chromosomal

segments (Degli-Esposti et al, 1992). Irrespective of the mechanisms involved in

preservation of ancestral haplotypes, it is clear that these haplotypes carry several

MHC genes, other than Human Leukocyte Antigens (HLA) which may be relevant to

antigen presentation, autoimmune responses, and transplantation rejection (Degli-

Esposti et al, 1992; Marounger, 1999). The ability to recognise intact and the

18

recombination of ancestral haplotypes enable an approach to mapping the MHC gene

(Marounger, 1999).

1.11 SNPs vs Indels

Single nucleotide polymorphisms (SNPs) are the most common type of genetic

variation among people. Each SNP represents a difference in a single DNA building

block, called a nucleotide (Kamahori, 2002; Aoki et al, 2003). For example, a SNP

may replace the nucleotide cytosine (C) with the nucleotide thymine (T) in a certain

stretch of DNA (Figure 1.5).

SNPs occur normally throughout an individual‟s DNA. They occur once in every 300

nucleotides on average, which means there are roughly 10 million SNPs in the human

genome (Griffiths et al, 2005; Kamahori, 2002; Ting, et al, 2006). Most commonly,

these variations are found in the DNA between genes. They can act as biological

markers, helping scientists locate genes that are associated with disease (Aoki et al,

2003; Kirk et al, 2002). When SNPs occur within a gene or in a regulatory region

near a gene, they may play a more direct role in disease by affecting the gene‟s

function.

Another form of polymorphism is a bi-allelic polymorphism, insertion-deletion

(Indel). An indel can be the insertion or deletion of a segment of DNA ranging from

one nucleotide to hundreds of nucleotides (Figure 1.6). The two alleles for bi-allelic

indels can simply be classified as „short‟ and „long‟ (Weber et al, 2002). James

Weber and colleagues at the Marshfield Medical Research Foundation recently

characterises over 2000 bi-allelic indels in the human genome (Weber et al, 2002). A

total of 71% of these indels possessed 2, 3, or 4 nucleotide length differences with

only 4% having greater than a 16 nucleotide length difference” (Butler, 2005).

Short Interspersed Nuclear Elements are just one of the types of indels make up the

majority of the “short” indels found within the genome. And though SNPs can

provide much detail about evolutionary history, because of their size, indels are easier

19

C

G

T

A

SNP

1

2

Figure 1.5: Illustration of a SNP. DNA strand 1 differs from DNA strand 2 at a single-base pair

location. SNPs occur in members of the same group showing variation in their DNA

sequence.

20

Before insertion After insertion

Area being

inserted Inserted

area

Before

deletion After

deletion

Deleted

area

Figure 1.6: Illustration of an insertion and a deletion at the chromosome level.

21

to type and have already been found useful in genetic studies, and have found their

use in forensic identity testing (Weber, et al, 2002; Ye et al, 2002).

1.12 Retroelements and Repeat Sequences

Indels can also be controlled by RNA mediated movement of genetic information

from one locus to another and is known as retrotransposition, while the transposed

genetic information is termed a retroelement.

Retroelements comprise a substantial portion of the human genome, and can be

classified into two groups. Members of the first group are called retroposons or

retrosequences and include SINE elements and processed pseudogenes (Griffiths et al,

2005). Among the retroelements, which in themselves may have the capacity to

transpose, are nonviral elements such as LINEs, as well as endogenous retroviruses

and retroviral elements, with structural analogies to infectious retroviruses (Leib-

Mösch, 1996; Smit, 1996).

Repeat sequences are assumed to influence the genomic stability and to generate hot

spots for recombination. Since closely related retroelements are dispersed in high

copy numbers throughout the human genome, it is conceivable that these sequences

could be involved in unequal crossovers between two related elements on different

chromosomal locations (Leib-Mösch, 1996; Smit, 1996), leading to DNA

rearrangements, such as deletions, inversions, duplications, and translocations.

1.13 SINEs – Short Interspersed Nuclear Elements

Almost half of the human genome is derived from transposable elements. The vast

majority of these transposable elements are SINEs or long interspersed nuclear

elements (LINEs) (Griffiths et al, 2005). LINEs move by retrotransposition with the

use of an element encoded reverse transcriptase, but lack some structural features of

retrovirus-like elements (Okada et al, 1991). SINEs can be best described as

nonautomomous LINEs. Because they have the structural features of LINEs but do

22

Presumably, they are mobilised by reverse transcriptase enzymes that are encoded by

LINEs that reside in the genome.

SINEs in the human genome are sequence approximately 300 bp in length derived

from the 7SL RNA gene. SINEs are inserted into DNA at different location by

retrotransposition, a mechanism in which a complementary DNA generated by

reverse transcription of RNA transcripts is expressed by one of a possible 100 Alu

master copy sequences and is then inserted into a new position in the genome (Jurka,

2002; Batzer, 1994).

1.14 Polymorphic Alu Insertions (POALINs)

Alu sequences are the largest family of SINEs in humans and other primates with

more than a million copies per haploid genome. Alu sequences were ancestrally

derived from the 7SL RNA gene and are thought to mobilise in a process termed

retroposition (Batzer et al, 1994; Batzer et al, 1994; Leib-Mösch & Seifarth, 1996;

Kulski & Dunn, 2005). Once inserted at specific chromosomal locations, most Alu

elements do not appear to be subject to loss or rearrangements, with less than 0.5% of

the Alu elements reported to be polymorphic (Arcot et al, 1996; Batzer & Deininger,

2002; Comas et al, 2004; Dunn et al, 2007). Also, generations of new Alu insertions

by retrotransposition are rare events, making them stable genetic markers, and their

allele frequency distribution varies in geographically distinct human populations

(Antunez-de-Mayolo et al, 2002).

POALINs have several desirable properties for studying genetic variation in human

populations. The non-radioactive, PCR based detection method for these

polymorphisms make it feasible to rapidly screen large numbers of DNA (Batzer,

1994). The Alu insertions also appear to have a relatively stable integration into the

genome and are rarely deleted. Even when deletion of an Alu element occurs, the

deletion is not a precise excision of the Alu element, but rather it leaves behind a

signature of the original insertion event (Batzer, 1994; Dunn, 2005).

23

The rate of insertion and fixation of new Alu elements are about 100-200 per million

years, so the independent insertion of two different Alu elements at the same location

is the genome has essentially no chance of occurring (Arcot et al, 1996; Comas, 2001;

Kulski et al, 2002). Therefore, individuals who share POALINs inherited them from

a common ancestor, making POALINs identical by descent (Paabo et al, 2001; Batzer,

1994; Kass 2006). The likelihood of Alu insertions occurring at different loci within

the same individual (haplotype) is extremely rare, haplotypes with multiple POALIN

sites (two or more) have most probably arisen by recombination of haplotypes with

single but different polymorphic Alu elements (Figure 1.7) (Dunn, 2005; Perna, 1992).

POALINs clearly represent an ongoing evolutionary process in the human genome,

and the Alu family of repeats represent a unique source of genetic variation for

human population genetics and forensic identity testing (Batzer et al, 1994).

Though POALINs will provide a strong source of genetic variation, the benefit of

them working with a large dataset can be a limitation for researchers, that in order to

get an accurate interpretation of a large population a large sample size must be tested.

It is also hard to detect one new insertion among one million pre-existing elements in

the genome (Cordaux et al, 2007). The major disadvantage is that non-African

populations all have the absence for a variety of particular POALINs, which reduces

the genetic diversity of the world and excludes those POALINs for population

comparisons (Cordaux et al, 2007).

1.15 Profiling Ethnicity in Forensic DNA

The genetic differences between subpopulations are very important. Shared ancestry

can cause a defendant‟s DNA profile to be more common among individuals from the

same subpopulations, and this subpopulation will often include some, perhaps even

most, of the alternative possible culprits (Ayres et al, 2002) (Triggs and Buckleton,

2002). If possible, population databases for use in forensic DNA testing should

contain unrelated individuals of known ethnicity. However, this may not be

24

Transposition

Autonomous

Nonautonomous

Nonautonomous

Autonomous

Structure

AAA

ORF1 ORF2 (pol)

AAA

transposase

LTR LTR gag pol env

AAA

Insertion

Nonautonomous

Type

Full Length

Adenine Rich Segments

AluJ, AluSx, AluSq, AluSp, AluSc,

AluY, AluYa5, AluYa8, & AluYb8,

LINEs

Element

DNA

Transposons

Alu

SINEs

Retrovirus

Figure 1.7: Structural comparison of a retrovirus to types of transposable elements in the human genome.

25

completely possible in a practical sense, as many laboratories are required to use

samples that have been made anonymous prior to study (Butler, 2005).

In addition, categories of ethnicity are often subjective and may be based on

perceived phenotype or cultural classification. Broad ethnic categories are usually

adequate for most forensic databases, unless an isolated population is of interest

(Ayres et al, 2002; Triggs and Buckleton, 2002). Sampled individuals may also have

more than one easily definable ethnic background and may prefer to be grouped

differently from a cultural stand point than they might otherwise biologically be.

Finally, people who have been adopted or conceived through in vitro fertilisation may

not know their genetic heritage (Butler, 2005).

All of these can lead to potential bias against the defendants, but can be overcome by

assuming that all alternative possible culprits have the same ethnic background and

are from the same subpopulation as the defendant (Ayres et al, 2002) (Triggs and

Buckleton, 2002). Examination of allele frequencies observed with different sample

sets from around the world have shown small differences between individuals of the

same population, and a distinguishable difference between different populations,

providing as much accuracy to the forensic testing as possible. Though there is still

potential difficulty with the „same subpopulation‟ approach in that allele frequency

estimates for the defendant‟s subpopulation may not be available. (Ayres et al, 2002)

(Triggs and Buckleton, 2002; Butler, 2005).

1.16 Previous Population Studies

Although a variety of studies have indicated that using statistical clustering

techniques to examine genetic information may allow for geographically based

grouping of individuals that tenuously map onto some conceptions of ethnicity

(Zyphur, 2006; Paabo, 2001). These studies have also indicated that the amount of

genetic variation within these groupings is significantly larger than the variation that

exists between them.

26

Many population studies divide the world into three primary ancestral groups African,

Asian and Caucasian, roughly representing the populations around the world (Comas

et al; Antunez-de-Mayolo, 2002). These categories not only can be hard to

distinguish from „race‟, but they also ignore, usually the overlap between groups and

the continuous nature of the way people and genes spread today (Comas et al, 2004).

Further research is available to look at a wide variety of uncharted populations as well

as the subpopulations that lie within.

1.17 Hardy-Weinberg Equilibrium

When it comes to population studies, researchers cannot visually examine the extent

to which genetic alleles tend to be inherited together. The simplest way to determine

the independence of alleles within a locus is to use the Hardy-Weinberg principle.

G.H. Hardy and W. Weinberg independently suggested a scheme whereby evolution

could be viewed as changes occurred in frequency of alleles in a population of

organisms. They argued that when certain conditions were met, breeding in large

populations, random mating, no mutation, migration and no natural selection, the

population‟s alleles and genotype frequency would remain constant from generation

to generation (Price, 1971; Aronson, 2007).

Checking for the Hardy-Weinberg Equilibrium (HWE) is performed by taking the

observed allele frequencies and calculating the expected genotype frequencies based

on those allele frequencies (Price, 1971). If the observed genotype frequencies are

close to the expected genotype frequencies calculated from the observed allele

frequencies, then the population is in Hardy-Weinberg Equilibrium and allele

combination are assumed to be independent of one another (Aronson, 2007).

1.18 Bedouin Culture

Bedouin derives from the word badawi, meaning an inhabitant of the Arabian and

Syrian Desert. This isolated population has divided itself among numerous amount of

tribes spread out across the Arabian Peninsula desert (Losleben, 2002). These

27

nomads of the desert travel from oases to oases using the resources as they grow

naturally. Their origin is unknown, but many have thought that the Bedouin were

descended from nomads who herded cattle at a time when the climate was milder

(Losleben, 2002; Abu-Rabia, 2002). While, the Bedouin themselves believe they are

the descendents of Shem, son of Noah. The Bedouin people are very tribe oriented

and even with strong westernisation they still maintain their cultural customs.

However, due to their deeply rooted customs of consanguineous marriages, the

Bedouins suffer from genetic diseases at a higher rate then an average population.

They do not carry more genetic mutations then the general population, but because

more than 50% of the population with almost two thirds of the consanguineous

mating being between first cousins, they have a significant higher chance of marrying

someone who carries the same mutations (Sheffield, 1998; Hsien, 2006). This is

consistent due to the fact that many in the Bedouin culture still travel in “goum”

which generally consist of people from one groups of members of a descent group.

1.19 Aims for this Thesis

Previous population studies have generally focused on three main population groups:

Caucasian, Asian and African. Subpopulations are then categorised into these three

main ethnic groups. To begin an understanding of subpopulations and their

connections I focused my research on the Bedouin population found in the deserts of

the United Arab Emirates. Arab populations are considered a subpopulation of

Caucasian ethnicity, but are they genetically similar to Caucasians?

A variety of markers have been used from STRs to microsatellites in order to

compare populations and subpopulations. In this experiment my work focused on the

study of the MHC Class I POALINs. POALINs have been emerging over the last few

years and have been very useful in previous population studies and have also been

found to be important polymorphic markers in ongoing population and disease studies.

The four POALINs used in this experiment were previously researched by Dr. David

Dunn for his PhD completed in 2005. My aim was to further characterise the

28

reliability of the four POALINs and to determine their frequency distribution within

the Bedouin population, and to compare the frequencies within other known

populations in order to understand the evolutionary history of the Bedouin population,

and to determine the population category in which they fit.

Further research into subpopulations can also help to improve technology currently in

place to understand and identify the separate populations. In the area of forensic

science, databases are used in order to compare DNA found at a crime scene to those

previously tested. In order to get the most accurate results, 13 STR markers are used;

as well the DNA is compared to a particular population database. It is at this point

that the accuracy of the test lowers, as placing an individual into a population

category is usually based on phenotypes rather then genetics, and may result in an

inaccurate comparison if the individual belongs to another population then the one

they are compared to. Researching further into the POALINs and the connection

between subpopulations may assist in enabling forensic testing to accurately

determine the population of the individual to be identified.

My aim is to assist in continuing improvement of the forensic science DNA database,

by proving the reliability of the POALINs in population research, and to also begin

research into the genetics of the Bedouin population, to understand their genetic flow,

as well as lead to research to understand the particular diseases that affect their

population.

29

Chapter 2 Materials and Method

2.1. Genomic DNA

Whole blood was drawn from 54 unrelated healthy Bedouin individuals following

standard procedures and after ethics approval from Dubai HE approval information.

The DNA was then extracted using the High Pure Viral Nucleic Acid Kit (Roche

Applied Science, Indianapolis, IN, USA). 300μl of whole blood from each sample

was mixed with 200μl of binding buffer to lyse the cell wall and to release the DNA,

and 40μl of Proteinase K to target and denature the protein.

100μl of isoproponal was added to remove residual amounts of protein. 500μl of

inhibitor Removal Buffer (5M guanidine-HCl, 20mM Tris-HCl, pH 6.6) was then

added to remove of lipids in the mixture. The DNA was then washed with wash

buffer (20mM NaCl, 2mM Tris-HCl, pH 7.5) and centrifuged twice. The DNA was

then washed using cold 70% ethanol, centrifuged and the supernatant was discarded,

leaving purified template DNA that was diluted in TE Buffer (1mM EDTA, 10mM

Tris-HCl, pH 7.5) to a concentration of approximately 20ng/μl. 4μl was used for each

polymerase chain reaction (PCR) assay.

2.2. POALIN PCR Assay

2.2.1 PCR Reaction

The presence and/or absence of the Alu at each of the four loci were distinguished

from each other by the different sizes of the PCR product for each primer pair (Table

2.1). For primers AluyHJ, AluyHF and AluyMICB the PCR solution (20μl) contained

80ng of DNA template, 10pmol of each primer, 25nmol of each deoxyribonucleotide

triphosphates (dNTPs), 0.4 units of FastStart Taq Polymerase (Roche Applied

Science, Indianapolis, IN, USA), 3mM of MgCl2 and 2μl of 10xPCR Buffer (600 mM

Tris-HCl pH 8.3, 250 mM KCl, 1% Triton X100, 100 mM β-mercaptoenthanol).

AluyTF included 40ng of DNA template, 5pmol of each primer, 0.4 μl of each dNTPs

0.5 units of FastStart Taq Polymerase, 1μl of MgCl2 and 1μl of 10xPCR Buffer (600

mM Tris-HCl pH 8.3, 250 mM KCl, 1% Triton X100, 100 mM β-mercaptoenthanol).

30

2.2.2 Cycling Conditions

Each solution was performed using a DNA Engine Tetrad Thermal Cycler (Bio-Rad

Laboratories, Hercules, CA, USA) with a hot start at 95°C for 10 mins for 1 cycle to

release the FastStart Taq, 35 cycles with a denaturation at 95°C for 30 secs, annealing

temperature at 59°C for AluyMICB and AluyHF, 55.1°C for AluyHJ and 56°C for

AluyTF, and an extension step at 72°C for 45 secs. A final extension step of 72°C for

10 mins complete the cycle.

2.2.3 Electrophoresis

The reaction programs were analysed by horizontal sub-cell model 192 gel

electrophoresis (Bio-Rad Laboratories, Hercules, CA, USA), in 1.5% agarose using

Ethinium Bromide running buffer. Fragments of different sizes were produced for

either the presence or absence of the POALINs (Figure 2.1); a single fragment of

different sizes for the two homozygous and two fragments for the heterozygous. Two

Caucasian DNA samples from Busselton Research Foundation were used as positive

controls, one homozygous for the absence and the other homozygous for the presence

of Alu insertions.

2.3 Genotype and Phenotype Analysis

The observed allele frequencies were obtained by using the gene counting method

(Ceppellini, Siniscalco et al. 1955). The method is used by adding the individuals

with the same of either of the two genotypes: A(p) and a(q). Every AA individual has

2 A genes and every Aa individual has 1 A. This is relative to all the genes in the

population, by dividing the total number of alleles present in the samples population

(2 x number of individuals).

To determine if a population is in Hardy-Weinberg Equilibrium an expected

frequency must be determined and compared to the observed frequency of the

population. The Hardy-Weinberg equilibrium equation (p2 + 2pq + q

2) was

performed for each of the POALINs used. If the population were in Hard-Weinberg

Equilibrium, it would be expected that the frequencies calculated would be similar to

those of the observed frequencies.

31

Table 2.1: The primer sequences and product size for the PCR amplification of the 4 POALIN loci.

Fragment size

(bp)

Aluy

Loci Primer Name Primer Sequence (5' – 3')

Size

(bp)

Accession

Numbera

Positionb allele*1 allele*2

MICB AluyMICB.F GCC TTC CAA TGC CAT TCA CAG 21 AC006046 38,921 38,941

502 664 AluyMICB.R CTC AGC CCT GCT TTC CCA TCT 21 AC006046 38,277 38,297

TF AluyTF.F GTG CCT GGT AAA AAT TTA AGA GCT GTA 27 AC005530 7,150 7,177

422 710 AluyTF.R TGC ACC CGG CCT AAA ACC ACT GGT T 25 AC005530 7,836 7,859

HJ AluyHLAJ.F AAG AAA CCC ATA ACT CAC TTG 21 AP000519 11,430 11,450

163 501 AluyHLAJ.R TGT GTC CAG GTT AAA CTT CAG 21 AP000519 11,909 11,929

HF AluyHF.F GCC TCA TGG CCT GAA TCT GCC AGT GTC CTT 30 AP000521 124,367 124,396

458 605 AluyHF.R GTA ACT GAC GTG CCC TCT ATA GTA TAG TCT 30 AP000521 124,794 124,825

aThe accession number and

bpositon can be found on the National Centre for Biotechnology Information (NCBI) database.

32

Using the observed frequencies, the expected frequency is calculated by squaring the

observed frequency of each genotype (p2 or q

2). From the frequency calculated, the

expected number of individuals to have that genotype can be determined (p2

or q2 x

number of individuals in the population). The heterozygous individuals are then

calculated using the expected frequency results in the equation 2pq and then

multiplying them by the number of individuals in the population to determine the

number of expected individuals with that genotype.

To ensure that the calculations are done correctly to determine genotype frequencies,

the equation p2 + 2pq + q

2 should equal to 1. If the results add up to 1 then it can

suggest that the allele and genotype frequencies are in Hardy-Weinberg equilibrium.

In other words, it can be expected that the allele frequencies will remain constant over

time. However, it does not imply that the population meeting Hardy-Weinberg

equilibrium is not evolving; it merely indicates that the particular locus being studied

is not changing.

For statistical purposes a chi-square (χ2) 2x2 contingency table from an online chi-

square calculator, GraphPad had been used. Using the results of the chi-square, the

probability value was then calculated using an online p-value statistical calculator

from danielsoper.com to determine the genetic relationship between the Middle

Eastern Bedouin and enable the ability to compare the Middle Eastern Bedouin to

each of the individual populations that had been previously studied using the same

POALIN markers. P-value measures how much evidence there is against the null

hypothesis, a hypothesis that presumes no change or no effect, in this case, that the

populations are identical. The general rule is that a small p-value is evidence against

the null hypothesis while a large p-value means little or no evidence against the null

hypothesis. Though a large p-value should not automatically be construed as

evidence in support of the null hypothesis; the failure to reject the null hypothesis can

be caused by an inadequate sample size.

33

1000bp

500bp 664bp

A: AluyMICB

502bp

MW 3 4 5 6 7 8 MW 3 4 5 6 7 8 9

1000bp 710bp 422bp

B: AluyTF

500bp

C: AluyHJ

1000bp

500bp

163bp

501bp

MW 3 4 5 6 7 8

D: AluyHF

1000bp

500bp 605bp 458bp

MW 3 4 5 6 7 8

Figure 2.1: Gel photographic presentation of the MHC Class I POALINs. The PCR products for the presence and/or absence of the respective

POALINs are visually distinguishable. A marker (MW) control with known sizes (sizes shown on the left for A through D) was used

for each gel (A-D) and the columns represent individual PCR products. The larger PRC product size for each POALIN represents the

presence and the smaller size represents the absence of the POALIN. (A) Columns 1 and 7 represent a homozygous AluyMICB

individual, products in lanes 2 to 5 and 8 represent homozygous individuals without (absence) AluyMICB, and product in lane 6 present

heterozygous individuals. (B) Product in lane 1 represents homozygous AluyTF for homozygous presence. Product 2, 3 and 5 to 8

represent homozygous AluyTF individuals represent homozygous absence of AluyTF and product in lane 4 represents a heterozygous;

individual carrying both one band for the presence and a second for the absence of the Alu gene. (C) Products 1 to 8 all represent

homozygous for the absence of AluyHJ. (D) Products 1 to 8 were also all representative of homozygous for the absence of AluyHF,

although not detected in the limited number of samples tested. The alternative alleles for both markers are shown in Figure 3.6

.

4A

OH

057

4A

OH

043

4A

OH

036

4A

OH

043

4A

OH

036

4A

OH

074

4A

OH

044

4A

OH

074

34

Chapter 3 Results

3.1 Location of POALINs within the MHC Class I Region

A map of the location for the four POALINs within the MHC class I region is shown

in Figure 3.1. Essentially, AluyMICB is located within the first intron of the MICB

gene in the beta block. AluyTF is located in the region between the beta and kappa

blocks close to the TFIIH and CDSN genes. The remaining two, AluyHJ and AluyHF,

are located at the beginning and the end of the alpha block, close to the HLA-J, and

HLA-F genes.

3.2 Distribution of POALIN Allele Frequencies in Bedouin Population

From Figure 3.2 it is evident that the AluyMICB shows the presence of the Alu genes

with a large band at 664bp and the absence of a band at 502bp. The AluyMICB

represents the most heterozygous allele out of all four POALIN primers that had been

used, with 10 individuals representing a heterozygous pair. In order to determine the

frequency of heterozygous alleles the gel photographs of AluyMICB (Figure 3.2) and

AluyTF (Figure 3.4) had been consulted. In Figure 3.2 the separation of the bands are

very difficult to distinguish and an accurate determination of alleles could not be

made. The product from the experiments completed in Dubai, UAE, were run a

second time by electrophoresis at the University of Western Australia lab (See

Appendix). Though due to unforeseen problems during transport little of the product

remained, but was able to provide more information about heterozygous bands that

were indistinguishable from the first test.

From Figure 3.3 it is apparent that the AluyTF primer shows a presence of the Alu

gene with a large band at 710bp and the absence of the gene with a small band at

422bp. For the Bedouin individuals that had been analysed, the group did display 6

individuals with heterozygous alleles, but the rest all showed a small band for the

absence of the gene.

35

Figure 3.1: The human MHC is a 4 Mb region located on the short arm of chromosome 6 (6p21). It is composed of three subregions,

class I, class II, and class III. The class I region is located within a 2000 kilobase (kb) region constituting the telomeric

half of the human MHC. Above is the map of the location and distribution of the four polymorphic Alu insertions

(AluyMICB, AluyTF, AluyHJ and AluyHF), HLA class I loci and related genes within and between the beta, kappa and

alpha blocks of the MHC Class I region (adapted from Dunn, 2005).

AluyMICB AluyTF AluyHJ AluyHF

β block κ block α block Centromeric Telomeric

BA

T1

HL

A-C

M

ICA

H

LA

-B

M

ICB

C

DS

N

DD

R1

FL

OT

1

G

NL

1

HL

A-E

MIC

C

HL

A-3

0

HL

A-9

2

TR

IM26

TR

IM31

H

LA

-J

MIC

D

HL

A-A

MIC

F

HL

A-G

MIC

G

MIC

E

HL

A-F

36

The first well contained the Caucasian control sample for the large band which does

show the primer is capable of amplifying that allele. The bands also get thicker

further down the gel, which could be due to the fact that the Bedouin DNA samples

were extracted using a kit and were diluted to an estimated 20ng/μl, or that there had

been too much sample (10μl) added to the gel when loaded.

The AluyHJ primer indicates the presence of the Alu gene with a large band at 501 bp

and the absence of the Alu Gene with a small band at 163bp (Figure 3.4). With the

Bedouin samples analysed the individuals tested had only represented the absence of

the Alu gene. The control Caucasian sampled used for the large band as well as two

of the Bedouin individual samples showed no result. In order to make sure that the

AluyHJ primer was working, at the University of Western Australia lab, using

Busselton Caucasian samples, all four primers were tested to see if they could amplify

all three, large, small and heterozygous individuals. Figure 3.6 shows that the AluyHJ

primer was able to amplify all three alleles.

The AluyHF primer indicates the presence of the Alu gene with a band at 605bp and

the absence of the gene with a band at 458 bp. The Caucasian control in well 1 was

thought to represent a large band sample, but indicated a small band. During tests in

Perth with the Caucasian samples, the AluHF primer had been giving results, but was

amplifying at a higher base pair then expected. In Dubai it was determined that the

sequence for the primer had been incorrect, and the sample used as a control for the

representation of a large band, no longer amplified as having the Alu gene. The

sequence of the AluyHF primer had been correct and new primers had been ordered.

Similar to the AluyHJ primer all Bedouin individuals tested showed only a small band

for the absence of the Alu gene. Six individuals showed no sign of results, so the

primer was retested in Perth for accuracy. Figure 3.6 shows evidence that the primer

was able to amplifying large, small and heterozygous individuals, confirming that the

primer was working during the tests in Dubai.

37

Fig

ure 3

.2: T

he g

el photo

grap

hic p

resentatio

n o

f Alu

yMIC

B. A

luyM

ICB

is a biallelic p

rimer th

at

show

s either th

e presen

ce and/o

r absen

ce of th

e Alu

by a larg

e or sm

all ban

d size.

Pro

ducts 2

and 3

were p

ositiv

e contro

ls of C

aucasian

DN

A. P

roduct 8

represen

ts the

presen

ce of th

e PO

AL

IN w

ith a h

om

ozy

gous larg

e (664bp) size. S

amples 7

, 11, 1

8,

31, 3

5 an

d 4

3 are h

eterozy

gous h

avin

g b

oth

the p

resence an

d ab

sence o

f the P

OA

LIN

s

leavin

g th

e rest of th

e pro

ducts to

be h

om

ozy

gous fo

r the ab

sence o

f the P

OA

LIN

with

a hom

ozy

gous sm

all (502bp) size.

500bp

1000bp

MW

M

W

MW

41 4

2 4

3 4

4 4

5 4

6 4

7 4

8 4

9 5

0 5

1 5

2 5

3 5

4 5

5 5

6 5

7 5

8 5

9 6

0

1000bp

1000bp

500bp

500bp

502bp

664bp

21 2

2 2

3 2

4 2

5 2

6 2

7 2

8 2

9 3

0 3

1 3

2 3

3 3

4 3

5 3

6 3

7 3

8 3

9 4

0

1 2

3 4

5 6

7 8

9 1

0 1

1 1

2 1

3 1

4 1

5 1

6 1

718 1

9 2

0

502bp

502bp

664bp

664bp

38

61 6

2 6

3 6

4 6

5 6

6 6

7 6

8 6

9 7

0 7

1 7

2 7

3 7

4 7

5 7

6 7

7 7

8 7

9

80

Fig

ure 3

.3: T

he g

el photo

grap

hic p

resentatio

n o

f Alu

yTF

. Alu

yTF

is a biallelic p

rimer

that sh

ow

s either th

e presen

ce and/o

r absen

ce of th

e Alu

by a larg

e or

small b

and size. P

roducts 2

and 3

were p

ositiv

e contro

ls of C

aucasian

DN

A. P

roduct 5

, 33, 4

3, 5

3, 5

4 an

d 6

3 are h

eterozy

gous h

avin

g b

oth

the

presen

ce and ab

sence o

f the P

OA

LIN

s leavin

g th

e rest of th

e pro

du

cts to

be h

om

ozy

gous fo

r the ab

sence o

f the P

OA

LIN

with

a hom

ozy

gous sm

all

(422bp) size.

1000bp

500bp

MW

M

W

500bp

500bp

500bp

1000bp

1000bp

1000bp

422bp

710bp

710bp

710bp

710bp

422bp

422bp

422bp

41 4

2 4

3 4

4 4

5 4

6 4

7 4

8 4

9 5

0 5

1 5

2 5

3 5

4 5

5 5

6 5

7 5

8 5

9 6

0

21 2

2 2

3 2

4 2

5 2

6 2

7 2

8 2

9 3

0 3

1 3

2 3

3 3

4 3

5 3

6 3

7 3

8 3

9 4

0

1 2

3 4

5 6

7 8

9 1

0 1

1 1

2 1

3 1

4 1

5 1

6 1

7 1

8 1

9 2

0

39

Fig

ure 3

.4: T

he g

el photo

grap

hic p

resentatio

n o

f Alu

yHJ. A

luyH

J is a biallelic p

rimer

that sh

ow

s either th

e presen

ce and/o

r absen

ce of th

e Alu

by a larg

e or sm

all

ban

d size. P

roducts 2

and 3

were p

ositiv

e contro

ls of C

aucasian

DN

A,

pro

duct 2

did

not am

plify

any resu

lt. All p

roducts am

plified

hom

ozy

gous fo

r

the ab

sence o

f the P

OA

LIN

with

a hom

ozy

gous sm

all (163bp) size fo

r all

pro

ducts.

61 6

2 6

3 6

4 6

5 6

6 6

7 6

8 6

9 7

0 7

1 7

2 7

3 7

4 7

5 7

6 7

7 7

8 7

9 8

0

500bp

1000bp

MW

M

W

MW

1000bp

1000bp

1000bp

500bp

500bp

500bp

163bp

501bp

41 4

2 4

3 4

4 4

5 4

6 4

7 4

8 4

9 5

0 5

1 5

2 5

3 5

4 5

5 5

6 5

7 5

8 5

9 6

0

21 2

2 2

3 2

4 2

5 2

6 2

7 2

8 2

9 3

0 3

1 3

2 3

3 3

4 3

5 3

6 3

7 3

8 3

9 4

0

1 2

3 4

5 6

7 8

9 1

0 1

1 1

2 1

3 1

4 1

5 1

6 1

7 1

8 1

9 2

0

501bp

501bp

501bp

163bp

163bp

163bp

40

41 4

2 4

3 4

4 4

5 4

6 4

7 4

8 4

9 5

0 5

1 5

2 5

3 5

4 5

5 5

6 5

7 5

8 5

9 6

0

Fig

ure 3

.5: T

he g

el photo

grap

hic p

resentatio

n o

f Alu

yHF

. Alu

yHF

is a biallelic p

rimer th

at

show

s either th

e presen

ce and/o

r absen

ce of th

e Alu

by a larg

e or sm

all ban

d

size. Pro

ducts 2

and 3

were p

ositiv

e contro

ls of C

aucasian

DN

A. P

roduct o

ne

did

not am

plify

a large (6

05bp) size. A

ll pro

ducts am

plified

hom

ozy

gous fo

r

th

e absen

ce of th

e PO

AL

IN w

ith a h

om

ozy

gous sm

all (458bp) size.

1000bp

500bp

MW

M

W

MW

500

bp

500bp

1000bp

1000bp

458bp

605bp

458bp

458bp

605bp

605bp

21 2

2 2

3 2

4 2

5 2

6 2

7 2

8 2

9 3

0 3

1 3

2 3

3 3

4 3

5 3

6 3

7 3

8 3

9 4

0

1 2

3 4

5 6

7 8

9 1

0 1

1 1

2 1

3 1

4 1

5 1

6 1

7 1

8 1

9 2

0

41

MW MW MW MW MW

AluyMICB AluyTF AluyHJ AluyHF

502bp

664bp

710bp

422bp

710 +

422bp

501bp

163bp

501 +

163bp

605bp

458bp

664 +

502bp

1000bp

500bp

605 +

458bp

Figure 3.6: Gel photographic presentation of the MHC Class I POALINs. The PCR products for the presence and/or

absence of the respective POALINs are visually distinguishable. A marker (MW) control with known sizes

(sizes shown on the right) was used for the gel and the columns represent individual PCR products. The

larger PCR product seize for each POALIN represents the presence and the smaller size represents the

absence of the POALIN. Using Caucasian samples, large homozygous, small homozygous and

heterozygous are represented for each POALIN, verifying the ability of each primer to produce the wanted

products.

42

3.3 Hardy-Weinberg Equilibrium

The Hardy-Weinberg equilibrium predicts that under stable conditions after a

generation of random mating, genotype frequencies throughout a population at a

specific gene locus become fixed at a specific equilibrium value. These values can be

defined as a function of the allele frequency of the genotype. The entire principle is

based on Mendelian genetics.

In a single locus with two alleles (A and a) have allele frequencies of p and q, the

frequency of genotype AA will be p2, the frequency of genotype Aa will be p*q, and

the frequency aa will be q2. The Hardy-Weinberg model consists of two equations:

one that calculates the allele frequencies and one that calculates genotype frequencies.

These are the foundation of population genetics p + q = 1 and p2 + 2pq + q

2 = 1. Each

genotype has a genotypic frequency and the sum of all genotypic frequency in the

population must add up to 1. 1 is the sum of all the individuals in the specific

population, and through this equation, a population can be examined as being at

genetic equilibrium or not.

Using the raw data collected allele frequency for a gene locus is determined by

observing the population. Each individual with AA has two copies of A alleles,

heterozygote individuals have one of each allele A and a, and individuals with aa

genotype have two copies of the a allele. Calculating the allelic frequency was done

by simply dividing the number of A or a alleles by the total # of alleles in the

population and ensuring that p + q were equal to 1. These are the observed

frequencies.

If the population were to be in Hardy-Weinberg equilibrium then it would be

expected that the genotype frequencies for AA, Aa and aa would be p2. 2pq and q

2,

and that the genotype frequencies add up to 1.

43

In calculating the Hardy-Weinberg equilibrium only the AluyMICB and AluyTF

Bedouin frequencies could be taken into consideration; as both the AluyHJ and

AluyHF frequencies deviated from the Hardy-Weinberg Equilibrium. Even though

the POALINs are found to be in linkage disequilibrium, you can have deviation from

the Hardy-Weinberg equilibrium from some of the POALINs. In the case of the

Bedouin population AluyHJ and AluyHF had showed no variation in the observed

population. The deviation from Hardy-Weinberg in this case can be attributed to the

small population size that had been collected for the study, but can also be due to the

consanguineous marriages which are still part of the Bedouin culture.

AluyMICB observed genotype frequency is 0.850 for allele*1, 0.150 for allele*2 and

0.255 for the heterozygous alleles. Using the Hardy-Weinberg equilibrium the

expected genotype frequencies for allele*1 frequency was expected to be 0.850,

0.150 for allele*2 and 0.255 for the heterozygous allele.

AluyTF observed genotype frequency is 0.944 for allele*1, 0.056 for allele*2, and

0.106 for the heterozygous alleles. Using the Hardy-Weinberg equilibrium the

expected genotype frequencies for allele*1 was expected to be 0.945, 0.054 for

allele*2 and 0.101 for the heterozygous alleles.

AluyHJ and AluyHF frequencies were not taken into consideration for the Hardy-

Weinberg equilibrium as they show no variation in allele genotypes.

The observed genotypes and allele frequencies of the four POALINs are listed in

Table 3.1. The most frequent POALIN was the AluyMICB*2 (0.150) followed by

AluyTF*2 allele (0.056) AluyHJ and AluyHF both had only shown a presence of a

small band, representing the absence of the Alu genes and had shown no other

variation.

44

Table 3.1: Observed genotypes, allele frequencies, and heterozygosity for AluyMICB, AluyTF, AluyHJ

and AluyHF in a Middle Eastern Bedouin Population

Aluy Loci n

Genotypesa Allele Frequencies

Heterozygosity (H)

1,1 1,2 2,2 allele*1 allele*2

Aluy MICB 40 29 10 1 0.850 0.150 0.255

Aluy TF 54 48 6 0 0.940 0.056 0.106

Aluy HJ 48 48 0 0 1.000 N.A N/A

Aluy HF 42 42 0 0 1.000 N/A N/A

a Genotypes: 1,1 homozygote absent; 1,2 heterozygote and 2,2 homozygote present

45

Table 3.2 a-d shows the insertion frequencies of the four MHC POALINs in Middle

Eastern Bedouin compared to other previously studied populations using the same 4

POALINs. The insertion frequency in the Middle Eastern Bedouin AluyMICB is

similar to the Australian Caucasian AluyMICB frequency (0.150) and is above the

frequency for North Eastern Thai (0.117) and below the AluyMICB for the Malaysian

Chinese population (0.170). With a higher frequency of 0.150; there is a significant

separation between the Middle Eastern Bedouin AluyMICB and the African

population AluyMICB frequencies that have been previously studied. Similarly,

Middle Eastern Bedouin AluyTF (0.056) is between the Malaysian Chinese (0.040)

and the North Eastern Thai (0.086) and is comparable to the South African Sekele

San frequency (0.034) and the Australian Caucasian frequency (0.107).

AluyHJ and AluyHF showed no variation among the population size that had been

studied, with the each of individuals DNA bands being representative for the absence

of the two POALINs. No variation can be due to many factors. One is the population

size that was tested. A smaller population size does not provide the study with a

substantial opportunity to acquire a large percentage of variation. In the case of the

Middle Eastern Bedouin, the possibility is also due to the fact that consanguineous

marriages are still a ritual part of the culture and the lack of random mating has

lowered the chance of variation in the particular POALINs studied. No comparison

could be made.

For AluyMICB the probability value (p-value) calculated for the relationship between

the Australian Caucasian and the Middle Eastern Bedouin was a probability of 0.1697.

Though the genotype frequencies of the two populations are similar, the p-value

suggests that the null hypothesis is to be rejected and that the AluyMICB between the

two populations are not identical. The p-value for the Malaysian Chinese comparison

(0.9691), the North Eastern Thai comparison (0.9436) shows a close relationship to

the Middle Eastern Bedouin. And of the African populations the South

African !Kung San (0.9099) and South African Sekele San (0.8136) also show a close

relationship to the Middle Eastern Bedouin population.

46

Table 3.2a: Allele frequency comparison for AluyMICB

Markers Description Race Frequency

HWb Reference χ² P-Value

allele*1 allele*2 Ha

AluyMICB

Polymorphic

insertion

consisting of

2 alleles

(502bp =

AluyMICB*1

and 664bp =

AluyMICB*2)

Malaysian

Chinese 0.830 0.170 0.282 Yes

Dunn et

al, (2007) 0.0015 0.9691

North Eastern

Thai 0.883 0.117 0.207 Yes

Dunn et

al, (2005) 0.0050 0.9436

Mongolian

Khalkh 0.622 0.378 0.470 No

Dunn,

(2005) 0.1338 0.7145

South African

South Eastern

Bantu

0.970 0.030 0.058 Yes Dunn,

(2005) 0.0879 0.7669

South African

Sekele San 0.950 0.050 0.095 Yes

Dunn,

(2005) 0.0556 0.8136

South

African !Kung

San

0.096 0.036 0.069 Yes Dunn,

(2005) 0.0128 0.9099

Australian

Caucasian 0.843 0.157 0.265 Yes

Dunn et

al, (2002) 1.8850 0.1697

Middle

Eastern

Bedouin

0.850 0.150 0.255 Yes This

Study N/A N/A

aHeterozygosity

bHardy Weinberg Formula

47

Table 3.2b: Allele frequency comparison for AluyTF




AluyTF

Polymorphic

insertion

consisting of

2 alleles

(422bp =

AluyTF*1

and 710bp =

AluyTF*2)

Malaysian

Chinese 0.960 0.040 0.077 Yes

Dunn et al,

(2007) 0.0028 0.9578

North Eastern

Thai 0.914 0.086 0.152 Yes

Dunn et al,

(2005) 0.0068 0.9343

Mongolian

Khalkh 0.780 0.220 0.343 Yes

Dunn,

(2005) 0.1130 0.7368

South African

South Eastern

Bantu

0.900 0.100 0.180 Yes Dunn,

(2005) 0.0135 0.9075

South African

Sekele San 0.966 0.034 0.066 Yes

Dunn,

(2005) 0.0056 0.9403

South

African !Kung

San

0.762 0.283 0.363 Yes Dunn,

(2005) 0.1705 0.6797

Australian

Caucasian 0.893 0.107 0.198 Yes

Dunn et al,

(2002) 0.0174 0.8951

Middle

Eastern

Bedouin

0.944 0.056 0.106 Yes This

Study N/A N/A

aHeterozygosity


48

Table 3.2c: Allele frequency comparison for AluyHJ




AluyHJ

Polymorphic

insertion

consisting of 2

alleles (163bp =

AluyHJ*1 and

501bp =

AluyHJ*2)

Malaysian

Chinese 0.700 0.300 0.420 Yes

Dunn et al,

(2007) 0.3530 0.5524

North Eastern

Thai 0.708 0.292 0.413 Yes

Dunn et al,

(2005) 0.3420 0.5887

Mongolian

Khalkh 0.707 0.293 0.414 Yes Dunn, (2005) 0.3430 0.5581

South African

South Eastern

Bantu

0.930 0.070 0.130 Yes Dunn, (2005) 0.0730 0.7870

South African

Sekele San 0.950 0.050 0.095 No Dunn, (2005) 0.0510 0.8213

South

African !Kung

San

0.893 0.107 0.191 Yes Dunn, (2005) 0.1130 0.7368

Australian

Caucasian 0.927 0.073 0.358 Yes

Dunn et al,

(2002) 0.0760 0.7828

Middle

Eastern

Bedouin

1.000 N/A N/A No This Study N/A N/A

aHeterozygosity


49

Table 3.2d: Allele frequency comparison for AluyHF




AluyHF

Polymorphic

insertion

consisting of

2 alleles

(458bp =

AluyHF*1

and 605bp =

AluyHF*2)

Malaysian

Chinese 0.970 0.030 0.058 Yes

Dunn et al,

(2007) 0.0305 0.8614

North Eastern

Thai 0.982 0.018 0.035 Yes

Dunn et al,

(2005) 0.0182 0.8927

Mongolian

Khalkh 0.902 0.098 0.177 Yes

Dunn,

(2005) 0.1030 0.7483

South African

South Eastern

Bantu

0.910 0.090 0.164 Yes Dunn,

(2005) 0.0942 0.7589

South African

Sekele San 0.917 0.083 0.152 Yes

Dunn,

(2005) 0.0866 0.7685

South

African !Kung

San

0.940 0.060 0.113 Yes Dunn,

(2005) 0.0618 0.8037

Australian

Caucasian 0.962 0.038 0.03 Yes

Dunn et al,

(2002) 0.0387 0.8440

Middle

Eastern

Bedouin

1.000 N/A N/A No This

Study N/A N/A

aHeterozygosity


50

With AluyTF the p-value for the Malaysian Chinese (0.9578) and the South African

Sekele San (0.9403) again show a close relationship the Middle Eastern Bedouin.

There is also still a strong relation to the North Eastern Thai (0.9343) as seen in

AluyMICB. However, the comparison differentiates in that the relation to the South

African Bantu (0.9075) is more comparable than that of the Australian Caucasian

population (0.8951) as there is with AluyMICB.

Both AluyHJ and AluyHF, due to the loss of variation in the population had not been

taken into consideration for comparison for the calculation of p-value and the

comparison between the populations.

63

References

1. Abu-Rabia, A. (2002). Bedouin Century: Education and Development among the

Nagev Tribes in the Twentieth Century. Berghahn Books: Isreal

2. Alberts B, Johnson A, Lewis J, Raff M, Roberts K, and Walters P. (2002)

Molecular Biology of the Cell, 4th

Ed. Garland Science: London.

3. Ania L, Manson E, C.A. Jones (2002) Cell Biology and Genetics. Elsevier Health

Sciences: London.

4. Antunez-de-Mayolo G, Antunez-de-Mayola A, Antunez-de-Mayolo P, Papiha SS,

Hammer M, Yunis JJ, Yunis EJ, Damodaran C, Martinez de Pancrobo M,

Caeiro JL, Puzyrev VP, Herrera RJ (2002). Phylogenetics of Worldwide

Human Populations as Determined by Polymorphic Alu Insertions.

Electrophoresis, 23: 3346-3356.

5. Anzai T, Shiina T, Kimura N, Yanagiya K, Kohara S, Shigenari A, Yamagata T,

Kulski JK, Naruse TK, Fujimori Y, Fukuzumi Y, Yamazaki M, Tashiro H,

Iwamoto C, Umehara Y, Imanishi T, Meyer A, Ikeo K, Gojobori T, Bahram S,

Inoko H. (2003). Comparative Sequencing of Human and Chimpanzee MHC

class I Regions Unveils Insertions/ Deletions as the Major Path to Genomic

Divergence. Proceedings of the National Academy of Science, 100: 7708-

7713.

6. Aoki, T., Satoh, K., Imamura, T., Watabe, H. (2004). A New Method for

Detection Single Nucleotide Polymorphism Using GFP-Display. Journal of

Biochemistry and Biophysical Methods, 60: 61-67.

7. Arcos-Burgos, M., and Muenke, M. (2002). Genetics of Populations Isolates.

Clinical Genetics, 61: 233-247.

64

8. Arcot SS, Adamson AW, Lamerdin JE, Kanagy B, Deininger PL, Carrano AV,

Batzer MA. (1996). Alu Fossil Relics – Distribution and Insertion

Polymorphism. Genome Research, 6: 1084-1092.

9. Aronson J.D. (2007) Genetic Witness: Science, Law and Controversy in the

Making of DNA Profiling. Rutgers University Press: London.

10. Bamshad MJ, Wooding S, Watking WS, Ostler CT, Batzer MA, Jorde LB (2003).

Human Population Genetic Structure and Inference of Group Membership.

American Journal of Human Genetics, 72: 578-589.

11. Batzer MA, Deininger PL (2002). Alu Repeats and Human Genomic Diversity.

Nature, 3: 370-379.

12. Batzer MA, Stoneking, M, Alegria-Hartman M, Bazan H, Kass DH, Shaikh TH,

Novick GE, Ioannou PA, Scheer WD, Herrera RJ, Deininger PL. (1994).

African Origin of Human-Specific Polymorphic Alu Insertions. Proceedings

of the National Academy of Science, 91:12288-12292.

13. Batzer, M. M., S. S. Arcot, et al. (1996). Genetic variation of recent Alu insertions

in human populations. Journal of Molecular Evolution, 42(1): 22-9.

14. Begovich, AB., McClure, GR., Suraj, VC., Helmuth, RC., Fildes, N., Bugawan,

TL., Erlich, HA., Klitz, W. (1992). Polymorphism, Recombination, and

Linkage Disequilibrium within the HLA Class II Region. Journal of

Immunology, 148(1): 249-258.

15. Benham, C. J., and Mielke, S. P. (2005). DNA Mechanics. Annual Review of

Biomedical Engineering, 7: 21-53.

65

16. Buffery, C., Burridge, F., Greenhalgh, M., Jones, S., and Willott, G. (1991).

Allele Frequency Distributions of Four Variable Number Tandem Repeat

(VNTR) Loci in the London Area. Forensic Science International, 52: 53-64.

17. Butler, J. M. (2005). Forensic DNA Typing. Elsevier Academic Press: London.

18. Collins, F. S. (2006). No Longer Just Looking under the Lamppost. American

Journal of Human Genetics, 79(3): 421-426.

19. Comas D, Plaza S, Calafell F, Sjantila A, Bertranpetit J (2001). Recent Insertion

of an Alu Element within a Polymorphic Human-Specific Alu Insertion.

Molecular Biology and Evolution, 18(1): 85-88.

20. Comas D, Schmid H, Braeuer S, Flaiz C, Busquets A, Calafell F, Bertranpetit J,

Scheil H,G, Huckenbeck, W, Efremouska L, Schmidt H. (2004). Alu Insertion

Polymorphisms in the Balkans and the Origins of the Aromuns. Annals of

Human Genetics, 68:120-127.

21. Cordaux, R., Srikanta, D., Lee, J., Stoneking M., and Batzer, MA. (2007). In

Search of Polymorphic Alu Insertions with Restricted Geographic

Distribution. Genomics, 90(1): 154-158.

22. Dawkins R, Leelayuwat C, Gaudieri S, Tay G, Hui J, Cattley S, Martinez P,

Kulski JK (1999). Genomics of the Major Histocompatibility Comples:

Haplotypes, Duplication, Retroviruses and Disease. Immunological Reviews,

167: 275-304.

23. Degli-Esposti MA, Leelayuwat C, Daly CN, Carcassi C, Contu L, Versluis LF,

Tilanus MG, Dawkins RL. (1992). Ancestral Haplotypes: Conserved

Population MHC Haplotypes. Human Immunology, 34: 242-252.

66

24. Donaldson CS, Crapanzano JP, Watson JC, Levine EA, Batzer MA. (2002)

PROGINS Alu Insertions and Human Genomic Diversity. Mutation Research,

501: 137-141.

25. Dunn DS, Naruse T, Inoko H, Kulski JK. (2004). The Association Between HLA-

A Alleles and Young Alu Dimorphisms Near the HLA-J, -H, and –F Genes in

Workshop Cell Lines and Japanese and Australian Populations. Journal of

Molecular Evolution, 55 (6): 718-726.

26. Dunn DS, Romphruk AV, Leelayuwat C, Bellgard M, Kulski JK (2005).

Polymorphic Alu Insertions and Their Associations with MHC Class I Alleles

and Haplotypes in the Northeastern Thais. Annals of Human Genetics, 69:

364-372.

27. Dunn DS (2005). Studies on Polymorphic Alu Insertions and Genomic Diversity

within the Major Histocompatibility Complex. PhD Thesis: University of

Western Australia.

28. Dunn DS, Inoko H, Kulski JK (2006). The Association Between Non-Melanoma

Skin Cancer and a Young Dimorphic Alu Element within the Major

Histocompatibility Complex Class I Genomic Region. Tissue Antigens,

68:127-134.

29. Dunn DS, Choy MK, Phipps ME, Kulski JK. (2007). The Distribution of Major

Histocompatibility Complex Class I Polymorphic Alu Insertions and their

Associations with HLA Alleles in a Chinese Population from Malaysia.

Tissue Antigens, 70: 136-143.

30. Efstratiadis AA, Posakony JJ, et al. (1980). The Structure and Evolution of the

Human Beta-Globin Gene Family. Cell, 21(3): 653-668.

67

31. Gagneux P, Varki A (2001). Genetic Differences Between Humans and Great

Apes. Molecular Phylogenetics and Evolution, 18(1): 2-13

32. Gaudieri S, Leelayuwat C, Tay GK, Townend DC, Mullberg J, Cosman D,

Dawkin RL (1997). Allelic and Interlocus Comparison of the PERB11

Multigene Family in the MHC. Immunogenetics, 45: 209-216.

33. Goodman F.R., Scambler P.J. (2001) Human Hox Gene Mutations. Clinical

Genetics, 59: 1-11.

34. Griffiths, et al (2005). Introduction to Genetic Analysis. W. H. Freeman and

Company: New York.

35. Hsien, L. (2006). A Heartbreaking Story about Genetics. The New York

Times: Science News.

36. Hammer HF, Spurdle AB, Karafet T, Bonner MR, Wood ET, Novelletto A,

Malaspina D, Michell RJ, Hosai S, Jenkins T, Zegura SL (1997). The

Geographic Distribution of Human Y Chromosome Variation. Genetics, 145:

787-805

37. Joseph, D. M. D. M. (1995). The Human Genome Project and Biology Education.

Bioscience, 45(11): 786-791.

38. Jurka, J, Krnjajic M, Kapitonov VV, Stenger JE, Kokhanyy O (2002). Active Alu

Elements Are Passed Primarily through Paternal Germlines. Theoretical

Population Biology, 61: 519-530.

39. Kamahori M, Harada K and Kambara H. (2002) A New Single Nucleotide

Polymorphisms Typing Method and Device by Bioluminometric Assay

68

Couple with a Photodiode Array. Measurement Science and Technology, 13:

1779-1785.

40. Kass, D., Jamison, N., Mayberry, MM., and Tecle, E. (2006). Identification of a

unique Alu-based Polymorphism and its use in Human Population Studies.

Gene, 390(1-2): 146-152.

41. Kirk, BW., Feinsod, M., Favis, R., Kliman, RM., and Brany, F. (2002). Single

Nucleotide Polymorphism Seeking Long Term Association with Complex

Disease. Nucleic Acids Research, 30(15): 3295-3311.

42. Kulski JK, Shiina T, Anzai T, Kohara S, Inoko H. (2002). Comparative Genomic

Analysis of the MHC: The Evolution of Class I Duplication Blocks, Diversity

and Complexity from Shark to Man. Immunology Review, 190: 95-122.

43. Kulski JK, Dunn DS (2005) Polymorphic Alu Insertions within the Major

Histocompatibility Complex Class I Region: A Brief Review. Cytogenetic and

Genome Research., 110: 193-202

44. Leelayuwat, C. et al (1994). A New Polymorphic and Multicopy MHC Gene

Family Related to Nonmammalian Class I. Immunogenetics, 40: 339-351

45. Leib-Mösch C and Seifarth, W (1996) Evolution and Biological Significance of

Human Retroelements. Virus Genes, 11: 133-145.

46. Losleben, B. (2003). The Bedouin of the Middle East. Lerner Publication:

London.

69

47. Martins, Sandra; Calafell, Francesc; Gaspar, Claudia; Wong, Virginia C. N.;

Silveira, Isabel; Nicholson, Garth A.; Brunt, Ewout R.; Tranebjaerg, Lisbeth;

Stevanin, Giovanni; Hsieh, Mingli; Soong, Bing-wen; Loureiro, Leal; Dürr,

Alexandra; Tsuji, Shoji; Watanabe, Mitsunori; Jardim, Laura B.; Giunti,

Paola; Riess, Olaf; Ranum, Laura P. W.; Brice, Alexis; Rouleau, Guy A.;

Coutinho, Paula; Amorim, António; Sequeiros, Jorge. (2007). Asian Origin

for the Worldwide-Spread Mutational Event in Machado-Joseph Disease.

American Medical Association 64(10): 1502-1508.

48. Marin MLC, Savioli CR, Yamamoto JH, Jorge K, Goldberg AC. (2004). MICA

Polymorphism in a sample of the Sao Paulo Population, Brazil. European

Journal of Immunogenetics, 31(2): 63-71.

49. Mavoungou, E., Sall, A., Poaty-Mavoungou, V., Toure, FS., Yaba, P., Delicat, A.,

and Lansoud-Soukate, J. (1999). Alloreactivity and Association of Human

Natural Killer Cells with the Major Histocompatibility Complex. Clinical and

Diagnostic Laboratory Immunology, 6(2): 254-259.

50. Middleton DD, Williams FF, Meenagh A, Daar AS, Gorodezky C, Hammond M,

Nascimento E, Briceno I, Perez MP. (2000). Analysis of the Distribution of

HLA-A Alleles in Populations from Five Continents. Human Immunology,

61(10): 1048-1052.

51. Mooser V, Mancini F.P., Bopp S, Pethö-Schramm A, Guerra P, Boerwinkle E,

Müller H.J, and Hobbs H.H. (1994). Sequence Polymorphisms in the Apo(a)

Gene Associated with Specific Levels of Lp(a) in Plasmas. Human Molecular

Genetics, 4(2): 173-181.

52. Muchmore, E.A. (2001). Chimpanzee Models for Human Disease and

Immunobiology. Immunological Review, 183: 86-93.

70

53. Mungall AJ, Palmer SA, Sims SK, Edwards CA, Ashurgt JL, wilming L, Jobes

MC, Horton R, Hunt SE, Scott CE, Gilber JG, Clamp ME, Bethel G, Milne S,

Ainscought R, Almeida JP, Ambrose TD, Ashwell RI, Babbage AK,

Bagguley CI, et al (2003). The DNA Sequence and Analysis of Human

Chromosome 6. Nature, 425: 805-811.

54. Nasidze I, Risch GM, Robichaux M, Sherry St, Batzer MA, Stoneking, M (2001).

Alu insertion Polymorphism and the Genetic Structure of Human Population

from the Caucasus. European Journal of Human Genetics, 9: 267-292

55. Nusbaum, C., Micheal, C. Z., et al. (2005). DNA Sequence and Analysis of

Human Chromosome 18. Nature, 437(7058): 551.

56. Nei M (1972). Genetic Distance Between Populations. American Naturalist, 106:

283-292.

57. Okada N (1991) SINEs. Current Opinion in Genetics and Development., 1: 498-

504.

58. Okada N, Hamada M, Ogiwara I, Ohshima K (1997). SINEs and LINEs Share

Common 3’ Sequence: A review. Gene, 205: 229-243

59. Olby, R. (2003). Quiet Debut for the Double Helix. Nature, 421: 402-405.

60. Paabo, S. (2001). Genomics and Society. The Human Genome and Our View of

Ourselves. Science, 291(5507): 1219-1220.

61. Patrinos A, Drell DW (1997) Introducing the Human Genome Project. Its

Relevance, Triumphs, and Challenges. The Judges Journal,36 (3).

71

62. Perna NT, Batzer MA, Deininger PL, Stoneking M (1992). Alu Insertion

Polymorphism: A New Type of marker for Human Population Studies.

Human Biology., 164: 641-648.

63. Perutz, MF., Randall, JT., Thomson, L., Wilkins, MH., and Watson, JD. (1969).

DNA Helix. Science Journal Science, 164: 1537–1539.

64. Price GR (1971) Extension of the Hardy-Weinberg Law to Assortative Mating.

Annals of Human Genetics., 34: 455-458.

65. Rajsbaum R, Fici D, Boggs DA, Fraser PA, Flores-Villanueva PO, Awdeh ZL

(2002). Linkage Disequilibrium Between HLA-DPB1 Alleles and Retinoid X

Receptor β Haplotypes. Human Immunology, 63(9): 771-778.

66. Robbins, R. R. J. (1992). Challenges in the human genome project. IEEE

Engineering in Medicine and Biology Magazine, 11(1): 25-34.

67. Romualdi, C. et al. (2002). Patterns of human diversity, within and among

continents, inferred from biallelic DNA polymorphisms. Genome Research.

12: 602−612.

68. Schneider, PM. (1997). Basic Issues in Forensic DNA typing. Forensic Science

International, 88(1): 17-22.

69. Sheffield VC, Stone EM, Carmi R. (1998). Use of Isolated Inbred Human

Populations for Identification of Disease Genes. Trends in Genetics, 14: 391-

396.

70. Shen MR, Batzer MA, Deininger PL (1991). Evolution of the Master Alu Genes.

Journal of Molecular Evolution, 33: 311-320.

72

71. Singer M, Berg P. (1997). Exploring Genetic Mechanisms. University Science

Books: New York.

72. Skaug HJ. (2001). Allele-Sharing Methods for Estimation of Population Size.

Biometrics, 57: 750-756.

73. Smit AF (1996). The Origin of Interspersed Repeats in the Human Genome.

Current Opinion of Genetics and Development, 6: 743-748.

74. Starr, C (2005). Biology: Concepts and Applications. Thomson Books: New

York.

75. Stumpf MP (2002). Haplotype Diversity and the Block Structure of Linkage

Disequilibrium. Trends Genetics, 18: 226-228.

76. Takasu M, Hayashi R, Maruya E, Ota M, Imura K, Kougo K, Kobayashi C, Saji

H, Ishikawa Y, Asai T, Tokunaga K (2007). Deletion of Entire HLA-A Gene

Accompanied by an Insertion of a Retrotransposon. Tissue Antigens, 70: 144-

150.

77. Ting, JC., Ye, Y., Thomas, GH., Ruczinki, I., and Pevsner, J. (2006). Analysis

and Visualisation of Chromosomal Abnormalities in SNP date with SNPscan.

BMC Bioinformatics, 7: 25.

78. Triggs, CM., and Buckleton, JS. (2002). Logical Implications of Applying the

Principles of Population Genetics to the Interpretation of DNA Profiling

Evidence. Forensic Science International, 128(3): 108-114.

79. Van Ommen, G-J.B. (2002). The Human Genome Project and the Future of

Diagnostics, Treatment and Prevention. Journal of Inherited and Metabolic

Diseases, 25: 183-188.

73

80. Wakeley J. (2001). The Discovery of Single-Nucleotide Polymorphisms: And

Inferences about Human Demographic History. American Journal of Human

Genetics, 69(6): 1332-1347.

81. Walsh, EC, Mather KA, Schaffner SF, Farwell L, Daly MJ, Patterson N, Cullen

M, Carrington M, Bugawan TL, Erlich H, Campbell J, Barrett J, Miller K,

Thomson G, Lander ES, Rioux JD (2003). American Journal of Human

Genetics, 73: 580-590.

82. Watson JD, Cook-Deagan RM (1991). Origins of the Human Genome Project.

FASEB Journal, 5: 8-11

83. Weber, JL., David, D., Heil, J., Fan, Y., Zhao, C., and Marth, G. (2002). Human

Diallelic Insertion/Deletion Polymorphisms. American Journal of Human

Genetics, 71: 854-862.

84. Xiao FX, Yang JF, Cassiman JJ, Decorte R (2002). Diversity at Eight

Polymorphic Alu Insertion Loci in Chinese Population Shows Evidence for

European Admixture in an Ethic Minority Population from Northwest China.

Human Biology, 74: 555-568.

85. Ye, J., Parra, J.E., Sosnoski, D.M., Hiester, K., Underhill, P.A. and Shriver, M.D.

Melting curve SNP (McSNP) genotyping: a useful approach for diallelic

genotyping in forensic science. Journal of Forensic Science, 47: 593- 600.

86. Yunis EJ, Larsen CE, Fernandez-Vina M, Awadeh Al, Romero T, Hansen Ja,

Alper CA. (2003). Inheritable Variation Sizes of DNA Stretches in the

Human MHC: Conserved Extended Haplotype and their Fragments or Blocks.

Tissue Antigens, 62: 1-20.

74

87. Zyphur, MJ. (2006). On the Complexity of Race. American Psychologist: 179-

180.

62

Appendix A

1000bp

500bp

500bp

1000bp

500bp

1000bp

MW

M

W

502bp

664bp

MW

Fig

ure 3

.7: T

he g

el photo

grap

hic rep

resentatio

n o

f the seco

nd electro

pho

resis run o

f the

Alu

yMIC

B p

roduct. R

un

in o

rder to

check

for th

e accurate n

um

ber o

f

hetero

zygous alleles.

MW

M

W

MW

502bp

502bp

664bp

664bp

1 2

3 4

5 6

7 8

9 1

0 1

1 1

2 1

3 1

4 1

5 1

6 1

7 1

8 1

9 2

0

21 2

2 2

3 2

4 2

5 2

6 2

7 2

8 2

9 3

0 3

1 3

2 3

334 3

5 3

6 3

7 3

8 3

9 4

0

41 4

2 4

3 4

4 4

5 4

6 4

7 4

8 4

9 5

0 5

1 5

2 5

3 5

4 5

5 5

6 5

7 5

8 5

9 6

0

Date post:	15-Jul-2020
Category:	Documents
Upload:	others
View:	5 times
Download:	0 times

Comparison of Middle Eastern Bedouin Genotypes with Previously … · Comparison of Middle Eastern...

Documents