Human Molecular Genetics – Analysis of Mitochondrial DNA Sequence Data
Genetics with Laboratory
April 5, 2020
This experiment extracted mtDNA from a human sample, and then isolated and amplified the hypervariable region 1 of the D-loop region with PCR amplification. Then the mtDNA was sequenced and analyzed by comparing the nucleotide substitutions of humans vs Chimps, humans vs Neanderthals, and humans vs Denisovans. The average pairwise sequence divergence for the humans in this sample was KHH=0.01681 for the D-loop region The results of the analysis concluded that the displacement model of human of evolution is the most consistent with the data because it was estimated that the human population shared a common mitochondrial ancestor an estimated 465,991.9 years ago and because modern humans are more closely related to one another than to Neanderthals (was KHH=0.01681 vs KHN=0.06378).
The mitochondrial genome has been widely used as a valuable tool for analyzing molecular differences between species for the purpose of constructing phylogenies (Bronstein et al., 2018). The double-stranded circular genome of the mitochondria has been inherited maternally throughout millions of years of animal evolution, and since the mitochondria provide energy to cells mitochondrial DNA (mtDNA) is found in virtually every cell type (Sharma et al., 2018). Mitochondrial DNA has been used in research because of its high acquired mutation rate due to issues with its mismatch repair system, lack of histone proteins, and damage from oxygen radicals produced during ATP synthesis (Sharma et al., 2018). This experiment will be using mitochondrial DNA isolated from the hypervariable region 1 on the D-loop of modern humans to construct a phylogeny between humans, Neanderthals, Denisovans, and the most recent common ancestor of modern humans. The D-loop or displacement loop is of particular interest because it can be used as a genetic marker in molecular analysis techniques (Bronstein et al., 2018). Since the D-loop is the non-coding region of mtDNA (Sharma et al., 2018), any mutations that occur here are much less likely to have a deleterious effect on the organism which would reduce its fitness. Thus, mutations are more likely to be conserved and inherited throughout generations.
In order to analyze the molecular differences between human samples and related hominids, the mtDNA sequence must first be extracted and then the target sequence must be isolated and amplified by PCR. PCR is an enzymatic polymerase chain reaction that allows for the targeting of a specific fragment of a DNA sequence and then amplification of the fragment to be used in electrophoresis or DNA sequencing (Garibyan and Avashia, 2014). The process of PCR occurs in three steps within a thermal cycler: denaturing, annealing, and extension of DNA (Garibyan and Avashia, 2014). The DNA is denatured by raising the temperature to facilitate the separation of the DNA helix into two separate complementary strands (Garibyan and Avashia, 2014). Then the temperature is lowered so annealing can occur, which is when a specific primer binds complementary and antiparallel to the DNA sequence right above and below the region that is being targeted for amplification (Garibyan and Avashia, 2014). It is important to select the correct primers so that the entire region on both strands of DNA is amplified, in order to accomplish this both a forward and reverse primer must be chosen. These primers should bind complementary at the 3’ end of the two complementary target DNA sequences. Next, the temperature of the thermal cycler is raised again so the extension of the primers can occur (Garibyan and Avashia, 2014). Extension occurs in the 5’ to 3’ direction from the primers as DNA polymerase adds nucleotides that are complementary to the target DNA sequence. Each time these three steps occur the number of copies of the target DNA sequence is amplified two-fold, so usually, this occurs about 30 times to yield a very high expression (Garibyan and Avashia, 2014). Then the target DNA sequence can be sequenced with fluorescent labeling or with a thermocycler for cycle-sequencing (Carlini, 2020).
The machines used for DNA sequencing are oftentimes capable of some forms of analysis, such as determining the number of nucleotide differences per site amongst a group of DNA samples. These values give the average proportional difference (Pd), but Pd does not take into consideration the possibility that multiple substitutions took place at the location of a single nucleotide. This is because the proportional difference is simply the number of different base pairs divided by the total number of base pairs, so it only counts the number of observable nucleotide differences. This means that proportional difference values tend to underestimate the true substitution rate, therefore this value needs to be corrected to find the true proportional difference (Kn), which provides a better estimate of the actual number of nucleotide substitutions per site of the DNA sequence (Carlini, 2020). This information is extremely useful because it can be used to find the mutation rate (nucleotide substitutions per site per year), and knowing this information allows for estimates of divergence time to be calculated using a calibrated reference point.
Mutations are both naturally occurring as a result of metabolic processes and repair errors, but also induced by environmental factors such as electromagnetic radiation or through exposure to carcinogenic chemicals. If the rate of mutation is known then it is possible to assign a molecular clock to an organism, meaning the amount of time it takes for a certain amount of mutations to accumulate (Carlini, 2020). There are currently two competing models of human evolution, the multiregional model, and the displacement model (Carlini, 2020). The multiregional model predicts that humans diverged from their most common ancestor 1.5 to 3 million years ago, and that modern humans evolved from different hominid species (ex: Europeans evolved from Neanderthals), which would mean that some human vs Neanderthal sequence comparisons are more similar to each other than human vs human sequence comparison (Carlini, 2020). On the other hand, the displacement model predicts that humans diverged from their most common ancestor much more recently, only 200,000 to 500,000 years ago, and that modern humans would be more similar to one another than to extinct hominid species such as Neanderthal or Denisovan (Carlini, 2020).
Materials & Methods
Genomic DNA was extracted from the mouths of the human subjects by rinsing with a saline solution for thirty seconds (Carlini, 2020). Then 1.5 ml of the extracted solution was transferred to a tube and centrifuged at maximum speed for five minutes to facilitate the removal and disposal of the supernatant (Carlini, 2020). The remaining pelleted DNA was resuspended into a small amount of saline solution and then transferred to a fresh microcentrifuge tube with 100 µl 10% Chelex solution (Carlini, 2020). After vortexing, the tubes were heated at 100 degrees Celsius for 10 minutes in a thermocycler and then spun in a microcentrifuge for another minute so that 30 µl of the DNA-containing supernatant could be transferred to a fresh tube (Carlini, 2020).
Next, the DNA sample was prepared for PCR amplification (Carlini, 2020). The primer/ddH2O mix was added to a tube with a Ready-to-Go PCR Bead. The Ready-to-Go PCR Bead contained Taq DNA polymerase, TrisHCL, KCL, MgCl2, all four dNTPs. The sequence of the forward and backward PCR primers that were used to amplify the HVI region of the mitochondrial DNA region are (Carlini, 2020):
HVIF15971: 5'-TTAACTCCACCATTAGCACC-3' [Forward Primer]
HVIR16410: 5'-GAGGATGGTGGTCAAGGGAC-3' [Reverse Primer]
Then 2.5 µl of the prepared DNA sample was reacted with the primer mix and PCR bead (Carlini, 2020). The sample was initially denatured in the thermal cycler for two minutes at 94 degrees Celsius and then run for 31 cycles on the thermal cycler at the following settings: denature for 30 seconds at 94 degrees Celsius, anneal for 30 seconds at 58 degrees Celsius, and extend for 30 seconds at degrees Celsius. After thirty cycles the sample underwent a final extension for 6 minutes at 72 degrees Celsius (Carlini, 2020).
Once PCR was complete the sample was loaded into an agarose gel plate along with a 1 Kb Plus DNA ladder (Carlini, 2020). It was run with electrophoresis until the markets and DNA had traveled down the gel far enough for the stain to be visualized under UV light (Carlini, 2020).
The purified PCR product was used as the DNA template for DNA sequencing (Carlini, 2020). This PCR product was added to a DNA sequencing master mix which contained dNTPS, primers, thermostable DNA polymerase, ddH2O, and a reaction buffer (Carlini, 2020). This solution was mixed with a vortex and centrifuge and then 1.8 µl was added to four separate tubes containing 0.9 µl ddATP, ddCTP, ddGTP, or ddTTP dideoxynucleotides (Carlini, 2020). The mixture of dideoxynucleotides/template DNA/DNA master mix was spun in a centrifuge and then placed into a thermocycler for two minutes at 92 degrees Celsius, and then run for 30 cycles at: 92 degrees Celsius for 30 seconds, 53 degrees Celsius for 30 seconds, 70 degrees Celsius for one minute (Carlini, 2020). These reactions occurred on a capillary electrophoresis DNA sequencing instrument which analyzed and sequenced the mitochondrial control region sequences of the DNA samples (Carlini, 2020).
The DNA sequencing results showed the pairwise sequence divergences for nucleotides in the mitochondrial control region for the entire student sample (Carlini, 2020). Additionally, the data showed the pairwise comparisons between each of the student samples mtDNA sequences and those of Neanderthal, Denisovans, and Chimps. These individual divergences were summed and divided by the number of student samples to calculate the average proportional divergence (Pd) for modern humans versus Neanderthals, modern humans versus Denisovans, modern humans versus Chimps, and for humans versus humans. Then the averages for proportional divergence were converted to the corrected true proportional difference (Kn) which gives the average number of nucleotide substitutions per site (Carlini, 2020).
Then divergence estimates were made using two calibration references. For the first divergence estimate which was for the most recent ancestor to modern living humans (TMRCA), the reference point was calibrated using the human-chimp divergence time of 6,000,000 years (Carlini, 2020). Then the Kn value for the human-chimp comparison was divided by 6,000,000 years to get the rate of nucleotide substitution in units of substitutions per site per year (Carlini, 2020). Then the average number of substitutions per site for human versus human (KHH) was divided by the previously calculated nucleotide substation rate to estimate the divergence time in years (Carlini, 2020). Next, the divergence time between Neanderthals and humans was calculated using a reference calibration of 200,000 years for the date of modern human divergence (Carlini, 2020). The rate of substitution was calculated by dividing KHH by 200,000 years to get the rate of substitution, and then KHN was divided by the rate of substitution to get the divergence estimate in years (Carlini, 2020).
Finally, a sample of human mtDNA will be further analyzed with the Nucleotide BLAST tool of the National Center for Biotechnology Information (NCBI) and compared to similar sequences within GenBank (Carlini, 2020).
The D-loop mtDNA sequence from sample Friday-12 was used in a Nucleotide BLAST (Basic Local Alignment Search Tool) to search for similar sequences. The search settings were optimized to search for sequences somewhat similar to Homo sapiens (taxid:9606). The program then found nucleotide sequences that were the most similar to that of the sample sequence. The first match (Accession Number: MK059721) was mitochondrial DNA isolated from a Homo sapiens in the United Kingdom belonging to mitochondrial haplogroup K1d (see Table 1 below). This match had an E score of 0, and all 400 nucleotide positions in the sample sequence were identical to the 400 nucleotides of the database match so it had 100% identity. The second match (Accession Number: KY671098) was mitochondrial DNA isolated from a Homo sapiens in the Belgorod Region of Russia belonging to mitochondrial haplogroup K1d (see Table 1 below). This match also had an E score of 0, and all 400 nucleotide positions in the sample sequence were identical to the 400 nucleotides of the database match, so it had 100% identity.
Table 1: Nucleotide Blast Results for the D-loop mtDNA Sequence of Sample Friday-12
As shown below in Table 2, the average proportional divergences of Modern Humans vs Neanderthals (Pd=0.06114) was calculated from the pairwise mtDNA comparisons and then converted to the corrected true proportional difference (KHN=0.06378). This value KHN gives the average number of nucleotide substitutions per site on the sequence. This process was repeated again for Modern Human vs Denisovan, which has KHD value equal to 0.06777; and also, for Modern Human vs Chimp which has a KHC value equal to 0.21640. Finally, the average number of nucleotide substitutions per site was calculated for the pairwise student mtDNA sequences, which gave a Human vs Human KHH value which was equal to 0.01681. The greatest number of average nucleotide substitutions per site was observed for the Modern Human Vs Chimp sequences, and the least number of average nucleotide substitutions per site was seen in the Human Vs Human comparisons. The average nucleotide substitutions per site were similar to the Neanderthal and Denisovan comparisons but slightly greater for the Human Vs Denisovan comparison.
Table 2: Pairwise Comparison of mtDNA Sequences
The average number of nucleotide substitutions per site that was calculated in Table 2, divergence estimates could be made from these values. The KHC value was divided by a calibration point (the Human-Chimp divergence time of 6,000,000 years) for the purpose of estimating the mutation rate. The mutation rate was determined to be 3.6067 x 10^-8 nucleotide substitutions per site per year (shown in Table 3 below), this value was used to estimate that modern humans diverged from their most recent common ancestor (TMRCA) 465,991.9 years ago. Similarly, the KHH value was divided by a calibration point (the Modern Human divergence time of 200,000 years) for the purpose of estimating the mutation rate. The mutation rate was determined to be 8.4035 x 10^-8 nucleotide substitutions per site per year (shown in Table 3 below), this value was used to estimate that modern living humans diverged from Neanderthals 758,981.8 years ago and from Denisovans 806,435.7 years ago.
Table 3: Computations and Presentation of Divergence Time Estimates
The results of the Nucleotide BLAST for sample Friday-12 shows that even Modern Humans have been geographically and reproductively isolated from each other over periods of time for long enough to create distinct identifiable mitochondrial haplogroups (Table 1). The ability to track this information and predict where a haplogroup originated is due to the low recombination rate of mitochondrial DNA and for its tendency to be only be inherited maternally (Bronstein et al., 2018).
As shown in Table 2, the greatest number of nucleotide substitutions was observed for the Human vs Chimp mtDNA comparison (KHC=0.21640). This makes sense because it is estimated that hominids diverged from Chimpanzees an estimated 5 to 6 million years ago (Carlini, 2020). When this value is compared to the nucleotide substitutions of Humans vs Neanderthal (KHN=0.06378) and Human vs Denisovans (KHD=0.06777), then it becomes clear that modern humans are more closely related to Denisovans and Neanderthals than they are to Chimpanzees. This is because there has been a greater amount of time between Human-Chimp divergence than Human-Hominid divergence which means that there is more time for mutations to accumulate.
The results of this genomic analysis support the displacement model for modern humans. If the multiregional model was true then there would more similarities between human vs Neanderthal mtDNA sequences than between human vs human, because this model says that modern humans evolved in different isolated populations (Carlini, 2020). However, this prediction was not observed in the sample population, instead, the displacement model was supported because humans had the greatest number of nucleotide similarities (KHH=0.01681) vs human vs Neanderthal (KHN=0.06378) (Table 2). The difference in the number of nucleotide substitutions is proportional to the amount of time that the populations have spent reproductively isolated from one another, therefore the human vs human are more closely related than Humans vs Denisovans. Furthermore, the multiregional model also says that modern humans diverged from the most recent common ancestor (MRCA) 1.5 to 3 million years ago, while the displacement model predicts that modern humans diverged from MRCA only 200,000 to 500,000 years ago. Again, the results of the analysis confirmed the predictions of the displacement model because modern humans were estimated to have diverged from their most recent common ancestor 465,991.9 years ago (shown in Table 3).
However, this 465,991.9-year estimate is much higher than the 200,000-year divergence time that has been estimated by population geneticists who have used mitochondrial and chromosomal DNA to create this estimate (Carlini, 2020). These differences can be explained by two main things. First, this analysis used a rather small sample size, so it is possible that this sample size had a higher nucleotide substitution rate than the entire global population, which is possible since all of the samples live in the same country. This being said, the sample may be a poor representation of the species. The mutation rate might be lower if using a larger population using mtDNA from humans, which would translate to a lower divergence time. The second possible explanation is that this analysis only used mitochondrial DNA, while the population geneticists used mitochondrial and chromosomal DNA. This is possibly problematic because mitochondrial DNA has a much higher mutation rate than chromosomal DNA (Bronstein et al., 2018). The reason why mitochondrial DNA has such a high nucleotide substation rate is due to three main reasons, the first is that the mismatch repair process is impaired, the second is that lacks the protective histone proteins that eukaryotic DNA has which makes it more vulnerable to alterations, and third, because the mitochondria create energy in the form of ATP during oxidative phosphorylation it is prone to mutagenic damage from oxygen free radicals (Sharma et al., 2018). This being said, using mtDNA has its drawbacks, and using it for determining phylogenic relationships can result in miscalculations. Another point worth mentioning is that some studies have determined several regions within the D-loop that can negatively impact the results of PCR and DNA sequencing (Bronstein et al., 2018). The high mutation rate of mtDNA can also be detrimental to the fitness and health of species, mutations in the mitochondrial genome have been associated with many forms of cancer including cervical cancer, HPV, metabolic diseases, and neurological conditions (Sharma et al., 2018).
The 200,000-year confirmed estimate of when modern humans diverged from the most recent common ancestor (MRCA) was used to calculate human divergence from Neanderthals and Denisovans because it provides a better estimate for the mutation rate. From this value, it was determined that modern living humans diverged from Neanderthals 758,981.8 years ago and from Denisovans 806,435.7 years ago (Table 3). This information suggests that Modern Humans shared an ancestor with Neanderthals about 760,000 years ago and with Denisovans about 800,000 years ago but have since been reproductively isolated. This reveals that humans are only slightly more related to Neanderthals than Denisovans, but also that the ancestor to modern humans evolved alongside these two species and the genetic similarities suggest the possibility of interbreeding between these three species at some point in time. Additionally, this information combined with the 200,000-year estimate of divergence from MRCA suggests that humans are more closely related to some undiscovered extinct hominid species than to either Denisovans or Neanderthals.
Carlini, D. (2020) Week 5: Genomic DNA Isolation & PCR Amplification of D-loop Region. BIO-356 Genetics with Laboratory. American University Department of Biology: Washington, D.C.
Carlini, D. (2020) Week 7: Agarose Gel Electrophoresis. BIO-356 Genetics with Laboratory. American University Department of Biology: Washington, D.C.
Carlini, D. (2020) Week 8: Sequencing Gel-Purified D-loop PCR Products. BIO-356 Genetics with Laboratory. American University Department of Biology: Washington, D.C.
Carlini, D. (2020) Week 11 Lab: Using the NCBI Databases. BIO-356 Genetics with Laboratory. American University Department of Biology: Washington, D.C.
Carlini, D. (2020) Human Mitochondrial DNA Lab Exercises. BIO-356 Genetics with Laboratory. American University Department of Biology: Washington, D.C.
Bronstein, O., A. Kroh, and E. Haring. (2018). Mind the gap! The mitochondrial control region and its power as a phylogenetic marker in echinoids. BMC Evolutionary Biology, 18(80).
Sharma, H., A. Singh, C. Sharma, S.K. Jain, and N. Singh. (2005). Mutations in the mitochondrial DNA D-loop region are frequent in cervical cancer. Cancer Cell Int., 5(34).