It has finally happened. After sequencing mitochondrial genes of hundreds of scorpions over the years, I have finally sequenced of couple dreaded pseudogenes, also known as NUMTs, short for nuclear mitochondrial DNA (pronounced “new mights” or “nummits”). Uggg!
Commonly encountered in many other organisms, including the human genome, NUMTs are fragments of the mitochondrial genome that have transferred to the nucleus via one of several postulated pathways (see figure) and have been incorporated into the nuclear genome. Gone undetected, NUMT sequences can seriously mislead phylogenetic interpretations and are often responsible for artificially inflating genetic diversity in phylogenetic research and species diversity in DNA barcoding efforts. Fortunately, these potential sources of error are usually easy to detect.
Once in the nucleus, NUMTs are not thought to be transcribed, instead behaving as non-coding ‘junk’ DNA. They can then accumulate random mutations as well as indels (insertions and deletions) at any codon position without consequence. Thus, NUMts can be detected by more mutations at the 1st and 2nd codon positions than expected for coding genes, by stop codons or indels within a reading frame, and by heterogeneous positions (double peaks) in chromatograms.
Yesterday I realized that I had clearly sequenced NUMTs when several sequences from Giant Hairy Scorpions, which were clean, contained a several hundred bp deletion replaced by a 30 bp insertion. As frustrating this was, especially in one of the very last populations left to sequence for my biggest research project, I was also somewhat fascinated. Interestingly, it appears that I may have sequenced two separate NUMTs that may have incorporated themselves into the nuclear genome of common ancestors at different times in the deep history of these scorpions…. the oldest NUMT might even have been transferred in the common ancestor of all Giant Hairy Scorpions before the group had diversified into the species found today!
Once in the nuclear genome, NUMTs still acquire mutations, but are expected do so much more slowly than their mitochondrial counterparts due to the larger effective population size of the nuclear genome. This means that NUMTs should be more similar to the original gene sequence of the common ancestor, making them a sort of window into the past, very similar to sequencing ancient DNA! In theory, NUMTs should therefore be useful in rooting or polarizing phylogenetic trees.
Although I don’t think that I will use these NUMTs in my own research, please contact me if you have an interest in investigating these bizarre sequences of DNA. As far as I am aware, this is the first time that pseudogenes have been discovered in scorpions.
Image reference: Hazkani-Covo E., R.M. Zeller, W. Martin. 2010. Molecular poltergeists: mitochondrial DNA copies (numts) in sequenced nuclear genomes. PLoS Genetics 2010, 6:e1000834