Search This Blog

Wednesday, January 18, 2017

Unusual Primate Mutations in Some Ketchum mtDNA Samples


“Shouldn’t we not throw the baby out with the bathwater?…Is there anything we can salvage from the Ketchum study?”     A Facebook Post
 


In my past work I have tried to treat each Ketchum et al. DNA sequence independently, and not to generalize without evidence.  Sometimes a tentative overall conclusion is suggested.  Here we look at some mtDNA sequences with common rare (human) mutations which are much more prevalent in other primates.  From everything I have learned, these several samples hold the most promise for actually originating from an unknown human-like primate.  This is a preliminary report, mostly just observations and suggestions for further study. Your comments are especially welcome here.
   
Comparisons between mtDNA extra mutations (i.e. those not found in the nearest haplogroup) revealed some unlikely coincidences.  Table 1 shows these.  Sample 26 has been shown to be a black bear [2a,b] from its nuclear DNA sequence taken from the original Ketchum paper [1], but the mtDNA results were much closer to human, though outside the normal range of number of extra mutations. [3]  Although the Ketchum et al. conclusion was that sasquatch is a hybrid of an unknown primate male and a modern human female, human mtDNA in this sample has been widely attributed to contamination. [4, 7]   ES-2 was discovered unlabeled below and to the right of other entries in the Ketchum Supplementary Data 2 and is not listed in the sample Table 1 there [1].  Sample 24 failed to produce human Amel X and Y STRs and had a very low 63% of 2.5 M human SNPs. Samples 29 and 138 are not included in any nDNA analyses in the Ketchum paper. Sample 28 appears in Table 5 of  ref. [1], where it shows human STRs in all but one locus (D3S1358) of sixteen total microsatellite loci.  It also matched nine of ten human SNP sites, with a heteroplasmic mutation at the tenth (478RHC) on the MC1R gene (Table 6, ref. [1]).  However, this sample failed to sequence at Amel X and at AmelY exons 1, 3, and 8, while showing human sequences at exons 2 and 4/5 (Table 4, ref. [1]).  Overall Sample 28 is the most human-like of those samples which were put through all of these nDNA tests.
  
Referring to Table 1, samples S1 and ES-2 have identical mtDNA sequences and the same two extra mutations.  C152T is common in humans, but G7332c is not found among the nearly 20,000 human mtDNA sequences in the Nucleotide database.  Further, it was not found in any other primates, though G7332A had a few examples.  Since we know nothing of the provenance of ES-2 and because it is identical to S1, it would be of great interest if Ketchum et al. would identify this sample.  However, the most significant similarity among the samples in Table 1 is the occurrence of 7852A, 9083C, and 13209T in samples S24, S26 (less 7852A), S28, S29, and S138.   These mutations are rare in both the NCBI Nucleotide database and the Phylotree of mtDNA mutations [6].  In fact, none of the nearly 20,000 complete human mtDNA genomes in the database have more than one of these mutations, i.e. none have two or three.  These five samples were collected by different teams in three widely separated locations, CA, NM, and BC.   Ketchum, et al.[1] maintain that all sample collectors and laboratory personnel were excluded from possible contamination based on their individual mtDNA profiles. 


Fig. 1 shows percentages of each primate group possessing each of these rare human mutations.  Percentages are based on all database entries (number in parentheses) for each group in GenBank(R) - Nucleotide database, without regard for species duplicates.    The groups collectively include all living primates: chimpanzees (two species), gorilla (several subspecies), orangutan (two species),
gibbons, tarsiers, Old World (OW) monkeys, New World (NW) monkeys, lemurs, Lorisiformes (loris’s and galagos), and Chiromyiformes (aye-ayes).  None of these extra mutations occurred in any Neanderthal (9 database examples), Heidelberg (1), or Denisovan (2) sequences, and are therefore not plotted in Fig. 1.  However, these small numbers of database examples may not be representative of these populations.   Full taxonomy of these families is found in ref. [5].that these three mutations, plus C9195T, are much more common in nonhuman primates, increasing generally with increasing taxonomic/genetic distance from human [5].  C6571T was only found in Old World monkeys.  C3626T was not found in other primates. 


Nothing definite can be concluded from these limited data, especially when considering the relatively low number of database entries for some of the primate families in Fig. 1.  Some intriguing questions might be explored further by collection of more DNA samples from the same geographical areas of CA, NM, and BC, along with photographic or video documentation:


1.  Could S26 (the black bear) be contaminated with sasquatch mtDNA?  In which case, it might be the result of a fight between bears over a sasquatch carcass. [7]


2.  Why doe
s S26 have so many (16) extra mutations? (so many that a haplogroup cannot be uniquely determined). [3] Phylotrees produced by two different methods did not agree. [4]
   
3.  Could S24, S28, S29, and S138 actually be sasquatch samples with a vestige of nonhuman primate mutations, either through parallel or reverse evolution?  The random occurrence of three rare mutations (7852A, 9083C, and 13209T) in all of these samples is a statistically improbable
coincidence. [See NOTE]  They must be related somehow.  Importantly, all four samples differ by only a few, mostly heteroplasmic, mutations.  A common human contamination is unlikely, given the rarity of these mutations in the human genome (Table 1, Fig. 1), especially in combination.  Even two human contaminants with these combined mutations seem unlikely (no GenBank(R) sequences had even two of these mutations).   Even more remarkable are the identical sequences of S29 and S138, which have in addition to these mutations three additional rare heteroplasmic mutations (C3626Y, C6571Y, and C9195Y) in common.  Sequencing errors would not likely be at the same three positions in two independent samples.  Both of these samples are from British Columbia and were collected by the same people (ref. 1,Table 1), whose mtDNA was found not to be in these samples [1].   These two samples may be from a single sasquatch or closely related individuals.


I openly encourage more field work and laboratory analysis of samples from these regions of OK, CA, NM, and BC. 
 
       
NOTE

 
The occurrences of 7852A, 9083C, and 13209T are 27, 20, and 7 respectively in 19988 human
database entries.  The probability of all three occurring in the same individual is:

(27/19988)*(20/199
88)*(7/19988) = 4.73 x 10 exp -10.

The probability of these three mutations occurring in any four randomly selected individuals is:
(4.73 x 10 exp -10) exp 4 = 5.02 x 10 exp -38.

REFERENCES

[1]  Ketchum, M. S. et al.,  Novel North American Hominins: Next Generation Sequencing of
Three Whole Genomes and Associated Studies. DeNovo, 2013, 1:1, Online only:
http://www.sasquatchgenomeproject.org/view-dna-study/
 
[2a]  Hart, H. V.,  Methodology and New Metrics for Distinguishing Related Species from
Incomplete nuDNA.,  Paper 1 at right (and related blogs)
http://www.bigfootclaims.blogspot.com
 
[2b]  Hart, H. V., Not Finding Bigfoot in DNA.  Journal of Cryptozoology 4: 39-51.
 
[3]  Hart, H. V.,  “But the mtDNA Sequences are all Human…”  Really?,  Paper 2 at right: http://www.bigfootclaims.blogspot.com
 
[4]  Hart, H. V., More Inconsistencies and Evidence for Contamination in Ketchum et al.
Supplementary Figures, June 17, 2016, this blogsite.

[5]  National Center for Biotechnology,    
http://www.ncbi.nlm.nih.gov/guide/taxonomy/
 
[6]  van Oven,  M., Revision of the mtDNA Tree and Corresponding Haplogroup Nomenclature. Proc. Natl. Acad. Sci. USA, 2010, 107(11), E38-E39.   http://dx.doi.org/10.1073/pnas.0915120107
 
[7]  Hart, H. V., Ketchum Sample 26, The Smeja Kill: Independent Lab Reports, November
26, 2014, this blogsite.
http://www.bigfootclaims.blogspot.com/2014/11/sample-26-smeja-kill-independent-lab.html
 



 




FIG. 1





Numbers in parentheses for each group are the number of database entries in GenBank(R), which include duplicate entries for some species.


Tuesday, January 17, 2017

Two New Peer-reviewed Papers are at Odds with Ketchum et al. Conclusions

“The above commonly reported traits, as well as other scientific evidence lending credence to the existence of Sasquatch, have been thoroughly researched and documented in both books and in peer reviewed manuscripts. refs.4-13”  (Ketchum et al., 2013)

Peer review is the process by which journal articles are reviewed by experts prior to publication.  Reviewers are selected by the editor of the journal for their proven expertise in the relevant area, and their identity is kept anonymous. Their criticisms and suggestions are forwarded to the author for consideration and possible revision of the manuscript.  It’s not a perfect system, but it greatly reduces the number of scurrilous publications, honest errors, and unclear narratives.  Recall that Ketchum et al. (2013) failed peer review in two journals before being self-published.

Recently I had published two, peer-reviewed papers (Hart, 2016a, 2016b).  “Not Finding Bigfoot in DNA” made it to Volume 4 (pp. 39-51) of The Journal of Cryptozoology, a publication of the Center for Fortean Zoology, edited by Dr. Karl Shuker.  Zoologist Shuker is well known for his many books and blog articles on cryptozoology – the study of animals not (yet) proven to exist by science.  My paper addresses the three nuclear DNA sequences published by Ketchum et al. (2013), said by them to be “a novel mosaic pattern of nuclear DNA comprising novel sequences that are related to primates interspersed with sequences that are closely homologous to humans.“  As proven in my previous blogs, this claim is false:  the sequences are from a bear (S26), a human (S31), and a dog (S140), with no significant traces of other primates.  I review the five different approaches to producing realistic phylotrees* of S26 and S140 that clearly show the samples are related to bears and dogs, respectively.   In contrast, Ketchum phylotrees for S31 (their Supp. Fig. 5) and S140 (their Supp. Fig. 6) showed homology to mice, chicken, and fish, not at all in support of their conclusion above.   Volume 4 of the Journal of Crypozoology is available from Amazon.com.
 
My second paper, “DNA as Evidence for the Existence of Relict Hominoids,” is a review of known publications related to DNA of purported cryptid hominoids, and can be downloaded free from: 
http://www2.isu.edu/rhi/research-papers.shtml, the Relict Hominoid Inquiry website of Prof. Jeff Meldrum of Idaho State.  Meldrum’s specialty is anatomy, especially primate bipedal motion, and he has studied hundreds of purported sasquatch footprint casts in great detail, the great majority of which he sees as evidence for the existence of a large North American primate.

Finally, one of the criteria for publication in peer-reviewed journals is that the references are authentic and support the statements in the text to which they are appended.  Some journals require that the author certify this.  Padding of references to give the appearance of command of the literature is explicitly discouraged.  References 5 (Malinkovitch et al., 2004) and 6 (Coltman and Davis, 2006) of Ketchum et al. (2013), both with titles in apparent support of the Ketchum et al. (2013) claim quoted at the beginning of this blog, are anything but supportive, if one bothers to read them.  Both references are reviewed in my second paper.

The Milinkovitch et al. (2004) paper has an admitted April Fool’s joke title.  The Himalayan hair sample was found to be from a horse, nothing close to a primate.  Their phylotree shows this clearly.

Coltman and Davis (2006) matched the DNA of a Yukon hair to the American bison exactly, in spite of their tongue-in-cheek title.   Their phylotree of related ungulates supports their conclusion.

Both of these papers are good examples of how an unknown DNA sample should be analyzed without bias.  Read your references, Melba.  You might learn something. 

 
 *  A phylotree is a DNA-based evolutionary tree of life with a topology determined by degree of match (distance – or % of matching base pairs in homologous DNA sequences) between the various species in the branches.  A phylotree is produced from the results of a DNA search, for example using BLAST® as both Ketchum et al. and I did.


REFERENCES  (Unpadded)

Coltman, D. and Davis, C. (2006) “Molecular cryptozoology meets the Sasquatch.“ TRENDS in Ecology and Evolution 21(2): 60–61.

Hart, H. V. (2016a)  “Not Finding Bigfoot in DNA.”  Journal of Cryptozoology 4: 39-51.

Hart, H. V. (2016b)  “DNA as Evidence for the Existence of Relict Hominoids” Relict Hominoid Inquiry 5: 8-31.  http://www2.isu.edu/rhi/research-papers.shtml

Ketchum, M. S. et al. (2013). Novel North American Hominins: Next Generation Sequencing of Three Whole Genomes and Associated Studies.“ DeNovo,  1 (1):  Online only:  http://sasquatchgenomeproject.org/sasquatch_genome_project_002.htm

Milinkovitch, M. C. et al. (2004) “Molecular phylogenetic analyses indicate extensive morphological convergence between the ‘yeti’  and primates.” Molecular Phylogenetics and Evolution 31: 1–3.

Friday, June 17, 2016

More Inconsistencies and Evidence for Contamination in Ketchum et al. Supplementary Figures

ABSTRACT

Scientific results should be consistent, otherwise more experiments are needed to clarify discrepancies.  The Ketchum et al.(2013) Supplementary Figures 1, 2, and 3, which are mitochondrial DNA phylotrees for samples 26, 31, and 140, respectively, are inconsistent with Supplementary Figures 7, 8, and 9, respectively.  Haplogroups of the closest relatives do not agree: Supp. Fig. 1 with 7 (S26), 2 with 8 (S31), or 3 with 9 (S140). The Most likely explanation is contamination by at least one human in each case.

INTRODUCTION 

Ketchum et al. (2013) mitochondrial DNA results have been previously reviewed (Paper 2 at right), and the Ketchum claim that they are all 100% human was found to be an overstatement.  In fact, it was found that eight of 18 samples with complete mitochondrial sequences had too many mutations from the closest haplogroup to be statistically probable (less than 1% probability).  Also, eight of 11 samples with HVR-1 (hypervariable region 1) only mutations listed were phylogenetically ambiguous, i.e., alternate haplogroups were equally likely.  From these new results, it was concluded that either the samples were contaminated and/or degraded, or that any possible hybridization events would have to have been followed by subsequent mutations along nonhuman evolutionary lines and on a different time scale.  The results in the current paper prove that S31 has human contamination by an individual with a different haplogroup than previously reported for that sample.  Samples 26 and 140 have previously been shown to be from a black bear and a dog, respectively, from nuclear DNA matches (See Paper 1 at right).  They are now shown here to be contaminated by two humans of different haplogroups.

METHODS

This study involves extracting data from six circular phylotrees, Supp. Figs. 1, 2, 3, 7, 8, and 9 from Ketchum et al. (2013) for comparisons.  Phylotrees of this kind are generated from the query results (hits) in BLAST (TM) through the "Distance tree of results" option.  The goal was to determine the haplogroup [1] of the nearest match to each query: S26, S31, or S140 in Supp. Figs. 1 and 7, 2 and 8, and 3 and 9, respectively.  Phylotrees in Supp. Figs. 1, 2, and 3 were generated from complete mitochondrial sequences produced by Family Tree DNA.  Phylotrees in Supp. Figs. 7, 8, and 9 [3] were generated from supercontigs, but the details of which mitochondrial genes were employed were not stated.

The title of the nearest phylotree branch tip was searched in GenBank (R)(the NCBI databases), and the accession number retrieved.  A BLAST(TM)[3] alignment of this accession with rCRS (Revised Cambridge Reference Sequence, Genbank accession NC_012920.1) produced a set of rCRS-based mutations, as seen in the "Graphics" option of the results page.  From these mutations a haplogroup was determined using the programs FASTmtDNA and mtDNAable as previously described (Paper 2 at right).        

RESULTS


Sample 26

Table 1 presents the results for S26. The nearest haplogroup from Supp. Fig. 1 (H1)closely matches that determined by Family Tree DNA (H1a).  However, The Supp. Fig. 7 result, T2b, is far removed.  Interestingly, this is the haplogroup of the human contamination determined by two independent studies (Cassidy, 2013; Khan and White, 2012-the Tyler Huggins Report at right).  Also to be noted is that S26 is one of the samples previously found to have too many extra mutations (16) to be called "modern human," according to the accepted mtDNA phylotree (van Oven, 2010) and a Poisson Distribution of mutations (Paper 2 at right).  Given that the nuDNA of this sample matched a black bear (Paper 1 at right; Cassidy, 2013;  Khan and White, 2012; Sykes et al., 2014 - See Tyler Huggins Report and Sykes Paper at right), it can be concluded that there are two sources of human contamination in this sample, with haplogroups H1a/H5e and T2b.

Table 1.  S26  
Nearest Matches to S26 mtDNA in Ketchum phylotrees
Supp.
Accession
Mis.vs.
Hap.
Fig.
S26
 
 
 
 
 
Homo sapiens clone 3760 mitochondrion,
1
JQ703795.1
16
H1
complete genome
Homo sapiens isolate NEC20 mitochondrion
7
JQ664540.1
22
T2b
complete genome
                  From Ketchum Supp. Data 2:
H1a
S26 from mtDNAable:
H5e


 

 
Columns left to right: Accession title, Ketchum et al. Supplementary Figure number,
Accession number (GenBank), Mismatches vs.S26, Haplogroup.
 
 


Sample 31

Table 2 presents the results for S31.  The nearest haplogroup from Supp. Fig. 2 (L1a1) is close to that determined by Family Tree DNA (L0d2a). However, The Supp. Fig. 8 results, T2b and T2b8, are far removed. The nuDNA of this sample matches modern human (Paper 1 at right).  Sample 31 is contaminated by another human of T2b haplogroup.




Table 2.  S31

 

 

 

 

Nearest Matches to S31 mtDNA in Ketchum phylotrees

Supp..

Accession

Mis. vs.

Hap.

Fig.

S31

 

 

 

 

 

Homo sapiens haplotype A10L1A2 mitochondrion

2

AY195777.1

2

L0d2a1

complete genome

(A10L1A2)*

Homo sapiens isolate 157 T2i Tor354 mitochondrion

8

JQ798131.1

100

T2 or T2b-16362C

complete genome

(T2i)*

Homo sapiens isolate 13T mitochondrion

8

 

 

complete genome

JX081995.1

104

T2b8

 

 

 

 

 

 

 

From Ketchum Supp. Data 2:

L0d2a

S31 from mtDNAable:

L0d2a1

*(Haplogroup) taken from Accession




Sample 140

Table 3 presents the results for S140.  The nearest haplogroup from Supp. Fig. 3 (D4b2b1) matches that determined by Family Tree DNA for HVR-1 only(D). However, The Supp. Fig. 9 results, both R2'JT, are far removed. The nuDNA of this sample matches a dog (First paper at right).  This sample is contaminated by two humans with haplogroups D and R2.





Table 3.  S140

Nearest Matches to S140 mtDNA in phylotrees

Supp.

Accession

Mis vs

Hap.

Fig.

S140

Homo sapiens mitochondrial DNA complete genome

3

AP008361.1

No complete sequence available**

D4b2b1

isolate PDsq0023

Homo sapiens isolate R1 mitochondrion,

9

JX155264.1

R2'JT(R2a1)*

complete genome

Homo sapiens isolate R2 mitochondrion

9

JX155265.1

R2'JT(R2a1)*

complete genome

           From Ketchum Supp. Data 2:

D (HVR-1)

      From Behar, et al.(2012):

D (HVR-1)

 

 

 

 

 

*  (Haplogroup) taken from Accession

**  Oddly Supp. Figs, 3 and 9 require a full sequence, but Supp. Data 2 contains only HVR-1 mutations


CONCLUSION


Over all three samples, using supercontigs resulted in phylotrees with haplogroups which were inconsistent with full sequence derived haplogroups. 

Samples 26 and 140 are contaminated by two modern humans.  Sample 31 is contaminated by one additional human.

Insistence by Dr. Melba Ketchum, DVM, that her samples were not contaminated when analyzed is not warranted.  Very likely some additional Ketchum et al. anomalous mtDNA samples are so because of contamination (Paper 2 at right).

NOTES

[1]  Haplogroups are unique human mtDNA sequences, represented by their mutations from a standard, either rCRS (revised Cambridge Reference Sequence) or RSRS (Reconstructed Sapiens Reference Sequence).  All known haplogroups of modern humans are represented in the phylotree of van Oven (2010) at www.phylotree.org.  This tree stems from the root called "Mitochondrial Eve", the most recent common maternal ancestor (MRCA) of all humans.  A haplotype is a particular allele (combination of SNPs-mutations) within a haplogroup and is designated by a preceding letter and number.

[2]  Supp. Figs. 7, 8, and 9 are erroneously referred to in the Ketchum et al. (2013) text as Supp. Figs. 4, 5, and 6 in the last paragraph of the "Next Generation Whole Genome Sequencing" section.  Supp. Figs. 4, 5, 6 are actually nuDNA-based phylotrees. See my blog "Melba Ketchum's Experts and Their Mistakes: What's in a Phylotree."    

[3]  BLAST (TM) is a search/match program which utilizes the National Center for Biotechnology Information (NCBI) GenBank databases. (Altschul et al.,1990; Madden, 2003).  Its application has been described extensively on this blogsite. See under BLAST Search and Ketchum DNA Study Tabs above.

REFERENCES 



  Altschul, S. F.; Gish, W.; Webb, M.; Meyers, E. W.; Lipman, D. J. (1990). Basic local alignment search tool.  Journal of Molecular Biology, 215 (no.3): 403-410.

  Behar D.M.; van Oven, M.; Rosset, S.; Metspalu, M.; Loogväli, E.-L.; Silva, N. M.; Kivisild, T.; Torroni, A.; Villems, R. (2012)  A “Copernican" reassessment of the human mitochondrial dna tree from its root.  American Journal of Human Genetics, 90 (no.4): 675-684. http://dx.doi.org/10.1016/j.ajhg.2012.03.002

  Cassidy, B. G. (2013).  Technical Examination Report DNAS Case Number: 2012-006524.  DNA Solutions, Inc. (Oklahoma City).

  Ketchum, M. S. et al. (2013).  Novel north american hominins: next generation sequencing of three whole genomes and associated studies, DeNovo, 1:1.  Online only: http://sasquatchgenomeproject.org/view-dna-study/
 

  Khan, T. and White, B.  (2012)  Final report on the analysis of samples submitted by Tyler Huggins. Wildlife Forensic DNA Laboratory Case File 12-019, Trent University Oshawa (Peterborough, Ontario, Canada). http://www.bigfootbuzz.net/bart-cutino-tyler-huggins-release-sierra-kills-sample-dna-results/

  Madden, T.  (2003). The BLAST sequence analysis tool.  The NCBI Handbook; McEntyre, J; Ostell, J., Eds.; National Center for Biotechnology Information (Bethesda, MD). http://www.ncbi.nlm.nih.gov/books/NBK21097/.

  Sykes, B. C.; Rhettman A.; Mullis, R. A.;  Hagenmuller, C.; Melton, T. W.; Sartori, M. (2014) Genetic analysis of hair samples attributed to yeti, bigfoot and other anomalous primates.  Proceedings of the Royal Society B, 281: 20140161.


  van Oven, M. (2010).  Revision of the mtDNA tree and corresponding haplogroup nomenclature. Proceedings of the National Academy of  Sciences USA, 107 (no. 11): E38-E39.   http://dx.doi.org/10.1073/pnas.0915120107