Search This Blog

Monday, November 3, 2014

The Ketchum Peer Reviews




"At least now everything I have said can be authenticated including the ridiculous and biased nature of the reviews. Now people will know the truth and that we did pass peer review.  (Melba Ketchum)
 

 

"Extraordinary claims require extraordinary evidence."  (Anonymous Ketchum Reviewer)

"Extraordinary evidence requires extraordinary review."  (Haskell Hart)

"Hindsight is 20/20."  (Anonymous)

Dr. Melba Ketchum first submitted her famous paper (link at right) to Nature which rejected it on November 1, 2012, based on comments of four reviewers and after she first tried to satisfy their requests in a second round.  She then submitted the paper to Journal of Advanced Multidisciplinary Exploration in Zoology (JAMEZ), which she claims accepted it after some revisions.      

Much has been made of the history of the defunct journal, Journal of Advanced Multidisciplinary Exploration in Zoology (JAMEZ), sometimes called redundantly Journal of Advanced Zoological Exploration in Zoology by Scott Carpenter (see link to his blog at right), and its successor DeNovo, and whether in fact Dr. Melba Ketchum purchased the former to acquire the rights to her "peer reviews" so she could tout them in her own publication DeNovo, where her paper  appeared.  This debate is not about science, and will be left to others.  Instead we focus here on the content of the peer reviews, taken as scientific criticism and evaluated at face value  in the light of current knowledge of the Ketchum study based on my in depth analyses in three papers (see at right).  In so doing we hope to answer the question of whether her work was fairly evaluated.

Nature had four reviewers, one of whom admitted he was not a geneticist (but then neither are Melba or her coauthors).  JAMEZ had only two reviewers, which is not enough for a paper of this import which is likely to be controversial.  All the comments are combined and categorized here.  There were 32 separate numbered comments in all, which addressed 15 separate issues or points of concern about the paper by my tally; one comment was positive (number 12 below).  This is a lot for one paper.

Others might categorize them slightly differently, but I believe they would capture substantially the same criticisms of the reviewers.  Keep in mind that the JAMEZ reviewers likely received a paper that was revised based on the comments of the Nature reviewers.  The tally of issues looks like this to me.

The Issues

In order of decreasing reviewer consensus.  Bold numbers in brackets are the number of reviewers (of the six total) who concurred.

(1)  Inadequately substantiated thesis.  Results do not support conclusions. [6]

(2)  Results of genomic analysis superficially treated.  Results are not adequately documented.  More stepwise in depth treatment needed.  Need more information on analysis of whole genomes. Analytical credibility could be improved by fully leveraging information from next generation sequencing. Sequences should be available.  Bioinformatics should include reference sequences from expected contaminants.  [4]

(3)  Hominin not proven.  Primate a better term. [4]

(4)  Phylogenetic trees not clear, inadequate, or give mixed results. [4]

(5)  Concern about monitoring submitter contamination. Results indicate contamination. [4]  Did avoid contamination. [1]

(6)  S26 provenance murky at best. [2]

(7)  Selectively aggregated sequences to support a favorable placement among primates. Reference sequence bias in developing consensus sequence. [2]

(8)  Poor agreement between mtDNA and nDNA results across samples.  S31 does not align with S26 and S140. [2]

(9)  Change "unknown" hair morphology to "novel." Should cross reference human morphology.  Expected mistakes in hair identification made. Needs statistical analysis. [2]

(10)  Quality control is difficult with contract labs. [1]

(11)  S26 - ethics of shooting. [1]

(12)  Q30 scores important. Seemed to justify publication. [1] 

(13)  Need better photographic evidence.  Stick structures inappropriate. [1]

(14)  mtDNA is all "poorly" human.  Would expect some other lineages.
[1]

(15)  Unknown sequences more likely from unknown microorganism. [1]

(16)  Electron microscopy suggests DNA damage
. [1] 
 


Where to begin? Number 1 is the bottom line and will fall out of the other criticisms.  I believe the key issues are numbers (2), (4), and (7), which are related .  Here’s what the reviewers said about these issues:

1.  The bioinformatics should include gene sequences from expected outlier species that may also be capable of contributing contaminating nucleic acids.”

2. “ The molecular genetics in this manuscript are the most important and it would be important to include information regarding the analysis of the whole genomes.”

3. “I would suggest the authors ….build phylogenetic trees with all possible mammal mtDNA genomes and nuclear data available at genbank.”

 
4.   To make a compelling case I need seeing mtDNA genomes and large numbers of nuDNA sequences that points(sic) in direction of a new hominin species i.e. ape or human like without being identical to know(sic) species.”

 
5.  “…. that several phylogenetic/gene trees have been included, they desperately need redesigning, since the text on the branches is so small that it can't be read without exceptional magnification. As such, the trees are essentially useless.”

6. “Sequencing data should be freely available to the scientific community after publishing (even better, before).”  (NOTE:  Sequencing data was unavailable to the Nature reviewers.  When added, one JAMEZ reviewer made the comment no. 7) 

7.  The bioinformatics should include gene sequences from expected outlier species that may also be capable of contributing contaminating nucleic acids. For example, a BLASTN search using Sample 26 does turn up some exceptionally strong homology with a gene from Ursus americanus (DQ240386.1). This would support the idea that the consensus sequence may have been affected by contaminant sequences.” (NOTE:  Ursus americanus is the American black bear.)


Just how did the Ketchum Team compare their sequences to those of known species?  Or did they?  Statements such as “Because the global BLASTn demonstrated statistically significant alignment across the Primate order; a Primate ‘Drill Down’ utilizing BLASTn with inclusive Primate organism taxids was analyzed,” are not nearly discriminating enough.  “Statistically significant alignment” needs numbers and comparisons to back it up.  Nowhere does the paper describe a comprehensive set of BLAST™ searches against databases of other animals such as I did in my first paper.  Everything is referenced to human chromosome 11 with no indication of how well it aligns.  Furthermore restricting the search to ”primate organism taxids” only results in tunnel vision and loss of perspective.  Such a procedure should be preceded by more general searches and justified by hard numbers from these.  It was not.


“As shown in Supplementary Figures 4, 5 and 6, the Sasquatch consensus that showed homology to human chromosome 11 reference sequence is related to primate lineages including Homo sapiens, Otolemur garnettii, Pan troglodytes (Chimpanzee), Macaca mulatta (Rhesus Monkey), Nomascus leukogenys (White cheeked Gibbon) and Callithrix jacchus (Common Marmoset) and other primate species,”(from the Ketchum paper) is a false statement.  Only Supp. Fig. 4 shows this relationship.  As I pointed out previously (See my blog “Otolemur garnettii is No Lemur and We're Not a Fish, a Chicken, or a Mouse”) Supp. Fig. 5 shows a phylotree with a chicken, a mouse, and 29 species of FISH as nearest relatives.  Supp. Fig. 6 shows only the mouse.  The paper does not say which figure goes with which sample, but Supp. Fig. 4 is likely Sample 31, the human.  I doubt the Ketchum team even translated the Latin (scientific) species names in Supp. Figs. 5 and 6, or they would not have published them without explanation.  These figures do not support the statement above.

In my first paper, Table 3, shows exactly the same result that was found in comment no. 7.  At first, I didn’t think much of it because the matching sequence was relatively short compared to others I had matched (my Tables 1, 4, and 5).  Later I realized that there is relatively little black bear data in the databases and that the giant panda and the polar bear are much better represented.  Matches to these bears were much better than any primate (including human) matches for Sample 26 (See Table 1 in my first paper.).   


Melba’s response to the no. 7 comment was:  “…DQ240386 is statistically significantly aligned with primates and carnivores.  In fact, BLASTing DQ240386- ring tailed cats of the raccoon family and seal have as much alignment as Ursus americanus. The maximum score for raccoon and seal are about 850. Maximum score for Ursus americanus is about 900. Max score for Sample 26 is 538. This shows contamination bias.”

In her response Melba demonstrates the taxonomy of the suborder Caniformia, order Carnivora, which includes the families:


The gene (BDNF) contained in the sequence mentioned by Melba is apparently highly conserved, so matches to a variety of Caniformia can be expected.  HOWEVER, “The devil’s in the details.”  Table 1 shows that the top matches of DQ240386 in the nucleotide database are all bears. 

 

Table 1.  Matches to DQ240386.1 - Black Bear

ACCESSION

%ID

SPECIES

SAME

MIS

GAPS

SCORE

 

Sample 26

100

black bear

species

0

0

538

XM008686880.1

100

polar bear

genus

0

0

889

DQ093584.1

100

Asiatic black bear

genus

0

0

889

AY011500.1

100

brown bear

genus

0

0

889

AF002239.1

99.79

brown bear

genus

1

0

883

AF002240.1

99.17

Malayan sun bear

family

4

0

867

DQ240387.1

98.77

giant panda

family

6

0

870

GU931015.2

98.75

giant panda

family

6

0

856

GU931011.1

98.54

ringtail

suborder

7

0

850

DQ660197.1

98.54

ringtail

suborder

7

0

850

U56638.1

98.54

giant panda

family

7

0

850

GU931013.1

97.92

North American raccoon

suborder

10

0

833

GU931012.1

97.92

ring-tailed coati

suborder

10

0

833

GU167726.1

97.92

leopard seal

suborder

10

0

833

DQ660203.1

97.92

North American raccoon

suborder

10

0

833

DQ660202.1

97.92

crab-eating raccoon

suborder

10

0

833

GU174616.1

97.71

ringed seal

suborder

11

0

830

 MIS = mismatches (bases).  GAPS = gaps in alignment (bases).

Notice that the %ID correlates with the taxonomic relationship (same species, genus, family, or suborder) and that Melba’s statements based on scores are too broad and lack sufficient discrimination.  This is a game of single percentages or tenths of a percent.  The Ursidae (bears) clearly match the reference black bear sequence DQ240386.1 BEST.  Other matches, while good, are clearly not AS good.  Sample 26 has a lower score because the sequence length is shorter; apparently the entire gene (BDNF) was not sequenced in this sample.  

No, there is no “contamination bias” here.  In fact, the THE SAMPLE IS FROM A BEAR, as proven by two independent laboratory analyses (Huggins,and Sykes at right) and my interpretation of Ketchum's nDNA sequence in my paper 1,  a fact missed by the reviewer and by Melba, who said “…your example of bear contamination can be completely ruled out considering none of the laboratories handling the samples have bear samples,“ a red herring, or possibly worse: a misleading diversion.  And 97.71%ID is a genetic mile from 100%, another fact not recognized by Melba.  So concludes issues (2), (4), and (7), which are all related. 


Three more issues are highly significant:  (14) the mtDNA is “poorly" human.  This is what I found (Paper 2) for eight of the 18 samples for which a complete mitochondrial genome was determined.  Specifically, these samples contained too many extra mutations to fit neatly into the mtDNA Phylotree (see at right) for human haplogroups.  These eight samples each had less than a 1% chance of being in the human population based on their numbers of these extra mutations: 7-17.  The average for the haplogroup H1a is only 2.37.  Additionally, eight of the remaining 11 samples for which only the HV1 region was sequenced had one extra mutation each.  So what are they?  They’re still much closer to human than anything else (>99.9%ID).   It cannot be determined whether they are of human origin but further evolved since a hybridization event according to the Ketchum theory, or they simply contain sequencing errors due to contamination.  In any case it’s incorrect to say that they are “completely human.”  At best, they might be human mutants.   


The best (by score) sequence match to Sample 31 is a fungus and the second best appeared to be a bacterium, so degradation as suggested in comment (15) is real.  Melba privately recognized this to me, yet publicly and in her responses to the reviewers she vigorously denies any degradation or contamination.  This is related to issue (16): the electron microscopy showing single-stranded DNA segments.  I asked the microscopist coauthor, Prof. Andreas Holzenburg of Texas A&M University, about possible explanations for this, and he responded that he only takes pictures and does not engage in the paranormal, which disappointed me (his name's on the paper).  Melba called the microscopy results “supporting evidence for the unusual behavior of the amplified DNA.”  But only viruses contain single-stranded DNA; it’s never been found in other organisms, where it would disrupt the known process of DNA replication.


In spite of Melba’s claims that standard forensic techniques were employed, issues (6) and (13) point out that there is nothing tying any of these samples to a specific identifiable animal.  No wonder the results are so mixed.  New species identification needs a holotype or very convincing photos or videos to accompany a DNA sample.  For example, Sample 26 was collected under two feet of snow weeks after the reported shooting of a sasquatch in the general vicinity.  Who can say that it was related to the shooting since no whole body was found, only this small patch of fur and tissue?

Issue (8) also merits discussion.  The three nuclear DNA sequences align as follows with one another:

S31 vs. S26: Query Cover (alignment) 8% of S31.

S140 vs. S26: Query Cover 62% of S140.

S31 vs. S140: Query Cover 8% of S31.

I pointed this out to Melba in April, 2013.  These cannot be the same species.  Sample 31 is especially far removed from the other two.  This made sense when I found out that S26 was a bear, S31 a human, and S140 a dog.   Notice that the bear and dog align better, being more closely related. 

I pointed out to Melba that hair diameter comparisons require statistical analysis (issue (9)).  One cannot simply compare two samples and say with certainty whether they belong to the same population.  A statistical distribution of the suspected population is needed, and preferably also of the target organism.  Any two hair samples may be outliers.  Standard statistical tests can determine the probability that a sample or collection of samples belongs to a specific population.  She didn’t do this. 

Lastly, issue (5) contamination is important, but hard to prove, especially when some samples were misidentified.  My third paper demonstrates multiple species in some samples.  What is the “origin” and what is “contamination” can be difficult to answer, but in any case no new species was proven to exist.  A number of specific gene sequences did not align with any species in several databases.  These may be examples of issue (15) unknown microorganisms and mispriming, as pointed out by one reviewer.
No wonder all six reviewers agreed on issue (1) Inadequately substantiated thesis.  The results do not support the conclusions. This is precisely what I found in my papers - flawed interpretation of results. 

Were these reviews fair and unbiased?  Based on the results in my three papers, they were on target, although having the DNA sequences available from the start would have made possible better substantiated reviewer criticism in some cases, especially for the Nature reviewers.  The paper didn’t provide the usual kinds of evidence for a new species, so the reviewers did not recommend publication in its final form.  In some cases Melba was able to satisfy relatively minor requests for semantic and other changes, but more importantly the DNA sequences when carefully interpreted do not point to a new species.  This does not disprove the existence of sasquatch, only that the Ketchum study does not prove its existence.

I leave you with the following quotation from Scott Carpenter (see his blog site at right):

“I make a challenge to any one that has access to the BLAST application. Take the supplemental attachments in Dr. Ketchum's study that contain the raw DNA sequence data, convert those PDF files into the correct format and then compare the sequences to the GenBank database and see if there are any matches. We already know the data is good, it has been produced by the University of Texas. Have the courage to run the BLAST search and publish the results! Because you know they will show no match, that the species represented in the data is a NEW species, a NEW homo sapiens species  a BIGFOOT..... “

Thanks, Scott, I did just that.  Sample 26 is a bear.  Sample 31 is human.  Sample 140 is a dog.  Nothing new here.  See my first paper.  Now, Scott, please take your own challenge and follow my procedures as outlined in my three part blog on using BLAST™.  Then show us YOUR results.  Until you do so, you have no credibility.  Walk the talk, please.  Your University of Texas experts blew it, big time.  Ask them to review my papers as I did theirs and tell us what they said.  I asked Melba to do this and she refused.  I sent coauthor Dr. Fan Zhang of the University of North Texas my work, requested a response, and received none.  These are not examples of the open mindedness and unbiasedness which Melba demands of others.   

And no, Scott and Melba, “Accept for publication with
revisions” which you refused to make does not "pass
peer review."