Another pre-publication review of: Reagent and laboratory contamination can critically impact sequence-based microbiome analyses


This is a pre-publication review of the manuscript entitled: “Reagent and laboratory contamination can critically impact sequence-based microbiome analyses” by Salter et. al, 2014 Another pre-publication review of the same manuscript can be found at

Reviewer's report


The manuscript by Salter et al. describes an analysis of the contaminants that can be introduced by the use of common DNA purification kits. The topic of this manuscript is particularly relevant as illustrated by a number of recent publications where likely contaminants (soil- or plant-associated bacteria) have, most likely erroneously, been linked to disease in humans.


I feel that the manuscript by Salter et al. importantly expands on a number of recent publications which have highlighted the dangers of contaminating DNA in microbiome studies. The authors have performed a thorough, quantitative analysis of the effects of contaminants in 16S rRNA profiling and metagenomic sequencing and convincingly show that studies of low-biomass samples can be importantly affected by contamination. I have found no serious deficiencies in this manuscript, but have outlined suggestions for improvements and corrections below.


The authors mostly focus on discussing their data in the context of human microbiome studies. Perhaps the authors could expand on how they feel that their findings and proposed guidelines should be used in non-human settings, particularly in studies of soil and water microbiomes, where some of the common contaminants may be naturally present in the samples.


I do not know the specific contents of the kits that were used in this study. Do they contain all necessary buffers or are self-prepared buffers or water from the laboratory also used during DNA purification? If the kits provide all materials, please state this specifically as this would rule out an important possibility for contamination originating from the laboratory.


l. 43. The list of contaminant genera that were detected in sequenced negative “blank” controls (Table 1) is interesting and important for future reference, but it is unclear how these data were collected. Information should be provided to clarify how this list has been compiled.


DNA of the S. bongori cultures was isolated using different batches of the same kit in the laboratories participating in this study. This means that any contamination that is detected may also be introduced during sample preparation in these laboratories. While the authors shortly mention that this experiment cannot distinguish between contamination of the kit and contamination that was introduced during handling (line 91), I feel this limitation should be made clearer in the text. Indeed, the observation that contaminants are originating from the kit, is more convincingly shown in the qPCR analysis of bacterial biomass eluted from a DNA extraction kit (Fig S2) and the metagenomic sequencing section, where DNA was isolated with different kits in the same laboratory.


l. 99 – 105. I would prefer to add Fig. S2 to the main text of the manuscript as it is very informative to of the level of contamination in the eluate from a DNA extraction kit. However, it is unclear from the text which kit was used for this experiment: this information should be added to the manuscript.


Fig. 2a. Levels of contaminants in the PSP kit appear to be extremely high, as DNA purification from an undiluted S. bongori culture results in >80% contaminating DNA sequences. The contamination of this kit should therefore be equivalent to approximately 4 x 10^9 cells/kit! Perhaps the authors can carefully check their data: possibly some type of contaminant has been introduced during handling of the samples? Or does this kit perform exceedingly poorly for the isolation of DNA from Gram-negative bacteria?


Fig 2b. I believe this figure may be clearer when data are presented on the ‘Order’ or 'Class' level as the colours, particularly the greens and blues, are very difficult to distinguish. Perhaps the current Figure 2b can be moved to the supplementary data.


l. 133. What do the authors mean with the ‘paucity of reads’? Why is the number of reads in the MB samples low?


l. 334. New England Biolabs produces several different types of Q5 DNA polymerases. Please specify the one that was used in this study.


Data should be made available in appropriate sequence data repositories.


Level of interest:5. Outstanding general interest


Quality of written English:Yes


Statistical review:No


What next?:Accept with discretionary revisions


This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.