Correcting for batch effects in case-control microbiome studies.
| Authors | |
| Keywords | |
| Abstract | High-throughput data generation platforms, like mass-spectrometry, microarrays, and second-generation sequencing are susceptible to batch effects due to run-to-run variation in reagents, equipment, protocols, or personnel. Currently, batch correction methods are not commonly applied to microbiome sequencing datasets. In this paper, we compare different batch-correction methods applied to microbiome case-control studies. We introduce a model-free normalization procedure where features (i.e. bacterial taxa) in case samples are converted to percentiles of the equivalent features in control samples within a study prior to pooling data across studies. We look at how this percentile-normalization method compares to traditional meta-analysis methods for combining independent p-values and to limma and ComBat, widely used batch-correction models developed for RNA microarray data. Overall, we show that percentile-normalization is a simple, non-parametric approach for correcting batch effects and improving sensitivity in case-control meta-analyses. |
| Year of Publication | 2018
|
| Journal | PLoS Comput Biol
|
| Volume | 14
|
| Issue | 4
|
| Pages | e1006102
|
| Date Published | 2018 04
|
| ISSN | 1553-7358
|
| DOI | 10.1371/journal.pcbi.1006102
|
| PubMed ID | 29684016
|
| PubMed Central ID | PMC5940237
|
| Links | |
| Grant list | P30 DK043351 / DK / NIDDK NIH HHS / United States
|