Intraspecies associations from strain-rich metagenome samples.

Cell reports
Authors
Keywords
Abstract

Genetically distinct strains of a species can vary widely in phenotype, reducing the utility of species-resolved microbiome measurements for detecting associations with health or disease. While metagenomics theoretically provides information on all strains in a sample, current strain-resolved analysis methods face a tradeoff: de novo genotyping approaches can detect novel strains but struggle when applied to strain-rich or low-coverage samples, while reference database methods work robustly across sample types but are insensitive to novel diversity. We present PHLAME, a method that bridges this divide by combining the advantages of reference database approaches with novelty awareness. PHLAME explicitly defines clades at multiple phylogenetic levels and introduces a probabilistic, mutation-based framework to quantify novelty from the nearest reference. By applying PHLAME to publicly available human skin and vaginal metagenomes, we find clade associations with coexisting species, geography, and host age. The ability to characterize intraspecies associations and dynamics in previously inaccessible environments will enable strain-level insights from accumulating metagenomic data.

Year of Publication
2025
Journal
Cell reports
Volume
44
Issue
8
Pages
116134
Date Published
08/2025
ISSN
2211-1247
DOI
10.1016/j.celrep.2025.116134
PubMed ID
40811063
Links