Abstract:
Microbiome/host interactions describe characteristics that affect the host health; shotgun metagenomics sequences microbiome samples, allowing us to analyze its taxonomic and metabolic potential. Reconstruction of metagenome fragments into genomes (called metagenome-assembled genomes) that facilitates linking function to taxa within microbial symbionts. Reconstruction of genomes sort assembled sequences into bins, characteristic of a genome. However, the microbial community composition, including taxonomic and phylogenetic diversity may influence genome reconstruction. We determine the optimal reconstruction method for four microbiome projects with variable sequencing platforms, diversity, and environment using a set of parameters to select for optimal assembly and binning tools. We evaluated 3 assemblers (IDBA, MetaVelvet, and SPAdes) and 2 binning tools (GroopM and MetaBat) for four projects (105 metagenomes). We find that SPAdes assembled more contigs (143,718 ± 124) of longer length (N50 = 1632 ± 108 bp), incorporated the most sequences (19.65 %), and low chimera levels (microbial richness and evenness were maintained across assembly). SPAdes assembly was responsive to biological and technological variations within the projects. MetaBat binning tool produced bins, characteristic of a genome with less GC variation (standard deviation 1.49), low species richness (4.91 ± 0.66), and higher genome completeness (40.92 ± 1.75). MetaBat extracted 115 bins of which 66 bins were identified as quality reconstructed metagenome-assembled genomes with a genus specific sequences. In conclusion, we present a set of biologically relevant parameters to select for optimal assembly and binning tools. SPAdes and MetaBat tools reconstructed quality metagenome-assembled genomes for the four projects included in this study.