Benchmarking bacterial taxonomic classification using nanopore metagenomics data of several mock communities

dc.contributor.authorVan Uffelen, Alexander
dc.contributor.authorPosadas, Andres
dc.contributor.authorRoosens, Nancy H.C.
dc.contributor.authorMarchal, Kathleen
dc.contributor.authorDe Keersmaecker, Sigrid C. J.
dc.contributor.authorVanneste, Kevin
dc.date.accessioned2024-09-18T05:47:26Z
dc.date.available2024-09-18T05:47:26Z
dc.date.issued2024-08
dc.descriptionDATA AVAILABILITY : The datasets presented in this study originate from other studies and can be found under the run accessions in Table 1. The output reports with all metrics and plots are available on Zenodo (https://zenodo.org/doi/10.5281/zenodo.11371848)en_US
dc.descriptionCODE AVAILABILITY : The source code to perform the analysis and generate the output reports is publicly available on GitHub (https://github.com/BioinformaticsPlatformWIV-ISP/BenchmarkingClassifiers) accompanied by an example dataset showcasing the expected output structure and final output file.en_US
dc.description.abstractTaxonomic classification is crucial in identifying organisms within diverse microbial communities when using metagenomics shotgun sequencing. While second-generation Illumina sequencing still dominates, third-generation nanopore sequencing promises improved classification through longer reads. However, extensive benchmarking studies on nanopore data are lacking. We systematically evaluated performance of bacterial taxonomic classification for metagenomics nanopore sequencing data for several commonly used classifiers, using standardized reference sequence databases, on the largest collection of publicly available data for defined mock communities thus far (nine samples), representing different research domains and application scopes. Our results categorize classifiers into three categories: low precision/high recall; medium precision/medium recall, and high precision/medium recall. Most fall into the first group, although precision can be improved without excessively penalizing recall with suitable abundance filtering. No definitive ‘best’ classifier emerges, and classifier selection depends on application scope and practical requirements. Although few classifiers designed for long reads exist, they generally exhibit better performance. Our comprehensive benchmarking provides concrete recommendations, supported by publicly available code for reassessment and fine-tuning by other scientists.en_US
dc.description.departmentGeneticsen_US
dc.description.librarianhj2024en_US
dc.description.sdgSDG-15:Life on landen_US
dc.description.sponsorshipSciensano, Belgium.en_US
dc.description.urihttp://www.nature.com/sdata/en_US
dc.identifier.citationVan Uffelen, A., Posadas, A., Roosens, N.H.C. et al. Benchmarking bacterial taxonomic classification using nanopore metagenomics data of several mock communities. Scientific Data 11, 864 (2024). https://doi.org/10.1038/s41597-024-03672-8.en_US
dc.identifier.issn2052-4463 (online)
dc.identifier.other10.1038/s41597-024-03672-8
dc.identifier.urihttp://hdl.handle.net/2263/98287
dc.language.isoenen_US
dc.publisherNature Researchen_US
dc.rights© The Author(s) 2024. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License.en_US
dc.subjectClassification and taxonomyen_US
dc.subjectMetagenomicsen_US
dc.subjectSDG-15: Life on landen_US
dc.titleBenchmarking bacterial taxonomic classification using nanopore metagenomics data of several mock communitiesen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
VanUffelen_Benchmarking_2024.pdf
Size:
1.88 MB
Format:
Adobe Portable Document Format
Description:
Article
Loading...
Thumbnail Image
Name:
VanUffelen_BenchmarkingSuppl_2024.pdf
Size:
1.08 MB
Format:
Adobe Portable Document Format
Description:
Supplementary Material

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: