Skip to content

Question about de novo multi-species Nocardiopsis database #339

Open
@ERBringHorvath

Description

@ERBringHorvath

I was hoping to ask a question about a draft Nocardiopsis spp. model I've built using PopPUNK; I work in the Natural Products space, primarily with phylum Actinomycetota—a group of bacteria that can be incredibly genetically diverse even at the species level. I am working with a novel Nocardiopsis species and was using PopPUNK to try to determine its taxonomic position relative to Nocardiopsis genomes available via NCBI. Because my strain is a new species (confirmed chemotaxonomically), I need a tool that would be appropriate for mixed-species taxonomic characterization. I started with PopPUNK because the 2019 publication highlighted its utility in classifying multi-species cohorts of bacterial pathogens, and also because I'd like to use core + accessory genomes to build a phylogeny. My question is this—is PopPUNK appropriate for a diverse genus like Nocardiopsis, given the following results?

I've attached all files generated from the database creation here as a .tar.gz file.

Reference Nocardiopsis genomes were downloaded from NCBI using Datasets. 157 genomes (either complete or draft) were used.

The following workflow was used to generate this first database:
Sketch:
poppunk --create-db --output nocardiopsis_ref --r-files ref.txt --min-k 13 --max-k 29

QC database:
poppunk --qc-db --ref-db nocardiopsis_ref --qc-keep

This resulted in 128 files failing QC for various reasons, mostly for failing distance.

Fit model:
poppunk --fit-model bgmm --ref-db nocardiopsis_ref --output nocardiopsis_ref

Fit summary:
Avg. entropy of assignment 0.0160
Number of components used 2

Scaled component means:
[0.69171908 0.07866528]
[0.36643846 0.3914161 ]

Network summary:
Components 1
Density 0.1921
Transitivity 0.5733
Mean betweenness 0.1924
Weighted-mean betweenness 0.1924
Score 0.4631
Score (w/ betweenness) 0.3740
Score (w/ weighted-betweenness) 0.3740
Removing 136 sequences

Thanks!

nocardiopsis_ref.tar.gz

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions