Tetraploid species #201

Homap · 2025-05-19T10:33:17Z

Hello,

Thank you so much for the great software and the time you put in helping us with our questions.

I have troubles understanding my smudgeplot. I have used following commands to generate it

FastK -v -t4 -k31 -M16 -T4 $fastq_filtered_dir/P18758_173_S83_L004_R1_001.filtered.fastq.gz $fastq_filtered_dir/P18758_173_S83_L004_R2_001.filtered.fastq.gz -NBAT4x_FastK_db
Histex -G BAT4x_FastK_db > BAT4x_Kmer31.hist
genomescope2 -i BAT4x_Kmer31.hist -k 31 -p 4 -o BAT4x -n BAT4x
smudgeplot.py hetmers -L 5 -t 4 -o BAT4x --verbose ../FastK_db/BAT4x_FastK_db
smudgeplot.py plot -o BAT4x_smudgeplot_2D BAT4x_smudgeplot_masked_errors_smu.txt BAT4x_smudgeplot_smudge_sizes.txt 13.2

and it look like this:

BAT4x_smudgeplot_2D_10thtry_smudgeplot_log10.pdf

Now, I know already of ploidy from flow cytometry data (Tetraploidy). The histograms makes sense, I think. I was wondering whether this is indicative of autopolyploidy or allopolyploidy? I was also wondering about the smudgeplot, do you think it's worth going with this analysis given my low haploid coverage of about 13X?

Our main goal here is to try to differentiate between modes of ploidy, if auto- or allo-. I'd appreciate your help very much with this.

Thank you in advance,
Homa

The text was updated successfully, but these errors were encountered:

Homap · 2025-05-19T13:48:22Z

Sorry, another question. I downloaded smudgeplot using conda. However, I cannot get the top and right-side histograms. I tried to add these myself but I think I haven't been totally successful yet. I see in other examples of smudgeplot that these graphs are also produced. I also tried copy and paste the code from the github into the conda installation but it still didn't work. Thanks so much for your help again!

KamilSJaron · 2025-05-20T08:13:39Z

Hi, this indeed look like a tetraploid. It's one of those funny cases that are hard to make anything out of - what is the species? Is it sexual?

Hannes Becher developed some explicit expectations for auto- and allo- tetraploid k-mer spectra. In his model, it always needs to be auto- when the first peak is the tallest: https://www.cell.com/plant-communications/fulltext/S2590-3462(20)30133-4?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS2590346220301334%3Fshowall%3Dtrue

However, there are two caveats to that - first, hist model is a symmetrical allo- so AABB style genomes. If you had a hybrid that would be AAAB the expectation would break down (e.g. in case of root knot nematodes) and the second caveat is that you need to trust that you know where the 1n peak is - is your genome size est in the right ballpark? Is the heterozygosity sensible? What is the species?

Homap · 2025-05-20T08:27:53Z

Hi Kamil, Thank you so much for your quick responses and the time you put in this. The organism is a flowering plant called Lithophragma bolanderi. The tetraploidy nature of it was determined using flow cytometry. We are puzzled by it because originally the ITS sequences, sequenced using Sanger, showed an allopolyploidy origin. Excited, we sequenced the parental diploids and the polyploids but the genome-wide data fail to show any sign of allopolyploidy. In the PCA, polyploids cluster with one of the parents only, in STRUCTURE, the same. The reviewers asked for the kmer analysis using GenomeScope2. The genome size reported based on flowcytometry is about 470 Mb, similar to the one reported by GenomeScope2, however, our assembly size is about 780 Mb. Sorry, it's all have been a bit confusing. I'd appreciate any advice you may have. Thank you so much!

KamilSJaron · 2025-05-20T12:26:52Z

"The genome size reported based on flowcytometry is about 470 Mb"

what is this number? 1C? 2C? Or do you divide it by ploidy?

Well the reviewer is quite right that you need to be cautious about what you are actually looking at - if you have uncollapsed haplotypes and you call variants and do STRUCTURE, that will be a disaster.

Your coverage / genome model / genome assembly MUST make sense together.

Did you read through that wiki? Give it a few days of playing around the models, perhaps look at the BGA tutorial about genomescope too. I am sorry, I don't have capacity to help with this more right now... I am preparing for a k-mer course we are running 1.-6. of June...

Homap · 2025-05-20T12:42:09Z

This is already great! Thank you so much and good luck with the course!

KamilSJaron added smudgeplot included if smudgeplot was posted with the quesiton / problem genomescope included labels May 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tetraploid species #201

Tetraploid species #201

Homap commented May 19, 2025 •

edited

Loading

Homap commented May 19, 2025

Uh oh!

KamilSJaron commented May 20, 2025

Uh oh!

Homap commented May 20, 2025

Uh oh!

KamilSJaron commented May 20, 2025

Uh oh!

Homap commented May 20, 2025

Uh oh!

Tetraploid species #201

Tetraploid species #201

Comments

Homap commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Homap commented May 19, 2025

Uh oh!

KamilSJaron commented May 20, 2025

Uh oh!

Homap commented May 20, 2025

Uh oh!

KamilSJaron commented May 20, 2025

Uh oh!

Homap commented May 20, 2025

Uh oh!

Homap commented May 19, 2025 •

edited

Loading