Should a larger K value be chosen? #133

GLking123 · 2024-06-25T13:29:13Z

Hello,

When I used Kmer=21, the genomescope graph is as follows:

When I used Kmer=27, the genomescope graph is as follows:

Flow cytometry estimated the genome size to be around 50G, and the assembled genome is also around 50G. However, this is significantly different from the above genomescope graph. Should I increase the kmer value? For example, to 31?

For the above question, could you provide some debugging suggestions? Thank you for your valuable time and assistance. I sincerely look forward to your response！

mschatz · 2024-07-28T22:04:27Z

GenomeScope reports the haploid genome size, but it looks like you have a tetraploid. So the estimated genome size here would need to be multiplied by 4 to compute the total DNA content, e.g. for a human sample it will report 3Gbp for the genome size but this needs to be multiplied by 2 to reach the total content. Otherwise you may need to adjust the kmer counting to account for the very high frequency kmers. This often gets truncated at 1000x or 10000x but you will need to push this out to 100,000x or higher to capture the most abundant repeats Good luck! Mike

…

On Tue, Jun 25, 2024 at 9:29 AM GLking123 ***@***.***> wrote: Hello, When I used Kmer=21, the genomescope graph is as follows: image.png (view on web) <https://github.com/schatzlab/genomescope/assets/71629239/e82442a7-bf98-42dc-a045-a7be405e5de9> When I used Kmer=27, the genomescope graph is as follows: image.png (view on web) <https://github.com/schatzlab/genomescope/assets/71629239/055b224c-c15f-4bb3-9701-0506ae48f405> Flow cytometry estimated the genome size to be around 50G, and the assembled genome is also around 50G. However, this is significantly different from the above genomescope graph. Should I increase the kmer value? For example, to 31? For the above question, could you provide some debugging suggestions? Thank you for your valuable time and assistance. I sincerely look forward to your response！ — Reply to this email directly, view it on GitHub <#133>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABP344HHPGBL24LUEKQ2TDZJFWD3AVCNFSM6AAAAABJ33SVYCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3TENZQHE2TKOA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Surbhigrewal · 2025-01-17T10:00:33Z

Hi @mschatz I read a recent paper where they reported that by using higher kmer size they were able to get the correct genome size. They estimated the genome size using different algorithms with different K-mer sizes. Please see figure below. I am working on the same species and facing the same problem with a low k-mer size (21-31). But interestingly when I tried k-mer size 171 I couldn't replicate their result. Is there any other parameter that needs considering when counting using a higher k-mer size?

mschatz · 2025-01-21T01:24:05Z

Hi. I cannot recommend going to these larger kmer lengths as these will be more sensitive to sequencing errors. If you are using a kmer this long, many (or most) of the kmers will intersect a sequencing error leading to undercounting kmers that should be part of the homozygous or heterozygous peaks Good luck Mike

…

On Fri, Jan 17, 2025 at 5:00 AM Surbhigrewal ***@***.***> wrote: Hi @mschatz <https://github.com/mschatz> I read a recent paper where they reported that by using higher kmer size they were able to get the correct genome size. They estimated the genome size using different algorithms with different K-mer sizes. Please see figure below. I am working on the same species and facing the same problem with a lower k-mer (21-31). But interestingly when I tried k-mer 171 I couldn't replicate their result. Is there any other parameter that needs considering when counting using a higher k-mer size? Screenshot.2025-01-17.at.09.34.10.png (view on web) <https://github.com/user-attachments/assets/941668be-fe95-45c4-9adb-63cc3bb1fa18> Screenshot.2025-01-17.at.10.00.21.png (view on web) <https://github.com/user-attachments/assets/0dc3aa97-86c0-4b2d-914e-d70955413789> — Reply to this email directly, view it on GitHub <#133 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABP347LSLFCUYROE2DAOKL2LDIFPAVCNFSM6AAAAABVLRRGMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJXHA3DGNBVG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should a larger K value be chosen? #133

Should a larger K value be chosen? #133

GLking123 commented Jun 25, 2024

mschatz commented Jul 28, 2024 via email

Surbhigrewal commented Jan 17, 2025 •

edited

Loading

mschatz commented Jan 21, 2025 via email

Should a larger K value be chosen? #133

Should a larger K value be chosen? #133

Comments

GLking123 commented Jun 25, 2024

mschatz commented Jul 28, 2024 via email

Surbhigrewal commented Jan 17, 2025 • edited Loading

mschatz commented Jan 21, 2025 via email

Surbhigrewal commented Jan 17, 2025 •

edited

Loading