-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should a larger K value be chosen? #133
Comments
GenomeScope reports the haploid genome size, but it looks like you have a
tetraploid. So the estimated genome size here would need to be multiplied
by 4 to compute the total DNA content, e.g. for a human sample it will
report 3Gbp for the genome size but this needs to be multiplied by 2 to
reach the total content. Otherwise you may need to adjust the kmer counting
to account for the very high frequency kmers. This often gets truncated at
1000x or 10000x but you will need to push this out to 100,000x or higher to
capture the most abundant repeats
Good luck!
Mike
…On Tue, Jun 25, 2024 at 9:29 AM GLking123 ***@***.***> wrote:
Hello,
When I used Kmer=21, the genomescope graph is as follows:
image.png (view on web)
<https://github.com/schatzlab/genomescope/assets/71629239/e82442a7-bf98-42dc-a045-a7be405e5de9>
When I used Kmer=27, the genomescope graph is as follows:
image.png (view on web)
<https://github.com/schatzlab/genomescope/assets/71629239/055b224c-c15f-4bb3-9701-0506ae48f405>
Flow cytometry estimated the genome size to be around 50G, and the
assembled genome is also around 50G. However, this is significantly
different from the above genomescope graph. Should I increase the kmer
value? For example, to 31?
For the above question, could you provide some debugging suggestions?
Thank you for your valuable time and assistance. I sincerely look forward
to your response!
—
Reply to this email directly, view it on GitHub
<#133>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABP344HHPGBL24LUEKQ2TDZJFWD3AVCNFSM6AAAAABJ33SVYCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3TENZQHE2TKOA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hi @mschatz I read a recent paper where they reported that by using higher kmer size they were able to get the correct genome size. They estimated the genome size using different algorithms with different K-mer sizes. Please see figure below. I am working on the same species and facing the same problem with a low k-mer size (21-31). But interestingly when I tried k-mer size 171 I couldn't replicate their result. Is there any other parameter that needs considering when counting using a higher k-mer size? |
Hi. I cannot recommend going to these larger kmer lengths as these will be
more sensitive to sequencing errors. If you are using a kmer this long,
many (or most) of the kmers will intersect a sequencing error leading to
undercounting kmers that should be part of the homozygous or
heterozygous peaks
Good luck
Mike
…On Fri, Jan 17, 2025 at 5:00 AM Surbhigrewal ***@***.***> wrote:
Hi @mschatz <https://github.com/mschatz> I read a recent paper where they
reported that by using higher kmer size they were able to get the correct
genome size. They estimated the genome size using different algorithms with
different K-mer sizes. Please see figure below. I am working on the same
species and facing the same problem with a lower k-mer (21-31). But
interestingly when I tried k-mer 171 I couldn't replicate their result. Is
there any other parameter that needs considering when counting using a
higher k-mer size?
Screenshot.2025-01-17.at.09.34.10.png (view on web)
<https://github.com/user-attachments/assets/941668be-fe95-45c4-9adb-63cc3bb1fa18> Screenshot.2025-01-17.at.10.00.21.png
(view on web)
<https://github.com/user-attachments/assets/0dc3aa97-86c0-4b2d-914e-d70955413789>
—
Reply to this email directly, view it on GitHub
<#133 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABP347LSLFCUYROE2DAOKL2LDIFPAVCNFSM6AAAAABVLRRGMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJXHA3DGNBVG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hello,
When I used Kmer=21, the genomescope graph is as follows:
When I used Kmer=27, the genomescope graph is as follows:
Flow cytometry estimated the genome size to be around 50G, and the assembled genome is also around 50G. However, this is significantly different from the above genomescope graph. Should I increase the kmer value? For example, to 31?
For the above question, could you provide some debugging suggestions? Thank you for your valuable time and assistance. I sincerely look forward to your response!
The text was updated successfully, but these errors were encountered: