Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different results from ska2 0.3.2 and 0.3.6 #69

Closed
danrlu opened this issue Mar 7, 2024 · 3 comments
Closed

different results from ska2 0.3.2 and 0.3.6 #69

danrlu opened this issue Mar 7, 2024 · 3 comments

Comments

@danrlu
Copy link

danrlu commented Mar 7, 2024

I was trying to run the latest version 0.3.6 (from conda install -c bioconda ska2 today) with our data and the results are different from 0.3.2. The command for all the following analysis are the same:

ska build -o seqs_ska2_strict --min-count 4 --min-qual 20 --threads 4 -k 31 --qual-filter strict -f ska2_input.tsv
ska distance --filter-ambiguous seqs_ska2_strict.skf > distances_ska2_strict.txt

We have 40 samples:
0.3.2:
image

0.3.6:
image

To help debug, I took 5 samples from the cluster in bottom right corner and subsampled to 1/5 of the read counts so the file is smaller. You can find them here https://github.com/danrlu/debug_data/tree/main/ska:

With 5 samples (see ska2_input_more.tsv)
0.3.2:

Sample1	Sample2	Distance	Mismatches
pt59	pt60	6.00	0.19603
pt59	pt61	5.00	0.21242
pt59	pt74	7.00	0.21048
pt59	pt75	7.00	0.19414
pt60	pt61	7.00	0.21246
pt60	pt74	7.00	0.21003
pt60	pt75	7.00	0.19471
pt61	pt74	2.00	0.22424
pt61	pt75	6.00	0.21081
pt74	pt75	7.00	0.21033

and 0.3.6

Sample1	Sample2	Distance	Mismatches
pt59	pt60	20.00	0.19603
pt59	pt61	21.17	0.21242
pt59	pt74	23.17	0.21048
pt59	pt75	24.00	0.19414
pt60	pt61	21.50	0.21246
pt60	pt74	20.00	0.21003
pt60	pt75	19.50	0.19471
pt61	pt74	16.17	0.22424
pt61	pt75	20.67	0.21081
pt74	pt75	19.67	0.21033

With 2 samples (see ska2_input.tsv), both 0.3.2 and 0.3.6 gave the same results:

Sample1	Sample2	Distance	Mismatches
pt60	pt61	12.00	0.21246

I checked the documentation and didn't see changes of setting for the options in the command. Let me know what else I should try~~ Thanks!!

@johnlees
Copy link
Member

johnlees commented Mar 8, 2024

The following have changed between these versions:

So some filtering default changes – could these have affected your results?

@danrlu
Copy link
Author

danrlu commented Mar 8, 2024

Let me look through them. Thanks!

@johnlees
Copy link
Member

johnlees commented Aug 5, 2024

Closing due to lack of update

@johnlees johnlees closed this as completed Aug 5, 2024
@johnlees johnlees closed this as not planned Won't fix, can't repro, duplicate, stale Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants