-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
__gen_read routine does not reverse insertions on reverse strand #3
Comments
Thanks for finding this problem. Another obvious (not neatest, nor most performant) fix in a few lines is to generate both reads in forward direction and reverse & complement tmp_seq[1] afterwards:
|
Addition to the fix proposed above to make sure ext_coor is correct:
|
There is a subtle problem (or really 2 problems) with bredeson's original fix as it stands - reverse reads that terminate at or within insertions are incorrect. A reverse read that terminates exactly at the location of an insertion will have a last base equal to the reference rather than the first base of the (r.c.) of the insertion, and reverse reads that terminate in the middle of an insertion will have a truncated insertion.
|
I'm not a C programmer, but (perl is similar enough and) I produced a 'git diff' of the changes I made to fix the issue. Perhaps there is a better way of doing this (insertion size is limited to length of longest end read), but tested it and it works:
diff --git a/wgsim.c b/wgsim.c
index 5c82192..faf31c1 100644
--- a/wgsim.c
+++ b/wgsim.c
@@ -312,11 +312,20 @@ void wgsim_core(FILE *fpout1, FILE *fpout2, const char *fn, int is_hap, uint64_t
tmp_seq[x][k++] = c & 0xf;
if (mut_type == SUBSTITUTE) ++n_sub[x];
} else { \
The text was updated successfully, but these errors were encountered: