Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accumulation issues during counting #56

Closed
donaldcampbelljr opened this issue Dec 12, 2024 · 5 comments
Closed

Accumulation issues during counting #56

donaldcampbelljr opened this issue Dec 12, 2024 · 5 comments

Comments

@donaldcampbelljr
Copy link
Member

Currently seeing accumulations during some counting functions while using gtars uniwig within the PEPATAC pipeline. Seems to be caused by a couple of issues.

Example of what this looks like in IGV when comparing bw files:
image

This issue appears to be rooted in the variable_shifted_bam_to_bw function and can be caused by two different events.

  1. A step size greater than 1
  2. smoothsize less than ~ 2

For the 2nd issue, because variable_shifted_bam_to_bw takes either the start or the end and uses it as the adjusted_start_site (or cut_size), occaisionally there is a situation where the newest coordinate and its corresponding endsite is less than the current coordinate position and current endsite. This can cause a count to accumulate which is never decremented. You can attempt to add this new, lower endsite to the endsite queue but the coordinate position is already accounted for. Since we push to these count vectors instead of accessing by index, this causes problems if we try to count previously passed coordinates. A simple solution is to just skip these counts. Another method would be to refactor and use an approach where we access elements in the count array via index.

Example of what this looks like as an output.

coordinate_position< adjusted_start_site: 16601267 < 16601268 . here is current endsite: 16601273 
Here is read.reference_start 16601249 and read.reference_end 16601291
here is shifted_pos -> 16601253
adjusted start site for new coord: 16601251
new endsite for new coord: 16601256
@donaldcampbelljr
Copy link
Member Author

I believe these commits have fixed the accumulation issue:
d960854
8a12cd6

@nleroy917
Copy link
Member

The latest updates from dev seem to cause a few issues now. uniwig ran fine (albeit a bit slower than anticipated), and converting to big wigs using kent utils yielded an error:

There's more than one value for chr1 base 10073 (in coordinates that start with 1).

@nleroy917
Copy link
Member

Indeed, there are multiples:

cat chr1_core.wig | grep 10073
fixedStep chrom=chr1 start=10073 step=1
fixedStep chrom=chr1 start=10073 step=1

@donaldcampbelljr
Copy link
Member Author

Did you clear the output directory before re-running? It will not overwrite but rather append files that are already present.

@donaldcampbelljr
Copy link
Member Author

Closing with 0.2.0 Release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants