-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors out at peak calling #122
Comments
Haven't seen such error before. Do you mind provide me a minimum dataset to reproduce such error? |
Sure, we are having some issues with a server upgrade at the moment. I'll see if I can reproduce it, then forward whatever I have along. |
I think this issue is being caused by an memory access error. I'm loading a rather large genome index to shared memory for alignments, and this problem only seems to happen when I don't purge it or have enough additional swap prior to commencing with the next steps in the pipeline. So I'm thinking a heap is being assigned or written to by numpy that's still associated with the aligner's shared memory object, or something funky like that. However, I get a new error now, still at peak calling, where the output is "slocal can't be smaller than d". As I understand it slocal is an estimation of local 1Kb signal bias, and d is fragment length. When I look at the fragment length estimation calculated by macs2, it's something ridiculous like 12Kb, even when I manually stipulate the shift and extsize values. I've also tried playing around with the -llocal option as well with similar results. I think it may have to do with the bam files I'm using, so I'm testing a few different parameters on a new set of data which previous analyses indicate are good. So I'll let you know when that finishes. If you have any suggestions in the mean time, specifically on how to troubleshoot this slocal error, I would be very grateful. Thanks, |
Ok, so an analysis with identical parameters as before but with the new bam files completed without error, so it has to be the bam file I was using previously. Whew! Thought I had broke macs2 for a second. :D |
Dear Tao Liu, we are experiencing the very same error, since we updated from 2014 to 2015 version. Current version that we are trying with is 2.1.1.20160309, with no luck. We tried different versions of Numpy (1.9, 1.10), with no luck. The PYTHON experts at our facility think that this line may be causing some trouble, but I am not fully sure about that (I am no numeric python expert): Our analysis have been very limited for several weeks. I noticed that not every BAM causes this issue, and it is sort of nondeterministic (seems to happen easily on larger BAMs). Here is a public data BAM to be tested: Thank you in advance for your help, |
@mpallocc Sorry, I just saw your comment! I will test it and let you know. |
@mpallocc I haven't seen errors while MACS2 calls peaks from the file1.align.bam your provided, with default parameters. I got 38K peaks and the header of xls file looks like:
My guess is that, perhaps in some cases you included many non-standard chromosomes in your BAM file and if the data on certain chromosome such as chr19_gl000209_random is too few, MACS2 may have issue to handle it. I will try to fix this potential problem. But at the meantime, a quick fix for your data analysis is to filter out all non-standard chromosomes. |
Dear @taoliu , thank you so much for your help. I am trying right now what you suggested. I was so convinced that it was a bug on the code side that I didn't check deepily the issues on my data, since it was a bit non-deterministic. I'll keep you updated on this. |
Dear Tao, thank you! I will try to test this on the cluster as soon as possible. In the meantime, unfortunately removing the chrUn* from my sample didn't help :( |
for alignment can you try using a non-patched primary assembly then On 05/19/2016 11:33 AM, mpallocc wrote:
|
i believe the twobit hg assembly files from ucsc are also unpatched, if On 05/19/2016 11:47 AM, Marty wrote:
|
@mpallocc Have you tested the new code? If you have any question, please let me know. Since I can't reproduce the error you saw, I purely depend on users' feedback including you :) |
Dear Tao, I just finished my tests, thank you for your patience. Finally, macs2 doesn't crash anymore! This is a massive relief for us :) During the execution, during the peak calling step, a warning is raised: INFO @ Mon, 23 May 2016 21:50:53: #3 Call peaks for each chromosome... But the code does not crash and I compared the peak sets found from previous analyses, I have an overlap from 95% (Transcription Factor) to 100% (Histone Modifications), that is likely due to other MACS2 updates from the end of 2015. I don't know whether we should worry or not about the numpy exception.. anyways, THANK YOU! |
@blikzen Thanks for your feedback! Good to know! It seems that I have to do more checks to eliminate such warnings... If you have a small bam file that I can reproduce such error/warning, please share with me. |
Reopen since user reported the 'owndata' warning still exists. |
@taoliu no problem, unfortunately i dont still have the bam file that was causing the issue. i can try to replicate it as i think it was either an issue with patched alignments or unsorted bam files. |
Dear Tao and Bliz, I think it's really hard to reproduce the problem, since with the very same bam files I had no problem with my own PC, while it crashed on the cluster on the previous versions. I think It should be related to multiple numpy processes/threads and memory usage on a several nodes. I still think it's not input-specific related, that's why I wrote "kind of non-deterministic". |
@taoliu I am experiencing the same errors with MACS2/2.1.1.20160309 and it does not depends on the input bam files, as different samples will crash at each run on the HPC cluster (python/2.7.11, numpy/1.6.2). As mpalloc noted, the problem seems related to ressource sharing, since it occurs only when multiple MACS2 processes run on the same node. I've switched back to MACS2/2.1.0.20140616 and this error does not happen. |
Figured it out. The issue is caused by root file permissions on the default temp directory (/tmp/) specified in MACS2. You can either mess around with the ownership and permissions for the root temp directory, or preferably set the temp directory to a path owned by the user launching the MACS2 process using the --tempdir [path] command line option. Just make sure that you're resulting temp location has enough space... I'm blikzen by the way. |
@lajoimat, @taoliu I also get this error on some files with MACS2/2.1.1.20160309 on HPC cluster ( python 2.7.13, numpy 1.12.1). Haven't tried MACS2/2.1.0.20140616 yet. It cannot be related to tmpdir, I force custom tmp dir using
Update: @taoliu Library built from current master works ok. |
macs2 has a weird and hidden constraint on having enough space in tmp space, see macs3-project/MACS#122
macs2 has a weird and hidden constraint on having enough space in tmp space, see macs3-project/MACS#122
Have you guys run into this before? Know what's causing the issue, or how to work around it?
command im using:
macs2 callpeak -t $chipfile -c $controlfile -B --nomodel --shift $shift --SPMR -g hs -q 0.01 --broad -f BAMPE -n $chipfile.macs2
INFO @ Wed, 16 Mar 2016 22:01:13: #3 Call peaks...
Traceback (most recent call last):
File "/usr/local/bin/macs2", line 5, in
pkg_resources.run_script('MACS2==2.1.1.20160309', 'macs2')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 528, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1394, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 617, in
main()
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 57, in main
run( args )
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/MACS2/callpeak_cmd.py", line 264, in run
peakdetect.call_peaks()
File "PeakDetect.pyx", line 105, in MACS2.PeakDetect.PeakDetect.call_peaks (MACS2/PeakDetect.c:1704)
File "PeakDetect.pyx", line 177, in MACS2.PeakDetect.PeakDetect.__call_peaks_w_control (MACS2/PeakDetect.c:2216)
AssertionError: slocal can't be smaller than d!
Traceback (most recent call last):
File "/usr/local/bin/macs2", line 5, in
pkg_resources.run_script('MACS2==2.1.1.20160309', 'macs2')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 528, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1394, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 617, in
main()
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 57, in main
run( args )
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/MACS2/callpeak_cmd.py", line 264, in run
peakdetect.call_peaks()
File "PeakDetect.pyx", line 105, in MACS2.PeakDetect.PeakDetect.call_peaks (MACS2/PeakDetect.c:1704)
File "PeakDetect.pyx", line 259, in MACS2.PeakDetect.PeakDetect.__call_peaks_w_control (MACS2/PeakDetect.c:3344)
File "CallPeakUnit.pyx", line 1431, in MACS2.IO.CallPeakUnit.CallerFromAlignments.call_broadpeaks (MACS2/IO/CallPeakUnit.c:18287)
File "CallPeakUnit.pyx", line 1488, in MACS2.IO.CallPeakUnit.CallerFromAlignments.call_broadpeaks (MACS2/IO/CallPeakUnit.c:17664)
File "CallPeakUnit.pyx", line 1567, in MACS2.IO.CallPeakUnit.CallerFromAlignments.__chrom_call_broadpeak_using_certain_criteria (MACS2/IO/CallPeakUnit.c:18413)
File "CallPeakUnit.pyx", line 475, in MACS2.IO.CallPeakUnit.CallerFromAlignments.__pileup_treat_ctrl_a_chromosome (MACS2/IO/CallPeakUnit.c:6465)
ValueError: cannot resize this array: it does not own its data
The text was updated successfully, but these errors were encountered: