Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors out at peak calling #122

Open
blikzen opened this issue Mar 16, 2016 · 21 comments
Open

Errors out at peak calling #122

blikzen opened this issue Mar 16, 2016 · 21 comments

Comments

@blikzen
Copy link

blikzen commented Mar 16, 2016

Have you guys run into this before? Know what's causing the issue, or how to work around it?

command im using:

macs2 callpeak -t $chipfile -c $controlfile -B --nomodel --shift $shift --SPMR -g hs -q 0.01 --broad -f BAMPE -n $chipfile.macs2

INFO @ Wed, 16 Mar 2016 22:01:13: #3 Call peaks...
Traceback (most recent call last):
File "/usr/local/bin/macs2", line 5, in
pkg_resources.run_script('MACS2==2.1.1.20160309', 'macs2')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 528, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1394, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 617, in
main()
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 57, in main
run( args )
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/MACS2/callpeak_cmd.py", line 264, in run
peakdetect.call_peaks()
File "PeakDetect.pyx", line 105, in MACS2.PeakDetect.PeakDetect.call_peaks (MACS2/PeakDetect.c:1704)
File "PeakDetect.pyx", line 177, in MACS2.PeakDetect.PeakDetect.__call_peaks_w_control (MACS2/PeakDetect.c:2216)
AssertionError: slocal can't be smaller than d!
Traceback (most recent call last):
File "/usr/local/bin/macs2", line 5, in
pkg_resources.run_script('MACS2==2.1.1.20160309', 'macs2')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 528, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1394, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 617, in
main()
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/EGG-INFO/scripts/macs2", line 57, in main
run( args )
File "/usr/local/lib/python2.7/dist-packages/MACS2-2.1.1.20160309-py2.7-linux-x86_64.egg/MACS2/callpeak_cmd.py", line 264, in run
peakdetect.call_peaks()
File "PeakDetect.pyx", line 105, in MACS2.PeakDetect.PeakDetect.call_peaks (MACS2/PeakDetect.c:1704)
File "PeakDetect.pyx", line 259, in MACS2.PeakDetect.PeakDetect.__call_peaks_w_control (MACS2/PeakDetect.c:3344)
File "CallPeakUnit.pyx", line 1431, in MACS2.IO.CallPeakUnit.CallerFromAlignments.call_broadpeaks (MACS2/IO/CallPeakUnit.c:18287)
File "CallPeakUnit.pyx", line 1488, in MACS2.IO.CallPeakUnit.CallerFromAlignments.call_broadpeaks (MACS2/IO/CallPeakUnit.c:17664)
File "CallPeakUnit.pyx", line 1567, in MACS2.IO.CallPeakUnit.CallerFromAlignments.__chrom_call_broadpeak_using_certain_criteria (MACS2/IO/CallPeakUnit.c:18413)
File "CallPeakUnit.pyx", line 475, in MACS2.IO.CallPeakUnit.CallerFromAlignments.__pileup_treat_ctrl_a_chromosome (MACS2/IO/CallPeakUnit.c:6465)
ValueError: cannot resize this array: it does not own its data

@taoliu
Copy link
Contributor

taoliu commented Mar 17, 2016

Haven't seen such error before. Do you mind provide me a minimum dataset to reproduce such error?

@blikzen
Copy link
Author

blikzen commented Mar 17, 2016

Sure, we are having some issues with a server upgrade at the moment. I'll see if I can reproduce it, then forward whatever I have along.

@blikzen
Copy link
Author

blikzen commented Mar 18, 2016

I think this issue is being caused by an memory access error. I'm loading a rather large genome index to shared memory for alignments, and this problem only seems to happen when I don't purge it or have enough additional swap prior to commencing with the next steps in the pipeline. So I'm thinking a heap is being assigned or written to by numpy that's still associated with the aligner's shared memory object, or something funky like that.

However, I get a new error now, still at peak calling, where the output is "slocal can't be smaller than d". As I understand it slocal is an estimation of local 1Kb signal bias, and d is fragment length. When I look at the fragment length estimation calculated by macs2, it's something ridiculous like 12Kb, even when I manually stipulate the shift and extsize values. I've also tried playing around with the -llocal option as well with similar results. I think it may have to do with the bam files I'm using, so I'm testing a few different parameters on a new set of data which previous analyses indicate are good. So I'll let you know when that finishes. If you have any suggestions in the mean time, specifically on how to troubleshoot this slocal error, I would be very grateful.

Thanks,
Marty

@blikzen
Copy link
Author

blikzen commented Mar 18, 2016

Ok, so an analysis with identical parameters as before but with the new bam files completed without error, so it has to be the bam file I was using previously. Whew! Thought I had broke macs2 for a second. :D

@blikzen blikzen closed this as completed Mar 18, 2016
@mpallocc
Copy link

mpallocc commented May 8, 2016

Dear Tao Liu,

we are experiencing the very same error, since we updated from 2014 to 2015 version. Current version that we are trying with is 2.1.1.20160309, with no luck.

We tried different versions of Numpy (1.9, 1.10), with no luck.

The PYTHON experts at our facility think that this line may be causing some trouble, but I am not fully sure about that (I am no numeric python expert):
https://github.com/taoliu/MACS/blame/master/MACS2/IO/CallPeakUnit.pyx#L475

Our analysis have been very limited for several weeks. I noticed that not every BAM causes this issue, and it is sort of nondeterministic (seems to happen easily on larger BAMs).

Here is a public data BAM to be tested:
https://bioinformatics.cineca.it//cast/files/2/1926/5757//file1.align.bam

Thank you in advance for your help,
Matteo

@taoliu
Copy link
Contributor

taoliu commented May 18, 2016

@mpallocc Sorry, I just saw your comment! I will test it and let you know.

@taoliu
Copy link
Contributor

taoliu commented May 18, 2016

@mpallocc I haven't seen errors while MACS2 calls peaks from the file1.align.bam your provided, with default parameters. I got 38K peaks and the header of xls file looks like:

# This file is generated by MACS version 2.1.1.20160309
# Command line: callpeak -t file1.align.bam
# ARGUMENTS LIST:
# name = NA
# format = AUTO
# ChIP-seq file = ['file1.align.bam']
# control file = None
# effective genome size = 2.70e+09
# band width = 300
# model fold = [5, 50]
# qvalue cutoff = 5.00e-02
# Larger dataset will be scaled towards smaller dataset.
# Range for calculating regional lambda is: 10000 bps
# Broad region calling is off
# Paired-End mode is off

# tag size is determined as 42 bps
# total tags in treatment: 34410327
# tags after filtering in treatment: 28451060
# maximum duplicate tags at the same position in treatment = 1
# Redundant rate in treatment: 0.17
# d = 178
# alternative fragment length(s) may be 178 bps

My guess is that, perhaps in some cases you included many non-standard chromosomes in your BAM file and if the data on certain chromosome such as chr19_gl000209_random is too few, MACS2 may have issue to handle it. I will try to fix this potential problem. But at the meantime, a quick fix for your data analysis is to filter out all non-standard chromosomes.

@mpallocc
Copy link

Dear @taoliu ,

thank you so much for your help. I am trying right now what you suggested. I was so convinced that it was a bug on the code side that I didn't check deepily the issues on my data, since it was a bit non-deterministic. I'll keep you updated on this.

taoliu added a commit that referenced this issue May 19, 2016
@taoliu
Copy link
Contributor

taoliu commented May 19, 2016

@mpallocc Hope the commit 24a1eab solve the issue

@mpallocc
Copy link

Dear Tao,

thank you! I will try to test this on the cluster as soon as possible. In the meantime, unfortunately removing the chrUn* from my sample didn't help :(

@blikzen
Copy link
Author

blikzen commented May 19, 2016

for alignment can you try using a non-patched primary assembly then
proceed with peak calling? e.g.,
ftp://ftp.ensembl.org/pub/release-84/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

On 05/19/2016 11:33 AM, mpallocc wrote:

Dear Tao,

thank you! I will try to test this on the cluster as soon as possible.
In the meantime, unfortunately removing the chrUn* from my sample
didn't help :(


You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub
#122 (comment)

@blikzen
Copy link
Author

blikzen commented May 19, 2016

i believe the twobit hg assembly files from ucsc are also unpatched, if
you prefer that.

On 05/19/2016 11:47 AM, Marty wrote:

for alignment can you try using a non-patched primary assembly then
proceed with peak calling? e.g.,
ftp://ftp.ensembl.org/pub/release-84/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

On 05/19/2016 11:33 AM, mpallocc wrote:

Dear Tao,

thank you! I will try to test this on the cluster as soon as
possible. In the meantime, unfortunately removing the chrUn* from my
sample didn't help :(


You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub
#122 (comment)

@taoliu
Copy link
Contributor

taoliu commented May 23, 2016

@mpallocc Have you tested the new code? If you have any question, please let me know. Since I can't reproduce the error you saw, I purely depend on users' feedback including you :)

@mpallocc
Copy link

mpallocc commented May 24, 2016

Dear Tao,

I just finished my tests, thank you for your patience.

Finally, macs2 doesn't crash anymore! This is a massive relief for us :)

During the execution, during the peak calling step, a warning is raised:

INFO @ Mon, 23 May 2016 21:50:53: #3 Call peaks for each chromosome...
Exception ValueError: 'cannot resize this array: it does not own its data' in 'MACS2.IO.CallPeakUnit.clean_up_ndarray' ignored
Exception ValueError: 'cannot resize this array: it does not own its data' in 'MACS2.IO.CallPeakUnit.clean_up_ndarray' ignored
Exception ValueError: 'cannot resize this array: it does not own its data' in 'MACS2.IO.CallPeakUnit.clean_up_ndarray' ignored
Exception ValueError: 'cannot resize this array: it does not own its data' in 'MACS2.IO.CallPeakUnit.clean_up_ndarray' ignored
Exception ValueError: 'cannot resize this array: it does not own its data' in 'MACS2.IO.CallPeakUnit.clean_up_ndarray' ignored
Exception ValueError: 'cannot resize this array: it does not own its data' in 'MACS2.IO.CallPeakUnit.clean_up_ndarray' ignored
INFO @ Mon, 23 May 2016 22:00:13: #4 Write output xls file... file1.peaks_macs2.macs2_peaks.xls

But the code does not crash and I compared the peak sets found from previous analyses, I have an overlap from 95% (Transcription Factor) to 100% (Histone Modifications), that is likely due to other MACS2 updates from the end of 2015.

I don't know whether we should worry or not about the numpy exception.. anyways, THANK YOU!

@taoliu
Copy link
Contributor

taoliu commented May 25, 2016

@blikzen Thanks for your feedback! Good to know! It seems that I have to do more checks to eliminate such warnings... If you have a small bam file that I can reproduce such error/warning, please share with me.

@taoliu
Copy link
Contributor

taoliu commented May 25, 2016

Reopen since user reported the 'owndata' warning still exists.

@taoliu taoliu reopened this May 25, 2016
@blikzen
Copy link
Author

blikzen commented May 25, 2016

@taoliu no problem, unfortunately i dont still have the bam file that was causing the issue. i can try to replicate it as i think it was either an issue with patched alignments or unsorted bam files.

@mpallocc
Copy link

Dear Tao and Bliz,

I think it's really hard to reproduce the problem, since with the very same bam files I had no problem with my own PC, while it crashed on the cluster on the previous versions. I think It should be related to multiple numpy processes/threads and memory usage on a several nodes. I still think it's not input-specific related, that's why I wrote "kind of non-deterministic".

@lajoimat
Copy link

@taoliu I am experiencing the same errors with MACS2/2.1.1.20160309 and it does not depends on the input bam files, as different samples will crash at each run on the HPC cluster (python/2.7.11, numpy/1.6.2). As mpalloc noted, the problem seems related to ressource sharing, since it occurs only when multiple MACS2 processes run on the same node. I've switched back to MACS2/2.1.0.20140616 and this error does not happen.

@mforde84
Copy link

mforde84 commented Sep 1, 2016

Figured it out. The issue is caused by root file permissions on the default temp directory (/tmp/) specified in MACS2. You can either mess around with the ownership and permissions for the root temp directory, or preferably set the temp directory to a path owned by the user launching the MACS2 process using the --tempdir [path] command line option. Just make sure that you're resulting temp location has enough space...

I'm blikzen by the way.

@iromeo
Copy link

iromeo commented Sep 28, 2017

@lajoimat, @taoliu I also get this error on some files with MACS2/2.1.1.20160309 on HPC cluster ( python 2.7.13, numpy 1.12.1). Haven't tried MACS2/2.1.0.20140616 yet. It cannot be related to tmpdir, I force custom tmp dir using --tempdir option

macs2 callpeak --tempdir /tmp/3444241.mgt2.cluster -t GSM1057025_hg19.bam -c GSM1057030_hg19_input.bam -f BAM -g hs -n GSM1057025_hg19_q0.1 -q 0.1 -B

Update: @taoliu Library built from current master works ok.

@JohnUrban JohnUrban mentioned this issue Sep 7, 2021
pveber added a commit to pveber/bistro that referenced this issue Jul 4, 2022
macs2 has a weird and hidden constraint on having enough space in tmp
space, see macs3-project/MACS#122
pveber added a commit to pveber/bistro that referenced this issue Feb 6, 2023
macs2 has a weird and hidden constraint on having enough space in tmp
space, see macs3-project/MACS#122
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants