missing tmp file during bamCoverage #546

blaiseli · 2017-06-30T12:32:02Z

I'm trying to obtain bigwig files for some RNA-seq data, and it takes a lot of time and ends up crashing on a FileNotFoundError for a file in /tmp

I noticed that the process used quite a lot of RAM: it seemed to be using almost all what was available (around 10 GB), but I experienced no swap symptoms. I did not observe lot of IO going on with iotop.

Is it possible that my workstation has not enough RAM for this and ends up loosing files in /tmp. Is it something that can happen?

I updated to deeptools-2.5.1, and the error is still there.

$ bamCoverage -b mapping_gonad_met2set25/hisat2/mapped_C_elegans/met2set25_2_on_C_elegans_sorted.bam --filterRNAstrand reverse -of=bigwig -bs 10 -p=1 --scaleFactor ${scale} -o mapping_gonad_met2set25/hisat2/mapped_C_elegans/met2set25_2/met2set25_2_on_C_elegans_norm_fwd.bw 
bamFilesList: ['mapping_gonad_met2set25/hisat2/mapped_C_elegans/met2set25_2_on_C_elegans_sorted.bam']
binLength: 10
numberOfSamples: None
blackListFileName: None
defaultFragmentLength: read length
numberOfProcessors: 1
verbose: False
region: None
bedFile: None
minMappingQuality: None
ignoreDuplicates: False
chrsToSkip: []
stepSize: 10
center_read: False
samFlag_include: None
samFlag_exclude: None
minFragmentLength: 0
maxFragmentLength: 0
zerosToNans: False
smoothLength: None
save_data: False
out_file_for_raw_data: None
maxPairedFragmentLength: 1000
filter_strand: reverse
Offset: [1, -1]
Traceback (most recent call last):
  File "/home/bli/.local/bin/bamCoverage", line 11, in <module>
    main(args)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/bamCoverage.py", line 238, in main
    format=args.outFileFormat, smoothLength=args.smoothLength)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 164, in run
    chrom_names_and_size, bedgraph_file, out_file_name, True)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 296, in bedGraphToBigWig
    f = open("{}.sorted".format(_file.name))
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpf444sdye.sorted'

(${scale} is a number close to 1, if that matters)

The text was updated successfully, but these errors were encountered:

dpryan79 · 2017-06-30T12:59:32Z

How much space is free in /tmp?

blaiseli · 2017-06-30T14:21:08Z

I'm not sure this is the correct way of checking:

$ df -h /tmp
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       458G  343G   92G  79% /

This is before and at the beginning of a run of the bamCoverage command shown earlier, when bamCoverage still does not use a lot of RAM.

After some time, the available RAM is almost entirely used, no swapping observed, bamCoverage is running at 100% of one CPU. df -h /tmp still shows 92G available.

I spot the following lines in the output of du -hs /tmp/*(which took about 1 or 2 minutes to run)

1.7M	/tmp/_deeptools_84vs5do_.bg
1.9M	/tmp/_deeptools_8g4a75ls.bg
1004K	/tmp/_deeptools__kr8qupd.bg
1.9M	/tmp/_deeptools_tzai631l.bg
1008K	/tmp/_deeptools_y2k375qo.bg

There are also 2 other recent files in /tmp:

$ ls -lhrt /tmp/ | tail -7
-rw------- 1 bli  Mhe  1007K Jun 30 16:02 _deeptools_y2k375qo.bg
-rw------- 1 bli  Mhe   1.7M Jun 30 16:04 _deeptools_84vs5do_.bg
-rw------- 1 bli  Mhe   1.9M Jun 30 16:07 _deeptools_8g4a75ls.bg
-rw------- 1 bli  Mhe   1.9M Jun 30 16:09 _deeptools_tzai631l.bg
-rw------- 1 bli  Mhe  1001K Jun 30 16:10 _deeptools__kr8qupd.bg
-rw------- 1 bli  Mhe   513K Jun 30 16:21 ycmd_53944_stderr_szm0drdd.log
-rw------- 1 bli  Mhe   1.8K Jun 30 16:22 ycmd_32835_stderr_nhebjixr.log

The bam file is 1.5G.

blaiseli · 2017-06-30T14:36:27Z

To be noted: I see a /tmp/tmpf444sdye file dating from the begining of this afternoon. This file name pattern corresponds to the missing *.sorted file reported in my first message:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpf444sdye.sorted'

I now see more /tmp/_deeptools_*.bg files, but still no /tmp/tmp?????

bamCoverage still running at 100% CPU and using a little less RAM now.

blaiseli · 2017-06-30T15:11:46Z

Update: About an hour after starting, bamCoverage crashed:

Traceback (most recent call last):
  File "/home/bli/.local/bin/bamCoverage", line 11, in <module>
    main(args)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/bamCoverage.py", line 238, in main
    format=args.outFileFormat, smoothLength=args.smoothLength)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 164, in run
    chrom_names_and_size, bedgraph_file, out_file_name, True)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 296, in bedGraphToBigWig
    f = open("{}.sorted".format(_file.name))
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpwqppp77v.sorted'

The only recent files left in /tmp are the following:

-rw------- 1 bli  Mhe    82 Jun 30 17:01 tmpwqppp77v
-rw------- 1 bli  Mhe  513K Jun 30 17:01 ycmd_53944_stderr_szm0drdd.log
-rw------- 1 bli  Mhe  2.3K Jun 30 17:02 ycmd_32835_stderr_nhebjixr.log

The /tmp/ycmd* files are actually completely unrelated: from their content I guess that they are generated by my text editor.

I couldn't monitor precisely what happened during the last minute, but the last time I checked the recent files in /tmp, I could see the following:

-rw------- 1 bli  Mhe  1007K Jun 30 16:02 _deeptools_y2k375qo.bg
-rw------- 1 bli  Mhe   1.7M Jun 30 16:04 _deeptools_84vs5do_.bg
-rw------- 1 bli  Mhe   1.9M Jun 30 16:07 _deeptools_8g4a75ls.bg
-rw------- 1 bli  Mhe   1.9M Jun 30 16:09 _deeptools_tzai631l.bg
-rw------- 1 bli  Mhe  1001K Jun 30 16:10 _deeptools__kr8qupd.bg
-rw------- 1 bli  Mhe   989K Jun 30 16:27 _deeptools_fnqqxtbv.bg
-rw------- 1 bli  Mhe   670K Jun 30 16:27 _deeptools_iw298w3l.bg
-rw------- 1 bli  Mhe   898K Jun 30 16:28 _deeptools_4uvy5ayn.bg
-rw------- 1 bli  Mhe   1.5M Jun 30 16:30 _deeptools_czrjci6r.bg
-rw------- 1 bli  Mhe   1.5M Jun 30 16:32 _deeptools_3di3imub.bg
-rw------- 1 bli  Mhe   1.4M Jun 30 16:33 _deeptools_x1woahhq.bg
-rw------- 1 bli  Mhe   844K Jun 30 16:33 _deeptools_tutsgh05.bg
-rw------- 1 bli  Mhe   943K Jun 30 16:34 _deeptools_4r2xgyv9.bg
-rw------- 1 bli  Mhe   1.8M Jun 30 16:36 _deeptools_u9dbl_il.bg
-rw------- 1 bli  Mhe   1.9M Jun 30 16:39 _deeptools_naeopo4g.bg
-rw------- 1 bli  Mhe   1.9M Jun 30 16:41 _deeptools_0j5hswrk.bg
-rw------- 1 bli  Mhe   1.4M Jun 30 16:41 _deeptools_bq70ao25.bg
-rw------- 1 bli  Mhe   380K Jun 30 16:42 _deeptools_0p8j_t2o.bg
-rw------- 1 bli  Mhe   880K Jun 30 16:42 _deeptools_u5e07_45.bg
-rw------- 1 bli  Mhe   898K Jun 30 16:43 _deeptools_xuwreca3.bg
-rw------- 1 bli  Mhe   1.1M Jun 30 16:45 _deeptools_hqp9_jbw.bg
-rw------- 1 bli  Mhe   1.4M Jun 30 16:46 _deeptools_6c237bz0.bg
-rw------- 1 bli  Mhe   1.5M Jun 30 16:48 _deeptools_qtz3rhmy.bg
-rw------- 1 bli  Mhe   655K Jun 30 16:49 _deeptools_xzcpjdsp.bg
-rw------- 1 bli  Mhe   287K Jun 30 16:49 _deeptools_ouplxyub.bg
-rw------- 1 bli  Mhe   513K Jun 30 16:51 ycmd_53944_stderr_szm0drdd.log
-rw------- 1 bli  Mhe   2.2K Jun 30 16:52 ycmd_32835_stderr_nhebjixr.log
-rw------- 1 bli  Mhe    31K Jun 30 16:53 _deeptools_6dbt4pph.bg
-rw------- 1 bli  Mhe   454K Jun 30 16:54 _deeptools_cf3fdlmc.bg
-rw------- 1 bli  Mhe   502K Jun 30 16:54 _deeptools_9r0zw06_.bg
-rw------- 1 bli  Mhe   810K Jun 30 16:55 _deeptools_kqctrui_.bg
-rw------- 1 bli  Mhe   944K Jun 30 16:56 _deeptools_n_ub8g40.bg
-rw------- 1 bli  Mhe   1.1M Jun 30 16:58 _deeptools_9pxg9986.bg
-rw------- 1 bli  Mhe   852K Jun 30 16:59 _deeptools_k94qzeev.bg
-rw------- 1 bli  Mhe   450K Jun 30 16:59 _deeptools_fztil8ie.bg
-rw------- 1 bli  Mhe   561K Jun 30 16:59 _deeptools_zfkm1sqh.bg
-rw------- 1 bli  Mhe   552K Jun 30 17:00 _deeptools_y09_v87f.bg
-rw------- 1 bli  Mhe   662K Jun 30 17:00 _deeptools_6w5vcn4f.bg
-rw------- 1 bli  Mhe   536K Jun 30 17:00 _deeptools_r_o780de.bg
-rw------- 1 bli  Mhe   588K Jun 30 17:00 _deeptools__30md89q.bg
-rw------- 1 bli  Mhe   609K Jun 30 17:00 _deeptools_7g75wayw.bg

The /tmp/tmpwqppp77v file has therefore been created only during the last minute over about one hour of run.

blaiseli · 2017-06-30T15:22:48Z

I looked at the code where the error occurs:

sort_cmd = cfg.config.get('external_tools', 'sort')
    _file = NamedTemporaryFile(delete=False)
    for chrom, size in chromSizes:
        _file.write(toBytes("{}\t{}\n".format(chrom, size)))
    _file.close()
    system("LC_ALL=C {} -k1,1 -k2,2n {} > {}.sorted".format(sort_cmd, _file.name, _file.name))

I'm able to "manually" create the .sorted file in python, but I don't know what cfg.config.get('external_tools', 'sort') actually returns:

system("LC_ALL=C {} -k1,1 -k2,2n {} > {}.sorted".format("sort", "/tmp/tmpwqppp77v", "/tmp/tmpwqppp77v"))

And I now have /tmp/tmpwqppp77v.sorted

blaiseli · 2017-06-30T16:32:02Z

I tried to edit the code as follows:

    ret_code = system("LC_ALL=C {} -k1,1 -k2,2n {} > {}.sorted".format(sort_cmd, _file.name, _file.name))
    if ret_code != 0:
        raise OSError()

The following error shows that the sort command fails:

Traceback (most recent call last):
  File "/home/bli/.local/bin/bamCoverage", line 11, in <module>
    main(args)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/bamCoverage.py", line 238, in main
    format=args.outFileFormat, smoothLength=args.smoothLength)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 164, in run
    chrom_names_and_size, bedgraph_file, out_file_name, True)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 296, in bedGraphToBigWig
    raise OSError()
OSError

I should have taken the opportunity to insert more debugging information...

blaiseli · 2017-07-03T08:30:34Z

With more debugging info:

    sort_cmd = cfg.config.get('external_tools', 'sort')
    _file = NamedTemporaryFile(delete=False)
    for chrom, size in chromSizes:
        _file.write(toBytes("{}\t{}\n".format(chrom, size)))
    _file.close()
    print(sort_cmd)
    print(_file.name)
    print("LC_ALL=C {} -k1,1 -k2,2n {} > {}.sorted".format(sort_cmd, _file.name, _file.name))
    print(', '.join(listdir("/tmp")))
    ret_code = system("LC_ALL=C {} -k1,1 -k2,2n {} > {}.sorted".format(sort_cmd, _file.name, _file.name))
    if ret_code != 0:
        raise OSError()

sort
/tmp/tmpubs_xfmx
LC_ALL=C sort -k1,1 -k2,2n /tmp/tmpubs_xfmx > /tmp/tmpubs_xfmx.sorted
.read_FCS.py.swp, jedihttp_58492_stderr_D_bomT.log, ycmd_32835_stderr_nhebjixr.log, .XIM-unix, .X1-lock, ycmd_32835_stdout_5yzzor1l.log, tmpf444sdye, .Test-unix, tmpeherjtxt, config-err-CNkHGE, systemd-private-4106110627ad405f89ae62a1d7527b8e-rtkit-daemon.service-tZbpvy, ycmd_53944_stderr_szm0drdd.log, .xfsm-ICE-CWCQ2Y, tmpk8b79_xe, systemd-private-4106110627ad405f89ae62a1d7527b8e-colord.service-h2iEnX, orbit-bli, .X11-unix, mozilla_bli0, jedihttp_40612_stderr_Dh_RYP.log, read_FCS.py, tmpubs_xfmx, ycm_od7ubot_.log, jedihttp_45110_stdout_0rabAW.log, .font-unix, tmpz73tm66i, ycm_132e0kih.log, tmpwqppp77v, thunderbird_bli, jedihttp_58492_stdout_ReROSs.log, ycmd_53944_stdout_shccctv6.log, jedihttp_45110_stderr_06xJAv.log, .X0-lock, tmp7clyo100, jedihttp_40612_stdout_ZLZn1O.log, tmp2535m6j6, .ICE-unix, ycmd_54594_stderr_oklmw42x.log, ycmd_54594_stdout_ln2ofpsx.log, tmpwqppp77v.sorted, ycm_xdwiz7xs.log, firefox_bli
Traceback (most recent call last):
  File "/home/bli/.local/bin/bamCoverage", line 11, in <module>
    main(args)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/bamCoverage.py", line 238, in main
    format=args.outFileFormat, smoothLength=args.smoothLength)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 164, in run
    chrom_names_and_size, bedgraph_file, out_file_name, True)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 300, in bedGraphToBigWig
    raise OSError()
OSError

The file /tmp/tmpubs_xfmx is present in the os.listdir generated list, and if I manually run LC_ALL=C sort -k1,1 -k2,2n /tmp/tmpubs_xfmx > /tmp/tmpubs_xfmx.sorted, there's no error and a corresponding .sorted file is generated.

blaiseli · 2017-07-03T08:38:04Z

Side question: What's the point of this piece of code?

What I see is that the NamedTemporaryFile is used to store chromosome names and lengths, then this is unix-sorted, and the sorted file is read into a list of (chrom, int(chromLen)) pairs. Then both files are removed.

Isn't it possible to achieve the same sort within python?

dpryan79 · 2017-07-03T08:57:44Z

Possibly. This is pretty old code and predates me, but my understanding is that the goal was to produce consistent sorting across machines with different locale settings. That's possible with forcing LC_ALL=C on the command line, but I don't know if it can be done internally in python. My personal preference would be to just use the same order that the input BAM file(s) have and forget ordering. I'd also write only BigWig files as temp files to cut down on space requirements and then use some custom merging code.

dpryan79 · 2017-07-03T08:59:17Z

Out of curiousity, what is the ret_code? That could be telling as to what's going wrong.

blaiseli · 2017-07-03T09:33:41Z

According to https://stackoverflow.com/questions/8776807/is-pythons-sort-function-the-same-as-linuxs-sort-with-lc-all-c, the correct sorting behaviour can be enforced the following way:

import locale
locale.setlocale(locale.LC_ALL, "C")

And then use cmp=locale.strcoll in the sort command.

blaiseli · 2017-07-03T10:28:49Z

I modified the code as follows in order to report the exit code and try using python sort:

    print("LC_ALL=C {} -k1,1 -k2,2n {} > {}.sorted".format(sort_cmd, _file.name, _file.name))
    ret_code = system("LC_ALL=C {} -k1,1 -k2,2n {} > {}.sorted".format(sort_cmd, _file.name, _file.name))
    if ret_code != 0:
        print("command returned with %d exit code\nTrying python sort" % ret_code)
        cl = sorted([(chrom, int(chromLen)) for chrom, chromLen in chromSizes])
    else:
        cl = []
        f = open("{}.sorted".format(_file.name))
        for line in f:
            chrom, chromLen = line.split()
            cl.append((chrom, int(chromLen)))
        f.close()
        remove("{}.sorted".format(_file.name))
    remove(_file.name)

Here is the output:

LC_ALL=C sort -k1,1 -k2,2n /tmp/tmpixq7m0g6 > /tmp/tmpixq7m0g6.sorted
command returned with -1 exit code
Trying python sort
[bwClose] There was an error while finishing writing a bigWig file! The output is likely truncated.

dpryan79 · 2017-07-03T11:01:50Z

What does cl look like after it's been sorted?

blaiseli · 2017-07-03T12:27:28Z

I tried to use the locale.strcoll trick, but one apparently cannot use cmp in sorted in python 3. There is a functools.cmp_to_key utility to convert this to a sorting key. It's currently running.

blaiseli · 2017-07-03T14:45:16Z

I had to do it differently because TypeError: strcoll() argument 1 must be str, not tuple:

    import locale
    locale.setlocale(locale.LC_ALL, "C")
    from functools import cmp_to_key
    def str_int_cmp(pair1, pair2):
        str1, int1 = pair1
        str2, int2 = pair2
        if locale.strcoll(str1, str2) < 0:
            return -1
        elif locale.strcoll(str1, str2) > 0:
            return 1
        else:
            return int1 - int2


    # Make a list of tuples for the bigWig header, this MUST be sorted identically to the bedGraph file
    sort_cmd = cfg.config.get('external_tools', 'sort')
    _file = NamedTemporaryFile(delete=False)
    for chrom, size in chromSizes:
        _file.write(toBytes("{}\t{}\n".format(chrom, size)))
    _file.close()
    unix_sort = "LC_ALL=C {} -k1,1 -k2,2n {} > {}.sorted".format(sort_cmd, _file.name, _file.name)
    ret_code = system(unix_sort)
    if ret_code != 0:
        import sys
        sys.stderr.write("command \"%s\" returned with %d exit code\nTrying python sort\n" % (unix_sort, ret_code))
        cl = sorted([(chrom, int(chromLen)) for chrom, chromLen in chromSizes], key=cmp_to_key(str_int_cmp))
        sys.stderr.write("\n".join(["%s\t%d" % (chr, chr_len) for chr, chr_len in cl]))
    else:
        cl = []
        f = open("{}.sorted".format(_file.name))
        for line in f:
            chrom, chromLen = line.split()
            cl.append((chrom, int(chromLen)))
        f.close()
        remove("{}.sorted".format(_file.name))
    remove(_file.name)

I think the comparison function can be simplified as follows:

def str_int_cmp(pair1, pair2):
    str1, int1 = pair1
    str2, int2 = pair2
    return locale.strcoll(str1, str2) or (int1 - int2)

But this looks a bit "hackish".

Probably better:

def str_int_key(pair):
    return (locale.strxfrm(pair[0]), pair[1])

And use this directly as key in sorted.

blaiseli · 2017-07-03T14:46:55Z

When I run the code using sorted as described above (with the first version of str_int_cmp), I get the following:

command "LC_ALL=C sort -k1,1 -k2,2n /tmp/tmpilsh0syc > /tmp/tmpilsh0syc.sorted" returned with -1 exit code
Trying python sort
I	15072434
II	15279421
III	13783801
IV	17493829
MtDNA	13794
V	20924180
X	17718942[bwClose] There was an error while finishing writing a bigWig file! The output is likely truncated.

fidelram · 2017-07-03T15:12:55Z

The sorting issue appeared when the bigwig files were generated using a system call, as bedGraphToBigWig required an specific sorting. I wonder if this is now redundant. Here are the messages that reported the problem: https://groups.google.com/forum/#!searchin/deeptools/LC_ALL%7Csort:relevance/deeptools/BUBRjFV7hKc/pbW82duSFQAJ

…

On Mon, Jul 3, 2017 at 4:46 PM, blaiseli ***@***.***> wrote: When I run the code using sorted as described above (with the first version of str_int_cmp), I get the following: command "LC_ALL=C sort -k1,1 -k2,2n /tmp/tmpilsh0syc > /tmp/tmpilsh0syc.sorted" returned with -1 exit code Trying python sort I 15072434 II 15279421 III 13783801 IV 17493829 MtDNA 13794 V 20924180 X 17718942[bwClose] There was an error while finishing writing a bigWig file! The output is likely truncated. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#546 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEu_1ebAOn4pOc4yL6L4KLxgGb0CEp36ks5sKP7fgaJpZM4OKhhC> .

-- Fidel Ramirez

blaiseli · 2017-07-03T15:14:09Z

I just noticed this in my dmesg output:

[346800.176174] INFO: task bamCoverage:28788 blocked for more than 120 seconds.
[346800.176181]       Not tainted 4.4.0-81-generic #104-Ubuntu
[346800.176183] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[346800.176186] bamCoverage     D ffff8801056ffc68     0 28788   2823 0x00000000
[346800.176192]  ffff8801056ffc68 ffff8801056ffc60 ffff88041ce90e00 ffff880347721c00
[346800.176197]  ffff880105700000 ffff88041b4d5984 ffff880347721c00 00000000ffffffff
[346800.176201]  ffff88041b4d5988 ffff8801056ffc80 ffffffff8183c955 ffff88041b4d5980
[346800.176204] Call Trace:
[346800.176216]  [<ffffffff8183c955>] schedule+0x35/0x80
[346800.176220]  [<ffffffff8183cbfe>] schedule_preempt_disabled+0xe/0x10
[346800.176224]  [<ffffffff8183e839>] __mutex_lock_slowpath+0xb9/0x130
[346800.176228]  [<ffffffff8183e8cf>] mutex_lock+0x1f/0x30
[346800.176236]  [<ffffffff8121df5c>] path_openat+0x4bc/0x1330
[346800.176239]  [<ffffffff8121ffc1>] do_filp_open+0x91/0x100
[346800.176245]  [<ffffffff81226954>] ? dput+0x34/0x220
[346800.176249]  [<ffffffff8122d906>] ? __alloc_fd+0x46/0x190
[346800.176255]  [<ffffffff8120e468>] do_sys_open+0x138/0x2a0
[346800.176260]  [<ffffffff8120e5ee>] SyS_open+0x1e/0x20
[346800.176264]  [<ffffffff81840a72>] entry_SYSCALL_64_fastpath+0x16/0x71
[359980.976545] bamCoverage[3827]: segfault at 0 ip 00007f2797af7ed9 sp 00007ffe622b78e8 error 6 in libc-2.23.so[7f27979aa000+1c0000]

Could be related to the [bwClose] There was an error while finishing writing a bigWig file! The output is likely truncated. message.

blaiseli · 2017-07-03T17:57:08Z

I further edited writeBedGraph.py to catch possible system errors:

    if sort:
        # temporary file to store sorted bedgraph file
        _file = NamedTemporaryFile(delete=False)
        tempfilename1 = _file.name
        unix_sort = "LC_ALL=C {} -k1,1 -k2,2n {} > {}".format(sort_cmd, bedGraphPath, tempfilename1)
        ret_code = system(unix_sort)
        if ret_code != 0:
            raise OSError("Command \"%s\" failed with %d exit code." % (unix_sort, ret_code))
        bedGraphPath = tempfilename1

And it appears that this sort is actually also failing:

Traceback (most recent call last):
  File "/home/bli/.local/bin/bamCoverage", line 11, in <module>
    main(args)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/bamCoverage.py", line 238, in main
    format=args.outFileFormat, smoothLength=args.smoothLength)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 164, in run
    chrom_names_and_size, bedgraph_file, out_file_name, True)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 345, in bedGraphToBigWig
    raise OSError("Command \"%s\" failed with %d exit code." % (unix_sort, ret_code))
OSError: Command "LC_ALL=C sort -k1,1 -k2,2n mapping_gonad_met2set25/hisat2/mapped_C_elegans/met2set25_2/met2set25_2_on_C_elegans_norm_fwd.bw.bg > /tmp/tmpgmymc1c6" failed with -1 exit code.

I don't know what a -1 error code is supposed to mean in this context. I suspect it is not the direct error code returned by sort.

I modified the code to have better diagnostics:

    unix_sort = "LC_ALL=C {} -k1,1 -k2,2n {} > {}.sorted".format(sort_cmd, _file.name, _file.name)
    p = Popen(
        [unix_sort],
        stdout=PIPE,
        stderr=PIPE,
        shell=True)
    result, err = p.communicate()
    if p.returncode != 0:
        import sys
        #raise IOError("The following error occurred:\n%s\nThe command was:\n%s\nThe return code was %d" % (err, unix_sort, p.returncode))
        sys.stderr.write("The following error occurred:\n%s\nThe command was:\n%s\nThe return code was %d\n" % (err, unix_sort, p.returncode))
    #ret_code = system(unix_sort)
    #if ret_code != 0:
        #sys.stderr.write("Command \"%s\" returned with %d exit code.\nTrying python sort\n" % (unix_sort, ret_code))
        sys.stderr.write("Trying python sort.\n")
        cl = sorted([(chrom, int(chromLen)) for chrom, chromLen in chromSizes], key=str_int_key)
        sys.stderr.write("\n".join(["%s\t%d" % (chr, chr_len) for chr, chr_len in cl]) + "\n")
    else:
        cl = []
        f = open("{}.sorted".format(_file.name))
        for line in f:
            chrom, chromLen = line.split()
            cl.append((chrom, int(chromLen)))
        f.close()
        remove("{}.sorted".format(_file.name))
    remove(_file.name)

And I obtain the following result:

Traceback (most recent call last):
  File "/home/bli/.local/bin/bamCoverage", line 11, in <module>
    main(args)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/bamCoverage.py", line 238, in main
    format=args.outFileFormat, smoothLength=args.smoothLength)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 164, in run
    chrom_names_and_size, bedgraph_file, out_file_name, True)
  File "/home/bli/.local/lib/python3.6/site-packages/deeptools/writeBedGraph.py", line 319, in bedGraphToBigWig
    shell=True)
  File "/home/bli/lib/python3.6/subprocess.py", line 707, in __init__
    restore_signals, start_new_session)
  File "/home/bli/lib/python3.6/subprocess.py", line 1260, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory

This is strange considering that I experience no swapping symptoms, so the RAM is not saturated.

steffenheyne · 2017-07-04T07:52:33Z

can you try
df -h
to see what size tmpfs really has and where this is mounted!? Because tmpfs can be mounted somewhere else, eg under /run/ although /tmp/ is used as well and has enough space. I'm not a linux system expert, but maybe some piping is using tmpfs (ie. swap, or?) ...

ulimit -a can be helpful as well...

steffenheyne · 2017-07-04T08:07:32Z

you can also try to set a fixed tmpdir in the deeptools code for sort, eg. sort -T /enough/space/ -k1,1 -k2,2n to rule out any strange default dir usage

blaiseli · 2017-07-04T08:52:13Z

Here are the output of df -h:

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            7.8G     0  7.8G   0% /dev
tmpfs           1.6G   38M  1.6G   3% /run
/dev/sda1       458G  344G   92G  79% /
tmpfs           7.8G  1.5M  7.8G   1% /dev/shm
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           7.8G     0  7.8G   0% /sys/fs/cgroup
/dev/sdb1       1.8T   97G  1.7T   6% /extra
/dev/sdc1       1.8T  972G  769G  56% /extra2
tmpfs           1.6G   44K  1.6G   1% /run/user/1000

and ulimit -a

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63715
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63715
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

I'm suspecting the issue may be related to me having set /proc/sys/vm/overcommit_memory to 2 about a year ago, after having experienced heavy swapping issues and read things such as https://www.etalabs.net/overcommit.html.

I set back /proc/sys/vm/overcommit_memory to 0, and I'll see it this has visible effects.

blaiseli · 2017-07-04T14:27:20Z

Some interesting reading that reinforces my suspicion about overcommit_memory: https://stackoverflow.com/a/13329386/1878788

Memory-wise, we already know that subprocess.Popen uses fork/clone under the hood, meaning that every time you call it you're requesting once more as much memory as Python is already eating up, i.e. in the hundreds of additional MB, all in order to then exec a puny 10kB executable such as free or ps. In the case of an unfavourable overcommit policy, you'll soon see ENOMEM

I'm currently running the thing under valgrind --tool=massif (hoping that it can be useful understanding what is happening), after having set overcommit_memory back to 0.
It has been more than 6 and a half hours now and the program has not crashed yet.

It is quite surprizing that it takes so much time. Maybe this is valgrind massively slowing things [1]. There are no signs of swapping.

[1](It does: "Massif runs programs about 20x slower than normal.")

blaiseli · 2017-07-05T08:20:36Z

No crashes this time.

The memory usage graph obtained using ms_print is as follows:

    GB
7.286^           ::  :     :  :   :                                           
     |          #: ::::::@@::::::::::                                         
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                                          
     |          #: : ::: @ :: ::: ::                             :::::        
     |          #: : ::: @ :: ::: ::                             :: :         
     |          #: : ::: @ :: ::: ::            ::::             :: :         
     |   :::::::#: : ::: @ :: ::: ::     ::: :::::: ::      :::  :: :         
     | @:: : : :#: : ::: @ :: ::: ::   ::::  :::::: ::   :::::   :: :  :::::: 
   0 +----------------------------------------------------------------------->Ti
     0                                                                   12.20

So it actually never exceeds the available memory (around 10 GB), and the problem was likely this overcommit restriction I had set. I'll rerun the program without valgrind and back with os.system, just to be sure.

That said, I believe it would be useful to use subprocess, check for failures and report the command's error message. If you think it is a good idea, I can do a pull request.

dpryan79 · 2017-07-05T10:15:17Z

Thanks for the update. Don't worry about making a PR, I plan to just rewrite that bit of code completely for 2.6.0 (don't close this issue, I'm using it as the reminder to do that!).

dpryan79 · 2017-07-20T08:55:58Z

I've implemented the bigWig creation changes in the develop branch and they're slated for inclusion in the 2.6.0 release.

dpryan79 mentioned this issue Jul 19, 2017

File names with spaces throw an error #558

Closed

dpryan79 added this to the 2.6.0 milestone Jul 19, 2017

dpryan79 added a commit that referenced this issue Jul 19, 2017

Don't write and sort large bedGraph files needlessly. Fix #546

dac5a87

dpryan79 closed this as completed Jul 20, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing tmp file during bamCoverage #546

missing tmp file during bamCoverage #546

blaiseli commented Jun 30, 2017

dpryan79 commented Jun 30, 2017

blaiseli commented Jun 30, 2017 •

edited

Loading

blaiseli commented Jun 30, 2017 •

edited

Loading

blaiseli commented Jun 30, 2017

blaiseli commented Jun 30, 2017

blaiseli commented Jun 30, 2017

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

dpryan79 commented Jul 3, 2017

dpryan79 commented Jul 3, 2017

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

dpryan79 commented Jul 3, 2017

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

fidelram commented Jul 3, 2017 via email

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

steffenheyne commented Jul 4, 2017 •

edited

Loading

steffenheyne commented Jul 4, 2017 •

edited

Loading

blaiseli commented Jul 4, 2017

blaiseli commented Jul 4, 2017 •

edited

Loading

blaiseli commented Jul 5, 2017

dpryan79 commented Jul 5, 2017

dpryan79 commented Jul 20, 2017

missing tmp file during bamCoverage #546

missing tmp file during bamCoverage #546

Comments

blaiseli commented Jun 30, 2017

dpryan79 commented Jun 30, 2017

blaiseli commented Jun 30, 2017 • edited Loading

blaiseli commented Jun 30, 2017 • edited Loading

blaiseli commented Jun 30, 2017

blaiseli commented Jun 30, 2017

blaiseli commented Jun 30, 2017

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

dpryan79 commented Jul 3, 2017

dpryan79 commented Jul 3, 2017

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

dpryan79 commented Jul 3, 2017

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

fidelram commented Jul 3, 2017 via email

blaiseli commented Jul 3, 2017

blaiseli commented Jul 3, 2017

steffenheyne commented Jul 4, 2017 • edited Loading

steffenheyne commented Jul 4, 2017 • edited Loading

blaiseli commented Jul 4, 2017

blaiseli commented Jul 4, 2017 • edited Loading

blaiseli commented Jul 5, 2017

dpryan79 commented Jul 5, 2017

dpryan79 commented Jul 20, 2017

blaiseli commented Jun 30, 2017 •

edited

Loading

blaiseli commented Jun 30, 2017 •

edited

Loading

steffenheyne commented Jul 4, 2017 •

edited

Loading

steffenheyne commented Jul 4, 2017 •

edited

Loading

blaiseli commented Jul 4, 2017 •

edited

Loading