pipeline_testing - #341

Charlie-George · 2017-06-27T18:29:03Z

I'm testing the peakcalling pipeline with the py3 environment, each pipeline seems to run individually now and I've pushed those changes, but I get the following error when it comes to the checksums - has anyone else come across it before? @sebastian-luna-valero @AndreasHeger
``

2017-06-27 19:14:02,518 INFO running statement:

cat test_peakcallingSEbroad.stats | cgat csv2db --retry --database-backend=sqlite --database-name=csvdb --database-host= --database-user= --database-password= --database-port=3306 --add-index=file --table=test_peakcallingSEbroad_results > test_peakcallingSEbroad_results.load

2017-06-27 19:14:11,261 ERROR 1 tasks with errors, please see summary below:

2017-06-27 19:14:11,261 WARNING could not get task information for compareCheckSums, no message sent

2017-06-27 19:14:11,262 ERROR 0: Task=compareCheckSums Error=io.UnsupportedOperation Job=[[test_peakcallingPEnarrow.stats,test_peakcallingPEnarrowIDR.stats,test_peakcallingPEnarrowIDRoracle.stats,test_peakcallingSEIDR.stats,test_peakcallingSEbroad.stats]->md5_compare.tsv]: (can't do nonzero end-relative seeks)

2017-06-27 19:14:11,262 ERROR full traceback is in pipeline.log

Traceback (most recent call last):
File "/ifs/devel/charlotteg/py35-v1/CGATPipelines/CGATPipelines/Pipeline/Control.py", line 943, in main
checksum_level=options.ruffus_checksums_level,
File "/ifs/devel/charlotteg/py35-v1/conda/lib/python3.5/site-packages/ruffus/task.py", line 5938, in pipeline_run
raise job_errors
ruffus.ruffus_exceptions.RethrownJobError:

Original exception:

Exception #1
  'io.UnsupportedOperation(can't do nonzero end-relative seeks)' raised in ...
   Task = def compareCheckSums(...):
   Job  = [[test_peakcallingPEnarrow.stats, test_peakcallingPEnarrowIDR.stats, test_peakcallingPEnarrowIDRoracle.stats, test_peakcallingSEIDR.stats, test_peakcallingSEbroad.stats] -> md5_compare.tsv]

Traceback (most recent call last):
  File "/ifs/devel/charlotteg/py35-v1/conda/lib/python3.5/site-packages/ruffus/task.py", line 751, in run_pooled_job_without_exceptions
    register_cleanup, touch_files_only)
  File "/ifs/devel/charlotteg/py35-v1/conda/lib/python3.5/site-packages/ruffus/task.py", line 567, in job_wrapper_io_files
    ret_val = user_defined_work_func(*params)
  File "/ifs/devel/charlotteg/py35-v1/CGATPipelines/CGATPipelines/pipeline_testing.py", line 467, in compareCheckSums
    is_complete = IOTools.isComplete(logfile)
  File "/ifs/devel/charlotteg/py35-v1/cgat/CGAT/IOTools.py", line 181, in isComplete
    lastline = getLastLine(filename)
  File "/ifs/devel/charlotteg/py35-v1/cgat/CGAT/IOTools.py", line 103, in getLastLine
    f.seek(-1 * offset, 2)
io.UnsupportedOperation: can't do nonzero end-relative seeks

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/ifs/devel/charlotteg/py35-v1/CGATPipelines/CGATPipelines/pipeline_testing.py", line 656, in
sys.exit(P.main(sys.argv))
File "/ifs/devel/charlotteg/py35-v1/CGATPipelines/CGATPipelines/Pipeline/Control.py", line 1028, in main
"pipeline failed with %i errors" % len(value.args))
ValueError: pipeline failed with 1 errors

``

The text was updated successfully, but these errors were encountered:

AndreasHeger · 2017-06-27T18:34:36Z

This is a py3 issue, I have a fix for this that I need to push.

AndreasHeger · 2017-06-27T19:49:22Z

... actually already pushed, could you please git pull --rebase?
Hopefully this will be fixed.

Charlie-George · 2017-06-28T09:22:42Z

hmm I've done that but it says I'm up to date,
I guess there has been some confusion when we merged with master?
Should I roll back? if so to which commit, I'm a bit confused with the history and at what point fixes have dissappeared. Thanks

sebastian-luna-valero · 2017-06-28T09:31:19Z

Hi Charlie,

I agree, I could not see Andreas' fixes into the Py3-migration branches:
https://github.com/CGATOxford/cgat/commits/Py3-migration
https://github.com/CGATOxford/CGATPipelines/commits/Py3-migration

I found the same problem with Jenkins. I think the issue is with pipeline_testing.py trying to access a file (test_name.log) while the pipeline itself is writing to it, and therefore you get an IO error.

However, I might be wrong and Andreas can explain better.

Best regards,
Sebastian

AndreasHeger · 2017-06-28T11:04:50Z

Hi, sorry about that. If I recall, the next() needs to be replaced by readline(). The issue was that in py3 the file is an IOBuffer or similar and that does not have a next() method. Best wishes, Andreas

…

On 28/06/17 10:31, Sebastian Luna-Valero wrote: Hi Charlie, I agree, I could not see Andreas' fixes into the Py3-migration branches: https://github.com/CGATOxford/cgat/commits/Py3-migration https://github.com/CGATOxford/CGATPipelines/commits/Py3-migration I found the same problem with Jenkins. I think the issue is with pipeline_testing.py <https://github.com/CGATOxford/CGATPipelines/blob/Py3-migration/CGATPipelines/pipeline_testing.py#L467> trying to access <https://github.com/CGATOxford/cgat/blob/Py3-migration/CGAT/IOTools.py#L103> a file (test_name.log) while the pipeline itself is writing to it, and therefore you get an IO error. However, I might be wrong and Andreas can explain better. Best regards, Sebastian — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#341 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEOKYJfkrxG_zx2-cAiI5d4rRMNjSkYqks5sIh1ogaJpZM4OHA_e>.

sebastian-luna-valero · 2017-06-29T13:26:34Z

Thanks, Andreas.

I think there is an additional issue. The isComplete function will check whether the last line of both test_name.log and test_name/test_name.log starts with # job finished. However, in the case of test_name.log that will never be the case in the compareCheckSums task of pipeline_testing.py since the (meta-)pipeline has not finished yet. Instead, you should be checking the test_name/test_name.log file only, which is the log file for the pipeline being tested.

Best regards,
Sebastian

AndreasHeger · 2017-06-29T20:46:10Z

Hi @sebastian-luna-valero , might be a bug, but note that I want to test ./test_name_.log instead of test_name/pipleline.log as the latter will also contain the log of the report building.

There is also the issue to test several logs if there are multiple targets to be tested in a pipeline, see for example pipeline_annotations.
Hopefully I pushed this correctly, I have the following snipped in my repository:

 logfiles = glob.glob(track + "*.log")
        job_finished = True
        for logfile in logfiles:
            is_complete = IOTools.isComplete(logfile)
            E.debug("logcheck: {} = {}".format(logfile, is_complete))
            job_finished = job_finished and is_complete

sebastian-luna-valero · 2017-06-30T08:51:10Z

Hi @AndreasHeger

Strange, I don't see new commits the Py3-migration branches yet.

The statement logfiles = glob.glob(track + "*.log"), will return ['test_annotations.log', 'test_annotations.tgz.log'], so you're right and it won't pickup the test_annotations.dir/pipeline.log, which I find necessary to check as well since pipeline_testing.py may finish silently while the pipeline under test may fail, giving exceptions in test_annotations.dir/pipeline.log.

Moreover, I think you can't expect to have # job finished in while running the compareCheckSums task of pipeline_testing.py.

AndreasHeger · 2017-06-30T08:58:13Z

Thanks, let us talk on Monday. I think I saw the changes on github, but maybe I put it in the wrong branch? Best wishes, Andreas

…

On 30/06/17 09:51, Sebastian Luna-Valero wrote: Hi @AndreasHeger <https://github.com/andreasheger> Strange, I don't see new commits the |Py3-migration| branches yet. The statement |logfiles = glob.glob(track + "*.log")|, will return |['test_annotations.log', 'test_annotations.tgz.log']|, so you're right and it won't pickup the |test_annotations.dir/pipeline.log|, which I find necessary to check as well since |pipeline_testing.py| may finish silently while the pipeline under test may fail, giving exceptions in |test_annotations.dir/pipeline.log|. Moreover, I think you can't expect to have |# job finished| in while running the |compareCheckSums| task of |pipeline_testing.py|. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#341 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEOKYHJwQ9Rl_7JhxdSJ1Q1S6grSy8WFks5sJLb-gaJpZM4OHA_e>.

AndreasHeger · 2017-07-03T10:11:57Z

apologies, forgot to push changes to the CGAT repository, only CGAT Pipelines. just pushed!

…

On 06/30/17 09:51, Sebastian Luna-Valero wrote: Hi @AndreasHeger <https://github.com/andreasheger> Strange, I don't see new commits the |Py3-migration| branches yet. The statement |logfiles = glob.glob(track + "*.log")|, will return |['test_annotations.log', 'test_annotations.tgz.log']|, so you're right and it won't pickup the |test_annotations.dir/pipeline.log|, which I find necessary to check as well since |pipeline_testing.py| may finish silently while the pipeline under test may fail, giving exceptions in |test_annotations.dir/pipeline.log|. Moreover, I think you can't expect to have |# job finished| in while running the |compareCheckSums| task of |pipeline_testing.py|. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#341 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEOKYHJwQ9Rl_7JhxdSJ1Q1S6grSy8WFks5sJLb-gaJpZM4OHA_e>.

sebastian-luna-valero · 2017-07-03T15:29:23Z

Thanks for fixing @AndreasHeger

sebastian-luna-valero closed this as completed Jul 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pipeline_testing - #341

pipeline_testing - #341

Charlie-George commented Jun 27, 2017 •

edited

Loading

AndreasHeger commented Jun 27, 2017

AndreasHeger commented Jun 27, 2017

Charlie-George commented Jun 28, 2017

sebastian-luna-valero commented Jun 28, 2017

AndreasHeger commented Jun 28, 2017 via email

sebastian-luna-valero commented Jun 29, 2017

AndreasHeger commented Jun 29, 2017

sebastian-luna-valero commented Jun 30, 2017

AndreasHeger commented Jun 30, 2017 via email

AndreasHeger commented Jul 3, 2017 via email

sebastian-luna-valero commented Jul 3, 2017

pipeline_testing - #341

pipeline_testing - #341

Comments

Charlie-George commented Jun 27, 2017 • edited Loading

2017-06-27 19:14:02,518 INFO running statement:

cat test_peakcallingSEbroad.stats | cgat csv2db --retry --database-backend=sqlite --database-name=csvdb --database-host= --database-user= --database-password= --database-port=3306 --add-index=file --table=test_peakcallingSEbroad_results > test_peakcallingSEbroad_results.load

2017-06-27 19:14:11,261 ERROR 1 tasks with errors, please see summary below:

2017-06-27 19:14:11,261 WARNING could not get task information for compareCheckSums, no message sent

2017-06-27 19:14:11,262 ERROR full traceback is in pipeline.log

AndreasHeger commented Jun 27, 2017

AndreasHeger commented Jun 27, 2017

Charlie-George commented Jun 28, 2017

sebastian-luna-valero commented Jun 28, 2017

AndreasHeger commented Jun 28, 2017 via email

sebastian-luna-valero commented Jun 29, 2017

AndreasHeger commented Jun 29, 2017

sebastian-luna-valero commented Jun 30, 2017

AndreasHeger commented Jun 30, 2017 via email

AndreasHeger commented Jul 3, 2017 via email

sebastian-luna-valero commented Jul 3, 2017

Charlie-George commented Jun 27, 2017 •

edited

Loading