Skip to content
This repository has been archived by the owner on Jan 31, 2020. It is now read-only.

Somatic Validation Build Fails on 1/10 Percent Downsampled Data #145

Closed
ghost opened this issue May 15, 2014 · 6 comments
Closed

Somatic Validation Build Fails on 1/10 Percent Downsampled Data #145

ghost opened this issue May 15, 2014 · 6 comments

Comments

@ghost
Copy link

ghost commented May 15, 2014

The somatic validation build fails when processing the 1/10 percent downsampled wgs data. This has been an issue since the 1/10 percent downsampled bams were name sorted a couple of days ago.

The steps to recreate the issue are:
$ ./setup/prime-system.pl --data=hcc1395_1tenth_percent --sync=tarball
$ genome model build start 2891325873
$ genome model build start 2891325882
wait for those builds to complete
$ genome model build start 77ae30e13e154ddb918b8903fb02ff8d
this build will fail

The failure occurs in the 'sv_breakdancer_1.3_#7 novo-realign v1 #1' step with an error message of “It is impossible for somatic case to have only 1 per lib rmdup bam". This step depends on the results of a previous result named 'sv breakdancer 1.3 #7’ which nominally succeeds, having only one pair of fastq files with data. The novo-realign step detects that there are not enough fastq files with data and fails.

I'm also testing the exome somatic validation build and will update this issue with the result.

@ghost
Copy link
Author

ghost commented May 15, 2014

I spoke with @tabbott today. I'll summarize the conversation:

  1. We can modify the somatic validation pipeline to not complain when there is too little data
  2. We can increase the amount of data that we are using

@malachig
Copy link
Collaborator

So this is somatic-variation and not somatic-validation right?

The test run is already a bit long and we may have to make the data set much larger to avoid this problem. So I am not a fan of option (2).

For option (1) we are actually getting and error and crash now right? Not just a warning. Perhaps giving a warning but allowing the build to progress would be the best option...

@ghost
Copy link
Author

ghost commented May 15, 2014

Yes, somatic-variation is what I meant. We are getting an error and a crash, not just a warning.

Do you have a preference for how a warning would be delivered to the user?

@malachig
Copy link
Collaborator

Just a warning message during the build process that gets stored in the usual log files seems fine to me...

i.e. $self->warning_message('text of warning')

@ghost
Copy link
Author

ghost commented May 16, 2014

@tmooney and I wrote a patch yesterday which fixes this problem. One of the previously failing models for auto-imported data now builds successfully. I'll continue testing other models.

@ghost
Copy link
Author

ghost commented May 19, 2014

genome/gms-core@72dc7371aa20d6f5f11fd61f8e44a921c28e3415 is the relevant patch. The wgs somatic variation model succeeds on both auto-imported and manually imported bams even when they are name sorted.

@ghost ghost closed this as completed Jun 25, 2014
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant