Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding error in LiftoverVcf if reference dictionary does not exist (Issue #1157) #1189

Merged
merged 5 commits into from
Jul 6, 2018

Conversation

kachulis
Copy link
Contributor

Description


Checklist (never delete this)

Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.

Content

  • Added or modified tests to cover changes and any new functionality
  • Edited the README / documentation (if applicable)
  • All tests passing on Travis

Review

  • Final thumbs-up from reviewer
  • Rebase, squash and reword as applicable

For more detailed guidelines, see https://github.com/broadinstitute/picard/wiki/Guidelines-for-pull-requests

Adding error in LiftoverVcf if reference dictionary does not exist

Adding error in LiftoverVcf if reference dictionary does not exist

Adding error in LiftoverVcf if reference dictionary does not exist
@coveralls
Copy link

coveralls commented Jun 22, 2018

Coverage Status

Coverage increased (+0.3%) to 80.338% when pulling 702e383 on ck_dict_fasta_requirement_LiftoverVcf_1157 into b02e42e on master.

"CHAIN=" + CHAIN_FILE,
"REFERENCE_SEQUENCE=" + REFERENCE_FILE,
};
Path dictionary = REFERENCE_FILE.toPath().resolveSibling(REFERENCE_FILE.getName().replace("fasta", "dict"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final and use another variable for the moved dictionary

dictionary = REFERENCE_FILE.toPath().resolveSibling(REFERENCE_FILE.getName().replace("fasta", "dict.tmp"));
try {
Assert.assertEquals(runPicardCommandLine(args), 1);
} finally {//return dictionary name to previous
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space between // and comment for readability

"REFERENCE_SEQUENCE=" + REFERENCE_FILE,
};
Path dictionary = REFERENCE_FILE.toPath().resolveSibling(REFERENCE_FILE.getName().replace("fasta", "dict"));
dictionary.toFile().renameTo(new File(dictionary.toString().replace("dict", "dict.tmp")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might fail (e.g. if read only). check return code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grab new File(dictionary.toString().replace("dict", "dict.tmp") in a Path variable and use it here and below in 1196. the code will be clearer that way.

@@ -284,6 +284,11 @@ protected int doWork() {
log.info("Loading up the target reference genome.");
final ReferenceSequenceFileWalker walker = new ReferenceSequenceFileWalker(REFERENCE_SEQUENCE);
final Map<String, ReferenceSequence> refSeqs = new HashMap<>();
//check if sequence dictionary exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space between // and comment for readability

"CHAIN=" + CHAIN_FILE,
"REFERENCE_SEQUENCE=" + REFERENCE_FILE,
};
Path dictionary = REFERENCE_FILE.toPath().resolveSibling(REFERENCE_FILE.getName().replace("fasta", "dict"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider the case that the file has fasta in the part before the extension...safer to use replaceAll with a .fasta$ and replace it with .dict to be sure that only the extension is swapped.

dictionary.toFile().setReadable(false);
try {
runPicardCommandLine(args);
} finally {//return dictionary to readable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space

"CHAIN=" + CHAIN_FILE,
"REFERENCE_SEQUENCE=" + REFERENCE_FILE,
};
final Path dictionary = REFERENCE_FILE.toPath().resolveSibling(REFERENCE_FILE.getName().replace("fasta", "dict"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see replaceAll comment from above

@@ -1211,11 +1208,11 @@ public void testUnreadableDictionary() throws IOException {
"CHAIN=" + CHAIN_FILE,
"REFERENCE_SEQUENCE=" + REFERENCE_FILE,
};
final Path dictionary = REFERENCE_FILE.toPath().resolveSibling(REFERENCE_FILE.getName().replace("fasta", "dict"));
final Path dictionary = REFERENCE_FILE.toPath().resolveSibling(REFERENCE_FILE.getName().replaceAll("fasta$", "dict"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the solution you did above (of copying the data you need) much better than modifying the test files and then changing them back in a finally clause...could you also do it here?

@kachulis
Copy link
Contributor Author

kachulis commented Jul 5, 2018 via email

@@ -1201,19 +1201,21 @@ public void testUnreadableDictionary() throws IOException {
final Path rejectOutput = Files.createTempFile("tmpreject", ".vcf");
rejectOutput.toFile().deleteOnExit();
final Path input = TEST_DATA_PATH.toPath().resolve("testLiftoverBiallelicIndels.vcf");
final Path referenceCopy = Files.createTempFile("ref", ".fasta");
referenceCopy.toFile().deleteOnExit();
Files.copy(REFERENCE_FILE.toPath(), referenceCopy, StandardCopyOption.REPLACE_EXISTING);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overwriting a temporary file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, once Files.createTempFile is called the file exists and Files.copy will need to overwrite. I can't really find a good way to get a temp path for the reference copy without creating the file; I could create a helper temp file and then resolveSibling on the helper to get the reference copy, or use Paths.get on an absolute path into the java tmp directory, but neither of those options seems better than just allowing the copy to overwrite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you want to do is use Files.createTempDirectory() to create the temp directory and then Path.resolve() on it to create your new temp file path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, that's better, thanks.

Files.copy(REFERENCE_FILE.toPath(), referenceCopy, StandardCopyOption.REPLACE_EXISTING);
final Path dictCopy = referenceCopy.resolveSibling(referenceCopy.toFile().getName().replaceAll("fasta$", "dict"));
final Path dictionary = REFERENCE_FILE.toPath().resolveSibling(REFERENCE_FILE.getName().replaceAll("fasta$", "dict"));
Files.copy(dictionary, dictCopy, StandardCopyOption.REPLACE_EXISTING);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to fail if the copy is about to overwrite.

Assert.assertEquals(runPicardCommandLine(args), 1);
}

@Test(expectedExceptions = SAMException.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the most specific exception that you can catch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as class, yes, this actually gets thrown out of the AbstractFastaSequenceFile constructor, calling IOUtil.assertFileIsReadable, which throws a SAMException. but I can add an expectedExceptionMessageRegExp to the test.

@@ -284,6 +284,11 @@ protected int doWork() {
log.info("Loading up the target reference genome.");
final ReferenceSequenceFileWalker walker = new ReferenceSequenceFileWalker(REFERENCE_SEQUENCE);
final Map<String, ReferenceSequence> refSeqs = new HashMap<>();
// check if sequence dictionary exists
if (walker.getSequenceDictionary() == null) {
log.error("Reference " + REFERENCE_SEQUENCE.getAbsolutePath() + " must have associated Dictionary .dict file in same directory");
Copy link
Member

@pshapiro4broad pshapiro4broad Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May as well write this using a complete sentence, e.g. Reference x must have an associated Dictionary .dict file in the same directory.

referenceCopy.toFile().deleteOnExit();
Files.copy(REFERENCE_FILE.toPath(), referenceCopy, StandardCopyOption.REPLACE_EXISTING);
final String[] args = new String[]{
"INPUT=" + input.toString(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all calls to toString() here and below are redundant and can be removed


@Test()
public void testNoDictionary() throws IOException {
final Path liftOutput = Files.createTempFile("tmpouput", ".vcf");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with all these deleteOnExit, I wonder if it's cleaner to have a
try{
....
} finally {
TestUtil.recursiveDelete(...);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not....nevermind.

Copy link
Contributor

@yfarjoun yfarjoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this...

@yfarjoun yfarjoun merged commit e0bb690 into master Jul 6, 2018
@yfarjoun
Copy link
Contributor

yfarjoun commented Jul 6, 2018

thanks @kachulis. you can delete your branch now.

@kachulis kachulis deleted the ck_dict_fasta_requirement_LiftoverVcf_1157 branch July 6, 2018 17:58
@sooheelee
Copy link
Contributor

sooheelee commented Sep 21, 2018

Hi @kachulis, would it be possible to improve the error messaging for when the dictionary is missing? We have a researcher who pointed this out at https://gatkforums.broadinstitute.org/gatk/discussion/comment/52324#Comment_52324. Currently the error message is uninformative: java.lang.NullPointerException.

@kachulis
Copy link
Contributor Author

Hi @sooheelee, the messaging is already fixed in Picard, but wasn't yet part of gatk until broadinstitute/gatk#5173 (upgrading the picard version from 2.18.7 to 2.18.13). It looks like the user was using v4.0.8.1 of gatk, this fix didn't make it into a gatk release until yesterday with v4.0.9.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants