Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docpatch #5047

Merged
merged 4 commits into from
Aug 28, 2018
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@
* -V data/gvcfs/father.g.vcf.gz \
* -V data/gvcfs/son.g.vcf.gz \
* --genomicsdb-workspace-path my_database \
* --TMP-DIR=path/to/other/tmp \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently changed the name of this argument totmp-dir, so the doc here and below should be changed to reflect the new name. Also, I'd suggest using an example path that includes the word large, to reinforce the reason for specifying this:

--tmp-dir=path/to/large/tmp \

* -L 20
* </pre>
*
Expand All @@ -98,6 +99,7 @@
* --batch-size 50 \
* -L chr1:1000-10000 \
* --sample-name-map cohort.sample_map \
* --TMP-DIR=path/to/other/tmp \
* --reader-threads 5
* </pre>
*
Expand All @@ -117,6 +119,7 @@
* <li>At least one interval must be provided</li>
* <li>Input GVCFs cannot contain multiple entries for a single genomic position</li>
* <li>The --genomicsdb-workspace-path must point to a non-existent or empty directory.</li>
* <li>GenomicsDBImport makes greedy use of /tmp by default, if your /tmp space is limited this will cause errors once /tmp has filled up. This happens quickly if the tool is running many intervals. It is therefore recommended to specify a different `--TMP-DIR`.</li>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested wording for this:

GenomicsDBImport uses temporary disk storage during import. The amount of temporary storage required can exceed the space available, especially when using a large number of intervals. The command line argument --temp-dir can be used to specify an alternate temporary storage location with sufficient space

* </ul>
*
* <h3>Developer Note</h3>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,8 @@
* gatk --java-options "-Xmx4g" GenotypeGVCFs \
* -R Homo_sapiens_assembly38.fasta \
* -V gendb://my_database \
* -O output.vcf.gz
* -O output.vcf.gz \
* --TMP-DIR=path/to/other/tmp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

* </pre>
*
* <h3>Caveats</h3>
Expand All @@ -70,6 +71,7 @@
* programs produce files that they call GVCFs but those lack some important information (accurate genotype likelihoods
* for every position) that GenotypeGVCFs requires for its operation.</li>
* <li>Cannot take multiple GVCF files in one command.</li>
* <li>Reading from a GenomicsDB workspace can fill up /tmp by default, causing confusing errors when scattering across many intervals. It is recommended to specify a `--TMP-DIR` if running this tool in combination with GenomicsDBImport.</li>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest similar wording here as above.

* </ul>
*
* <h3>Special note on ploidy</h3>
Expand Down