Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look into eliminating the --optimize_for_large_inputs flag #380

Open
arostamianfar opened this issue Oct 16, 2018 · 0 comments
Open

Look into eliminating the --optimize_for_large_inputs flag #380

arostamianfar opened this issue Oct 16, 2018 · 0 comments
Assignees

Comments

@arostamianfar
Copy link
Contributor

arostamianfar commented Oct 16, 2018

We can likely detect whether the input is "large" at initialization and automate setting this flag. We already try to do some of this (e.g. https://github.com/googlegenomics/gcp-variant-transforms/blob/master/gcp_variant_transforms/vcf_to_bq_common.py#L69), but it needs to be smarter and also take size of the files into account.

This can make the user experience nicer as they don't need to worry about passing an additional flag.

Note: we should make sure that the initialization itself does not take too long for very large inputs (e.g. add a cap to the number of files being processed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants