Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve UniProtKB downloads #202

Merged
merged 3 commits into from
Jan 2, 2024
Merged

Improve UniProtKB downloads #202

merged 3 commits into from
Jan 2, 2024

Conversation

gaurav
Copy link
Collaborator

@gaurav gaurav commented Nov 3, 2023

UniProtKB downloads have gotten really slow lately. Rather than relying on the built-in pull_via_urllib() method, this PR switches that over to using wget --continue so that we get progress updates and resume incomplete downloads. I've written a pull_via_wget() method that calls wget().

@gaurav gaurav mentioned this pull request Dec 9, 2023
@gaurav gaurav marked this pull request as ready for review December 9, 2023 21:03
@gaurav gaurav requested a review from cbizon December 9, 2023 21:03
Copy link
Contributor

@cbizon cbizon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that there isn't a wget module for python, but as far as I can tell there ain't.

@gaurav gaurav merged commit f7ed8f0 into master Jan 2, 2024
@gaurav gaurav deleted the improve-uniprotkb-downloads branch January 2, 2024 14:52
gaurav added a commit that referenced this pull request Jan 24, 2024
This PR includes several minor fixes needed to build Babel:
- Upgrade:
  - UMLS to 2023AB
  - RxNorm to 01022024
  - Biolink Model to 3.6.0
- Reduced log level of empty synonym list from warning to debug to reduce the size of the overall log.
- I accidentally merged #206 into this branch instead of `master`, so it has a bunch of changes, including:
  - Adding `genefamily_outputs`, `umls_outputs` and `macromolecularcomplex_outputs`.
  - Adding the KGX export to the `all` target.
  - Expanding babel_outputs to 500Gi to accommodate KGX files.
- A bug fix for #197
- KGX export now produces gzipped files.
- #207
- #202
- #218
- Includes some code from PR #217, but only produces a warning instead of skipping mappings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants