-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make fsspec_open_kwargs
, query_string_secrets
, & is_opendap
attributes of FilePattern
#167
Conversation
This from recipe import instantiate_recipe
api_token = "my-secret-token"
authenticated_recipe = instantiate_recipe(api_params=api_token)
# etc. Automating this would presumably require an update to the recipes:
- callable: "recipe:instantiate_recipe"
args: "api_params:{{ pointer to GitHub Secrets }}" AFAIK, the "recipe secrets" discussed in #79 have mostly been thought of as usernames and passwords which can be passed as |
Just curious if it would also work if you put the token in the HTTP Authorization header, rather than in a query string. That is a very common way to do token-based API access. If so, we could use |
As far as I can tell, this doesn't work for this particular source (NCAR Climate Data Gateway). I made a few attempts to pass this token to This does not seem to match the usual pattern described in the MDN Web Docs, where:
|
If we do want to move forward with a version of this feature, the implementation could alternatively be as an |
Let's talk this through at today's upcoming call. |
From my point of view, it seems much simpler to define the FilePattern without these secrets in the query string, and then figure out a way to pass extra secrets (key-value pairs) to be encoded into the URL. |
Agreed. I think a subclass of |
An alternative would be to put this in the pangeo-forge-recipes/pangeo_forge_recipes/storage.py Lines 172 to 178 in e1ef575
via an extra argument like All remote file opening goes through that function, so it would make sense. |
This is wrong. We also open files here: pangeo-forge-recipes/pangeo_forge_recipes/storage.py Lines 136 to 139 in e1ef575
|
query_string_secrets
to fname
if provided
e18c797
to
1175af4
Compare
@rabernat, this PR is ready for your re-review. Following reflection on your #167 (comment), I believe this feature can be implemented with the very light touch represented here. AFAICT, declaration of This PR required some small changes to the test suite, to get I do wonder if there may be a way to bring some of the new names into closer alignment with the spirit of "Call things the same thing", as I have ended up with
...but maybe these are sufficient justifications for the naming heterogeneity. And a final note: defining an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great progress.
A few comments about the implementation and test logic.
Thanks for these thoughtful suggestions. For clarity, I've converted this back to a draft, and will re-mark it as 'Ready for review' once I've incorporated this round of feedback. |
query_string_secrets
to fname
if providedfsspec_open_kwargs
, query_string_secrets
, & is_opendap
attributes of FilePattern
I believe all of the items from #167 (comment) are now complete. A few notes:
Edit: These questions resolved in #167 (comment)
I'm testing this within
☑️ here
☑️ here
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great Charles! We are super close on this very complicated and much needed test refactor.
I just have one more round of comments and then I'm sure we'll be ready to merge. Thanks for your patience and perseverance here.
Co-authored-by: Ryan Abernathey <[email protected]>
Optimize tests following #167 slowdown
Proposed feature description
Users may source data from APIs which authenticate via parameters (e.g., token, username, etc.) appended to a url path. This PR provides
DropAPIParamsCache
, a subclass ofCacheFSSpecTarget
that removes any of these (likely secret) API parameters from the cache filenames and logs generated at execution time.If we want to move forward with this feature, the following TODO items remain (assuming we preserve all existing logging):
storage.file_opener
would need a conditionalpublic_fname
assignment; something like thisrecipes.xarray_zarr.open_input
also needs apublic_fname
assignment before logging hereMotivating recipe
This feature draft is motivated by @paigem's recipe pangeo-forge/staged-recipes#56, which as described in the linked issue here pangeo-forge/staged-recipes#46, sources files from NCAR's Climate Data Gateway via an HTTP API.
Once signing into the web GUI with valid credentials, the user is given an API token to append to the data url. For example, the first source url for the above-linked recipe is:
https://tds.ucar.edu/thredds/fileServer/datazone/campaign/cesm/collections/ASD/v5_rel04_BC5_ne30_g16/ocn/proc/tseries/daily/v5_rel04_BC5_ne30_g16.pop.h.nday1.HMXL_2.00010101-01661231.nc?api-token=<TOKEN>
where
<TOKEN>
is a secret.