-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding EOOffshore CCMP v0.2.1.NRT recipe #145
Adding EOOffshore CCMP v0.2.1.NRT recipe #145
Conversation
🎉 New recipe runs created for the following recipes at sha
|
/run recipe-test recipe_run_id=885 |
When I tried to import your recipe module, I encountered this error
Please correct your recipe module so that it's importable. |
@derekocallaghan - thanks os much for this contribution and sorry for the slow response here! (Some of our team members have been out sick.) I'm confused about the metpy error, since metpy is clearly part of the pangeo docker image (https://github.com/pangeo-data/pangeo-docker-images/blob/master/pangeo-notebook/environment.yml#L53). @cisaacstern - any thoughts on this? |
This is a design flaw with the |
Fixing this (major) design oversight is on our roadmap. For Pangeo Forge organization members, a detailed tracking issue is available here: https://github.com/pangeo-forge/registrar/issues/43. |
Hi @cisaacstern and @rabernat, thanks for the update. Sorry for only getting back now, I'm currently on vacation. I could exclude the |
Yes, our plan is to support metpy in the coming months. Thanks for your patience, and please feel free to check back with me on this thread at any point. I will plan to circle back to the this once metpy is supported here. |
Thanks @cisaacstern, btw I'm happy to help with resolving the issue if that suits |
…ersion 0.9.1 Runs are postponed for now, see `blocked:requires-dind` label.
pre-commit.ci autofix |
import dask | ||
import dask.array as da | ||
from datetime import datetime | ||
from metpy.calc import wind_direction, wind_speed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andersy005, despite my earlier comments on this PR, and the blocked:...
label I added, I actually think if we move this metpy
import inside the body of the ics_wind_speed_direction
function below, this will work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason being that the Dataflow worker image does have metpy, so the only issue is actually that the FastAPI application doesn't have metpy
at recipe parse time. But if that import is within the function body (where IIUC it is used anyway), then it will not be imported at parse time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great...where would we define metpy
as a run dependency or is it already installed in the environment used to run the recipe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already in the environment used to run the recipe, so I think if the import is moved into the function body, it should just work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cisaacstern and @andersy005, the latest recipe commit now has the metpy
import inside ics_wind_speed_direction()
. I've tested it locally with version 0.9.1.
…d_speed_direction()`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @derekocallaghan! Something I've been learning recently is that in the Apache Beam distributed computing model we're using on the backend, it turns out any imports used within a function need to be declared inside the function body. So I've moved the xarray import into the function body as well. I'll accept those changes myself now, and then trigger a test run of this recipe.
import pandas as pd | ||
from pangeo_forge_recipes.patterns import FilePattern, ConcatDim | ||
from pangeo_forge_recipes.recipes import XarrayZarrRecipe | ||
import xarray as xr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import xarray as xr |
direction for the u and v components in the specified product. Dask arrays are | ||
created for delayed execution. | ||
""" | ||
from metpy.calc import wind_direction, wind_speed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from metpy.calc import wind_direction, wind_speed | |
import xarray as xr | |
from metpy.calc import wind_direction, wind_speed |
@derekocallaghan for some reason I don't see the option to accept these changes myself. Could you move the xarray import into the function body as well, as proposed above? Thanks! |
…into the function body.
Hi @cisaacstern, no problem, I've just moved all required imports into the function body. Local test was successful. |
pre-commit.ci autofix |
Oh I see why I couldn't accept changes, same reason pre-commit autofix didn't work:
No problem, that's just stylistic stuff. I'll run the test now. |
/run eooffshore_ics_ccmp_v02_1_nrt_wind |
The test failed, but I'm sure we can find out why! Pangeo Forge maintainers are working diligently to provide public logs for contributors. |
Re: this failure, see my comment in #169 (comment). I aim to have a way for other admins to query these logs today. I'll check back here once that's ready. |
Thanks, sorry about that, it was made from an organization. |
I've been looking into reproducing the issue locally using In [29]: bconfig
Out[29]:
{'Bake': {'bakery_class': 'pangeo_forge_runner.bakery.local.LocalDirectBakery',
'recipe_id': 'eooffshore_ics_ccmp_v02_1_nrt_wind',
'feedstock_subdir': 'recipes/eooffshore_ics_ccmp_v02_1_nrt_wind',
'repo': 'https://github.com/eooffshore/staged-recipes',
'ref': '663f30c95c406b9efe012b9bae66fa1f386b539b',
'job_name': 'CCMP',
'prune': True},
'LocalDirectBakery': {'num_workers': 1},
'TargetStorage': {'fsspec_class': 'fsspec.implementations.local.LocalFileSystem',
'root_path': './ccmp.zarr'},
'InputCacheStorage': {'fsspec_class': 'fsspec.implementations.local.LocalFileSystem',
'root_path': './input-cache/'},
'MetadataCacheStorage': {'fsspec_class': 'fsspec.implementations.local.LocalFileSystem',
'root_path': './metadata-cache/'}}
In [30]: Bake(config=Config(bconfig)).start()
[Bake] Target Storage is FSSpecTarget(LocalFileSystem(, root_path="./ccmp.zarr")
[Bake] Input Cache Storage is CacheFSSpecTarget(LocalFileSystem(, root_path="./input-cache/")
[Bake] Metadata Cache Storage is MetadataTarget(LocalFileSystem(, root_path="./metadata-cache/")
[Bake] Picked Git content provider.
[Bake] Cloning into '/tmp/tmpub5ds75f'...
[Bake] HEAD is now at 663f30c Added datetime import to ics_wind_speed_direction()
[Bake] Parsing recipes...
[Bake] Baking only recipe_id='eooffshore_ics_ccmp_v02_1_nrt_wind'
[Bake] Running job for recipe eooffshore_ics_ccmp_v02_1_nrt_wind
WARNING:apache_beam.runners.portability.local_job_service:Worker: severity: WARN timestamp { seconds: 1666188444 nanos: 309692382 } message: "Discarding unparseable args: [\'--pipeline_type_check\', \'--direct_runner_use_stacked_bundle\']" log_location: "/data/anaconda/anaconda3/envs/forgerunner/lib/python3.9/site-packages/apache_beam/options/pipeline_options.py:339" thread: "MainThread"
path: https://data.remss.com/ccmp/v02.1.NRT/Y2015/M01/CCMP_RT_Wind_Analysis_20150116_V02.1_L3.0_RSS.nc, full path: ./input-cache/d3408891d9718ce472f03bb2d4427fe7-https_data.remss.com_ccmp_v02.1.nrt_y2015_m01_ccmp_rt_wind_analysis_20150116_v02.1_l3.0_rss.nc
path: https://data.remss.com/ccmp/v02.1.NRT/Y2015/M01/CCMP_RT_Wind_Analysis_20150117_V02.1_L3.0_RSS.nc, full path: ./input-cache/0bc1db53b59b2db2912c364bb76e0070-https_data.remss.com_ccmp_v02.1.nrt_y2015_m01_ccmp_rt_wind_analysis_20150117_v02.1_l3.0_rss.nc
/data/anaconda/anaconda3/envs/forgerunner/lib/python3.9/site-packages/pangeo_forge_recipes/chunk_grid.py:51: UserWarning: chunksize (8000) > dimsize (8). Decreasing chunksize to 8
warnings.warn(
In [31]: ds = xr.open_zarr('./ccmp.zarr/')
In [32]: ds
Out[32]:
<xarray.Dataset>
Dimensions: (height: 1, latitude: 50, longitude: 86, time: 8)
Coordinates:
* height (height) int64 10
* latitude (latitude) float32 45.88 46.12 46.38 ... 57.62 57.88 58.12
* longitude (longitude) float32 333.9 334.1 334.4 ... 354.6 354.9 355.1
* time (time) datetime64[ns] 2015-01-16 ... 2015-01-17T18:00:00
Data variables:
nobs (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
uwnd (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
vwnd (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
wind_direction (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
wind_speed (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
Attributes: (12/38)
Conventions: CF-1.6
comment: none
contact: Remote Sensing Systems, support@remss.com
contributor_name: Carl Mears, Joel Scott, Frank Wentz, Ross...
contributor_role: Co-Investigator, Software Engineer, Proje...
creator_email: support@remss.com
... ...
publisher_email: support@remss.com
publisher_name: Remote Sensing Systems
publisher_url: http://www.remss.com/
references: Mears et al., Journal of Geophysical Rese...
summary: CCMP_RT V2.1 has been created using the s...
title: RSS CCMP_RT V2.1 derived surface winds (L...
In [33]: ds.time
Out[33]:
<xarray.DataArray 'time' (time: 8)>
array(['2015-01-16T00:00:00.000000000', '2015-01-16T06:00:00.000000000',
'2015-01-16T12:00:00.000000000', '2015-01-16T18:00:00.000000000',
'2015-01-17T00:00:00.000000000', '2015-01-17T06:00:00.000000000',
'2015-01-17T12:00:00.000000000', '2015-01-17T18:00:00.000000000'],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 2015-01-16 ... 2015-01-17T18:00:00
Attributes:
_CoordinateAxisType: Time
_Fillvalue: -9999.0
axis: T
delta_t: 0000-00-00 06:00:00
long_name: Time of analysis
standard_name: time
valid_max: 245826.0
valid_min: 245808.0 |
/run eooffshore_ics_ccmp_v02_1_nrt_wind |
The test failed, but I'm sure we can find out why! Pangeo Forge maintainers are working diligently to provide public logs for contributors. |
@derekocallaghan, after digging into the logs, i found the following traceback which seems to be pointing at connectivity issues. Traceback (most recent call last):
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/http.py", line 391, in _info
await _file_info(
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/http.py", line 768, in _file_info
r = await session.get(url, allow_redirects=ar, **kwargs)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/client.py", line 536, in _request
conn = await self._connector.connect(
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 540, in connect
proto = await self._create_connection(req, traces, timeout)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 901, in _create_connection
_, proto = await self._create_direct_connection(req, traces, timeout)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 1206, in _create_direct_connection
raise last_exc
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 1175, in _create_direct_connection
transp, proto = await self._wrap_create_connection(
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 988, in _wrap_create_connection
raise client_error(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host data.remss.com:443 ssl:default [Connect call failed ('157.131.67.86', 443)]
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 624, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "/usr/local/lib/python3.9/dist-packages/apache_beam/transforms/core.py", line 1956, in <lambda>
File "/usr/local/lib/python3.9/dist-packages/pangeo_forge_recipes/executors/beam.py", line 40, in exec_stage
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 156, in cache_input
config.storage_config.cache.cache_file(
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/storage.py", line 164, in cache_file
remote_size = _get_url_size(fname, secrets, **open_kwargs)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/storage.py", line 31, in _get_url_size
with _get_opener(fname, secrets, **open_kwargs) as of:
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/core.py", line 103, in __enter__
f = self.fs.open(self.path, mode=mode)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/spec.py", line 1034, in open
f = self._open(
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/http.py", line 340, in _open
size = size or self.info(path, **kwargs)["size"]
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/asyn.py", line 111, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/asyn.py", line 96, in sync
raise return_result
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/asyn.py", line 53, in _runner
result[0] = await coro
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/http.py", line 404, in _info
raise FileNotFoundError(url) from exc
FileNotFoundError: https://data.remss.com/ccmp/v02.1.NRT/Y2015/M01/CCMP_RT_Wind_Analysis_20150117_V02.1_L3.0_RSS.nc
@rabernat / @yuvipanda, does the dataflow runner possibly not have internet access? I am able to access the URLs in question (e.g. |
/run eooffshore_ics_ccmp_v02_1_nrt_wind |
The test failed, but I'm sure we can find out why! Pangeo Forge maintainers are working diligently to provide public logs for contributors. |
/run eooffshore_ics_ccmp_v02_1_nrt_wind |
The test failed, but I'm sure we can find out why! Pangeo Forge maintainers are working diligently to provide public logs for contributors. |
/run eooffshore_ics_ccmp_v02_1_nrt_wind |
The test failed, but I'm sure we can find out why! Pangeo Forge maintainers are working diligently to provide public logs for contributors. |
not sure what's going on but the failure seems to be related to a serialization issue i observed in other feedstocks:
File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 827, in findsource
raise OSError('source code not available')
OSError: source code not available [while running 'Start|cache_input|Reshuffle_000|prepare_target|Reshuffle_001|store_chunk|Reshuffle_002|finalize_target|Reshuffle_003/prepare_target-ptransform-56']
,
a6170692e70616e67656f2d66-10201802-cyzc-harness-sbr9
Root cause: Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 837, in apache_beam.runners.common.PerWindowInvoker.invoke_process
File "apache_beam/runners/common.py", line 983, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
File "/usr/local/lib/python3.9/dist-packages/apache_beam/transforms/core.py", line 1877, in <lambda>
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/executors/beam.py", line 14, in _no_arg_stage
fun(config=config)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 587, in prepare_target
for k, v in config.get_execution_context().items():
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/base.py", line 59, in get_execution_context
recipe_hash=self.sha256().hex(),
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/base.py", line 53, in sha256
return dataclass_sha256(self, ignore_keys=self._hash_exclude_)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 73, in dataclass_sha256
return dict_to_sha256(d)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 34, in dict_to_sha256
b = dumps(
File "/srv/conda/envs/notebook/lib/python3.9/json/__init__.py", line 234, in dumps
return cls(
File "/srv/conda/envs/notebook/lib/python3.9/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/srv/conda/envs/notebook/lib/python3.9/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 22, in either_encode_or_hash
return inspect.getsource(obj)
File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 1024, in getsource
lines, lnum = getsourcelines(object)
File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 1006, in getsourcelines
lines, lnum = findsource(object)
File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 827, in findsource
raise OSError('source code not available')
OSError: source code not available
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py", line 284, in _execute
response = task()
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py", line 357, in <lambda>
lambda: self.create_worker().do_instruction(request), request)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py", line 597, in do_instruction
return getattr(self, request_type)(
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py", line 635, in process_bundle
bundle_processor.process_bundle(instruction_id))
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1003, in process_bundle
input_op_by_transform_id[element.transform_id].process_encoded(
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/bundle_processor.py", line 227, in process_encoded
self.output(decoded_value)
File "apache_beam/runners/worker/operations.py", line 526, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 528, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 237, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 240, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 907, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 908, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1419, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1491, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 623, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1581, in apache_beam.runners.common._OutputHandler.handle_process_outputs
File "apache_beam/runners/common.py", line 1694, in apache_beam.runners.common._OutputHandler._write_value_to_tag
File "apache_beam/runners/worker/operations.py", line 240, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 907, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 908, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1419, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1491, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 623, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1581, in apache_beam.runners.common._OutputHandler.handle_process_outputs
File "apache_beam/runners/common.py", line 1694, in apache_beam.runners.common._OutputHandler._write_value_to_tag
File "apache_beam/runners/worker/operations.py", line 240, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 907, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 908, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1419, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1507, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 837, in apache_beam.runners.common.PerWindowInvoker.invoke_process
File "apache_beam/runners/common.py", line 983, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
File "/usr/local/lib/python3.9/dist-packages/apache_beam/transforms/core.py", line 1877, in <lambda>
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/executors/beam.py", line 14, in _no_arg_stage
fun(config=config)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 587, in prepare_target
for k, v in config.get_execution_context().items():
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/base.py", line 59, in get_execution_context
recipe_hash=self.sha256().hex(),
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/base.py", line 53, in sha256
return dataclass_sha256(self, ignore_keys=self._hash_exclude_)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 73, in dataclass_sha256
return dict_to_sha256(d)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 34, in dict_to_sha256
b = dumps(
File "/srv/conda/envs/notebook/lib/python3.9/json/__init__.py", line 234, in dumps
return cls(
File "/srv/conda/envs/notebook/lib/python3.9/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/srv/conda/envs/notebook/lib/python3.9/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 22, in either_encode_or_hash
return inspect.getsource(obj)
File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 1024, in getsource
lines, lnum = getsourcelines(object)
File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 1006, in getsourcelines
lines, lnum = findsource(object)
File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 827, in findsource
raise OSError('source code not available')
OSError: source code not available [while running 'Start|cache_input|Reshuffle_000|prepare_target|Reshuffle_001|store_chunk|Reshuffle_002|finalize_target|Reshuffle_003/prepare_target-ptransform-56'] |
Hi @andersy005, thanks for the test runs and the update. I guess the good news is that the I had a quick look at the recipes that failed yesterday (this one - EOOffshore CCMP, AGDC, LMR...) vs the recipe that ran successfully (eNATL60), and the latter is the only one that doesn't define either a I had a look at reproducing the scenario with a combination of code snippets similar to what's used in In [43]: from fsspec.implementations.local import LocalFileSystem
...: import inspect
...: from pangeo_forge_recipes.storage import CacheFSSpecTarget, FSSpecTarget, MetadataTarget, StorageConfig, temporary_storage_config
...: from pangeo_forge_runner import Feedstock
...: from pathlib import Path
...:
...: feedstock = Feedstock(Path(".../staged-recipes/recipes/eooffshore_ics_ccmp_v02_1_nrt_wind"))
...:
...: recipes = feedstock.parse_recipes()
...:
...: recipes = {k: r.copy_pruned() for k, r in recipes.items()}
...:
...: recipe = recipes['eooffshore_ics_ccmp_v02_1_nrt_wind']
...:
...: storage_config = temporary_storage_config()
...: storage_config.target = FSSpecTarget(LocalFileSystem(), f'./ccmp.zarr')
...: storage_config.cache = CacheFSSpecTarget(LocalFileSystem(), './input-cache')
...: recipe.storage_config = storage_config
...:
In [44]: recipe
Out[44]: XarrayZarrRecipe(file_pattern=<FilePattern {'time': 2}>, storage_config=StorageConfig(target=FSSpecTarget(fs=<fsspec.implementations.local.LocalFileSystem object at 0x7fbf1b443370>, root_path='./ccmp.zarr'), cache=CacheFSSpecTarget(fs=<fsspec.implementations.local.LocalFileSystem object at 0x7fbf1b443370>, root_path='./input-cache'), metadata=MetadataTarget(fs=<fsspec.implementations.local.LocalFileSystem object at 0x7fbf1b443370>, root_path='/tmp/tmps62mr0gl/urkdQjLc')), inputs_per_chunk=2000, target_chunks={'time': 8000, 'latitude': -1, 'longitude': -1}, cache_inputs=True, copy_input_to_local_file=False, consolidate_zarr=True, consolidate_dimension_coordinates=True, xarray_open_kwargs={}, xarray_concat_kwargs={}, delete_input_encoding=True, process_input=<function ics_wind_speed_direction at 0x7fbf19f7df70>, process_chunk=None, lock_timeout=None, subset_inputs={}, open_input_with_kerchunk=False)
In [45]: recipe.sha256()
Out[45]: b'\xe6\x85\xd6\xfa\xb0\xe7\x8a\x80\xf7ST\xa1M\xed\xae\x8e\x9e\xbe\xe8!d\xb2\xc6)\n\x8f3}}\x85\x7f\xf5'
In [46]: dataclass_sha256(recipe, ignore_keys=recipe._hash_exclude_)
Out[46]: b'\xe6\x85\xd6\xfa\xb0\xe7\x8a\x80\xf7ST\xa1M\xed\xae\x8e\x9e\xbe\xe8!d\xb2\xc6)\n\x8f3}}\x85\x7f\xf5'
In [47]: either_encode_or_hash(recipe.process_input)
Out[47]: 'def ics_wind_speed_direction(ds, fname):\n """\n Selects a subset for the Irish Continental Shelf (ICS) region, and computes wind speed and\n direction for the u and v components in the specified product. Dask arrays are\n created for delayed execution.\n """\n import dask\n import dask.array as da\n from datetime import datetime\n from metpy.calc import wind_direction, wind_speed\n import xarray as xr\n\n @dask.delayed\n def delayed_metpy_fn(fn, u, v):\n return fn(u, v).values\n\n # ICS grid\n geospatial_lat_min = 45.75\n geospatial_lat_max = 58.25\n geospatial_lon_min = 333.85\n geospatial_lon_max = 355.35\n icds = ds.sel(\n latitude=slice(geospatial_lat_min, geospatial_lat_max),\n longitude=slice(geospatial_lon_min, geospatial_lon_max),\n )\n\n # Remove subset of original attrs as they\'re no longer relevant\n for attr in ["base_date", "date_created", "history"]:\n del icds.attrs[attr]\n\n # Update the grid attributes\n icds.attrs.update(\n {\n "geospatial_lat_min": geospatial_lat_min,\n "geospatial_lat_max": geospatial_lat_max,\n "geospatial_lon_min": geospatial_lon_min,\n "geospatial_lon_max": geospatial_lon_max,\n }\n )\n u = icds.uwnd\n v = icds.vwnd\n # Original wind speed \'units\': \'m s-1\' attribute not accepted by MetPy,\n # use the unit contained in ERA5 data\n ccmp_wind_speed_units = u.units\n era5_wind_speed_units = "m s**-1"\n u.attrs["units"] = era5_wind_speed_units\n v.attrs["units"] = era5_wind_speed_units\n\n variables = [\n {\n "name": "wind_speed",\n "metpy_fn": wind_speed,\n "attrs": {"long_name": "Wind speed", "units": ccmp_wind_speed_units},\n },\n {\n "name": "wind_direction",\n "metpy_fn": wind_direction,\n "attrs": {"long_name": "Wind direction", "units": "degree"},\n },\n ]\n\n # CCMP provides u/v at a single height, 10m\n for variable in variables:\n icds[variable["name"]] = (\n xr.DataArray(\n da.from_delayed(\n delayed_metpy_fn(variable["metpy_fn"], u, v), u.shape, dtype=u.dtype\n ),\n coords=u.coords,\n dims=u.dims,\n )\n .assign_coords(height=10)\n .expand_dims(["height"])\n )\n icds[variable["name"]].attrs.update(variable["attrs"])\n\n icds.height.attrs.update(\n {\n "long_name": "Height above the surface",\n "standard_name": "height",\n "units": "m",\n }\n )\n # Restore units\n for variable in ["uwnd", "vwnd"]:\n icds[variable].attrs["units"] = ccmp_wind_speed_units\n\n icds.attrs["eooffshore_zarr_creation_time"] = datetime.strftime(\n datetime.now(), "%Y-%m-%dT%H:%M:%SZ"\n )\n icds.attrs[\n "eooffshore_zarr_details"\n ] = "EOOffshore Project: Concatenated CCMP v0.2.1.NRT 6-hourly wind products provided by Remote Sensing Systems (RSS), for Irish Continental Shelf. Wind speed and direction have been calculated from the uwnd and vwnd variables. CCMP Version-2 vector wind analyses are produced by Remote Sensing Systems. Data are available at www.remss.com."\n return icds\n'
In [48]: inspect.isfunction(recipe.process_input)
Out[48]: True
In [49]: inspect.getsource(recipe.process_input)
Out[49]: 'def ics_wind_speed_direction(ds, fname):\n """\n Selects a subset for the Irish Continental Shelf (ICS) region, and computes wind speed and\n direction for the u and v components in the specified product. Dask arrays are\n created for delayed execution.\n """\n import dask\n import dask.array as da\n from datetime import datetime\n from metpy.calc import wind_direction, wind_speed\n import xarray as xr\n\n @dask.delayed\n def delayed_metpy_fn(fn, u, v):\n return fn(u, v).values\n\n # ICS grid\n geospatial_lat_min = 45.75\n geospatial_lat_max = 58.25\n geospatial_lon_min = 333.85\n geospatial_lon_max = 355.35\n icds = ds.sel(\n latitude=slice(geospatial_lat_min, geospatial_lat_max),\n longitude=slice(geospatial_lon_min, geospatial_lon_max),\n )\n\n # Remove subset of original attrs as they\'re no longer relevant\n for attr in ["base_date", "date_created", "history"]:\n del icds.attrs[attr]\n\n # Update the grid attributes\n icds.attrs.update(\n {\n "geospatial_lat_min": geospatial_lat_min,\n "geospatial_lat_max": geospatial_lat_max,\n "geospatial_lon_min": geospatial_lon_min,\n "geospatial_lon_max": geospatial_lon_max,\n }\n )\n u = icds.uwnd\n v = icds.vwnd\n # Original wind speed \'units\': \'m s-1\' attribute not accepted by MetPy,\n # use the unit contained in ERA5 data\n ccmp_wind_speed_units = u.units\n era5_wind_speed_units = "m s**-1"\n u.attrs["units"] = era5_wind_speed_units\n v.attrs["units"] = era5_wind_speed_units\n\n variables = [\n {\n "name": "wind_speed",\n "metpy_fn": wind_speed,\n "attrs": {"long_name": "Wind speed", "units": ccmp_wind_speed_units},\n },\n {\n "name": "wind_direction",\n "metpy_fn": wind_direction,\n "attrs": {"long_name": "Wind direction", "units": "degree"},\n },\n ]\n\n # CCMP provides u/v at a single height, 10m\n for variable in variables:\n icds[variable["name"]] = (\n xr.DataArray(\n da.from_delayed(\n delayed_metpy_fn(variable["metpy_fn"], u, v), u.shape, dtype=u.dtype\n ),\n coords=u.coords,\n dims=u.dims,\n )\n .assign_coords(height=10)\n .expand_dims(["height"])\n )\n icds[variable["name"]].attrs.update(variable["attrs"])\n\n icds.height.attrs.update(\n {\n "long_name": "Height above the surface",\n "standard_name": "height",\n "units": "m",\n }\n )\n # Restore units\n for variable in ["uwnd", "vwnd"]:\n icds[variable].attrs["units"] = ccmp_wind_speed_units\n\n icds.attrs["eooffshore_zarr_creation_time"] = datetime.strftime(\n datetime.now(), "%Y-%m-%dT%H:%M:%SZ"\n )\n icds.attrs[\n "eooffshore_zarr_details"\n ] = "EOOffshore Project: Concatenated CCMP v0.2.1.NRT 6-hourly wind products provided by Remote Sensing Systems (RSS), for Irish Continental Shelf. Wind speed and direction have been calculated from the uwnd and vwnd variables. CCMP Version-2 vector wind analyses are produced by Remote Sensing Systems. Data are available at www.remss.com."\n return icds\n'
In [50]: dataclass_sha256(recipe, ignore_keys=recipe._hash_exclude_)
Out[50]: b'\xe6\x85\xd6\xfa\xb0\xe7\x8a\x80\xf7ST\xa1M\xed\xae\x8e\x9e\xbe\xe8!d\xb2\xc6)\n\x8f3}}\x85\x7f\xf5' I guess this doesn't necessarily help determining why the above is successful locally but failing in the test runs. However, as we're excluding In [51]: dataclass_sha256(recipe, ignore_keys=recipe._hash_exclude_ + ['process_input'])
Out[51]: b'\x14w\xd5\xcbH\x15At:\x1bY;\xbb/P\x90%t\xf3\xd9T\\P\x9a\xf4\xf5\xa18\x0b\xa7M\xa8' I.e. use the following in _hash_exclude_ = ["process_chunk", "process_input", "storage_config"] and in if k in d:
del d[k] Afaict, apart from tests, |
Thank you for the informative insights, @derekocallaghan... @rabernat, does this ring any bells for you? |
/run eooffshore_ics_ccmp_v02_1_nrt_wind |
🎉 The test run of import xarray as xr
store = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1255/eooffshore_ics_ccmp_v02_1_nrt_wind.zarr"
ds = xr.open_dataset(store, engine='zarr', chunks={})
ds |
@derekocallaghan, the latest run seems have succeeded https://pangeo-forge.org/dashboard/recipe-run/1255?feedstock_id=1 let me know if this is ready, and i'll merge it |
Hi @andersy005, thanks for retrying the run and for all of the previous runs. I've just tried a short check of this run's store, compared to my corresponding local store, and it looks good: In [1]: import xarray as xr
In [2]: store = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1255/eooffshore_ics_ccmp_v02_1_nrt_wind.zarr"
...: ds = xr.open_dataset(store, engine='zarr', chunks={})
In [3]: ds
Out[3]:
<xarray.Dataset>
Dimensions: (height: 1, latitude: 50, longitude: 86, time: 8)
Coordinates:
* height (height) int64 10
* latitude (latitude) float32 45.88 46.12 46.38 ... 57.62 57.88 58.12
* longitude (longitude) float32 333.9 334.1 334.4 ... 354.6 354.9 355.1
* time (time) datetime64[ns] 2015-01-16 ... 2015-01-17T18:00:00
Data variables:
nobs (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
uwnd (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
vwnd (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
wind_direction (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
wind_speed (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
Attributes: (12/38)
Conventions: CF-1.6
comment: none
contact: Remote Sensing Systems, support@remss.com
contributor_name: Carl Mears, Joel Scott, Frank Wentz, Ross...
contributor_role: Co-Investigator, Software Engineer, Proje...
creator_email: support@remss.com
... ...
publisher_email: support@remss.com
publisher_name: Remote Sensing Systems
publisher_url: http://www.remss.com/
references: Mears et al., Journal of Geophysical Rese...
summary: CCMP_RT V2.1 has been created using the s...
title: RSS CCMP_RT V2.1 derived surface winds (L...
In [4]: ds.time.values
Out[4]:
array(['2015-01-16T00:00:00.000000000', '2015-01-16T06:00:00.000000000',
'2015-01-16T12:00:00.000000000', '2015-01-16T18:00:00.000000000',
'2015-01-17T00:00:00.000000000', '2015-01-17T06:00:00.000000000',
'2015-01-17T12:00:00.000000000', '2015-01-17T18:00:00.000000000'],
dtype='datetime64[ns]')
In [5]: ds.resample(time='D').mean().wind_speed.isel(latitude=0,longitude=0).compute()
Out[5]:
<xarray.DataArray 'wind_speed' (time: 2, height: 1)>
array([[10.636068],
[14.07321 ]], dtype=float32)
Coordinates:
* height (height) int64 10
latitude float32 45.88
longitude float32 333.9
* time (time) datetime64[ns] 2015-01-16 2015-01-17
In [6]: ds.eooffshore_zarr_details
Out[6]: 'EOOffshore Project: Concatenated CCMP v0.2.1.NRT 6-hourly wind products provided by Remote Sensing Systems (RSS), for Irish Continental Shelf. Wind speed and direction have been calculated from the uwnd and vwnd variables. CCMP Version-2 vector wind analyses are produced by Remote Sensing Systems. Data are available at www.remss.com.'
In [7]: dslocal = xr.open_zarr('./eooffshore_ics_ccmp_v02_1_nrt_wind.zarr')
In [8]: dslocal = dslocal.isel(time=slice(0,8))
In [9]: dslocal
Out[9]:
<xarray.Dataset>
Dimensions: (height: 1, latitude: 50, longitude: 86, time: 8)
Coordinates:
* height (height) int64 10
* latitude (latitude) float32 45.88 46.12 46.38 ... 57.62 57.88 58.12
* longitude (longitude) float32 333.9 334.1 334.4 ... 354.6 354.9 355.1
* time (time) datetime64[ns] 2015-01-16 ... 2015-01-17T18:00:00
Data variables:
nobs (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
uwnd (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
vwnd (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
wind_direction (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
wind_speed (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
Attributes: (12/35)
Conventions: CF-1.6
comment: none
contact: Remote Sensing Systems, support@remss.com
contributor_name: Carl Mears, Joel Scott, Frank Wentz, Ross...
contributor_role: Co-Investigator, Software Engineer, Proje...
creator_email: support@remss.com
... ...
publisher_email: support@remss.com
publisher_name: Remote Sensing Systems
publisher_url: http://www.remss.com/
references: Mears et al., Journal of Geophysical Rese...
summary: CCMP_RT V2.1 has been created using the s...
title: RSS CCMP_RT V2.1 derived surface winds (L...
In [10]: dslocal.time.values
Out[10]:
array(['2015-01-16T00:00:00.000000000', '2015-01-16T06:00:00.000000000',
'2015-01-16T12:00:00.000000000', '2015-01-16T18:00:00.000000000',
'2015-01-17T00:00:00.000000000', '2015-01-17T06:00:00.000000000',
'2015-01-17T12:00:00.000000000', '2015-01-17T18:00:00.000000000'],
dtype='datetime64[ns]')
In [11]: dslocal.resample(time='D').mean().wind_speed.isel(latitude=0,longitude=0).compute()
Out[11]:
<xarray.DataArray 'wind_speed' (time: 2, height: 1)>
array([[10.63607],
[14.07321]], dtype=float32)
Coordinates:
* height (height) int64 10
latitude float32 45.88
longitude float32 333.9
* time (time) datetime64[ns] 2015-01-16 2015-01-17
In [12]: dslocal.eooffshore_zarr_details
Out[12]: 'EOOffshore Project: Concatenated CCMP v0.2.1.NRT 6-hourly wind products provided by Remote Sensing Systems (RSS), for Irish Continental Shelf. Wind speed and direction have been calculated from the uwnd and vwnd variables. CCMP Version-2 vector wind analyses are produced by Remote Sensing Systems. Data are available at www.remss.com.' |
Thank you, @derekocallaghan! Just merged this. For any subsequent issues/discussions, let's have them over in https://github.com/pangeo-forge/eooffshore_ics_ccmp_v02_1_nrt_wind-feedstock |
This recipe will create a data set containing 2015 - 2021 Cross-Calibrated Multi-Platform (CCMP) v0.2.1.NRT 6-hourly wind products for the Irish Continental Shelf region, where wind speed and direction are calculated from the
uwnd
andvwnd
variables. The source data products are generated by Remote Sensing Systems (RSS).The recipe will recreate the CCMP data set used in the EOOffshore project (https://eooffshore.github.io), whose outputs were presented (Scalable Offshore Wind Analysis With Pangeo) at the Meeting Exascale Computing Challenges with Compression and Pangeo 2022 EGU General Assembly session.
Example usage of the CCMP data set in EOOffshore:
Note:
pangeo_notebook_version: "2022.05.02"
isn't in the current sandboxmeta.yaml
template, I've included it as it seems to be in recently contributed recipes. This may need to be excluded or the version changed (I couldn't determine the latter).