Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding EOOffshore CCMP v0.2.1.NRT recipe #145

Merged

Conversation

derekocallaghan
Copy link
Contributor

This recipe will create a data set containing 2015 - 2021 Cross-Calibrated Multi-Platform (CCMP) v0.2.1.NRT 6-hourly wind products for the Irish Continental Shelf region, where wind speed and direction are calculated from the uwnd and vwnd variables. The source data products are generated by Remote Sensing Systems (RSS).

The recipe will recreate the CCMP data set used in the EOOffshore project (https://eooffshore.github.io), whose outputs were presented (Scalable Offshore Wind Analysis With Pangeo) at the Meeting Exascale Computing Challenges with Compression and Pangeo 2022 EGU General Assembly session.

Example usage of the CCMP data set in EOOffshore:

Note:

  • A pruned version of the recipe has been successfully tested in the sandbox today.
  • Although pangeo_notebook_version: "2022.05.02" isn't in the current sandbox meta.yaml template, I've included it as it seems to be in recently contributed recipes. This may need to be excluded or the version changed (I couldn't determine the latter).
  • Although I couldn't find the license details on the RSS site, this NCAR/UCAR Research Data Archive page states that it's CC-BY-4.0. A separate CCMP data set has been previously used in the NASA CCMP Winds Pangeo Gallery notebook.

@pangeo-forge-bot
Copy link

🎉 New recipe runs created for the following recipes at sha 6a58178849a44f2fdad99c5c1b11e1a7ea0a1cf0:

@rabernat
Copy link
Contributor

/run recipe-test recipe_run_id=885

@pangeo-forge-bot
Copy link

When I tried to import your recipe module, I encountered this error

            line 4, in <module>
        from metpy.calc import wind_direction, wind_speed
    ModuleNotFoundError: No module named 'metpy'

Please correct your recipe module so that it's importable.

@rabernat
Copy link
Contributor

@derekocallaghan - thanks os much for this contribution and sorry for the slow response here! (Some of our team members have been out sick.)

I'm confused about the metpy error, since metpy is clearly part of the pangeo docker image (https://github.com/pangeo-data/pangeo-docker-images/blob/master/pangeo-notebook/environment.yml#L53). @cisaacstern - any thoughts on this?

@cisaacstern
Copy link
Member

ModuleNotFoundError: No module named 'metpy'

This is a design flaw with the registrar as it's currently designed. (Which is my fault.) ... we don't have the full worker environment available in the registrar's Python session. 🤔

@cisaacstern
Copy link
Member

Fixing this (major) design oversight is on our roadmap. For Pangeo Forge organization members, a detailed tracking issue is available here: https://github.com/pangeo-forge/registrar/issues/43.

@derekocallaghan
Copy link
Contributor Author

Hi @cisaacstern and @rabernat, thanks for the update. Sorry for only getting back now, I'm currently on vacation. I could exclude the metpy usage but I was hoping to reproduce the same data set used for the EGU presentation. I can hold off for now in case the registrar issue you mention will be resolved in the coming months.

@cisaacstern
Copy link
Member

Yes, our plan is to support metpy in the coming months. Thanks for your patience, and please feel free to check back with me on this thread at any point. I will plan to circle back to the this once metpy is supported here.

@derekocallaghan
Copy link
Contributor Author

Thanks @cisaacstern, btw I'm happy to help with resolving the issue if that suits

@cisaacstern cisaacstern added the blocked:requires-dind Requires Docker in Docker on backend to deploy, due to some import at recipe top level. label Sep 8, 2022
…ersion 0.9.1

Runs are postponed for now, see `blocked:requires-dind` label.
@andersy005
Copy link
Member

pre-commit.ci autofix

import dask
import dask.array as da
from datetime import datetime
from metpy.calc import wind_direction, wind_speed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andersy005, despite my earlier comments on this PR, and the blocked:... label I added, I actually think if we move this metpy import inside the body of the ics_wind_speed_direction function below, this will work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason being that the Dataflow worker image does have metpy, so the only issue is actually that the FastAPI application doesn't have metpy at recipe parse time. But if that import is within the function body (where IIUC it is used anyway), then it will not be imported at parse time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great...where would we define metpy as a run dependency or is it already installed in the environment used to run the recipe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already in the environment used to run the recipe, so I think if the import is moved into the function body, it should just work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cisaacstern and @andersy005, the latest recipe commit now has the metpy import inside ics_wind_speed_direction(). I've tested it locally with version 0.9.1.

Copy link
Member

@cisaacstern cisaacstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @derekocallaghan! Something I've been learning recently is that in the Apache Beam distributed computing model we're using on the backend, it turns out any imports used within a function need to be declared inside the function body. So I've moved the xarray import into the function body as well. I'll accept those changes myself now, and then trigger a test run of this recipe.

import pandas as pd
from pangeo_forge_recipes.patterns import FilePattern, ConcatDim
from pangeo_forge_recipes.recipes import XarrayZarrRecipe
import xarray as xr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import xarray as xr

direction for the u and v components in the specified product. Dask arrays are
created for delayed execution.
"""
from metpy.calc import wind_direction, wind_speed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from metpy.calc import wind_direction, wind_speed
import xarray as xr
from metpy.calc import wind_direction, wind_speed

@cisaacstern
Copy link
Member

@derekocallaghan for some reason I don't see the option to accept these changes myself. Could you move the xarray import into the function body as well, as proposed above? Thanks!

@derekocallaghan
Copy link
Contributor Author

@derekocallaghan for some reason I don't see the option to accept these changes myself. Could you move the xarray import into the function body as well, as proposed above? Thanks!

Hi @cisaacstern, no problem, I've just moved all required imports into the function body. Local test was successful.

@cisaacstern
Copy link
Member

pre-commit.ci autofix

@cisaacstern
Copy link
Member

Oh I see why I couldn't accept changes, same reason pre-commit autofix didn't work:

push (skipped)
push permission was denied -- perhaps the PR was made from an organization?

No problem, that's just stylistic stuff. I'll run the test now.

@cisaacstern
Copy link
Member

/run eooffshore_ics_ccmp_v02_1_nrt_wind

@pangeo-forge
Copy link
Contributor

pangeo-forge bot commented Sep 29, 2022

The test failed, but I'm sure we can find out why!

Pangeo Forge maintainers are working diligently to provide public logs for contributors.
That feature is not quite ready yet, however, so please reach out on this thread to a
maintainer, and they'll help you diagnose the problem.

@cisaacstern
Copy link
Member

Re: this failure, see my comment in #169 (comment). I aim to have a way for other admins to query these logs today. I'll check back here once that's ready.

@derekocallaghan
Copy link
Contributor Author

Oh I see why I couldn't accept changes, same reason pre-commit autofix didn't work:

push (skipped)
push permission was denied -- perhaps the PR was made from an organization?

No problem, that's just stylistic stuff. I'll run the test now.

Thanks, sorry about that, it was made from an organization.

@derekocallaghan
Copy link
Contributor Author

I've been looking into reproducing the issue locally using pangeo_forge_runner.commands.bake.Bake, and I think the issues are only happening when running the recipe with Beam. I'm looking at generating a more useful test case, but in the meantime, here's the latest recipe commit being run with Bake, which generates the expected Zarr:

In [29]: bconfig
Out[29]: 
{'Bake': {'bakery_class': 'pangeo_forge_runner.bakery.local.LocalDirectBakery',
  'recipe_id': 'eooffshore_ics_ccmp_v02_1_nrt_wind',
  'feedstock_subdir': 'recipes/eooffshore_ics_ccmp_v02_1_nrt_wind',
  'repo': 'https://github.com/eooffshore/staged-recipes',
  'ref': '663f30c95c406b9efe012b9bae66fa1f386b539b',
  'job_name': 'CCMP',
  'prune': True},
 'LocalDirectBakery': {'num_workers': 1},
 'TargetStorage': {'fsspec_class': 'fsspec.implementations.local.LocalFileSystem',
  'root_path': './ccmp.zarr'},
 'InputCacheStorage': {'fsspec_class': 'fsspec.implementations.local.LocalFileSystem',
  'root_path': './input-cache/'},
 'MetadataCacheStorage': {'fsspec_class': 'fsspec.implementations.local.LocalFileSystem',
  'root_path': './metadata-cache/'}}

In [30]: Bake(config=Config(bconfig)).start()
[Bake] Target Storage is FSSpecTarget(LocalFileSystem(, root_path="./ccmp.zarr")

[Bake] Input Cache Storage is CacheFSSpecTarget(LocalFileSystem(, root_path="./input-cache/")

[Bake] Metadata Cache Storage is MetadataTarget(LocalFileSystem(, root_path="./metadata-cache/")

[Bake] Picked Git content provider.

[Bake] Cloning into '/tmp/tmpub5ds75f'...

[Bake] HEAD is now at 663f30c Added datetime import to ics_wind_speed_direction()

[Bake] Parsing recipes...
[Bake] Baking only recipe_id='eooffshore_ics_ccmp_v02_1_nrt_wind'
[Bake] Running job for recipe eooffshore_ics_ccmp_v02_1_nrt_wind

WARNING:apache_beam.runners.portability.local_job_service:Worker: severity: WARN timestamp {   seconds: 1666188444   nanos: 309692382 } message: "Discarding unparseable args: [\'--pipeline_type_check\', \'--direct_runner_use_stacked_bundle\']" log_location: "/data/anaconda/anaconda3/envs/forgerunner/lib/python3.9/site-packages/apache_beam/options/pipeline_options.py:339" thread: "MainThread" 
path: https://data.remss.com/ccmp/v02.1.NRT/Y2015/M01/CCMP_RT_Wind_Analysis_20150116_V02.1_L3.0_RSS.nc, full path: ./input-cache/d3408891d9718ce472f03bb2d4427fe7-https_data.remss.com_ccmp_v02.1.nrt_y2015_m01_ccmp_rt_wind_analysis_20150116_v02.1_l3.0_rss.nc
path: https://data.remss.com/ccmp/v02.1.NRT/Y2015/M01/CCMP_RT_Wind_Analysis_20150117_V02.1_L3.0_RSS.nc, full path: ./input-cache/0bc1db53b59b2db2912c364bb76e0070-https_data.remss.com_ccmp_v02.1.nrt_y2015_m01_ccmp_rt_wind_analysis_20150117_v02.1_l3.0_rss.nc
/data/anaconda/anaconda3/envs/forgerunner/lib/python3.9/site-packages/pangeo_forge_recipes/chunk_grid.py:51: UserWarning: chunksize (8000) > dimsize (8). Decreasing chunksize to 8
  warnings.warn(

In [31]: ds = xr.open_zarr('./ccmp.zarr/')

In [32]: ds
Out[32]: 
<xarray.Dataset>
Dimensions:         (height: 1, latitude: 50, longitude: 86, time: 8)
Coordinates:
  * height          (height) int64 10
  * latitude        (latitude) float32 45.88 46.12 46.38 ... 57.62 57.88 58.12
  * longitude       (longitude) float32 333.9 334.1 334.4 ... 354.6 354.9 355.1
  * time            (time) datetime64[ns] 2015-01-16 ... 2015-01-17T18:00:00
Data variables:
    nobs            (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
    uwnd            (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
    vwnd            (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
    wind_direction  (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
    wind_speed      (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
Attributes: (12/38)
    Conventions:                    CF-1.6
    comment:                        none
    contact:                        Remote Sensing Systems, support@remss.com
    contributor_name:               Carl Mears, Joel Scott, Frank Wentz, Ross...
    contributor_role:               Co-Investigator, Software Engineer, Proje...
    creator_email:                  support@remss.com
    ...                             ...
    publisher_email:                support@remss.com
    publisher_name:                 Remote Sensing Systems
    publisher_url:                  http://www.remss.com/
    references:                     Mears et al., Journal of Geophysical Rese...
    summary:                        CCMP_RT V2.1 has been created using the s...
    title:                          RSS CCMP_RT V2.1 derived surface winds (L...

In [33]: ds.time
Out[33]: 
<xarray.DataArray 'time' (time: 8)>
array(['2015-01-16T00:00:00.000000000', '2015-01-16T06:00:00.000000000',
       '2015-01-16T12:00:00.000000000', '2015-01-16T18:00:00.000000000',
       '2015-01-17T00:00:00.000000000', '2015-01-17T06:00:00.000000000',
       '2015-01-17T12:00:00.000000000', '2015-01-17T18:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2015-01-16 ... 2015-01-17T18:00:00
Attributes:
    _CoordinateAxisType:  Time
    _Fillvalue:           -9999.0
    axis:                 T
    delta_t:              0000-00-00 06:00:00
    long_name:            Time of analysis
    standard_name:        time
    valid_max:            245826.0
    valid_min:            245808.0

@andersy005
Copy link
Member

/run eooffshore_ics_ccmp_v02_1_nrt_wind

@pangeo-forge
Copy link
Contributor

pangeo-forge bot commented Oct 19, 2022

The test failed, but I'm sure we can find out why!

Pangeo Forge maintainers are working diligently to provide public logs for contributors.
That feature is not quite ready yet, however, so please reach out on this thread to a
maintainer, and they'll help you diagnose the problem.

@andersy005
Copy link
Member

@derekocallaghan, after digging into the logs, i found the following traceback which seems to be pointing at connectivity issues.

  Traceback (most recent call last):
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/http.py", line 391, in _info
      await _file_info(
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/http.py", line 768, in _file_info
      r = await session.get(url, allow_redirects=ar, **kwargs)
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/client.py", line 536, in _request
      conn = await self._connector.connect(
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 540, in connect
      proto = await self._create_connection(req, traces, timeout)
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 901, in _create_connection
      _, proto = await self._create_direct_connection(req, traces, timeout)
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 1206, in _create_direct_connection
      raise last_exc
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 1175, in _create_direct_connection
      transp, proto = await self._wrap_create_connection(
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/aiohttp/connector.py", line 988, in _wrap_create_connection
      raise client_error(req.connection_key, exc) from exc
  aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host data.remss.com:443 ssl:default [Connect call failed ('157.131.67.86', 443)]

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
    File "apache_beam/runners/common.py", line 624, in apache_beam.runners.common.SimpleInvoker.invoke_process
    File "/usr/local/lib/python3.9/dist-packages/apache_beam/transforms/core.py", line 1956, in <lambda>
    File "/usr/local/lib/python3.9/dist-packages/pangeo_forge_recipes/executors/beam.py", line 40, in exec_stage
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 156, in cache_input
      config.storage_config.cache.cache_file(
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/storage.py", line 164, in cache_file
      remote_size = _get_url_size(fname, secrets, **open_kwargs)
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/storage.py", line 31, in _get_url_size
      with _get_opener(fname, secrets, **open_kwargs) as of:
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/core.py", line 103, in __enter__
      f = self.fs.open(self.path, mode=mode)
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/spec.py", line 1034, in open
      f = self._open(
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/http.py", line 340, in _open
      size = size or self.info(path, **kwargs)["size"]
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/asyn.py", line 111, in wrapper
      return sync(self.loop, func, *args, **kwargs)
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/asyn.py", line 96, in sync
      raise return_result
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/asyn.py", line 53, in _runner
      result[0] = await coro
    File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/http.py", line 404, in _info
      raise FileNotFoundError(url) from exc
  FileNotFoundError: https://data.remss.com/ccmp/v02.1.NRT/Y2015/M01/CCMP_RT_Wind_Analysis_20150117_V02.1_L3.0_RSS.nc

@rabernat / @yuvipanda, does the dataflow runner possibly not have internet access? I am able to access the URLs in question (e.g. https://data.remss.com/ccmp/v02.1.NRT/Y2015/M01/CCMP_RT_Wind_Analysis_20150117_V02.1_L3.0_RSS.nc) from my local computer.

@andersy005
Copy link
Member

/run eooffshore_ics_ccmp_v02_1_nrt_wind

@pangeo-forge
Copy link
Contributor

pangeo-forge bot commented Oct 20, 2022

The test failed, but I'm sure we can find out why!

Pangeo Forge maintainers are working diligently to provide public logs for contributors.
That feature is not quite ready yet, however, so please reach out on this thread to a
maintainer, and they'll help you diagnose the problem.

@andersy005
Copy link
Member

/run eooffshore_ics_ccmp_v02_1_nrt_wind

@pangeo-forge
Copy link
Contributor

pangeo-forge bot commented Oct 20, 2022

The test failed, but I'm sure we can find out why!

Pangeo Forge maintainers are working diligently to provide public logs for contributors.
That feature is not quite ready yet, however, so please reach out on this thread to a
maintainer, and they'll help you diagnose the problem.

@andersy005
Copy link
Member

/run eooffshore_ics_ccmp_v02_1_nrt_wind

@pangeo-forge
Copy link
Contributor

pangeo-forge bot commented Oct 21, 2022

The test failed, but I'm sure we can find out why!

Pangeo Forge maintainers are working diligently to provide public logs for contributors.
That feature is not quite ready yet, however, so please reach out on this thread to a
maintainer, and they'll help you diagnose the problem.

@andersy005
Copy link
Member

andersy005 commented Oct 21, 2022

not sure what's going on but the failure seems to be related to a serialization issue i observed in other feedstocks:

  File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 827, in findsource
    raise OSError('source code not available')
OSError: source code not available [while running 'Start|cache_input|Reshuffle_000|prepare_target|Reshuffle_001|store_chunk|Reshuffle_002|finalize_target|Reshuffle_003/prepare_target-ptransform-56']
,

  a6170692e70616e67656f2d66-10201802-cyzc-harness-sbr9
      Root cause: Traceback (most recent call last):
  File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 837, in apache_beam.runners.common.PerWindowInvoker.invoke_process
  File "apache_beam/runners/common.py", line 983, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
  File "/usr/local/lib/python3.9/dist-packages/apache_beam/transforms/core.py", line 1877, in <lambda>
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/executors/beam.py", line 14, in _no_arg_stage
    fun(config=config)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 587, in prepare_target
    for k, v in config.get_execution_context().items():
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/base.py", line 59, in get_execution_context
    recipe_hash=self.sha256().hex(),
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/base.py", line 53, in sha256
    return dataclass_sha256(self, ignore_keys=self._hash_exclude_)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 73, in dataclass_sha256
    return dict_to_sha256(d)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 34, in dict_to_sha256
    b = dumps(
  File "/srv/conda/envs/notebook/lib/python3.9/json/__init__.py", line 234, in dumps
    return cls(
  File "/srv/conda/envs/notebook/lib/python3.9/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/srv/conda/envs/notebook/lib/python3.9/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 22, in either_encode_or_hash
    return inspect.getsource(obj)
  File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 1024, in getsource
    lines, lnum = getsourcelines(object)
  File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 1006, in getsourcelines
    lines, lnum = findsource(object)
  File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 827, in findsource
    raise OSError('source code not available')
OSError: source code not available

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py", line 284, in _execute
    response = task()
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py", line 357, in <lambda>
    lambda: self.create_worker().do_instruction(request), request)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py", line 597, in do_instruction
    return getattr(self, request_type)(
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/sdk_worker.py", line 635, in process_bundle
    bundle_processor.process_bundle(instruction_id))
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1003, in process_bundle
    input_op_by_transform_id[element.transform_id].process_encoded(
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/apache_beam/runners/worker/bundle_processor.py", line 227, in process_encoded
    self.output(decoded_value)
  File "apache_beam/runners/worker/operations.py", line 526, in apache_beam.runners.worker.operations.Operation.output
  File "apache_beam/runners/worker/operations.py", line 528, in apache_beam.runners.worker.operations.Operation.output
  File "apache_beam/runners/worker/operations.py", line 237, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 240, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 907, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 908, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 1419, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 1491, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 623, in apache_beam.runners.common.SimpleInvoker.invoke_process
  File "apache_beam/runners/common.py", line 1581, in apache_beam.runners.common._OutputHandler.handle_process_outputs
  File "apache_beam/runners/common.py", line 1694, in apache_beam.runners.common._OutputHandler._write_value_to_tag
  File "apache_beam/runners/worker/operations.py", line 240, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 907, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 908, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 1419, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 1491, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 623, in apache_beam.runners.common.SimpleInvoker.invoke_process
  File "apache_beam/runners/common.py", line 1581, in apache_beam.runners.common._OutputHandler.handle_process_outputs
  File "apache_beam/runners/common.py", line 1694, in apache_beam.runners.common._OutputHandler._write_value_to_tag
  File "apache_beam/runners/worker/operations.py", line 240, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 907, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 908, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 1419, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 1507, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 837, in apache_beam.runners.common.PerWindowInvoker.invoke_process
  File "apache_beam/runners/common.py", line 983, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
  File "/usr/local/lib/python3.9/dist-packages/apache_beam/transforms/core.py", line 1877, in <lambda>
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/executors/beam.py", line 14, in _no_arg_stage
    fun(config=config)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 587, in prepare_target
    for k, v in config.get_execution_context().items():
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/base.py", line 59, in get_execution_context
    recipe_hash=self.sha256().hex(),
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/base.py", line 53, in sha256
    return dataclass_sha256(self, ignore_keys=self._hash_exclude_)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 73, in dataclass_sha256
    return dict_to_sha256(d)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 34, in dict_to_sha256
    b = dumps(
  File "/srv/conda/envs/notebook/lib/python3.9/json/__init__.py", line 234, in dumps
    return cls(
  File "/srv/conda/envs/notebook/lib/python3.9/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/srv/conda/envs/notebook/lib/python3.9/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/serialization.py", line 22, in either_encode_or_hash
    return inspect.getsource(obj)
  File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 1024, in getsource
    lines, lnum = getsourcelines(object)
  File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 1006, in getsourcelines
    lines, lnum = findsource(object)
  File "/srv/conda/envs/notebook/lib/python3.9/inspect.py", line 827, in findsource
    raise OSError('source code not available')
OSError: source code not available [while running 'Start|cache_input|Reshuffle_000|prepare_target|Reshuffle_001|store_chunk|Reshuffle_002|finalize_target|Reshuffle_003/prepare_target-ptransform-56']

@derekocallaghan
Copy link
Contributor Author

derekocallaghan commented Oct 21, 2022

Hi @andersy005, thanks for the test runs and the update. I guess the good news is that the cache_input stage has been performed so the URL files are being retrieved successfully.

I had a quick look at the recipes that failed yesterday (this one - EOOffshore CCMP, AGDC, LMR...) vs the recipe that ran successfully (eNATL60), and the latter is the only one that doesn't define either a process_input or process_chunk function. The stack trace above appears to be related to the inspect.getsource(obj) call where obj is either recipe.process_input or recipe.process_chunk.

I had a look at reproducing the scenario with a combination of code snippets similar to what's used in pangeo_forge_recipes (XarrayZarrRecipe, BaseRecipe etc) and pangeo_forge_runner:

In [43]: from fsspec.implementations.local import LocalFileSystem
    ...: import inspect
    ...: from pangeo_forge_recipes.storage import CacheFSSpecTarget, FSSpecTarget, MetadataTarget, StorageConfig, temporary_storage_config
    ...: from pangeo_forge_runner import Feedstock
    ...: from pathlib import Path
    ...: 
    ...: feedstock = Feedstock(Path(".../staged-recipes/recipes/eooffshore_ics_ccmp_v02_1_nrt_wind"))
    ...: 
    ...: recipes = feedstock.parse_recipes()
    ...: 
    ...: recipes = {k: r.copy_pruned() for k, r in recipes.items()}
    ...: 
    ...: recipe = recipes['eooffshore_ics_ccmp_v02_1_nrt_wind']
    ...: 
    ...: storage_config = temporary_storage_config()
    ...: storage_config.target = FSSpecTarget(LocalFileSystem(), f'./ccmp.zarr')
    ...: storage_config.cache = CacheFSSpecTarget(LocalFileSystem(), './input-cache')
    ...: recipe.storage_config = storage_config
    ...: 

In [44]: recipe
Out[44]: XarrayZarrRecipe(file_pattern=<FilePattern {'time': 2}>, storage_config=StorageConfig(target=FSSpecTarget(fs=<fsspec.implementations.local.LocalFileSystem object at 0x7fbf1b443370>, root_path='./ccmp.zarr'), cache=CacheFSSpecTarget(fs=<fsspec.implementations.local.LocalFileSystem object at 0x7fbf1b443370>, root_path='./input-cache'), metadata=MetadataTarget(fs=<fsspec.implementations.local.LocalFileSystem object at 0x7fbf1b443370>, root_path='/tmp/tmps62mr0gl/urkdQjLc')), inputs_per_chunk=2000, target_chunks={'time': 8000, 'latitude': -1, 'longitude': -1}, cache_inputs=True, copy_input_to_local_file=False, consolidate_zarr=True, consolidate_dimension_coordinates=True, xarray_open_kwargs={}, xarray_concat_kwargs={}, delete_input_encoding=True, process_input=<function ics_wind_speed_direction at 0x7fbf19f7df70>, process_chunk=None, lock_timeout=None, subset_inputs={}, open_input_with_kerchunk=False)

In [45]: recipe.sha256()
Out[45]: b'\xe6\x85\xd6\xfa\xb0\xe7\x8a\x80\xf7ST\xa1M\xed\xae\x8e\x9e\xbe\xe8!d\xb2\xc6)\n\x8f3}}\x85\x7f\xf5'

In [46]: dataclass_sha256(recipe, ignore_keys=recipe._hash_exclude_)
Out[46]: b'\xe6\x85\xd6\xfa\xb0\xe7\x8a\x80\xf7ST\xa1M\xed\xae\x8e\x9e\xbe\xe8!d\xb2\xc6)\n\x8f3}}\x85\x7f\xf5'

In [47]: either_encode_or_hash(recipe.process_input)
Out[47]: 'def ics_wind_speed_direction(ds, fname):\n    """\n    Selects a subset for the Irish Continental Shelf (ICS) region, and computes wind speed and\n    direction for the u and v components in the specified product. Dask arrays are\n    created for delayed execution.\n    """\n    import dask\n    import dask.array as da\n    from datetime import datetime\n    from metpy.calc import wind_direction, wind_speed\n    import xarray as xr\n\n    @dask.delayed\n    def delayed_metpy_fn(fn, u, v):\n        return fn(u, v).values\n\n    # ICS grid\n    geospatial_lat_min = 45.75\n    geospatial_lat_max = 58.25\n    geospatial_lon_min = 333.85\n    geospatial_lon_max = 355.35\n    icds = ds.sel(\n        latitude=slice(geospatial_lat_min, geospatial_lat_max),\n        longitude=slice(geospatial_lon_min, geospatial_lon_max),\n    )\n\n    # Remove subset of original attrs as they\'re no longer relevant\n    for attr in ["base_date", "date_created", "history"]:\n        del icds.attrs[attr]\n\n    # Update the grid attributes\n    icds.attrs.update(\n        {\n            "geospatial_lat_min": geospatial_lat_min,\n            "geospatial_lat_max": geospatial_lat_max,\n            "geospatial_lon_min": geospatial_lon_min,\n            "geospatial_lon_max": geospatial_lon_max,\n        }\n    )\n    u = icds.uwnd\n    v = icds.vwnd\n    # Original wind speed \'units\': \'m s-1\' attribute not accepted by MetPy,\n    # use the unit contained in ERA5 data\n    ccmp_wind_speed_units = u.units\n    era5_wind_speed_units = "m s**-1"\n    u.attrs["units"] = era5_wind_speed_units\n    v.attrs["units"] = era5_wind_speed_units\n\n    variables = [\n        {\n            "name": "wind_speed",\n            "metpy_fn": wind_speed,\n            "attrs": {"long_name": "Wind speed", "units": ccmp_wind_speed_units},\n        },\n        {\n            "name": "wind_direction",\n            "metpy_fn": wind_direction,\n            "attrs": {"long_name": "Wind direction", "units": "degree"},\n        },\n    ]\n\n    # CCMP provides u/v at a single height, 10m\n    for variable in variables:\n        icds[variable["name"]] = (\n            xr.DataArray(\n                da.from_delayed(\n                    delayed_metpy_fn(variable["metpy_fn"], u, v), u.shape, dtype=u.dtype\n                ),\n                coords=u.coords,\n                dims=u.dims,\n            )\n            .assign_coords(height=10)\n            .expand_dims(["height"])\n        )\n        icds[variable["name"]].attrs.update(variable["attrs"])\n\n    icds.height.attrs.update(\n        {\n            "long_name": "Height above the surface",\n            "standard_name": "height",\n            "units": "m",\n        }\n    )\n    # Restore units\n    for variable in ["uwnd", "vwnd"]:\n        icds[variable].attrs["units"] = ccmp_wind_speed_units\n\n    icds.attrs["eooffshore_zarr_creation_time"] = datetime.strftime(\n        datetime.now(), "%Y-%m-%dT%H:%M:%SZ"\n    )\n    icds.attrs[\n        "eooffshore_zarr_details"\n    ] = "EOOffshore Project: Concatenated CCMP v0.2.1.NRT 6-hourly wind products provided by Remote Sensing Systems (RSS), for Irish Continental Shelf. Wind speed and direction have been calculated from the uwnd and vwnd variables. CCMP Version-2 vector wind analyses are produced by Remote Sensing Systems. Data are available at www.remss.com."\n    return icds\n'

In [48]: inspect.isfunction(recipe.process_input)
Out[48]: True

In [49]: inspect.getsource(recipe.process_input)
Out[49]: 'def ics_wind_speed_direction(ds, fname):\n    """\n    Selects a subset for the Irish Continental Shelf (ICS) region, and computes wind speed and\n    direction for the u and v components in the specified product. Dask arrays are\n    created for delayed execution.\n    """\n    import dask\n    import dask.array as da\n    from datetime import datetime\n    from metpy.calc import wind_direction, wind_speed\n    import xarray as xr\n\n    @dask.delayed\n    def delayed_metpy_fn(fn, u, v):\n        return fn(u, v).values\n\n    # ICS grid\n    geospatial_lat_min = 45.75\n    geospatial_lat_max = 58.25\n    geospatial_lon_min = 333.85\n    geospatial_lon_max = 355.35\n    icds = ds.sel(\n        latitude=slice(geospatial_lat_min, geospatial_lat_max),\n        longitude=slice(geospatial_lon_min, geospatial_lon_max),\n    )\n\n    # Remove subset of original attrs as they\'re no longer relevant\n    for attr in ["base_date", "date_created", "history"]:\n        del icds.attrs[attr]\n\n    # Update the grid attributes\n    icds.attrs.update(\n        {\n            "geospatial_lat_min": geospatial_lat_min,\n            "geospatial_lat_max": geospatial_lat_max,\n            "geospatial_lon_min": geospatial_lon_min,\n            "geospatial_lon_max": geospatial_lon_max,\n        }\n    )\n    u = icds.uwnd\n    v = icds.vwnd\n    # Original wind speed \'units\': \'m s-1\' attribute not accepted by MetPy,\n    # use the unit contained in ERA5 data\n    ccmp_wind_speed_units = u.units\n    era5_wind_speed_units = "m s**-1"\n    u.attrs["units"] = era5_wind_speed_units\n    v.attrs["units"] = era5_wind_speed_units\n\n    variables = [\n        {\n            "name": "wind_speed",\n            "metpy_fn": wind_speed,\n            "attrs": {"long_name": "Wind speed", "units": ccmp_wind_speed_units},\n        },\n        {\n            "name": "wind_direction",\n            "metpy_fn": wind_direction,\n            "attrs": {"long_name": "Wind direction", "units": "degree"},\n        },\n    ]\n\n    # CCMP provides u/v at a single height, 10m\n    for variable in variables:\n        icds[variable["name"]] = (\n            xr.DataArray(\n                da.from_delayed(\n                    delayed_metpy_fn(variable["metpy_fn"], u, v), u.shape, dtype=u.dtype\n                ),\n                coords=u.coords,\n                dims=u.dims,\n            )\n            .assign_coords(height=10)\n            .expand_dims(["height"])\n        )\n        icds[variable["name"]].attrs.update(variable["attrs"])\n\n    icds.height.attrs.update(\n        {\n            "long_name": "Height above the surface",\n            "standard_name": "height",\n            "units": "m",\n        }\n    )\n    # Restore units\n    for variable in ["uwnd", "vwnd"]:\n        icds[variable].attrs["units"] = ccmp_wind_speed_units\n\n    icds.attrs["eooffshore_zarr_creation_time"] = datetime.strftime(\n        datetime.now(), "%Y-%m-%dT%H:%M:%SZ"\n    )\n    icds.attrs[\n        "eooffshore_zarr_details"\n    ] = "EOOffshore Project: Concatenated CCMP v0.2.1.NRT 6-hourly wind products provided by Remote Sensing Systems (RSS), for Irish Continental Shelf. Wind speed and direction have been calculated from the uwnd and vwnd variables. CCMP Version-2 vector wind analyses are produced by Remote Sensing Systems. Data are available at www.remss.com."\n    return icds\n'

In [50]: dataclass_sha256(recipe, ignore_keys=recipe._hash_exclude_)
Out[50]: b'\xe6\x85\xd6\xfa\xb0\xe7\x8a\x80\xf7ST\xa1M\xed\xae\x8e\x9e\xbe\xe8!d\xb2\xc6)\n\x8f3}}\x85\x7f\xf5'

I guess this doesn't necessarily help determining why the above is successful locally but failing in the test runs. However, as we're excluding storage_config when generating the hash, could we also exclude process_input and process_chunk? This won't solve it but maybe it's a short-term workaround:

In [51]: dataclass_sha256(recipe, ignore_keys=recipe._hash_exclude_ + ['process_input'])
Out[51]: b'\x14w\xd5\xcbH\x15At:\x1bY;\xbb/P\x90%t\xf3\xd9T\\P\x9a\xf4\xf5\xa18\x0b\xa7M\xa8'

I.e. use the following in pangeo_forge_recipes.recipes.base.BaseRecipe

_hash_exclude_ = ["process_chunk", "process_input", "storage_config"]

and in pangeo_forge_recipes.serialization.dataclass_sha256()

if k in d:
    del d[k]

Afaict, apart from tests, BaseRecipe.sha256() is currently called from BaseRecipe.get_execution_context(), which itself is called from XarrayZarrRecipe.prepare_target() (this scenario), and generating a default job name in pangeo_forge_runner.commands.bake.Bake.

@andersy005
Copy link
Member

I had a quick look at the recipes that failed yesterday (this one - EOOffshore CCMP, AGDC, LMR...) vs the recipe that ran successfully (eNATL60), and the latter is the only one that doesn't define either a process_input or process_chunk function. The stack trace above appears to be related to the inspect.getsource(obj) call where obj is either recipe.process_input or recipe.process_chunk.

Thank you for the informative insights, @derekocallaghan... @rabernat, does this ring any bells for you?

@andersy005
Copy link
Member

/run eooffshore_ics_ccmp_v02_1_nrt_wind

@pangeo-forge
Copy link
Contributor

pangeo-forge bot commented Oct 27, 2022

🎉 The test run of eooffshore_ics_ccmp_v02_1_nrt_wind at 663f30c succeeded!

import xarray as xr

store = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1255/eooffshore_ics_ccmp_v02_1_nrt_wind.zarr"
ds = xr.open_dataset(store, engine='zarr', chunks={})
ds

@andersy005
Copy link
Member

@derekocallaghan, the latest run seems have succeeded https://pangeo-forge.org/dashboard/recipe-run/1255?feedstock_id=1

let me know if this is ready, and i'll merge it

@derekocallaghan
Copy link
Contributor Author

Hi @andersy005, thanks for retrying the run and for all of the previous runs. I've just tried a short check of this run's store, compared to my corresponding local store, and it looks good:

In [1]: import xarray as xr

In [2]: store = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1255/eooffshore_ics_ccmp_v02_1_nrt_wind.zarr"
   ...: ds = xr.open_dataset(store, engine='zarr', chunks={})

In [3]: ds
Out[3]: 
<xarray.Dataset>
Dimensions:         (height: 1, latitude: 50, longitude: 86, time: 8)
Coordinates:
  * height          (height) int64 10
  * latitude        (latitude) float32 45.88 46.12 46.38 ... 57.62 57.88 58.12
  * longitude       (longitude) float32 333.9 334.1 334.4 ... 354.6 354.9 355.1
  * time            (time) datetime64[ns] 2015-01-16 ... 2015-01-17T18:00:00
Data variables:
    nobs            (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
    uwnd            (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
    vwnd            (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
    wind_direction  (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
    wind_speed      (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
Attributes: (12/38)
    Conventions:                    CF-1.6
    comment:                        none
    contact:                        Remote Sensing Systems, support@remss.com
    contributor_name:               Carl Mears, Joel Scott, Frank Wentz, Ross...
    contributor_role:               Co-Investigator, Software Engineer, Proje...
    creator_email:                  support@remss.com
    ...                             ...
    publisher_email:                support@remss.com
    publisher_name:                 Remote Sensing Systems
    publisher_url:                  http://www.remss.com/
    references:                     Mears et al., Journal of Geophysical Rese...
    summary:                        CCMP_RT V2.1 has been created using the s...
    title:                          RSS CCMP_RT V2.1 derived surface winds (L...

In [4]: ds.time.values
Out[4]: 
array(['2015-01-16T00:00:00.000000000', '2015-01-16T06:00:00.000000000',
       '2015-01-16T12:00:00.000000000', '2015-01-16T18:00:00.000000000',
       '2015-01-17T00:00:00.000000000', '2015-01-17T06:00:00.000000000',
       '2015-01-17T12:00:00.000000000', '2015-01-17T18:00:00.000000000'],
      dtype='datetime64[ns]')

In [5]: ds.resample(time='D').mean().wind_speed.isel(latitude=0,longitude=0).compute()
Out[5]: 
<xarray.DataArray 'wind_speed' (time: 2, height: 1)>
array([[10.636068],
       [14.07321 ]], dtype=float32)
Coordinates:
  * height     (height) int64 10
    latitude   float32 45.88
    longitude  float32 333.9
  * time       (time) datetime64[ns] 2015-01-16 2015-01-17

In [6]: ds.eooffshore_zarr_details
Out[6]: 'EOOffshore Project: Concatenated CCMP v0.2.1.NRT 6-hourly wind products provided by Remote Sensing Systems (RSS), for Irish Continental Shelf. Wind speed and direction have been calculated from the uwnd and vwnd variables. CCMP Version-2 vector wind analyses are produced by Remote Sensing Systems. Data are available at www.remss.com.'

In [7]: dslocal = xr.open_zarr('./eooffshore_ics_ccmp_v02_1_nrt_wind.zarr')

In [8]: dslocal = dslocal.isel(time=slice(0,8))

In [9]: dslocal
Out[9]: 
<xarray.Dataset>
Dimensions:         (height: 1, latitude: 50, longitude: 86, time: 8)
Coordinates:
  * height          (height) int64 10
  * latitude        (latitude) float32 45.88 46.12 46.38 ... 57.62 57.88 58.12
  * longitude       (longitude) float32 333.9 334.1 334.4 ... 354.6 354.9 355.1
  * time            (time) datetime64[ns] 2015-01-16 ... 2015-01-17T18:00:00
Data variables:
    nobs            (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
    uwnd            (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
    vwnd            (time, latitude, longitude) float32 dask.array<chunksize=(8, 50, 86), meta=np.ndarray>
    wind_direction  (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
    wind_speed      (height, time, latitude, longitude) float32 dask.array<chunksize=(1, 8, 50, 86), meta=np.ndarray>
Attributes: (12/35)
    Conventions:                    CF-1.6
    comment:                        none
    contact:                        Remote Sensing Systems, support@remss.com
    contributor_name:               Carl Mears, Joel Scott, Frank Wentz, Ross...
    contributor_role:               Co-Investigator, Software Engineer, Proje...
    creator_email:                  support@remss.com
    ...                             ...
    publisher_email:                support@remss.com
    publisher_name:                 Remote Sensing Systems
    publisher_url:                  http://www.remss.com/
    references:                     Mears et al., Journal of Geophysical Rese...
    summary:                        CCMP_RT V2.1 has been created using the s...
    title:                          RSS CCMP_RT V2.1 derived surface winds (L...

In [10]: dslocal.time.values
Out[10]: 
array(['2015-01-16T00:00:00.000000000', '2015-01-16T06:00:00.000000000',
       '2015-01-16T12:00:00.000000000', '2015-01-16T18:00:00.000000000',
       '2015-01-17T00:00:00.000000000', '2015-01-17T06:00:00.000000000',
       '2015-01-17T12:00:00.000000000', '2015-01-17T18:00:00.000000000'],
      dtype='datetime64[ns]')

In [11]: dslocal.resample(time='D').mean().wind_speed.isel(latitude=0,longitude=0).compute()
Out[11]: 
<xarray.DataArray 'wind_speed' (time: 2, height: 1)>
array([[10.63607],
       [14.07321]], dtype=float32)
Coordinates:
  * height     (height) int64 10
    latitude   float32 45.88
    longitude  float32 333.9
  * time       (time) datetime64[ns] 2015-01-16 2015-01-17

In [12]: dslocal.eooffshore_zarr_details
Out[12]: 'EOOffshore Project: Concatenated CCMP v0.2.1.NRT 6-hourly wind products provided by Remote Sensing Systems (RSS), for Irish Continental Shelf. Wind speed and direction have been calculated from the uwnd and vwnd variables. CCMP Version-2 vector wind analyses are produced by Remote Sensing Systems. Data are available at www.remss.com.'

@andersy005 andersy005 merged commit e113bf9 into pangeo-forge:master Oct 31, 2022
@andersy005
Copy link
Member

Thank you, @derekocallaghan! Just merged this. For any subsequent issues/discussions, let's have them over in https://github.com/pangeo-forge/eooffshore_ics_ccmp_v02_1_nrt_wind-feedstock

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked:requires-dind Requires Docker in Docker on backend to deploy, due to some import at recipe top level.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants