Daymet PyProcessing

This repository provides various utility scripts for downloading Daymet datasets [1, 2]. Originally, this project was intended to prepare Daymet surface data for basins of the CAMELS-US dataset [3]. However, the scripts provided in this repo can be used for any other geospatial objects with polygon geometry.

Daymet data

Daymet data contain gridded estimates of daily weather and climatology parameters at a 1 km x 1 km raster for
North America, Hawaii, and Puerto Rico. Daymet Version 3 and Version 4 data are provided by ORNL DAAC and can be via ORNL DAAC's Thematic Real-time Environmental Distributed Data Services (THREDDS).

Get started

To install the Daymet PyProcessing package just use Pip. To install the latest version of the package, run:

python3 -m pip install git+https://github.com/SebaDro/daymet-pyprocessing.git

It is also possible to clone this repository to your local machine and install the package from your local copy:

python3 -m pip install -e .

For developing purpose, you also can either use Conda or Pip to install all required dependencies. Therefore, this repo comes with an environment.yml and a requirements.txt in this repository, respectively.

Note, that this project depends on GeoPandas, which may not install all required dependencies for some operations systems. In this case, you'll find installing instructions within the GeoPandas documentation.

User Guide

Download Daymet data

In order to run the download_daymet script, you have to provide a configuration file which controls the download process. You'll find an exemplary config files inside ./config which you can use as starting point. The download script supports two modes: download for multiple areas based on a geo file and download for a fixed bounding box.

Prepare a config file as stated above and run the download_daymet script with the path to the config file as only positional argument:

download_daymet ./config/download-config.yml

The script will download Daymet datasets via NetCDF Subset Service (NCSS) for each geospatial object present in the provided geo file and indicated by the ids in the config file. To do so, the bounding box of each geospatial object as well as the specified variable and timeframe will be used as request parameters.

Geo file download

This mode takes the polygonal geometries for different basins or other geospatial features from a geo file and downloads Daymet data for each of the geometries based on its bounding box. Daymet files will be downloaded for a certain variable.

Config parameter	Description
loggingConfig	Path to a logging configuration file. This must be a YAML file according to the Python logging dictionary schema.
geo.file	Path to a file that contains geospatial data. The file must be in a data format that can be read by GeoPandas and should contain polygon geometries with WGS84 coordinates, which will be used for requesting Daymet data.
geo.idCol	Name of the column that contains unique identifiers for the geospatial objects.
geo.ids	IDs of the geospatial objects used for requesting Daymet data. If `None`, all geospatial objects from the geo.file will be considered.
readTimeout	Sets a read timeout for the download.
singleFileStorage	For `true` the downloaded yearly Daymet datasets will be concatenated by time dimension and stored within a single file for each geospatial object. For `false` the downloaded yearly Daymet datasets will be stored within separate files foreach object and year.
timeFrame	`startTime` and `endTime` in UTC time for requesting Daymet data.
outputDir	Path to the output directory directory. Downloaded datasets will be stored here.
variable	Data variable that should be included in the downloaded Daymet datasets.
version	Version of the Daymet dataset to be downloaded.

Bbox file download

This mode takes a bbox parameter that will be directly used for download Daymet files for the specified variable.

Config parameter	Description
loggingConfig	Path to a logging configuration file. This must be a YAML file according to the Python logging dictionary schema.
bbox	A static bbox used for downloading Daymet files. Format: [minLon, minLat, maxLon, maxLat] (e.g. [-73.73, 40.93, -73.72, 40.94])
readTimeout	Sets a read timeout for the download.
singleFileStorage	For `true` the downloaded yearly Daymet datasets will be concatenated by time dimension and stored within a single file for each geospatial object. For `false` the downloaded yearly Daymet datasets will be stored within separate files foreach object and year.
timeFrame	`startTime` and `endTime` in UTC time for requesting Daymet data.
outputDir	Path to the output directory directory. Downloaded datasets will be stored here.
variable	Data variable that should be included in the downloaded Daymet datasets.
version	Version of the Daymet dataset to be downloaded.

Processing Daymet data

This repo also comes with some processing routines. Up to now, it supports combining, clipping and aggregating Daymet NetCDF files. You can control processing Daymet data via the process_daymet script by providing a configuration file. You'll find different exemplary files inside ./config which you can use as starting point.

Prepare a config file as stated above and run the process_daymet script with the path to the config file as positional argument followed by a certain operation that should be applied to the Daymet files:

process_daymet {operation} ./config/processing-config.yml

Combining Daymet data

The combine discovers multiple Daymet NetCDF files which have been downloaded with the download.py script and merges those files that refer to the same basin. NetCDF files with the same basin ID as file name prefix will be handled as related files and merged.

In order to discover all relevant files, folder structure and file naming must follow the conventions mentioned below:

{data_dir}/{variable}/{id}/{id}_daymet_v4_daily_na_{variable}_*.nc.
{data_dir}/{variable}/{id}/{id}_daymet_v3_{variable}_*_na.nc4
These patterns follow the naming style for single downloaded files as a result of the download.py script.

Config parameter	Description
dataDir	Path of the data directory which contains the Daymet NetCDF files. Only files which are stored according to a certain folder structure (you'll find an example below) within this directory will be considered for processing.
loggingConfig	Path to a logging configuration file. This must be a YAML file according to the Python logging dictionary schema.
outputDir	Path to the output directory directory. Processing results will be stored here.
ids	Identifier used to determine, which Daymet files should be considered for processing. Leave empty, if all Daymet files inside the `dataDir` should be considered.
outputFormat	Format for storing the results. Supporter: netcdf, zarr
version	Version of the Daymet datasets.
operationParameters.variables	Only a subset of the available Daymet datasets containing these variables will be considered for processing.

Clipping Daymet data

The clip operation clips Daymet data for given polygonal geometries stored in a geo file.

In order to discover all relevant files, folder structure and file naming must follow the conventions mentioned below:

{data_dir}/{id}_daymet_v4_daily_na.nc
{data_dir}/{id}_daymet_v3_na.nc4
These patterns follow the naming style for stored results of the combine operation.

Config parameter	Description
dataDir	Path of the data directory which contains the Daymet NetCDF files. Only files which are stored according to a certain folder structure (you'll find an example below) within this directory will be considered for processing.
loggingConfig	Path to a logging configuration file. This must be a YAML file according to the Python logging dictionary schema.
outputDir	Path to the output directory directory. Processing results will be stored here.
ids	Identifier used to determine, which Daymet files should be considered for processing. Leave empty, if all Daymet files inside the `dataDir` should be considered.
outputFormat	Format for storing the results. Supporter: netcdf, zarr
version	Version of the Daymet datasets.
operationParameters.geomPath	Path to the file that contains polygonal geometries.
operationParameters.idCol	Name of the ID column within the geo file.

Aggregate Daymet data

The aggregate operation calculates the mean, min or max for Daymet data across the combined 'x' and 'y' dimension.

In order to discover all relevant files, folder structure and file naming must follow the conventions mentioned below:

{data_dir}/{id}_daymet_v4_daily_na.nc
{data_dir}/{id}_daymet_v3_na.nc4
These patterns follow the naming style for stored results of the combine operation.

Config parameter	Description
dataDir	Path of the data directory which contains the Daymet NetCDF files. Only files which are stored according to a certain folder structure (you'll find an example below) within this directory will be considered for processing.
loggingConfig	Path to a logging configuration file. This must be a YAML file according to the Python logging dictionary schema.
outputDir	Path to the output directory directory. Processing results will be stored here.
ids	Identifier used to determine, which Daymet files should be considered for processing. Leave empty, if all Daymet files inside the `dataDir` should be considered.
outputFormat	Format for storing the results. Supporter: netcdf, zarr
version	Version of the Daymet datasets.
operationParameters.aggregationMode	Defines which aggregation operation should be performed. Supported: mean, min, max

References

[1] Thornton, P.E., M.M. Thornton, B.W. Mayer, Y. Wei, R. Devarakonda, R.S. Vose, and R.B. Cook. 2016. Daymet: Daily Surface Weather Data on a 1-km Grid for North America, Version 3. ORNL DAAC, Oak Ridge, Tennessee, USA. https://doi.org/10.3334/ORNLDAAC/1328

[2] Thornton, M.M., R. Shrestha, Y. Wei, P.E. Thornton, S. Kao, and B.E. Wilson. 2020. Daymet: Daily Surface Weather Data on a 1-km Grid for North America, Version 4. ORNL DAAC, Oak Ridge, Tennessee, USA. https://doi.org/10.3334/ORNLDAAC/1840

[3] Newman, A., Sampson, K., Clark, M. P., Bock, A., Viger, R. J., Blodgett, D. (2014). A large-sample watershed-scalehydrometeorological dataset for the contiguous USA. Boulder, CO: UCAR/NCAR. https://dx.doi.org/10.5065/D6MW2F4D

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
config		config
daymetpyprocessing		daymetpyprocessing
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Daymet PyProcessing

Daymet data

Get started

User Guide

Download Daymet data

Geo file download

Bbox file download

Processing Daymet data

Combining Daymet data

Clipping Daymet data

Aggregate Daymet data

References

About

Releases 1

Packages

Languages

License

SebaDro/daymet-pyprocessing

Folders and files

Latest commit

History

Repository files navigation

Daymet PyProcessing

Daymet data

Get started

User Guide

Download Daymet data

Geo file download

Bbox file download

Processing Daymet data

Combining Daymet data

Clipping Daymet data

Aggregate Daymet data

References

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages