From bbafee31320a8a5c4b9d0ba1eac76c45002dcbc0 Mon Sep 17 00:00:00 2001 From: Maaike Date: Fri, 10 Jan 2025 13:26:14 +0100 Subject: [PATCH] add/update csv2bufr plugin documentation (#839) --- .../running/data-pipeline-plugins.rst | 18 ++++-- docs/source/user/data-ingest.rst | 57 ++++++++----------- 2 files changed, 35 insertions(+), 40 deletions(-) diff --git a/docs/source/reference/running/data-pipeline-plugins.rst b/docs/source/reference/running/data-pipeline-plugins.rst index 3376957f..5bae6bb0 100644 --- a/docs/source/reference/running/data-pipeline-plugins.rst +++ b/docs/source/reference/running/data-pipeline-plugins.rst @@ -18,13 +18,16 @@ Default pipeline plugins wis2box provides a number of data pipeline plugins by default, which users can be used "out of the box". The list below describes each plugin and provides an example data mappings configuration. +.. _csv2bufr-plugin: + ``wis2box.data.csv2bufr.ObservationDataCSV2BUFR`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -This plugin converts CSV observation data into BUFR using ``csv2bufr``. A csv2bufr template -can be configured to process the data accordingly. In addition, ``file-pattern`` can be used -to filter on incoming data based on a regular expression. Consult the `csv2bufr`_ documentation -for more information on configuration and templating. +This plugin converts CSV observation data into BUFR using `csv2bufr`_. + +A `template` is required to convert the CSV columns to BUFR encoded values. See `csv2bufr-examples`_ on how to create a template or use one of the built-in templates. + +A ``file-pattern`` is used to filter on incoming data based on a regular expression. A typical csv2bufr plugin workflow definition would by defined as follows: @@ -38,7 +41,7 @@ A typical csv2bufr plugin workflow definition would by defined as follows: The default templates are defined by the `csv2bufr-templates`_ repository. -In the case the user wants to use a custom template, the template should be located in the ``$WIS2BOX_HOST_DATADIR/mappings`` directory. +To use a custom template, the template should be located in the ``$WIS2BOX_HOST_DATADIR/mappings`` directory and the `wis2box.env` file should include `CSV2BUFR_TEMPLATES=${WIS2BOX_DATADIR}/mappings`. The plugin configuration would then be defined as follows: @@ -50,6 +53,7 @@ The plugin configuration would then be defined as follows: notify: true # trigger GeoJSON publishing for API and UI file-pattern: '^.*\.csv$' +Environment variables can be set in `wis2box.env` to customize the behavior of the csv2bufr-plugin within the wis2box, see `csv2bufr-environment-variables`_ for the full list of environment variables. ``wis2box.data.bufr4.ObservationDataBUFR2GeoJSON`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -165,7 +169,9 @@ For example, to publish GRIB2 data matching the file-pattern ``^.*_(\d{8})\d{2}. See :ref:`data-mappings` for a full example data mapping configuration. -.. _`csv2bufr`: https://csv2bufr.readthedocs.io +.. _`csv2bufr`: https://csv2bufr.readthedocs.io/en/v0.8.5/ +.. _`csv2bufr-examples`: https://csv2bufr.readthedocs.io/en/v0.8.5/example.html +.. _`csv2bufr-environment-variables`: https://csv2bufr.readthedocs.io/en/v0.8.5/installation.html#environment-variables .. _`csv2bufr-templates`: https://github.com/wmo-im/csv2bufr-templates .. _`bufr2geojson`: https://github.com/wmo-im/bufr2geojson .. _`synop2bufr`: https://synop2bufr.readthedocs.io diff --git a/docs/source/user/data-ingest.rst b/docs/source/user/data-ingest.rst index fdf88dc2..f6f5a9f4 100644 --- a/docs/source/user/data-ingest.rst +++ b/docs/source/user/data-ingest.rst @@ -10,29 +10,8 @@ The wis2box storage is provided using a `MinIO`_ container that provides S3-comp Any file received in the ``wis2box-incoming`` storage bucket will trigger an action to process the file. What action to take is determined by the data mappings that were setup in the previous section. -wis2box-webapp --------------- - -The wis2box-webapp is a web application that includes the following forms for data validation and ingestion: - -* user interface to ingest `FM-12 SYNOP data `_ -* user interface to ingest CSV data using the :ref:`AWS template` - -The wis2box-webapp is available on your host at `http:///wis2box-webapp`. - -Interactive data ingestion requires an execution token, which can be generated using the ``wis2box auth add-token`` command inside the wis2box-management container: - -.. code-block:: bash - - python3 wis2box-ctl.py login - wis2box auth add-token --path processes/wis2box - -.. note:: - - Be sure to record the token value, as it will not be shown again. If you lose the token, you can generate a new one. - data mappings plugins ---------------------- +^^^^^^^^^^^^^^^^^^^^^ The plugins you have configured for your dataset mappings will determine the actions taken when data is received in the MinIO storage bucket. @@ -40,22 +19,11 @@ The wis2box provides 3 types of built-in plugins to publish data in BUFR format: * `bufr2bufr` : the input is received in BUFR format and split by subset, where each subset is published as a separate bufr message * `synop2bufr` : the input is received in `FM-12 SYNOP format `_ and converted to BUFR format. The year and month are extracted from the file pattern -* `csv2bufr` : the input is received in CSV format and converted to BUFR format, a mapping template is used to convert the CSV columns to BUFR encoded values. Custom mapping templates need to be placed in the ``$WIS2BOX_HOST_DATADIR/mappings`` directory. See :ref:`csv2bufr-templates` for examples of mapping templates +* `csv2bufr` : the input is received in CSV format and converted to BUFR format, a mapping template is required to convert the CSV columns to BUFR encoded values. See :ref:`csv2bufr-plugin` for information on how to configure the csv2bufr plugin. To publish data for other data formats you can use the 'Universal' plugin, which will pass through the data without any conversion. Please note that you will need to ensure that the date timestamp can be extracted from the file pattern when using this plugin. -.. _aws-template: - -The AWS template in csv2bufr plugin ------------------------------------ - -When using the csv2bufr plugin, the columns are mapped to BUFR encoded values using a template as defined in the repository `csv2bufr-templates`_. - -An example of a CSV file that can be ingested using the 'AWS' mappings template can be downloaded here :download:`AWS-example <../_static/aws-example.csv>` - -The CSV columns description of the AWS template can be downloaded here :download:`AWS-reference <../_static/aws-minimal.csv>` - MinIO user interface -------------------- @@ -194,6 +162,27 @@ For example using the command line from the host running wis2box: put /path/to/your/datafile.csv wis2box-incoming/urn:wmo:md:it-meteoam:surface-weather-observations.synop EOF +wis2box-webapp +-------------- + +The wis2box-webapp is a web application that includes the following forms for data validation and ingestion: + +* user interface to ingest `FM-12 SYNOP data `_ +* user interface to ingest CSV data using the csv2bufr-plugin and using the predefined "AWS-template" mapping. + +The wis2box-webapp is available on your host at `http:///wis2box-webapp`. + +Interactive data ingestion requires an execution token, which can be generated using the ``wis2box auth add-token`` command inside the wis2box-management container: + +.. code-block:: bash + + python3 wis2box-ctl.py login + wis2box auth add-token --path processes/wis2box + +.. note:: + + Be sure to record the token value, as it will not be shown again. If you lose the token, you can generate a new one. + wis2box-data-subscriber -----------------------