Skip to content

Latest commit

 

History

History
791 lines (634 loc) · 34.7 KB

README.md

File metadata and controls

791 lines (634 loc) · 34.7 KB

Aviation Message Archiver

An application that reads aviation message files, parses some basic properties and stores messages along with parsed metadata in a database.

Overview

Aviation Message Archiver is a Spring Boot service application. It monitors configured input directories for aviation message files. Whenever new files appear, it scans for messages in files, parses some basic properties like message (issued) time, validity period (when applicable), station / location indicator. It stores each message in a PostGIS (PostgreSQL) database.

Contents

Feature overview

  • Configurable archival process.
  • Automatic message type recognition.
  • Supports TAC (experimental) and IWXXM message formats.
  • Supported file formats:
    • WMO GTS meteorological message as specified by WMO Doc. 386.
      • Multiple meteorological messages may be included in a file.
      • A supported COLLECT document is also supported as message content.
    • COLLECT 1.2 document
    • A meteorological bulletin starting with a WMO GTS abbreviated heading line, followed by messages or a COLLECT document
    • A single message (not recommended for TAC messages)
  • Supported database engines:
  • Traceable logging. Support for structured JSON logging.
  • Built on Spring Boot and Spring Integration.

Supported message types and formats

Supported message types and formats are listed in the table below. Generally, this application supports all

Message type TAC IWXXM 2.1 IWXXM 3.0 IWXXM 2023-1
METAR 1) + + 1)
SPECI 1) + + 1)
TAF 1) + + -
SIGMET - + + +
AIRMET - + + +
Volcanic Ash Advisory 1) n/a + -
Tropical Cyclone Advisory - n/a + -
Space Weather Advisory 1) n/a + -

+ Complete support
- Unsupported
1) Experimental

Getting started

The next steps guide you to test the application with example configuration using H2 (in-memory) or PostGIS database engine.

  1. After cloning the code repository, build the application with Maven.
mvn package
  1. Set up the database engine.

    • H2: is automatically set up at application startup, no actions needed.
    • PostGIS: Database is easily set up with Docker or Podman. Use credentials specified by spring.datasource.* properties in the application.yml configuration for profile local & postgresql & !openshift.
      docker run \
        -p 127.0.0.1:5432:5432 \
        --env POSTGRES_USER=avidb_agent \
        --env POSTGRES_PASSWORD=secret \
        --env POSTGRES_DB=avidb \
        --name avidb \
        docker.io/postgis/postgis:14-3.2

    In the local mode used in this guide, the application will automatically initialize the schema at startup.

  2. Prepare an SQL script to populate the avidb_stations table. This is optional for testing the application, but all messages will be rejected without a matching location indicator in the icao column of avidb_stations table.

  3. Start the application. Replace

    • $AVIDB_STATIONS_SQL with a path to the file created in previous step, or omit the spring.sql.init.data-locations property.
    • $DB_ENGINE with h2 or postgresql.
    java \
      -Dspring.profiles.active="local,example,$DB_ENGINE" \
      -Dspring.sql.init.data-locations="\${example.spring.sql.init.data-locations.$DB_ENGINE},file://$AVIDB_STATIONS_SQL" \
      -jar target/aviation-message-archiver-1.1.1-SNAPSHOT-bundle.jar
  4. Check some actuator endpoints to see that the application is running and healthy.

  5. Copy some message files in the input directories specified by the production-line.products[n].input-dir properties in the application.yml configuration file.

  6. After an input file is processed, the application moves it to one of target directories specified by the production-line.products[n].archive-dir and production-line.products[n].fail-dir properties in the application.yml configuration file. The processing identifier is appended to the file name.

  7. Connect to the database.

    • H2: You can access the H2 database console at http://localhost:8080/h2-console/login.jsp with default connection settings and credentials.
    • PostGIS: Use psql or any appropriate client with connection information provided in the database setup step.

    Look at the archived and rejected message tables in the database for any messages. E.g.

SELECT *
FROM avidb_messages AS messages
       LEFT JOIN avidb_message_iwxxm_details AS iwxxm ON iwxxm.message_id = messages.message_id
ORDER BY message_time DESC;
SELECT *
FROM avidb_rejected_messages AS r_messages
       LEFT JOIN avidb_rejected_message_iwxxm_details as r_iwxxm
                 ON r_iwxxm.rejected_message_id = r_messages.rejected_message_id
ORDER BY message_time DESC;

Logging

Application logging is designed to enable tracking of file processing and possible errors during the process. This section describes implemented mechanisms supporting this goal.

Processing identifier and phase

Processing identifier is a unique string within a single Java virtual machine that identifies a single processing of a single input file. It connects all separate log entries of the file processing. Processing identifier 0 or null denotes that a processing identifier is not available or applicable on the log entry. See FileProcessingIdentifier class for details.

Processing phase describes the phase in the processing chain. Processing phase nul or null denotes that the processing phase information is not available or applicable on the log entry. See ProcessingPhase enum for list of phases.

When using unstructured logging, the processing identifier and phase are logged before logging level in format:

[<processing-id>:<phase>] <level> ...

Processing context

Most log entries contain processing context information, referring to file being processed, bulletin of the file and message within the bulletin. The format of the processing context in a log message is:

<fileReference:bulletinReference:messageReference>

which can be extracted to:

<productId/fileName:bulletinIndex(bulletinHeading)@bulletinCharIndex:messageIndex(messageExcerpt)>

where

  • productId: identifier of the product specifying the file under processing
  • fileName: name of the file under processing
  • bulletinIndex: index of bulletin within file, starting from 0
  • bulletinHeading: heading (GTS or collect identifier) of the bulletin under processing
  • bulletinCharIndex: character index of bulletin within file, starting from 0
  • messageIndex: index of message within bulletin, starting from 0
  • messageExcerpt: excerpt from beginning of a TAC message, or IWXXM message gml:id

Unavailable or inapplicable fields and separators are omitted.

See ReadableLoggingContext interface for more information.

Processing statistics

When file processing is finished, the application logs statistics of the processing in the following format:

M{X:n,...,T:n} B{X:1,...,T:n} F{X}

where

  • M{} contains statistics counted as messages within the file.
  • B{} contains statistics counted as bulletins within the file.
  • F{} contains the overall result of the file processing.
  • X:n contains the count of items having the specified processing result. Any processing results with zero (0) count are omitted.
  • T:n contains the total count n of processed items.

Aggregated results denote the worst processing result of a message within the aggregation context. E.g. a bulletin is considered rejected when it contains at least one rejected message, but no failed messages.

See ReadableFileProcessingStatistics interface for more information.

Structured logging

Structured JSON logging can be enabled by activating the logstash runtime profile in the startup command. E.g.

java -Dspring.profiles.active=<other profiles...>,logstash ...

Application configuration

Application configuration properties are collected in a YAML file called application.yml. The provided configuration file is a base configuration, acting as an example. You can use it as a base for your own application configuration file. In your custom configuration file you need to add and/or override only changed or forced properties in your own configuration file, because the provided base configuration file is loaded as well. See External Application Properties in Spring Boot reference documentation for instructions on how to apply your custom configuration file.

Runtime behavior is controlled using Spring profiles which are activated by the application launch command. Profiles declared in the provided configuration are described in the application.yml file.

The most relevant part of the configuration file is the production line configuration under production-line property. Its contents are described below.

Note: Invalid configuration does not necessarily raise an error on startup, but may be silently ignored.

Products

Aviation product model describes basic information like input and output directories and accepted file name patterns. This application can be configured for one or more products. Template of products configuration:

production-line:
  products:
    - id: <product id>
      route: <route name>
      input-dir: <input directory>
      archive-dir: <archived files directory>
      fail-dir: <directory for failed files>
      files:
        - pattern: <file name regex pattern>
          name-time-zone: <zone of timestamp in file name>
          format: <message format>
        - ...
    - ...

See application.yml file for a documented configuration example. More detailed documentation on individual properties can be found in the AviationProduct model class.

Message populators

Message populators are small classes that are responsible for populating the archive data object with message data parsed from the input file. Each message populator focuses on a single responsibility, and together all configured populators construct the complete message entity to be archived. Message populators also decide whether a message is considered

  • eligible for archival, thus will be stored in the message database table after all populators are executed,
  • rejected, thus will be stored in the rejected message database table after all populators are executed,
  • discarded, thus will be ignored immediately and logged at info level,
  • failed, thus will be ignored immediately and logged at error level.

Message populator configuration specifies which populators are executed and in which order. A message populator executed later in the execution chain may then override values set by previously executed populators. The configuration applies to all products. Any populator may be configured to execute conditionally.

Template of message populators configuration:

production-line:
  message-populators:
    - # Name of message populator to execute (mandatory)
      name: <populator name>
      # Optional activation conditions. The specified populator is executed only when all of
      # provided activation conditions are satisfied. Omit to execute unconditionally.
      activate-on:
        <activation property>:
          <activation operator>: <activation operand>
          ...
        ...
      # Map of populator-specific configuration options. Some of them may be mandatory.
      # May be omitted, when no configuration options are given.
      config:
        <config property name>: <config property value>
        ...
    - ...

Message populator name is generally by convention the same as the class simple name. For example, name of fi.fmi.avi.archiver.message.populator.MessageDataPopulator class is MessageDataPopulator in the configuration.

A base configuration is provided in the application.yml file as an example.

Bundled message populators

This application comes with handful of bundled message populators. Some of them, like MessageDataPopulator, play an essential role in the archival process. The provided application.yml has an example configuration of these. Others, like FixedProcessingResultPopulator or StationIcaoCodeReplacer, are provided for customized message handling, along with the possibility for conditional activation. One message populator, StationIdPopulator, cannot be configured, but is implicitly active.

Available populators are listed below (in alphabetic order by name). These are declared for use in the MessagePopulatorFactoryConfig class.

BulletinHeadingDataPopulator

Populate properties parsed from input bulletin heading.

  • name: BulletinHeadingDataPopulator
  • config:
    • bulletin-heading-sources (optional) - List of bulletin heading sources in preferred order.
      Available values are specified in BulletinHeadingSource enum.
      Example:
      bulletin-heading-sources:
        - GTS_BULLETIN_HEADING
        - COLLECT_IDENTIFIER
FileMetadataPopulator

Populate properties available in input file metadata.

FileNameDataPopulator

Populate properties parsed from file name.

FixedDurationValidityPeriodPopulator

Set validity period to a fixed duration period starting from message time.

FixedProcessingResultPopulator

Set processing result to specified value.

FixedRoutePopulator

Set route to specified value.

FixedTypePopulator

Set message type to specified value.

  • name: FixedTypePopulator
  • config:
    • type (mandatory) - Message type name.
      Available values are specified in the production-line.type-ids identifier mapping property.
      Example:
      type: METAR
MessageContentTrimmer

Trim whitespaces around message content.

MessageDataPopulator

Populate properties parsed from message content.

  • name: MessageDataPopulator
  • config:
    • message-type-location-indicator-types (optional) - Message type-specific list of location indicator types in order of preference for reading the station ICAO code.
      Available message types are specified in the map property production-line.type-ids. Available location indicator types are specified in GenericAviationWeatherMessage.LocationIndicatorType enum.
      Example:
      message-type-location-indicator-types:
        - AIRMET:
            - ISSUING_AIR_TRAFFIC_SERVICES_REGION
        - METAR:
            - AERODROME
        - SIGMET:
            - ISSUING_AIR_TRAFFIC_SERVICES_REGION
        - SPECI:
            - AERODROME
        - SPACE_WEATHER_ADVISORY: [ ]
        - TAF:
            - AERODROME
        - TROPICAL_CYCLONE_ADVISORY: [ ]
        - VOLCANIC_ASH_ADVISORY: [ ]
    • default-location-indicator-types (optional) - Default list of location indicator types in order of preference for reading the station ICAO code.
      Only used when the message type-specific list is not configured. Available location indicator types are specified in GenericAviationWeatherMessage.LocationIndicatorType enum.
      Example:
      default-location-indicator-types:
        - AERODROME
        - ISSUING_AIR_TRAFFIC_SERVICES_REGION
MessageDiscarder

Discard message.

MessageFutureTimeValidator

Reject message if message time is too far in the future.

MessageMaximumAgeValidator

Reject message if message time is too far in the past.

ProductMessageTypesValidator

Reject message if message type is not one of the valid types configured for the product.

  • name: ProductMessageTypesValidator
  • config:
    • confname (mandatory) - Product-specific list of valid message types.
      Available products are specified under list property production-line.products. Available message types are specified in map property production-line.type-ids.
      Example:
      product-message-types:
        example1:
          - METAR
          - SPECI
        example2:
          - TAF
          - SIGMET
          - AIRMET
          - TROPICAL_CYCLONE_ADVISORY
          - VOLCANIC_ASH_ADVISORY
          - SPACE_WEATHER_ADVISORY
StationIcaoCodeReplacer

Replace all occurrences of regular expression pattern in the station ICAO code with the provided replacement.

StationIdPopulator

Set the numeric station id matching station ICAO code, and reject the message if such ICAO code cannot be found in the database stations table.

StationIdPopulator is implicitly added in the end of the message populator execution chain, and it cannot be omitted nor configured in the middle of the execution chain.

  • name: ~
  • config: ~

Conditional message popular activation

Message populator configuration may include an activation condition, in which case the populator is executed only when the provided condition is satisfied. An activation condition consists of one or more activation expressions, that consist of activation property and one or more activation operator and activation operand pairs. When multiple activation expressions and/or operator-operand pairs are specified, they are combined with AND operator, thus all of them must be satisfied to activate the message populator.

Template for activation condition configuration:

activate-on:
  <activation property>:
    <activation operator>: <activation operand>
    ...
  ...

Conditional execution is provided by ConditionalMessagePopulator class.

Activation property

Activation property is the name of the aviation message data property that is applied as the first operand of activation operator.

The following activation properties are declared in MessagePopulatorConditionPropertyReaderConfig:

  • format: populated message format name. See format-ids in Identifier mappings.
  • productId: identifier of the product this file belongs to.
  • route: populated route name. See route-ids in Identifier mappings.
  • station: populated station ICAO code.
  • type: populated message type name. See type-ids in Identifier mappings.
  • <heading>-data-designator: input bulletin heading data designators (TTAAii).
  • <heading>-originator: input bulletin heading location indicator / originator (CCCC).

The <heading> specifies whether to read GTS or COLLECT heading. It may be one of:

  • gts: GTS heading
  • collect: collect identifier
  • gts-or-collect: GTS heading, or if not present, collect identifier
  • collect-or-gts: collect identifier, or if not present, GTS heading

E.g. gts-or-collect-data-designator.

Activation operator and operand

In an activation expression, the activation operator is applied to the activation property value (as first operand for the operator) and specified activation operand (as second operand for the operator). Activation operand type and possible values depend on the activation operator and activation property.

Activation operator may be one of:

  • presence: test for presence of activation property.
    Activation operand is one of:

    • PRESENT: activation property value must be present. This is the default when presence is omitted.
      For example, omitting presence is equivalent to:
      activate-on:
        type:
          presence: PRESENT
    • EMPTY: activation property value must not be present
    • OPTIONAL: activation property value may or may not be present, aka presence condition is always satisfied
  • is: test whether activation property is equal to activation operand.
    This is mutually exclusive with is-any-of.

    For example, following activation condition is satisfied when message format is IWXXM.

    activate-on:
      format:
        is: IWXXM
  • is-any-of: test whether activation property is equal to any of activation operand list.
    This is mutually exclusive with is.

    For example, following activation condition is satisfied when message type is either METAR or SPECI.

    activate-on:
      type:
        is-any-of:
          - METAR
          - SPECI
  • is-not: test whether activation property is not equal to activation operand.
    This is mutually exclusive with is-none-of.

    For example, following activation condition is satisfied when message format is not IWXXM.

    activate-on:
      format:
        is-not: IWXXM
  • is-none-of: test whether activation property is not equal to any of activation operand list.
    This is mutually exclusive with is-not.

    For example, following activation condition is satisfied when message type is neither METAR nor SPECI.

    activate-on:
      type:
        is-none-of:
          - METAR
          - SPECI
  • matches: test whether activation property matches regular expression provided as activation operand.

    For example, following activation condition is satisfied when originator parsed from GTS bulletin heading, or collect identifier when GTS heading is not present, starts with 'XX':

    activate-on:
      gts-or-collect-originator:
        matches: 'XX[A-Z]{2}'
  • does-not-match: test whether activation property does not match regular expression provided as activation operand.

    For example, following activation condition is satisfied when originator parsed from GTS bulletin heading, or collect identifier when GTS heading is not present, does not start with 'XX':

    activate-on:
      gts-or-collect-originator:
        does-not-match: 'XX[A-Z]{2}'

Activation operators are provided by GeneralPropertyPredicate class.

Identifier mappings

Internally this application uses static values to represent message type and format. These internal values differ from the identifier of equivalent database entity, and are not bound to the name in the database. Therefore, internal values need to be mapped to database values in the application configuration. In addition to message type and format, message route is a similar property, but it has no internal semantics. However, it is mapped in the configuration for consistency with other similar properties.

The following mappings must exist under production-line application configuration property:

  • route-ids: Map route name, preferably same as database column avidb_message_routes.name, to database column avidb_message_routes.route_id.
  • format-ids: Map GenericAviationWeatherMessage.Format.name() to database column avidb_message_format.format_id.
  • type-ids: Map MessageType.name() to database column avidb_message_types.type_id.

See the provided application.yml for an example.

Spring Boot configuration properties

Many of the properties in application.yml configuration file control the behavior of Spring Boot features. Look at the Spring Boot Reference Documentation for more information on these. Some of related sections are:

License

MIT License. See LICENSE.