Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement bracketed unit parsing in forcing CSV headers #511

Merged
merged 9 commits into from
May 26, 2023

Conversation

program--
Copy link
Contributor

@program-- program-- commented Apr 13, 2023

Resolves #510 by implementing unit parsing within brackets or parentheses appended to column names in forcing CSV headers.

Additions

  • Adds direct implementation in CsvPerFeatureForcingProvider::read_csv.
  • Adds unit tests to check for header name resolution (i.e. removing unit declaration and whitespace), correct unit conversions, and correct parsing with non-well-known columns.
  • Adds new test data file from cat-27115 with unit declarations, and new non-well-known column.

Testing

  • All existing and new unit tests pass locally on Linux with GCC version 12.2.1.

Todos

  • Add documentation for unit declaration syntax.

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Target Environment support

  • Linux

@program-- program-- marked this pull request as ready for review April 13, 2023 21:09
Copy link
Contributor

@hellkite500 hellkite500 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than testing the case mentioned, the implementation looks reasonable to me.

@@ -164,3 +176,63 @@ TEST_F(CsvPerFeatureForcingProviderTest, TestGetAvailableForcingOutputs)

}

///Test CSV Units Parsing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to add a test for extra characters after the unit brackets, e.g. var (unit) extra, ...

if (var_name_open != std::string::npos) {
// found matching opening bracket/parenth

units = var_name.substr(var_name_open + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern is what happens with extra characters after the closing bracket/paren.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we expecting anything after a unit declaration? My understanding was that unit declarations would be required to be at the end, or would not be parsed at all (see the outer if construct)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not expecting it, but should be prepared...a warning, error, or something... Would not catching that case cause some weird behavior in the units that we don't expect or can't track down easily?

Copy link
Contributor Author

@program-- program-- Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we add a check after the well-known checks to check if units is still empty, and throw a warning then? This would then tell us/the user that the column that threw the warning has no unit mapping (i.e. like how UnitsHelper::get_converted_value throws a warning when a mapping doesn't exist).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue was originally written to treat this case as brackets that do not represent a unit.... in order for the unit to be recognized it would have to be the last thing in the string. We can revisit if we want.

The thought was that there might be (for some reason??) column names with another set of brackets representing something else. In that case, if they were not at the end, then the brackets and their contents would be ignored by the unit parser. If they were at the end, then the workaround could be to add an extra set of different brackets to the end, which would either have valid units or be empty and picked up as an empty units string (didn't cover this case in the issue though, probably).

@mattw-nws mattw-nws self-requested a review May 26, 2023 13:33
@mattw-nws mattw-nws merged commit 4d89814 into NOAA-OWP:master May 26, 2023
@program-- program-- deleted the csv-header-units branch May 29, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support bracketed units in CsvPerFeatureForcingProvider
3 participants