Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement bracketed unit parsing in forcing CSV headers #511

Merged
merged 9 commits into from
May 26, 2023
18 changes: 17 additions & 1 deletion include/forcing/CsvPerFeatureForcingProvider.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -307,7 +307,23 @@ class CsvPerFeatureForcingProvider : public data_access::GenericDataProvider
std::string var_name = col_head;
std::string units = "";

//TODO: parse units in parens and/or square brackets?
boost::trim(var_name); // remove leading/trailing ws
const auto var_name_close = var_name.back();
if (var_name_close == ']' || var_name_close == ')') {
// found closing bracket/parenth

const bool is_bracket = var_name_close == ']';
const size_t var_name_open = is_bracket ? var_name.rfind('[') : var_name.rfind('(');
if (var_name_open != std::string::npos) {
// found matching opening bracket/parenth

units = var_name.substr(var_name_open + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern is what happens with extra characters after the closing bracket/paren.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we expecting anything after a unit declaration? My understanding was that unit declarations would be required to be at the end, or would not be parsed at all (see the outer if construct)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not expecting it, but should be prepared...a warning, error, or something... Would not catching that case cause some weird behavior in the units that we don't expect or can't track down easily?

Copy link
Contributor Author

@program-- program-- Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we add a check after the well-known checks to check if units is still empty, and throw a warning then? This would then tell us/the user that the column that threw the warning has no unit mapping (i.e. like how UnitsHelper::get_converted_value throws a warning when a mapping doesn't exist).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue was originally written to treat this case as brackets that do not represent a unit.... in order for the unit to be recognized it would have to be the last thing in the string. We can revisit if we want.

The thought was that there might be (for some reason??) column names with another set of brackets representing something else. In that case, if they were not at the end, then the brackets and their contents would be ignored by the unit parser. If they were at the end, then the workaround could be to add an extra set of different brackets to the end, which would either have valid units or be empty and picked up as an empty units string (didn't cover this case in the issue though, probably).

units.pop_back(); // remove closing bracket

var_name = var_name.substr(0, var_name_open);
boost::trim(var_name); // trim again in case of ws between name and units
}
}

auto wkf = data_access::WellKnownFields.find(var_name);
if(wkf != data_access::WellKnownFields.end()){
Expand Down
100 changes: 100 additions & 0 deletions test/data/forcing/cat-27115-nwm-aorc-variant-derived-format-units.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
Time,RAINRATE[mm s^-1],T2D [K],Q2D [kg kg-1],U2D(m s-1) ,V2D (m/s),TEST [kg],PSFC[Pa),SWDOWN(W m-2],LWDOWN [alt][]
2015-12-01 00:00:00,0.00000000,265.30,0.00203,-0.779, 2.303, 0.369072640744103, 97498.0, 0.00,196.40
2015-12-01 01:00:00,0.00000000,264.11,0.00182,-1.791, 2.916, 0.483859626703707, 97362.1, 0.00,194.07
2015-12-01 02:00:00,0.00000000,262.93,0.00163,-2.803, 3.533, 0.366017865267406, 97223.5, 0.00,191.79
2015-12-01 03:00:00,0.00000000,261.74,0.00146,-3.814, 4.149, 0.457391463874851, 97084.9, 0.00,227.35
2015-12-01 04:00:00,0.00000000,262.64,0.00154,-3.133, 5.178, 0.737631060558781, 97026.7, 0.00,228.79
2015-12-01 05:00:00,0.00000000,263.53,0.00162,-2.450, 6.200, 0.602648271858213, 96963.4, 0.00,230.15
2015-12-01 06:00:00,0.00000000,264.44,0.00171,-1.770, 7.229, 0.346901047841881, 96903.1, 0.00,284.62
2015-12-01 07:00:00,0.00006820,265.11,0.00181,-2.024, 8.217, 0.509759386830487, 96779.8, 0.00,285.84
2015-12-01 08:00:00,0.00032685,265.77,0.00192,-2.281, 9.210, 0.560660270540399, 96655.6, 0.00,287.03
2015-12-01 09:00:00,0.00059252,266.43,0.00204,-2.539,10.197, 0.460096988544622, 96532.1, 0.00,301.32
2015-12-01 10:00:00,0.00091235,267.47,0.00232,-1.572,10.595, 0.725090390683898, 96422.8, 0.00,303.71
2015-12-01 11:00:00,0.00030843,268.49,0.00261,-0.615,10.997, 0.258452138273704, 96307.0, 0.00,306.07
2015-12-01 12:00:00,0.00047892,269.52,0.00293, 0.344,11.397, 0.540497664150327, 96198.0, 0.00,307.59
2015-12-01 13:00:00,0.00000000,270.40,0.00308, 0.169,11.380, 0.52704762808526, 96200.2, 35.00,309.33
2015-12-01 14:00:00,0.00000000,271.28,0.00323,-0.008,11.373, 0.491141328397042, 96204.7,100.11,311.06
2015-12-01 15:00:00,0.00000000,272.17,0.00340,-0.182,11.359, 0.653268705082186, 96205.1,147.50,311.84
2015-12-01 16:00:00,0.00000566,272.56,0.00351,-0.216,10.681, 0.471999167382976, 96096.9,182.38,312.52
2015-12-01 17:00:00,0.00000434,272.94,0.00363,-0.257, 9.995, 0.592657926238533, 95987.2,192.87,313.20
2015-12-01 18:00:00,0.00000618,273.33,0.00375,-0.293, 9.315, 0.678191486712355, 95879.4,214.32,316.47
2015-12-01 19:00:00,0.00000217,273.28,0.00377,-0.279, 8.213, 0.481847842009306, 95807.3,167.33,316.40
2015-12-01 20:00:00,0.00000618,273.24,0.00380,-0.273, 7.108, 0.39729302731704, 95732.9,100.99,316.34
2015-12-01 21:00:00,0.00000618,273.19,0.00382,-0.258, 6.006, 0.748501693643781, 95657.2, 13.28,316.49
2015-12-01 22:00:00,0.00000618,273.05,0.00380, 0.248, 4.816, 0.455377375702623, 95699.5, 0.00,315.98
2015-12-01 23:00:00,0.00000618,272.90,0.00378, 0.754, 3.635, 0.677327794276216, 95739.8, 0.00,315.45
2015-12-02 00:00:00,0.00000620,272.76,0.00375, 1.262, 2.443, 0.510890292682041, 95781.6, 0.00,310.76
2015-12-02 01:00:00,0.00000722,272.79,0.00376, 1.281, 2.366, 0.699172523787197, 95712.4, 0.00,310.98
2015-12-02 02:00:00,0.00000869,272.81,0.00377, 1.297, 2.290, 0.514190631008857, 95642.7, 0.00,311.26
2015-12-02 03:00:00,0.00000312,272.84,0.00378, 1.307, 2.210, 0.563774670715194, 95570.3, 0.00,308.03
2015-12-02 04:00:00,0.00000096,272.78,0.00379, 2.144, 1.794, 0.372323497975563, 95529.8, 0.00,307.92
2015-12-02 05:00:00,0.00000000,272.72,0.00380, 2.986, 1.374, 0.440564194370743, 95488.2, 0.00,307.81
2015-12-02 06:00:00,0.00000000,272.67,0.00381, 3.820, 0.953, 0.68830343963261, 95447.5, 0.00,300.77
2015-12-02 07:00:00,0.00001087,272.70,0.00379, 4.946, 0.065, 0.497993216299713, 95437.4, 0.00,300.95
2015-12-02 08:00:00,0.00002215,272.73,0.00377, 6.073,-0.826, 0.715992327783042, 95424.8, 0.00,301.06
2015-12-02 09:00:00,0.00012696,272.77,0.00375, 7.201,-1.713, 0.382735115508423, 95416.2, 0.00,298.31
2015-12-02 10:00:00,0.00013539,272.21,0.00360, 8.238,-2.945, 0.432333783396072, 95541.9, 0.00,298.57
2015-12-02 11:00:00,0.00004202,271.66,0.00344, 9.269,-4.182, 0.594375846546082, 95665.2, 0.00,298.84
2015-12-02 12:00:00,0.00002027,271.10,0.00329,10.301,-5.413, 0.754148767470999, 95796.0, 0.00,283.54
2015-12-02 13:00:00,0.00000000,270.64,0.00310,10.108,-5.133, 0.272142205609158, 95985.0, 30.67,283.54
2015-12-02 14:00:00,0.00000000,270.18,0.00290, 9.927,-4.846, 0.436670077599158, 96177.7, 87.80,283.56
2015-12-02 15:00:00,0.00000000,269.72,0.00271, 9.735,-4.568, 0.422978723522929, 96369.5,196.75,256.32
2015-12-02 16:00:00,0.00000000,269.46,0.00264, 9.311,-3.989, 0.641331891895699, 96433.1,243.38,256.09
2015-12-02 17:00:00,0.00000000,269.20,0.00257, 8.882,-3.412, 0.355176890251846, 96493.9,257.21,255.83
2015-12-02 18:00:00,0.00000000,268.94,0.00251, 8.457,-2.833, 0.476734886599881, 96557.1,271.36,244.24
2015-12-02 19:00:00,0.00000000,268.65,0.00244, 8.068,-2.593, 0.508846493626766, 96619.3,212.36,243.83
2015-12-02 20:00:00,0.00000000,268.36,0.00237, 7.673,-2.351, 0.386762672426179, 96682.5,128.88,243.40
2015-12-02 21:00:00,0.00000000,268.06,0.00230, 7.284,-2.112, 0.631101244829658, 96742.3, 14.14,213.43
2015-12-02 22:00:00,0.00000000,267.91,0.00232, 6.841,-1.527, 0.480850713581088, 96813.4, 0.00,213.52
2015-12-02 23:00:00,0.00000000,267.75,0.00233, 6.406,-0.937, 0.585107670448437, 96883.9, 0.00,213.45
2015-12-03 00:00:00,0.00000000,267.60,0.00234, 5.963,-0.352, 0.35966494373066, 96955.4, 0.00,201.10
2015-12-03 01:00:00,0.00000000,267.49,0.00233, 5.813, 0.053, 0.574430736701533, 96991.9, 0.00,201.04
2015-12-03 02:00:00,0.00000000,267.38,0.00231, 5.657, 0.452, 0.62250074609481, 97029.0, 0.00,200.96
2015-12-03 03:00:00,0.00000000,267.26,0.00230, 5.509, 0.855, 0.652115104685093, 97067.8, 0.00,199.08
2015-12-03 04:00:00,0.00000000,267.20,0.00231, 5.068, 1.485, 0.614155929342996, 97053.4, 0.00,198.99
2015-12-03 05:00:00,0.00000000,267.13,0.00232, 4.636, 2.118, 0.582053280495819, 97039.5, 0.00,198.91
2015-12-03 06:00:00,0.00000000,267.06,0.00233, 4.193, 2.748, 0.506784003105991, 97028.4, 0.00,197.47
2015-12-03 07:00:00,0.00000000,267.19,0.00234, 3.994, 2.927, 0.486036737181291, 97039.5, 0.00,197.42
2015-12-03 08:00:00,0.00000000,267.32,0.00236, 3.787, 3.106, 0.400532583883899, 97052.7, 0.00,197.39
2015-12-03 09:00:00,0.00000000,267.43,0.00237, 3.589, 3.283, 0.522485446689281, 97066.7, 0.00,208.99
2015-12-03 10:00:00,0.00000000,267.51,0.00238, 3.511, 3.668, 0.341305270607806, 97087.1, 0.00,209.14
2015-12-03 11:00:00,0.00000000,267.59,0.00239, 3.440, 4.052, 0.705054267791722, 97109.0, 0.00,209.25
2015-12-03 12:00:00,0.00000000,267.66,0.00240, 3.364, 4.434, 0.515074232915239, 97127.7, 0.00,203.31
2015-12-03 13:00:00,0.00000000,268.50,0.00253, 3.042, 4.841, 0.396927395108321, 97044.9, 65.56,203.66
2015-12-03 14:00:00,0.00000000,269.36,0.00267, 2.705, 5.252, 0.533825233062325, 96965.1,187.70,204.04
2015-12-03 15:00:00,0.00000000,270.20,0.00282, 2.386, 5.654, 0.480495505208419, 96887.0,318.65,202.98
2015-12-03 16:00:00,0.00000000,271.16,0.00293, 3.066, 5.679, 0.428273437906239, 96775.8,394.48,203.35
2015-12-03 17:00:00,0.00000000,272.10,0.00306, 3.747, 5.700, 0.747743416505291, 96671.4,417.70,203.76
2015-12-03 18:00:00,0.00000000,273.04,0.00318, 4.423, 5.719, 0.678330595019967, 96560.1,357.78,211.80
2015-12-03 19:00:00,0.00000000,272.81,0.00315, 4.380, 5.235, 0.465247048719914, 96580.0,280.66,211.58
2015-12-03 20:00:00,0.00000000,272.58,0.00313, 4.345, 4.747, 0.610208739984751, 96601.9,171.29,211.39
2015-12-03 21:00:00,0.00000000,272.35,0.00310, 4.303, 4.263, 0.538381349516313, 96621.4, 15.48,247.83
2015-12-03 22:00:00,0.00000000,272.59,0.00318, 4.486, 3.930, 0.217003409111572, 96615.8, 0.00,249.69
2015-12-03 23:00:00,0.00000000,272.83,0.00325, 4.674, 3.591, 0.61348000324151, 96608.8, 0.00,251.55
2015-12-04 00:00:00,0.00000000,273.07,0.00333, 4.863, 3.260, 0.800860775428882, 96603.5, 0.00,229.79
2015-12-04 01:00:00,0.00000000,271.80,0.00291, 4.779, 2.981, 0.357629445143273, 96603.3, 0.00,225.71
2015-12-04 02:00:00,0.00000000,270.51,0.00254, 4.700, 2.699, 0.29870809379768, 96604.2, 0.00,221.72
2015-12-04 03:00:00,0.00000000,269.24,0.00221, 4.616, 2.420, 0.35364714261436, 96600.8, 0.00,222.69
2015-12-04 04:00:00,0.00000000,268.12,0.00207, 3.866, 2.395, 0.430365857711972, 96581.0, 0.00,219.50
2015-12-04 05:00:00,0.00000000,267.01,0.00195, 3.117, 2.365, 0.393065348170171, 96558.0, 0.00,216.36
2015-12-04 06:00:00,0.00000000,265.89,0.00183, 2.368, 2.336, 0.771422087726575, 96532.6, 0.00,221.24
2015-12-04 07:00:00,0.00000000,265.80,0.00182, 2.129, 2.530, 0.344679285854667, 96515.1, 0.00,221.09
2015-12-04 08:00:00,0.00000000,265.70,0.00181, 1.877, 2.735, 0.394115096146419, 96496.9, 0.00,220.99
2015-12-04 09:00:00,0.00000000,265.61,0.00180, 1.640, 2.943, 0.483713892634595, 96478.5, 0.00,207.95
2015-12-04 10:00:00,0.00000000,265.78,0.00182, 1.503, 3.241, 0.416181034658139, 96536.7, 0.00,208.52
2015-12-04 11:00:00,0.00000000,265.95,0.00184, 1.364, 3.545, 0.668649158323891, 96588.8, 0.00,209.06
2015-12-04 12:00:00,0.00000000,266.11,0.00185, 1.230, 3.853, 0.511886949669646, 96641.2, 0.00,205.60
2015-12-04 13:00:00,0.00000000,268.16,0.00211, 1.182, 4.297, 0.648779175810115, 96672.2, 63.50,209.18
2015-12-04 14:00:00,0.00000000,270.21,0.00239, 1.136, 4.747, 0.566687062318829, 96700.3,181.65,212.82
2015-12-04 15:00:00,0.00000000,272.25,0.00271, 1.086, 5.197, 0.540462915464216, 96729.5,304.19,231.99
2015-12-04 16:00:00,0.00000000,274.54,0.00308, 1.355, 5.139, 0.371895232554147, 96612.8,376.68,236.44
2015-12-04 17:00:00,0.00000000,276.82,0.00350, 1.630, 5.082, 0.621510025518188, 96500.2,399.19,240.93
2015-12-04 18:00:00,0.00000000,279.10,0.00395, 1.902, 5.026, 0.663715574451496, 96384.9,351.57,240.49
2015-12-04 19:00:00,0.00000000,277.99,0.00384, 2.026, 4.666, 0.709559194057522, 96401.6,276.33,238.72
2015-12-04 20:00:00,0.00000000,276.89,0.00374, 2.158, 4.297, 0.59649321880213, 96416.2,169.66,236.95
2015-12-04 21:00:00,0.00000000,275.78,0.00363, 2.282, 3.930, 0.518605866458076, 96434.7, 18.40,238.85
2015-12-04 22:00:00,0.00000000,275.18,0.00358, 2.113, 3.874, 0.523043806397619, 96404.4, 0.00,237.89
2015-12-04 23:00:00,0.00000000,274.57,0.00353, 1.934, 3.813, 0.624582869802129, 96375.4, 0.00,236.88
2015-12-05 00:00:00,0.00000000,273.97,0.00348, 1.761, 3.746, 0.315559274966311, 96346.9, 0.00,256.27
2015-12-05 01:00:00,0.00000000,272.48,0.00314, 1.583, 3.954, 0.288652333228552, 96334.4, 0.00,251.20
2015-12-05 02:00:00,0.00000000,270.99,0.00283, 1.395, 4.153, 0.571172703828173, 96324.3, 0.00,246.13
58 changes: 58 additions & 0 deletions test/forcing/CsvPerFeatureForcingProvider_Test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ class CsvPerFeatureForcingProviderTest : public ::testing::Test {

std::shared_ptr<CsvPerFeatureForcingProvider> Forcing_Object;
std::shared_ptr<CsvPerFeatureForcingProvider> Forcing_Object_2;
std::shared_ptr<CsvPerFeatureForcingProvider> Forcing_Object_3; // explicit units

typedef struct tm time_type;

Expand Down Expand Up @@ -68,6 +69,17 @@ void CsvPerFeatureForcingProviderTest::setupForcing()
forcing_params forcing_p_2(forcing_file_name, "CsvPerFeature", "2015-12-01 00:00:00", "2015-12-05 02:00:00");

Forcing_Object_2 = std::make_shared<CsvPerFeatureForcingProvider>(forcing_p_2);

forcing_file_names = {
"test/data/forcing/cat-27115-nwm-aorc-variant-derived-format-units.csv",
"../test/data/forcing/cat-27115-nwm-aorc-variant-derived-format-units.csv",
"../../test/data/forcing/cat-27115-nwm-aorc-variant-derived-format-units.csv"
};
forcing_file_name = utils::FileChecker::find_first_readable(forcing_file_names);

forcing_params forcing_p_3(forcing_file_name, "CsvPerFeature", "2015-12-01 00:00:00", "2015-12-05 02:00:00");

Forcing_Object_3 = std::make_shared<CsvPerFeatureForcingProvider>(forcing_p_3);
}

///Test AORC Forcing Object
Expand Down Expand Up @@ -164,3 +176,49 @@ TEST_F(CsvPerFeatureForcingProviderTest, TestGetAvailableForcingOutputs)

}

///Test CSV Units Parsing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to add a test for extra characters after the unit brackets, e.g. var (unit) extra, ...

TEST_F(CsvPerFeatureForcingProviderTest, TestForcingUnitHeaderParsing)
{
const time_t t = Forcing_Object_3->get_data_start_time() + (8 * 3600);
const auto& varnames = this->Forcing_Object_3->get_avaliable_variable_names();
const std::vector<std::tuple<std::string, std::string, std::string>>& expected = {
{"RAINRATE", "mm s^-1", "cm min^-1"},
{"T2D", "K", "degC"},
{"Q2D", "kg kg-1", "g kg-1"},
{"U2D", "m s-1", "cm s-1"},
{"V2D", "m/s", "cm s-1"},
{"TEST", "kg", "g"},
{"PSFC[Pa)", "Pa", "bar"},
{"SWDOWN(W m-2]", "W m-2", "langley"},
{"LWDOWN [alt]", "W m-2", "langley"}
};

for (auto ite = expected.begin(); ite != expected.end(); ite++) {
const auto expected_name = std::get<0>(*ite);
const auto expected_in_units = std::get<1>(*ite);
const auto expected_out_units = std::get<2>(*ite);

const double in_value = this->Forcing_Object_3->get_value(
CSVDataSelector(expected_name, t, 3600, expected_in_units),
data_access::SUM
);

const double out_value = this->Forcing_Object_3->get_value(
CSVDataSelector(expected_name, t, 3600, expected_out_units),
data_access::SUM
);

// make sure each expected column name is within varnames
EXPECT_NE(std::find(varnames.begin(), varnames.end(), expected_name), varnames.end());

// make sure units are correctly mapped
if (ite - expected.begin() < 6) {
// conversion expected
EXPECT_NE(in_value, out_value);
} else {
// conversion is not expected, since there is no mapping
EXPECT_NEAR(in_value, out_value, 0.00001);
}

}
}