Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove deprecated Forcing class dependency from legacy formulations #432

Merged
merged 7 commits into from
Sep 21, 2022

Conversation

stcui007
Copy link
Contributor

Removing the old legacy forcing

Additions

Removals

Changes

include/realizations/catchment/Catchment_Formulation.hpp
include/realizations/catchment/Formulation_Constructors.hpp
include/realizations/catchment/Simple_Lumped_Model_Realization.hpp
include/realizations/catchment/Tshirt_C_Realization.hpp
include/realizations/catchment/Tshirt_Realization.hpp
src/realizations/catchment/Simple_Lumped_Model_Realization.cpp
src/realizations/catchment/Tshirt_C_Realization.cpp
src/realizations/catchment/Tshirt_Realization.cpp
test/realizations/catchments/Bmi_C_Cfe_IT.cpp
test/realizations/catchments/Bmi_C_Formulation_Test.cpp
test/realizations/catchments/Bmi_Cpp_Formulation_Test.cpp
test/realizations/catchments/Bmi_Fortran_Formulation_Test.cpp
test/realizations/catchments/Bmi_Py_Formulation_Test.cpp
test/realizations/catchments/LSTM_Realization_Test.cpp

Testing

Screenshots

Notes

Todos

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Testing checklist (automated report can be put here)

Target Environment support

  • Linux

Copy link
Contributor

@mattw-nws mattw-nws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an intitial pass... I haven't checked into the logic for doing the epoch time differentials, for instance. But a few changes I already see.

Also, generally, go ahead and delete commented-out references to Forcing.h and _link_legacy_forcing(), we can see what it was previously via Git.

@@ -7,7 +7,8 @@
#include "Et_Accountable.hpp"
#include <HY_CatchmentArea.hpp>

#include <Forcing.h> // Remove when _link_legacy_forcing() is removed!
//#include <Forcing.h> // Remove when _link_legacy_forcing() is removed!
#include "CsvPerFeatureForcingProvider.hpp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be here. Not sure if anything should go in it's place, but if anything maybe DataProvider.hpp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the time functions, get_data_start_time() and several others, for example are only provided in the following header files, so we need to provide CsvPerFeatureForcingProvider some where. I just tried to use GenericDataProvider.hpp and DataProvider.hpp in its place, both gave build errors.

include/forcing/CsvPerFeatureForcingProvider.hpp:    long get_data_start_time() override {
include/forcing/DataProvider.hpp:        virtual long get_data_start_time() = 0;
include/forcing/WrappedForcingProvider.hpp:            return wrapped_provider->get_data_start_time();
include/forcing/WrappedForcingProvider.hpp:        long get_data_start_time() override {
include/forcing/WrappedForcingProvider.hpp:            return wrapped_provider->get_data_start_time();
include/forcing/Forcing.h:    long get_data_start_time() override
include/forcing/NetCDFPerFeatureDataProvider.hpp:        long get_data_start_time()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused on this, since as you show above, get_data_start_time() is defined in both, and it's not even templated... will try this myself and have a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me the get_data_start_time() is only accessible through Forcing, CsvPerFeatureForcingProvider, WrappedForcingProvider, or NetCDFPerFeatureDataProvider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You maybe right. Probably need some code reorganization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here doesn't seem to have anything to do with get_data_start_time(). See the comment in Formulation_Constructors.hpp.

@@ -72,7 +72,8 @@ namespace realization {
std::shared_ptr<data_access::GenericDataProvider> fp;
if (formulation_type == "tshirt" || formulation_type == "tshirt_c" || formulation_type == "lstm" // These formulations are still using the legacy interface!
){
fp = std::make_shared<Forcing>(forcing_config);
//fp = std::make_shared<Forcing>(forcing_config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if block above here is an exception that prevented the use of CsvPerFeatureForcingProvider below. Rather than this change, this conditional block should be reduced or removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to reduce it. fp will still need to have a value though, as in the netCDF forcing provider a couple lines below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Exactly. What I mean is, remove tshirt_c and tshirt from the conditional above... they should fall through to the other blocks, depending on what the forcing_config says to use. They shouldn't be locked to CsvPerFeature, which is what the code does now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing tshirt and tshirt_c appears to cause the Formulation_Manager_Test to fail. There are a bunch of tshirt and tshirt_c examples there using the old form of forcing explicitly. Do we still need the Formulation_Manager_Test.cpp?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the provider string in Formulation_Manager_Test.cpp:

          "\"forcing\": { "
              "\"provider\": \"legacy\","
              "\"path\": \"./data/forcing/cat-67_2015-12-01 00_00_00_2015-12-30 23_00_00.csv\" "

@@ -37,13 +39,12 @@ Tshirt_C_Realization::Tshirt_C_Realization(forcing_params forcing_config,
std::vector<double> giuh_ordinates,
tshirt::tshirt_params params,
const vector<double> &nash_storage)
: Catchment_Formulation(catchment_id, std::move(std::make_unique<Forcing>(forcing_config)), output_stream), catchment_id(std::move(catchment_id)),
//: Catchment_Formulation(catchment_id, std::move(std::make_unique<Forcing>(forcing_config)), output_stream), catchment_id(std::move(catchment_id)),
: Catchment_Formulation(catchment_id, std::move(std::make_unique<CsvPerFeatureForcingProvider>(forcing_config)), output_stream), catchment_id(std::move(catchment_id)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm betting this should not be specific to the CSV reader... where is this constructor used? Maybe this is only in test code and that's okay, but not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the class is use by a few testing codes. Not sure which form of the constructors.

) : Catchment_Formulation(std::move(id), std::move(std::make_unique<Forcing>(forcing_config)), output_stream) {
_link_legacy_forcing();
//) : Catchment_Formulation(std::move(id), std::move(std::make_unique<Forcing>(forcing_config)), output_stream) {
) : Catchment_Formulation(std::move(id), std::move(std::make_unique<CsvPerFeatureForcingProvider>(forcing_config)), output_stream) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above... where is this constructor used, and does this need to be non-specific to CsvPerFeature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would tend to agree...

if (t_index < 0) {
throw std::invalid_argument("Getting response of negative time step in Tshirt C Realization is not allowed.");
}
time_t start_time = this->forcing->get_data_start_time();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While time_t is probably reasonable here, I know in some other places we have used long because we may deal with dates < 1970. Is this based on code in another module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time_t is used all over the places in our codes. In CsvPerFeatureForcingProvider.hpp, time_t is used for epoch time. This is used here for consistency. If you think that might not be enough, we can change that.

@@ -3,6 +3,8 @@
#include "TshirtErrorCodes.h"
#include "Catchment_Formulation.hpp"
using namespace realization;
using data_access::MEAN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one isn't actually used, right? And for SUM, I would skip the using and just use the namespaced name below in the one call it appears in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mean is not used here. Will make the suggested change for SUM.

@@ -9,6 +9,8 @@
#include <boost/algorithm/string/join.hpp>

using namespace realization;
using data_access::MEAN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments in Tshirt_Realization.cpp on these

@@ -2,7 +2,7 @@

#include "gtest/gtest.h"
#include "realizations/catchment/LSTM_Realization.hpp"
#include "forcing/Forcing.h"
//#include "forcing/Forcing.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that these tests are almost certainly not actually being run because of the guard at line 1...I am betting that if we can run these tests they won't succeed with only this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, Ngen framework does not seem to set up to handle LSTM module. So there is actually no test code ever built because of the guard lines.

@@ -14,6 +14,18 @@
#define CSDMS_STD_NAME_WIND_V_Y "land_surface_wind__y_component_of_velocity"
#define NGEN_STD_NAME_SPECIFIC_HUMIDITY "atmosphere_air_water~vapor__relative_saturation" // This is not present in standard names, use this for now... may change!

// Recognized Forcing Value Names (in particular for use when configuring BMI input variables)
// TODO: perhaps create way to configure a mapping of these to something different
#define AORC_FIELD_NAME_PRECIP_RATE "precip_rate"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly... these defines appear to only be used in test code... though the magic strings appear elsewhere (in this file even)... no action necessary here now, but this is probably something that should be refactored, some way, at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, need to keep in mind.

@@ -18,7 +18,7 @@
#include "Bmi_Multi_Formulation.hpp"
#include "Bmi_Py_Formulation.hpp"
#include <GenericDataProvider.hpp>
#include "CsvPerFeatureForcingProvider.hpp"
//#include "CsvPerFeatureForcingProvider.hpp"
Copy link
Contributor

@mattw-nws mattw-nws Aug 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is actually what is causing your problem in Catchment_Formulation.hpp. This is the one place that CsvPerFeatureForcingProvider.hpp is needed, because the instances of the providers are instantiated in this class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will need #include "CsvPerFeatureForcingProvider.hpp" in files where CsvPerFeatureForcingProvider is explicitly constructed... for the moment that means the two tshirt files also (though ideally those dependencies can be removed, per the discussion there).

Copy link
Contributor

@mattw-nws mattw-nws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only problem left has to do with LSTM_Realization... either it needs to be fixed, dropped, or this will have to keep Forcing.h for that one realization... hold until decided.

if (formulation_type == "tshirt" || formulation_type == "tshirt_c" || formulation_type == "lstm" // These formulations are still using the legacy interface!
){
//fp = std::make_shared<Forcing>(forcing_config);
if (formulation_type == "lstm") {
fp = std::make_shared<CsvPerFeatureForcingProvider>(forcing_config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unfortunately should not stay this way. This will break LSTM_Realization it seems, so this will either need to go back to Forcing or both LSTM_Realization and this whole block should be dropped. Posted a question in Slack on which to do.

time_t stop_time = this->forcing->get_data_stop_time();
time_t t_current = start_time + t_index * t_delta_s;
//Ensure model run does not exceed the end time of forcing
if (t_current > stop_time) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, the BMI formulations allow this... but code suggests it was never a feature for other realizations... so this is probably fine.

double precip = this->legacy_forcing->get_next_hourly_precipitation_meters_per_second();
time_t t_delta = this->forcing->record_duration();
if (t_delta != t_delta_s) { //Checking the time step used is consistent with that provided in forcing data
throw std::invalid_argument("Getting response using insonsistent time step with provided forcing data");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely legal for BMI realizations... however, apparently it wasn't under the old API (they had to be in sync), so this is fine for any current use cases.

@stcui007
Copy link
Contributor Author

stcui007 commented Aug 24, 2022 via email

@mattw-nws mattw-nws changed the title Use new forcing Remove deprecated Forcing class dependency from legacy formulations Aug 26, 2022
}
//Negative t_index is not allowed
if (t_index < 0) {
throw std::invalid_argument("Getting response of negative time step in Tshirt C Realization is not allowed.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is LSTM_Realization, not Tshirt C Realization

}
time_t start_time = this->forcing->get_data_start_time();
time_t stop_time = this->forcing->get_data_stop_time();
time_t t_current = start_time + t_index * t_delta_s;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it guaranteed that data start time will align with simulation start time? Also, there is no guarantee that t_delta_s is the SAME delta at each call to get response.

And I'm not sure I'm tracking time here correctly...

t_index should be the current_time relative to the simulation driver (or whatever is calling get_response. t_delta_s is the amount of seconds the call wants to advance relative to t_index, or current_time. Is this just trying to convert that into time into the forcing provider's time reference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as this code is for the generic provider the forcing time step could be anything, it does not necessarily match the model time step or is covenant multiple of the model time step. At least for all existing providers it is not a dynamic value.

Copy link
Contributor

@mattw-nws mattw-nws Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it guaranteed that data start time will align with simulation start time?

DP interface does support retrieving arbitrary time slices...however, this legacy code was already fixed to the forcing timestep by virtue of the fact that the old Forcing object kept the timestep and time state--that has shifted out of the DataProvider interface, but this didn't add any new restrictions.

Copy link
Contributor

@mattw-nws mattw-nws Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there is no guarantee that t_delta_s is the SAME delta at each call to get response.

Perhaps not via the interface, but the rest of the framework is currently locked at 3600. That's going to require way more complex changes than you'll find here to change.

) : Catchment_Formulation(std::move(id), std::move(std::make_unique<Forcing>(forcing_config)), output_stream) {
_link_legacy_forcing();
//) : Catchment_Formulation(std::move(id), std::move(std::make_unique<Forcing>(forcing_config)), output_stream) {
) : Catchment_Formulation(std::move(id), std::move(std::make_unique<CsvPerFeatureForcingProvider>(forcing_config)), output_stream) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would tend to agree...

}
time_t start_time = this->forcing->get_data_start_time();
time_t stop_time = this->forcing->get_data_stop_time();
time_t t_current = start_time + t_index * t_delta_s;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, the relationship between start_time, t_index, and t_delta_s is not super clear

* @param longitude
* @param area_square_km
*/
LSTM_Realization::LSTM_Realization(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, any reason this constructor got removed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not used anywhere. It could be re-added with forcing_params changed to a DataProvider, but it would just be zombie code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of it was to ensure that if someone wanted to use the realization for something they could create it directly with the paths (it is a convenience constructor). It wasn't exactly meant to be used in our own code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess it it was using a forcing config object, not sure why it was removed an not refactored though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat subjective, but it's unused in the application... this isn't a general purpose library (at least not yet)... so it's arguably goldplating. Just more code to maintain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nuke it if you want...we have moved away in large part from the development kit style of implementing realizations, relying instead on BMI for the moment. I would, however, suggest that that deprecation/removal be done in its own isolated commit so that a history/blame on the file in the future would show a clean way to revive that constructor.

Copy link
Contributor

@donaldwj donaldwj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nel's concerns in the LSTM code need to be addressed, otherwise no issues that I see

@hellkite500 hellkite500 merged commit aca6450 into NOAA-OWP:master Sep 21, 2022
@stcui007
Copy link
Contributor Author

stcui007 commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants