Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forcings Engine Data Provider Interface and Lumped Forcings Implementation #720

Closed
wants to merge 45 commits into from
Closed
Show file tree
Hide file tree
Changes from 44 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
6417ee9
feat: initial forcings engine implementation
program-- Jan 30, 2024
33e2e47
fix: add include guards; rm unused header [no ci]
program-- Jan 30, 2024
30dbb56
feat: correct time indexing [no ci]
program-- Feb 1, 2024
999e325
feat: forcings engine resampling [no ci]
program-- Feb 6, 2024
fe98a70
cpp: ForcingEngine -> ForcingsEngine
program-- Feb 12, 2024
3306017
feat: allow forcings engine provider to set MPI communicator [no ci]
program-- Feb 12, 2024
c43bcf6
cmake: handle forcings engine data provider if python is not enabled
program-- Feb 12, 2024
f2ef0c5
cpp: separate NullForcingProvider into source file; add more forcings…
program-- Feb 14, 2024
8d7e209
refactor: separate forcings engine instance from data provider [no ci]
program-- Feb 23, 2024
b9266e4
feat: add runtime checks for forcings engine provider; update tests […
program-- Feb 26, 2024
7bb3802
try and fix MPI error
program-- Mar 1, 2024
717d930
tests: rename forcings engine test [no ci]
program-- Mar 1, 2024
a219491
tests: docs; call forcings engine finalize on instance destruction
program-- Mar 14, 2024
50c0ca0
rev: move forcing engine impl out of data provider source
program-- Mar 18, 2024
fcadedb
rev: change trailing return types to traditional; further separate fo…
program-- Mar 18, 2024
0a05e1a
feat: add finalization function for Forcings Engine
program-- Mar 18, 2024
823f985
rev: more use of std::chrono, less of time_t; address rev comments
program-- Mar 25, 2024
37e66df
rev: remove extraneous semicolons
program-- Mar 26, 2024
6ab9405
rev: impl more review comments
program-- Mar 26, 2024
c1f6a42
rev: refactor get_value() for ForcingsEngineDataProvider
program-- Mar 28, 2024
3ae3cde
rev: separate out value storage op into helper function
program-- Apr 2, 2024
4d7b9da
rev: use qualifed name for ReSampleMethod parameter in public interface
program-- Apr 2, 2024
d86e162
rev: assertions for forcings engine instance retrieval
program-- Apr 2, 2024
b1eebfb
rev: correctly construct time_points for assertion comparisons
program-- Apr 2, 2024
b141369
rev: break up long line containing duration_cast
program-- Apr 2, 2024
916dadf
rev: further clarity on get_value accumulation statement
program-- Apr 2, 2024
ed58d71
rev: refactor caching mechanism to only hold current and previous tim…
program-- Apr 8, 2024
145be68
docs: add docs for ForcingsEngine::next()
program-- Apr 8, 2024
0a5c4a5
docs: small documentation changes
program-- Apr 29, 2024
0c59a62
refactor: template forcings engine; impl lumped provider [no ci]
program-- May 3, 2024
24464b1
tests: translate tests to new impl; rev: address quick suggestions
program-- May 8, 2024
4329acc
refactor: adjust construction/access patterns
program-- May 8, 2024
5d8820c
refactor: modify divide storage to map instead of vector
program-- May 9, 2024
4b55886
rev: convert GenericDataProvider to type alias
program-- May 9, 2024
fc609d4
rev: remove set_instance swap in favor of move assignment
program-- May 9, 2024
1796ec9
rev: update tests for new impl
program-- May 10, 2024
a6179d4
rev: correctly use duration relative to start for resampling mean
program-- May 10, 2024
471ef26
rev: assert count of output variables is equal to count of expected v…
program-- May 10, 2024
5d6c399
rev: add forcings engine assertion to base constructor
program-- May 10, 2024
632d742
rev: adjust logic of variable checks to correctly check existence
program-- May 10, 2024
0bc1396
rev: optimize indexing; use int overload for ForcingsEngineLumpedData…
program-- May 10, 2024
48e724c
rev: check uniquness of domain divide IDs using unordered_map
program-- May 14, 2024
8e8fa99
rev: remove assumption that divide ID is prefixed by 'cat-'
program-- May 14, 2024
283282a
fix: correct function prototypes for ForcingsEngineDataProvider (doub…
program-- May 16, 2024
c0d44f0
rev: remove previous timestep caching in lumped forcings engine provider
program-- May 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions include/forcing/ForcingsEngineDataProvider.hpp
program-- marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
#pragma once

#include <memory>
#include <unordered_map>
#include <chrono>

#include "DataProvider.hpp"
#include "bmi/Bmi_Py_Adapter.hpp"

namespace data_access {

static constexpr auto forcings_engine_python_module = "NextGen_Forcings_Engine";
static constexpr auto forcings_engine_python_class = "NWMv3_Forcing_Engine_BMI_model";
static constexpr auto forcings_engine_python_classpath = "NextGen_Forcings_Engine.NWMv3_Forcing_Engine_BMI_model";
static constexpr auto default_time_format = "%Y-%m-%d %H:%M:%S";

//! Parse time string from format.
//! Utility function for ForcingsEngineLumpedDataProvider constructor.
time_t parse_time(const std::string& time, const std::string& fmt);

/**
* Check that requirements for running the forcings engine
* are available at runtime. If requirements are not available,
* then this function throws.
*/
void assert_forcings_engine_requirements();

template<typename DataType, typename SelectionType>
struct ForcingsEngineDataProvider
: public DataProvider<DataType, SelectionType>
{
using data_type = DataType;
using selection_type = SelectionType;
using clock_type = std::chrono::system_clock;

~ForcingsEngineDataProvider() = default;

boost::span<const std::string> get_available_variable_names() override
{
return var_output_names_;
}

long get_data_start_time() override
{
return clock_type::to_time_t(time_begin_);
}

long get_data_stop_time() override
{
return clock_type::to_time_t(time_end_);
}

long record_duration() override
{
return std::chrono::duration_cast<std::chrono::seconds>(time_step_).count();
}

size_t get_ts_index_for_time(const time_t& epoch_time) override
{
const auto epoch = clock_type::from_time_t(epoch_time);

if (epoch < time_begin_ || epoch > time_end_) {
throw std::out_of_range{
"epoch " + std::to_string(epoch.time_since_epoch().count()) +
" out of range of " + std::to_string(time_begin_.time_since_epoch().count()) + ", "
+ std::to_string(time_end_.time_since_epoch().count())
};
}

return (epoch - time_begin_) / time_step_;
}

// Temporary (?) function to clear out instances of this type.
static void finalize_all() {
instances_.clear();
}

/* Remaining virtual member functions from DataProvider must be implemented
by derived classes. */

data_type get_value(const selection_type& selector, data_access::ReSampleMethod m) override = 0;

std::vector<data_type> get_values(const selection_type& selector, data_access::ReSampleMethod m) override = 0;


/* Friend functions */
static ForcingsEngineDataProvider* instance(
const std::string& init,
const std::string& time_begin,
const std::string& time_end,
const std::string& time_fmt = default_time_format
)
{
auto& inst = instances_.at(init);
if (inst != nullptr) {
assert(inst->time_begin_.time_since_epoch() == std::chrono::seconds{parse_time(time_begin, time_fmt)});
assert(inst->time_end_.time_since_epoch() == std::chrono::seconds{parse_time(time_end, time_fmt)});
}

return inst.get();
}

protected:

// TODO: It may make more sense to have time_begin_seconds and time_end_seconds coalesced into
// a single argument: `clock_type::duration time_duration`, since the forcings engine
// manages time via a duration rather than time points. !! Need to double check
ForcingsEngineDataProvider(
const std::string& init,
std::size_t time_begin_seconds,
std::size_t time_end_seconds
)
: time_begin_(std::chrono::seconds{time_begin_seconds})
, time_end_(std::chrono::seconds{time_end_seconds})
{
program-- marked this conversation as resolved.
Show resolved Hide resolved

assert_forcings_engine_requirements();

bmi_ = std::make_unique<models::bmi::Bmi_Py_Adapter>(
"ForcingsEngine",
init,
forcings_engine_python_classpath,
/*allow_exceed_end=*/true,
/*has_fixed_time_step=*/true,
utils::getStdOut()
);

time_step_ = std::chrono::seconds{static_cast<int64_t>(bmi_->GetTimeStep())};
var_output_names_ = bmi_->GetOutputVarNames();
}

static ForcingsEngineDataProvider* set_instance(
const std::string& init,
std::unique_ptr<ForcingsEngineDataProvider>&& instance
)
{
instances_[init] = std::move(instance);
return instances_[init].get();
};

//! Instance map
//! @note this map will exist for each of the
//! 3 instance types (lumped, gridded, mesh).
static std::unordered_map<
std::string,
std::unique_ptr<ForcingsEngineDataProvider>
> instances_;

// TODO: this, or just push the scope on time members up?
void increment_time()
{
time_current_index_++;
}

//! Forcings Engine instance
std::unique_ptr<models::bmi::Bmi_Py_Adapter> bmi_ = nullptr;

//! Output variable names
std::vector<std::string> var_output_names_{};

private:
//! Initialization config file path
std::string init_;

clock_type::time_point time_begin_{};
clock_type::time_point time_end_{};
clock_type::duration time_step_{};
std::size_t time_current_index_{};
};

} // namespace data_access
113 changes: 113 additions & 0 deletions include/forcing/ForcingsEngineLumpedDataProvider.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
#pragma once

#include "DataProviderSelectors.hpp"
#include "ForcingsEngineDataProvider.hpp"

#include <utilities/mdarray.hpp>

namespace data_access {

struct ForcingsEngineLumpedDataProvider
: public ForcingsEngineDataProvider<double, CatchmentAggrDataSelector>
{
static constexpr auto bad_index = static_cast<std::size_t>(-1);

~ForcingsEngineLumpedDataProvider() override = default;

double get_value(const CatchmentAggrDataSelector& selector, data_access::ReSampleMethod m) override;

std::vector<double> get_values(const CatchmentAggrDataSelector& selector, data_access::ReSampleMethod m) override;

/**
* @brief Get the index in `CAT-ID` for a given divide in the instance cache.
* @note The `CAT-ID` output variable uses integer values instead of strings.
*
* @param divide_id A hydrofabric divide ID, i.e. "cat-*"
* @return size_type
*/
std::size_t divide_index(const std::string& divide_id) noexcept;

/**
* @brief Get the index of a variable in the instance cache.
*
* @param variable
* @return size_type
*/
std::size_t variable_index(const std::string& variable) noexcept;

static ForcingsEngineDataProvider* lumped_instance(
const std::string& init,
const std::string& time_start,
const std::string& time_end,
const std::string& time_fmt = default_time_format
);

private:
ForcingsEngineLumpedDataProvider(
const std::string& init,
std::size_t time_begin_seconds,
std::size_t time_end_seconds
);

/**
* @brief Update to next timestep.
*
* @return true
* @return false
*/
bool next();

/**
* @brief Get a forcing value from the instance
*
* @param divide_id Divide ID to index at
* @param variable Forcings variable to get
* @param previous If true, return the previous timestep values.
* @return double
*/
double at(
const std::string& divide_id,
const std::string& variable,
bool previous = false
PhilMiller marked this conversation as resolved.
Show resolved Hide resolved
);

/**
* @brief Get a forcing value from the instance
*
* @param divide_index Divide index
* @param variable_index Variable index
* @param previous If true, return the previous timestep values.
* @return double
*/
double at(
std::size_t divide_idx,
std::size_t variable_idx,
bool previous = false
);

/**
* @brief Update the value storage.
*/
void update_value_storage_();

//! Divide index map
//! (Divide ID) -> (Divide ID Array Index)
std::unordered_map<int, int> var_divides_{};

/**
* Values are stored indexed on (2, divide_id, variable),
* such that the structure can be visualized as:
*
* Divide ID : || D0 | D1 || D0 ...
* Variable : || V1 V2 V3 | V1 V2 V3 || V1 ...
* Value : || 9 11 31 | 3 4 5 || 10 ...
*
* Some notes for future reference:
* - Time complexity to update is approximately O(2*V*D) = O(V*D),
* where V is the number of variables and D is the number of divides.
* In general, D will dominate and, V will be some small constant amount.
*/
ngen::mdarray<double> var_cache_{};
};

} // namespace data_access
9 changes: 2 additions & 7 deletions include/forcing/GenericDataProvider.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,7 @@

namespace data_access
{
class GenericDataProvider : public DataProvider<double, CatchmentAggrDataSelector>
{
public:

private:
};
using GenericDataProvider = DataProvider<double, CatchmentAggrDataSelector>;
}

#endif
#endif
38 changes: 9 additions & 29 deletions include/forcing/NullForcingProvider.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@

#include <vector>
#include <string>
#include <stdexcept>
#include <limits>
#include "GenericDataProvider.hpp"

/**
Expand All @@ -14,43 +12,25 @@ class NullForcingProvider : public data_access::GenericDataProvider
{
public:

NullForcingProvider(){}
NullForcingProvider();

// BEGIN DataProvider interface methods

long get_data_start_time() override {
return 0;
}
long get_data_start_time() override;

long get_data_stop_time() override {
return LONG_MAX;
}
long get_data_stop_time() override;

long record_duration() override {
return 1;
}
long record_duration() override;

size_t get_ts_index_for_time(const time_t &epoch_time) override {
return 0;
}
size_t get_ts_index_for_time(const time_t &epoch_time) override;

double get_value(const CatchmentAggrDataSelector& selector, data_access::ReSampleMethod m) override
{
throw std::runtime_error("Called get_value function in NullDataProvider");
}
double get_value(const CatchmentAggrDataSelector& selector, data_access::ReSampleMethod m) override;

virtual std::vector<double> get_values(const CatchmentAggrDataSelector& selector, data_access::ReSampleMethod m) override
{
throw std::runtime_error("Called get_values function in NullDataProvider");
}
std::vector<double> get_values(const CatchmentAggrDataSelector& selector, data_access::ReSampleMethod m) override;

inline bool is_property_sum_over_time_step(const std::string& name) override {
throw std::runtime_error("Got request for variable " + name + " but no such variable is provided by NullForcingProvider." + SOURCE_LOC);
}
inline bool is_property_sum_over_time_step(const std::string& name) override;

boost::span<const std::string> get_available_variable_names() override {
return {};
}
boost::span<const std::string> get_available_variable_names() override;
};

#endif // NGEN_NULLFORCING_H
Loading
Loading