Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bmi array support #405

Merged
merged 17 commits into from
Jul 11, 2022
Merged

Bmi array support #405

merged 17 commits into from
Jul 11, 2022

Conversation

hellkite500
Copy link
Member

This PR allows a BMI module to provide an array of values as a single 1D output which can then be passed/consumed by another BMI module in a Multi BMI configuration.

Additions

  • get_values function added to ForcingProvider, with a default implementation
  • BMI Module Formulation custom override of ForcingProvider::get_values
  • get_converted_values function added to unit helper for unit conversion of an array of values
  • BMI utilities module with function to cast and copy values based on string type tags
  • Test module for testing setting/getting array of values across two C++ BMI modules

Removals

  • Deprecated templated GetValue( std::string & ) overload from BMI adapters. Functionality moved to BMI utilities free function.

Changes

  • check_internal_providers now returns a vector of values instead of a single value
  • add new array variable to c++ bmi test model
  • updated implementation set_model_inputs_prior_to_update which checks for array like values and handles them accordingly.

Testing

  1. New functionality tested, all existing tests passing locally

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Target Environment support

  • Linux
  • MacOS

@hellkite500 hellkite500 force-pushed the bmi-array-support branch 2 times, most recently from 69f1a76 to 462ec6c Compare April 5, 2022 18:51
Copy link
Contributor

@donaldwj donaldwj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit conversion code will leak memory on exceptions. As we don't always treat that as fatal error this needs to be addressed. The other comments are just places where I would like design choices explained.

.github/workflows/test_and_validate.yml Show resolved Hide resolved

private:
static double* get_converted_values(const std::string &in_units, double* values, const std::string &out_units, const size_t & count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to work on raw arrays. As defined this will change random memory if 'count' is not correctly set to the size of the array. Also returning a value is misleading as we are mutating the call argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is designed to work with BMI array inputs and outputs, I imagine a version to work on raw arrays is wanted. We face the same risk of count being incorrect in all those BMI calls, so there's little lost here.

However... I didn't realize we were mutating the input... so if we used this on the result of get_value_ptr()...that could be bad. Ideally I'd say this should not mutate, or that another function (e.g. static void convert_values_in_place(...)) would be better to have for the use case where we explicitly want to do that. Unsure what the mem mgmt semantics would be in C style if we return a copy... malloc and assume/document that the caller is now responsible for freeing? Require the caller to allocate and pass a new buffer as an additional parameter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a thin wrapper around the UDUNITS function that works on arrays cv_convert_doubles(conv, values, count, values);

double* UnitsHelper::get_converted_values(const std::string &in_units, double* values, const std::string &out_units, const size_t& count)
{
    if(in_units == out_units){
        return values; // Early-out optimization
    }
    std::call_once(unit_system_inited, init_unit_system);
    ut_unit* to = NULL;
    ut_unit* from = NULL;
    cv_converter* conv = get_converter(in_units, out_units, to, from);

    cv_convert_doubles(conv, values, count, values);
    ut_free(from);
    ut_free(to);
    cv_free(conv);
    return values;
}

And returning the value (reference to the mutated input variable) is just syntactic convenience and follows the potentially expected pattern as the regular get_converted_value which returns the converted value. It could be said that NOT returning it is misleading in that context. I implemented it this way to be flexible, and it is a somewhat common pattern when working with mutable args behind a pointer.

if (availableData.empty() || availableData.find(output_name) == availableData.end()) {
throw runtime_error(get_formulation_type() + " cannot get output values for unknown " + output_name + SOURCE_LOC);
}
return availableData[output_name]->get_values(CatchmentAggrDataSelector("",output_name, init_time, duration_s, output_units), m);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a new selector, this appears identical to the input selector?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question... I think this is essentially copy-pasted from the get_value implementation above, which does the same... possibly an artifact of an original intent to ensure that a BmiDataSelector was used, but as discussed there's a slight performance benefit to doing it this way. Not sure what's "best" for both these cases... in theory, an actual CatchmentAggrDataSelector could get passed in with a non-empty catchment id... this code ensures that there is not one downstream....that is one difference, but I'm not sure if that's a good thing or bad thing now, on reflection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattw-nws any thoughts, I believe this was one of your additions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that would have been the result of least change policy when adding data selectors, In any case there is not reason to do that.

std::string type = model.get_analogous_cxx_type(model.GetVarType(name), item_size);

//C++ form of malloc
void* data = ::operator new(total_mem);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe attach smart pointer to this to make sure that memory is not leaked in the case of an exception. We have in general avoided new/delete malloc/free as much as possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is partly due to getting a smart pointer of the correct size which also properly deletes the correct amount of memory. A void* of the correct size is required by BMI to populate the array in GetValue. I recall trying to do this with a smart pointer and having issues with it, but I can't remember off the top of my head exactly what the complication was. It very well may have been getting it typed as void* with the correct number of bytes allocated.

The use of manual allocation/cleanup here is at least isolated to this function, and we can fix the potential leak in the exception throwing. If we can do this sufficiently with a smart pointer, we can change this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting question. Maybe you ran into a problem using unique_ptr<void>? Maybe I'll play with this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the memory link anytime there was an error, which was my only concern with the original request.

" as " + boost::typeindex::type_id<T>().pretty_name() + ": no logic for converting variable type " + type);
}
//clean up the temporary data
::operator delete(data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this position we will get a memory leak on the above exception.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this is a fatal error at this point, but in case it is caught elsewhere, we could delete before we throw as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the delete[] before the throw, no reason to leave a potential leak, even on something assumed to be fatal, if we dont need to.

Copy link
Contributor

@mattw-nws mattw-nws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with it at this point. @donaldwj see if your change requests have been addressed.

std::string type = model.get_analogous_cxx_type(model.GetVarType(name), item_size);

//C++ form of malloc
void* data = ::operator new(total_mem);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the memory link anytime there was an error, which was my only concern with the original request.

@mattw-nws mattw-nws merged commit 24d1658 into NOAA-OWP:master Jul 11, 2022
@mattw-nws mattw-nws mentioned this pull request Jun 5, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants