Adds function for checking if model uses rng in transform data #868
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This adds a function to the model for checking whether the model uses any
_rng(..)
functions inside of transformed data.Description
For the parallel cmdstan PR (running multiple chains in one cmdstan program) there is currently one model that is shared across all of the threads. This is nice whenever we are doing sampling etc. but when doing SBC where the data is created with rngs in transformed data that means that even though your running N chains you're really just using one set of data so there's not really much of a point to using more than 1 chain at all. I think that's counterintuitive from a users perspective. So when we make the model we need some way to know "Is an rng used in transformed data?". If so then we need to make a model for each thread so that each thread's model uses a different rng/seed and different transformed data.
So to do that I added a new function to the model that returns a boolean for whether or not an rng function is used in transformed data.
We can make a virtual function for this on the Stan C++ side, then when running the program we will construct one model, check whether
is_rng_in_transform_data()
, and if so then make a (N - 1) new models for each thread.Right now the function returns either
0
for no rngs or astatic_cast<bool>({# of rng functions})
which is just nice for testing and sanity checking.To find the rngs I added
query_stmt_functions
andquery_expr_functions
that takes inThese return a list of optional types that we then count over to get the number of rngs. It's not terribly efficient, but I thought they might be more extensible for other uses if they returned back a list instead of just a single value from something like
find_map
. Though if these seem useful we could add an optional parameter that let's the user define the map function.tbh these functions kind of feel like I brute forced the problem, like the programming memes where you see someone write
But testing it on a sample stan program they do seem to work thoroughly.
@rok-cesnovar These might also be useful for the new matrix type and opencl since it let's us query whether a function in an expr or statement satisifes some condition (like being in a table of supported functions)
Testing
One issue is that idk how to write expect tests for these functions, is there an example somewhere of taking in
Expr.Fixed()
types and printing out something to show it's correct? I think I'm having the most trouble with the printing part. Instead of an expect tests I wrote a stan model that uses a bunch of rngs in weird places to try to capture edge cases. The number of rngs reported in the function seems to match up with the number in transformed data so I think I got everythingRelease notes
Adds function for checking if model uses rng in transform data
Copyright and Licensing
By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause)
Steve Bronder