Adds function for checking if model uses rng in transform data #868

SteveBronder · 2021-04-02T23:42:33Z

Summary

This adds a function to the model for checking whether the model uses any _rng(..) functions inside of transformed data.

Description

For the parallel cmdstan PR (running multiple chains in one cmdstan program) there is currently one model that is shared across all of the threads. This is nice whenever we are doing sampling etc. but when doing SBC where the data is created with rngs in transformed data that means that even though your running N chains you're really just using one set of data so there's not really much of a point to using more than 1 chain at all. I think that's counterintuitive from a users perspective. So when we make the model we need some way to know "Is an rng used in transformed data?". If so then we need to make a model for each thread so that each thread's model uses a different rng/seed and different transformed data.

So to do that I added a new function to the model that returns a boolean for whether or not an rng function is used in transformed data.

 inline bool is_rng_in_transform_data() const noexcept {
    return static_cast<bool>({# of RNGs in TD});
  } // is_rng_in_transform_data()

We can make a virtual function for this on the Stan C++ side, then when running the program we will construct one model, check whether is_rng_in_transform_data(), and if so then make a (N - 1) new models for each thread.

Right now the function returns either 0 for no rngs or a static_cast<bool>({# of rng functions}) which is just nice for testing and sanity checking.

To find the rngs I added query_stmt_functions and query_expr_functions that takes in

A functor to select a subset of a function's types
A functor to check whether the function satisfies a given condition (like having "_rng" in the name),
Fixed Stmt or Expr respectively

These return a list of optional types that we then count over to get the number of rngs. It's not terribly efficient, but I thought they might be more extensible for other uses if they returned back a list instead of just a single value from something like find_map. Though if these seem useful we could add an optional parameter that let's the user define the map function.

tbh these functions kind of feel like I brute forced the problem, like the programming memes where you see someone write

bool is_even(int x) {
  if (x == 0) return true;
  if (x == 1) return false;
  if (x == 2) return true;
  // ....
}

But testing it on a sample stan program they do seem to work thoroughly.

@rok-cesnovar These might also be useful for the new matrix type and opencl since it let's us query whether a function in an expr or statement satisifes some condition (like being in a table of supported functions)

Testing

One issue is that idk how to write expect tests for these functions, is there an example somewhere of taking in Expr.Fixed() types and printing out something to show it's correct? I think I'm having the most trouble with the printing part. Instead of an expect tests I wrote a stan model that uses a bunch of rngs in weird places to try to capture edge cases. The number of rngs reported in the function seems to match up with the number in transformed data so I think I got everything

Release notes

Adds function for checking if model uses rng in transform data

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause)
Steve Bronder

…ed in the transformed data block

…-in-td

SteveBronder · 2021-04-05T12:31:09Z

We ended up not needing this

SteveBronder added 3 commits April 2, 2021 18:37

Adds function to model that returns whether any rng functions were us…

0b3f6b5

…ed in the transformed data block

tidy up

2ed0373

cleanup the docs

7e06677

SteveBronder mentioned this pull request Apr 2, 2021

Running multiple chains in one Stan program stan-dev/cmdstan#987

Merged

2 tasks

SteveBronder added 5 commits April 2, 2021 19:55

remove some lets

d11c3cd

dune format

3edd26c

remove subset_function since it was redundant

b2ff216

Merge remote-tracking branch 'upstream/master' into feature/count-rng…

7189bf9

…-in-td

fix wrt changes for funapp in master

6681e2f

SteveBronder closed this Apr 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds function for checking if model uses rng in transform data #868

Adds function for checking if model uses rng in transform data #868

SteveBronder commented Apr 2, 2021

SteveBronder commented Apr 5, 2021

Adds function for checking if model uses rng in transform data #868

Adds function for checking if model uses rng in transform data #868

Conversation

SteveBronder commented Apr 2, 2021

Summary

Description

Testing

Release notes

Copyright and Licensing

SteveBronder commented Apr 5, 2021