feat!: add parent symbol tables #680

niklasdewally · 2025-02-17T11:34:06Z

Change SymbolTable to hold a pointer to a parent symbol table.

This required a large-scale refactor of the symbol-table API, and its
calling code due to the use of Rc and RefCell. See SYMBOL TABLE.
These changes broke the de/serialisation of the model using serde;
therefore, this commit also adds de/serialisation helpers for
Rc<RefCell<>> types. See DE/SERIALISATION.

This commit is the first in a series adding lexical scoping to
Conjure-Oxide. Planned commits are as follows:

Add Submodel. This will represent a particular lexical scope,
containing the local symbol table for that scope and its
constraints. Model will become a wrapper over Submodel, allowing
it to act as a one while also containing global model information
(such as dominance information).
Add the Scope expression type to represent sub-models inside a
model.
Make the rewriter keep track of the sub-model it is in and the
symbol table for that model. This will allow rules to run inside a
scope, acting on the local variables defined in its symbol table.

SYMBOL TABLE

Using the parent pointer, a symbol table for a scope can lookup all
declarations in its enclosing scopes.

A symbol table first looks at its declarations to see if the given name
is defined in itself. If not, As the most-local declaration for a name
is returned, variable shadowing works as expected.

As parent symbol tables must be pointed to by all of their children,
symbol-tables are now contained in Rc<RefCell<SymbolTable>> pointers.

Due to the borrow and RAII semantics of Rc and RefCell, this made an
API that returns values as borrows no longer possible. Things that used
to return borrows now return Refs or Rcs. The new intermediate type,
Declaration was added primarily to help with borrow checking and Rc
RefCell related issues.

DE/SERIALISATION

Serde does not support Rc types by default. There is a feature flag,
rc to enable support, however this does not retain shared pointers:

Serializing a data structure containing reference-counted pointers
will serialize a copy of the inner value of the pointer each time a
pointer is referenced within the data structure. Serialization will
not attempt to deduplicate these repeated data.

Deserializing a data structure containing reference-counted pointers
will not attempt to deduplicate references to the same data. Every
deserialized pointer will end up with a strong count of 1.

-- https://serde.rs/feature-flags.html

Having parent pointers point to the same symbol table is important in
our case, so this is inadequate.

This commit adds two de/serialisation helpers, RcRefCellAsId and
RcRefCellAsInner. These can be used for Rc<RefCell<>> fields using
serde_as.

Types that use these helpers must implement the HasId and
DefaultFromId traits.

HasId requires that a type has a function id() that returns a unique
id for every distinct instance of that type in memory.
DefaultFromId allows creating an instance of a type by just it's
id.

For symbol tables, we generate these using a static atomic. We also add
a custom implementation of clone so that cloning a symbol table changes
its id. Thus, two symbol tables have the same ids iff they have the same
memory address.

The helpers do the following:

RcRefCellAsInner works similarly to serde's rc flag, storing a copy
of the data inside the RcRefCell<<>>, and deserialising it without
deduplication.
RcRefCellAsId stores just the id on serialisation, and on
de-serialisation creates a dummy object the id using DefaultFromId.

For symbol tables, we plan to store a full version of each in its
submodel using RcRefCellAsInner, and store the parent pointers as ids
using RcRefCellAsId.

Post-deserialisation, the shared pointers can be restored by
deduplicating pointers to objects with a given id with one to a shared
object. Note that it is up to the caller code to do this
post-deserialization step, otherwise this works identically to serdes
rc flag.

To enforce this post-deserialisation step in Conjure Oxide, this commit
adds a new type SerdeModel. Only a SerdeModel is de/serialisable,
not a Model. To use a SerdeModel, it must be converted to a Model
by calling SerdeModel::initialize(), which will do this
post-deserialisation step. This will be implemented fully in a later
commit, once multiple submodels are representable inside the model.

This initialisation step also fixes the setting context pointer in
deserialised models.

Change `SymbolTable` to hold a pointer to a parent symbol table. + This required a large-scale refactor of the symbol-table API, and its calling code due to the use of `Rc` and `RefCell`. See SYMBOL TABLE. + These changes broke the de/serialisation of the model using `serde`; therefore, this commit also adds de/serialisation helpers for `Rc<RefCell<>>` types. See DE/SERIALISATION. This commit is the first in a series adding lexical scoping to Conjure-Oxide. Planned commits are as follows: + Add `Submodel`. This will represent a particular lexical scope, containing the local symbol table for that scope and its constraints. `Model` will become a wrapper over `Submodel`, allowing it to act as a one while also containing global model information (such as dominance information). + Add the `Scope` expression type to represent sub-models inside a model. + Make the rewriter keep track of the sub-model it is in and the symbol table for that model. This will allow rules to run inside a scope, acting on the local variables defined in its symbol table. SYMBOL TABLE Using the parent pointer, a symbol table for a scope can lookup all declarations in its enclosing scopes. A symbol table first looks at its declarations to see if the given name is defined in itself. If not, As the most-local declaration for a name is returned, variable shadowing works as expected. As parent symbol tables must be pointed to by all of their children, symbol-tables are now contained in `Rc<RefCell<SymbolTable>>` pointers. Due to the borrow and RAII semantics of `Rc` and `RefCell`, this made an API that returns values as borrows no longer possible. Things that used to return borrows now return `Ref`s or `Rc`s. The new intermediate type, `Declaration` was added primarily to help with borrow checking and `Rc` `RefCell` related issues. DE/SERIALISATION Serde does not support `Rc` types by default. There is a feature flag, `rc` to enable support, however this does not retain shared pointers: > Serializing a data structure containing reference-counted pointers > will serialize a copy of the inner value of the pointer each time a > pointer is referenced within the data structure. Serialization will > not attempt to deduplicate these repeated data. > > Deserializing a data structure containing reference-counted pointers > will not attempt to deduplicate references to the same data. Every > deserialized pointer will end up with a strong count of 1. -- https://serde.rs/feature-flags.html Having parent pointers point to the same symbol table is important in our case, so this is inadequate. This commit adds two de/serialisation helpers, `RcRefCellAsId` and `RcRefCellAsInner`. These can be used for `Rc<RefCell<>>` fields using `serde_as`. Types that use these helpers must implement the `HasId` and `DefaultFromId` traits. + `HasId` requires that a type has a function `id()` that returns a unique id for every distinct instance of that type in memory. + `DefaultFromId` allows creating an instance of a type by just it's id. For symbol tables, we generate these using a static atomic. We also add a custom implementation of clone so that cloning a symbol table changes its id. Thus, two symbol tables have the same ids iff they have the same memory address. The helpers do the following: + `RcRefCellAsInner` works similarly to serde's `rc` flag, storing a copy of the data inside the `RcRefCell<<>>`, and deserialising it without deduplication. + `RcRefCellAsId` stores just the id on serialisation, and on de-serialisation creates a dummy object the id using `DefaultFromId`. For symbol tables, we plan to store a full version of each in its submodel using `RcRefCellAsInner`, and store the parent pointers as ids using `RcRefCellAsId`. Post-deserialisation, the shared pointers can be restored by deduplicating pointers to objects with a given id with one to a shared object. Note that it is up to the caller code to do this post-deserialization step, otherwise this works identically to serdes `rc` flag. To enforce this post-deserialisation step in Conjure Oxide, this commit adds a new type `SerdeModel`. Only a `SerdeModel` is de/serialisable, not a `Model`. To use a `SerdeModel`, it must be converted to a `Model` by calling `SerdeModel::initialize()`, which will do this post-deserialisation step. This will be implemented fully in a later commit, once multiple submodels are representable inside the model. This initialisation step also fixes the setting context pointer in deserialised models.

github-actions · 2025-02-17T11:44:12Z

Code and Documentation Coverage Report

Documentation Coverage

This PR:

conjure_core: 3% with examples, 55% documented -- 8/144/271
conjure_oxide: 0% with examples, 19% documented -- 0/4/36
minion_rs: 30% with examples, 82% documented -- 2/23/112
tree-morph: 42% with examples, 100% documented -- 8/22/22

Main:

conjure_core: 4% with examples, 51% documented -- 8/119/239
conjure_oxide: 0% with examples, 19% documented -- 0/4/36
minion_rs: 30% with examples, 82% documented -- 2/23/112
tree-morph: 42% with examples, 100% documented -- 8/22/22

View full documentation coverage for main, this PR

Code Coverage Summary

This PR: Detailed Report

  functions..: 65.8% (644 of 979 functions)
  branches...: no data found

Main: Detailed Report

  functions..: 66.9% (628 of 939 functions)
  branches...: no data found

Coverage Main & PR Coverage Change

Functions coverage changed by -1.10% and covered lines changed by 16
Branches... coverage: No comparison data available

ozgurakgun · 2025-02-17T11:45:48Z

LGTM modulo the hygiene check.

cc @lixitrixi @gskorokhod @YehorBoiar @Soph1514 for info

Add `Submodel`, and refactor `Model` to be a wrapper over `Submodel`. This commit is part of on-going work to add lexical scope to Conjure Oxide and is a follow up to 8636432 (feat!: add parent symbol tables (#680), 2025-02-17). DETAILS A `Submodel` represents a particular scope and holds the symbol-table and constraints tree for that scope. This commit refactors `Model` to be a wrapper over `Submodel`, and removes methods operating on constraints and the symbol table from `Model`, placing them in `Submodel` instead. A `Model` can be borrowed as a `Submodel` using `as_submodel() and `as_submodel_mut()`. The language semantics of a top level model and a sub-model are identical, so treating it as `Submodel` in most cases is valid. `Model` is a separate type than `Submodel` for the following reasons: + It will hold global-only information in the future, such as dominance constraints. + It holds a pointer to the context. + We need special initialisation and de-serialisation logic for the top level model that we do not want for `Submodel`. See `SerdeModel`.

Add `Submodel`, and refactor `Model` to be a wrapper over `Submodel`. This commit is part of on-going work to add lexical scope to Conjure Oxide and is a follow up to 8636432 (feat!: add parent symbol tables (#680), 2025-02-17). DETAILS A `Submodel` represents a particular scope and holds the symbol-table and constraints tree for that scope. This commit refactors `Model` to be a wrapper over `Submodel`, and removes methods operating on constraints and the symbol table from `Model`, placing them in `Submodel` instead. A `Model` can be borrowed as a `Submodel` using `as_submodel()` and `as_submodel_mut()`. The language semantics of a top level model and a sub-model are identical, so treating it as `Submodel` in most cases is valid. `Model` is a separate type than `Submodel` for the following reasons: + It will hold global-only information in the future, such as dominance constraints. + It holds a pointer to the context. + We need special initialisation and de-serialisation logic for the top level model that we do not want for `Submodel`. See `SerdeModel`.

niklasdewally requested a review from ozgurakgun February 17, 2025 11:34

niklasdewally self-assigned this Feb 17, 2025

fixup! feat!: add parent symbol tables

5e6fcac

ozgurakgun approved these changes Feb 17, 2025

View reviewed changes

niklasdewally merged commit 8636432 into main Feb 17, 2025
13 of 14 checks passed

niklasdewally deleted the nik/scoping/refactor-symbol-table branch February 17, 2025 11:53

niklasdewally mentioned this pull request Feb 17, 2025

feat!: add Submodel #682

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat!: add parent symbol tables #680

feat!: add parent symbol tables #680

niklasdewally commented Feb 17, 2025

github-actions bot commented Feb 17, 2025

ozgurakgun commented Feb 17, 2025

feat!: add parent symbol tables #680

feat!: add parent symbol tables #680

Conversation

niklasdewally commented Feb 17, 2025

github-actions bot commented Feb 17, 2025

Code and Documentation Coverage Report

Documentation Coverage

Code Coverage Summary

Coverage Main & PR Coverage Change

ozgurakgun commented Feb 17, 2025