Tidy3dBaseModel mutability #377
Replies: 1 comment 3 replies
-
IntroThanks for writing this up, it is good to have an ongoing discussion on this topic. Of course, we should always make the best design decision given our needs. I think there are good arguments for taking a more immutable approach. However, I feel like there are many good reasons for keeping mutability, beyond “saving a few lines”. There are reasons dicts, lists, sets, numpy arrays, etc. are mutable objects. It tends to make them much easier to work with. This is also a very major change, before we make such a big change, we should continue discussing and make sure we fully understand the ramifications. A few facts (as far as I understand them) to agree on before moving onto the details:
Hash vs. Private Attributes
Backwards compatibility
I guess we could revisit this issue. A few things:
class MyModel(Tidy3dBaseModel):
old_field = Field() # the one we are phasing out
new_field = Field() # the one we'll use exclusively soon
@pydantic.validator("old_field")
def warn_and_convert(cls, val, values):
"""set the values['new_field']"""
log.warning('old_field will not be supported')
new_val = some_conversion(val)
values['new_val'] = new_val
return val # or None Bugs
I agree that there are some potential issues. But isn’t this true in python generally? Both of your example seem like things that could be done in any python context. Why do we want to protect the user from the normal workings of the language? The first example seems like it will be very easy to discover and fix. The second example seems quite unlikely to occur. I’m far more worried about us (the developers) making some mistake by assuming the Simulations are immutable when they have been changed. For example, using a cached property when some of the data has been modified (ie a polygon was dilated). These bugs seem much harder to identify. I guess there’s always going to be a tradeoff between performance and correctness. My original design decision (models are mutable, use properties often) was to make sure everything was always “correct”, even when the model was changed. The immutability move scares me because it seems impossible to fully enforce, and means we are introducing the potential for bugs that we have no need to worry about currently. Benefits of Mutability
I feel like this is a bit of an oversimplification of the benefits. As just one example, usually a more convenient but also more robust way to construct variants of the same simulation. Consider: sim = Simulation(...)
# many cells / lines later, I need to make one or more copies of the same simulation with a different run_time or some other fields.
# copy and paste the init method, what if I add`subpixel` argument to the call above? I need to remember to propagate this to all Simulation() calls.
sim2 = Simulation(...)
# wrap everything in a function, this is cumbersome and all the call signatures need to be propagated if, say, a kwarg is added.
sim2 = make_sim(run_time)
# much more convenient and always works, even if original sim is modifie.
sim2 = sim.copy()
sim2.run_time = ... This pattern shows up all the time: optimization loops, adjoint gradient, resolution / parameter scanning. Enforcing immutability
Yea this also worries me the most as I’ve probably made clear. I dont even know how to begin with the numpy stuff. There is a way to set them as unwritable, but it's still not robust because any view into the data in an unwritable array can still allow mutation. We’d likely need to cast all numpy arrays to tuples and then cast them back to numpy arrays in the validators and methods, which will be annoying to work with and potentially have its own performance issues when dealing with large data (custom sources?). Wrapping upAnyway, I do recognize the advantages of taking an immutable approach.
However, there are almost mirroring advantages to mutability.
I’m probably still team “mutability” unless the following conditions can be met:
As a side note: maybe we can compromise a bit and take a stab at a subclassed |
Beta Was this translation helpful? Give feedback.
-
I think we should have another serious discussion on whether we want to keep models mutable. I can offer three things I think would work better if they were immutable.
_hash
to try to speed things up when caching properties may not be as successful as we were hoping in the case of large models, where the hash function itself can be quite costly. Furthermore there are still issues with making it recursively work with all models. If models were immutable instead, we could just computeself._cached_property
once upon init and then just always return that uponself.property
.A more contrived but still possible scenario:
As far as I can see there are two main arguments to keep mutability.
This same principle can be applied to numpy arrays, data arrays, doesn't matter. One problem I can envision is that the documentation generation function may have to be modified a bit. Another one is that I'm not sure how the validators work exactly in this case, but I'm sure plenty of people have made immutable pydantic models and there will be plenty of information how to make it work.
Beta Was this translation helpful? Give feedback.
All reactions