Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Dictionary::get unsoundness and add a few convenience functions #748

Merged
merged 1 commit into from
Jun 15, 2021

Conversation

Waridley
Copy link
Contributor

@Waridley Waridley commented Jun 9, 2021

For some strange reason godot_dictionary_get calls Dictionary::operator[] instead of Dictionary::get, which ends up inserting Nil if the key doesn't already exist in the dictionary.
https://github.com/godotengine/godot/blame/3.3/modules/gdnative/gdnative/dictionary.cpp#L126
This means it is unsound to call godot_dictionary_get on a Dictionary<Shared>. This PR re-implements godot-rust's Dictionary::get in terms of godot_dictionary_get_with_default (the behavior originally intended and likely expected by most Rust users), and adds Dictionary::get_or_insert_nil to provide the original behavior if it is indeed desired, but only on Dictionary<T: LocalThreadAccess> and an unsafe counterpart for Dictionary<Shared>, like the other mutating functions on Dictionary.

I also updated the documentation of get_ref and get_mut_ref since they call operator[]. Perhaps they should also be moved to non-shared access, but I figured that's less important since they're unsafe anyway?

I also added try_get and get_non_nil while I was at it, to simplify checking if a key exists and checking that its corresponding value is not Nil respectively.

@Bromeon
Copy link
Member

Bromeon commented Jun 9, 2021

There are three methods which are very similar:

fn get_or<K, D>(&self, default: D, key: K) -> Variant;
fn try_get<K>(&self, key: K) -> Option<Variant>;
fn get_non_nil<K>(&self, key: K) -> Option<Variant>;

The difference between try_get() and get_non_nil() are likely not clear to the user. And v.get_or(default) can be written as v.try_get().unwrap_or(default). Furthermore, it seems weird that the default is the first parameter.

What I'm not exactly sure about, is whether Godot dictionaries allow storing nil values, or they treat it like absent values. In the former case, we would need to decide if we support both.


I would argue that it is worthwhile to make the API closer to Rust's HashMap type, and delegate the default mapping to Option. Also, I would keep the API small; many very similar methods tend to cause more chaos than they help.

What about this?

// same signature and semantics as HashMap::get(), except for returning a copy
fn get<K>(&self, key: K) -> Option<Variant>;

// equivalent to get(key).unwrap_or(Variant::new()), if someone actively uses the nil type
fn get_or_nil<K>(&self, key: K) -> Variant;

@ghost
Copy link

ghost commented Jun 9, 2021

Regarding this particular point:

What I'm not exactly sure about, is whether Godot dictionaries allow storing nil values, or they treat it like absent values. In the former case, we would need to decide if we support both.

I think there is a way to disambiguate between the two with what we have: by abusing referential equality with a sentinel object. We'll pass a new object as the default value to godot_dictionary_get_with_default instead of Nil, and check equality against that to decide whether we want to return None or Some(Nil). This, of course, comes at some run-time cost, but does preserve the semantics best.

@Waridley
Copy link
Contributor Author

Waridley commented Jun 9, 2021

@Bromeon Yeah, Dictionaries do allow storing Nil values and keys, so it's effectively like every Dictionary is a HashMap<Option<T> Option<T>. Thus in Rust, get would return Option<Option<T>>. Compounded by the fact that the only way to find out whether an entry exists is apparently to call contains, it means there's not a super clean mapping of the Dictionary API to a more idiomatic Rust API.

And another thing to consider is in GDScript Dictionaries behave differently. Trying to call dictionary["nonexistent key"] fails with an error instead of returning null, and dictionary.get("nonexistent key") apparently returns Nil without inserting it into the map. Although I guess this is a bit more similar to how Rust's Index trait panics if the entry doesn't exist.

@Waridley
Copy link
Contributor Author

Waridley commented Jun 9, 2021

@toasteater So you're saying pass in Variant::from_object instead of Variant::new? Interesting, hadn't thought of that. I had thought of passing a null pointer, but the engine returns a new Variant wrapping it so that doesn't work.

@Bromeon
Copy link
Member

Bromeon commented Jun 9, 2021

@Waridley

Yeah, Dictionaries do allow storing Nil values and keys, so it's effectively like every Dictionary is a HashMap<Option<T> Option<T>. Thus in Rust, get would return Option<Option<T>>. Compounded by the fact that the only way to find out whether an entry exists is apparently to call contains, it means there's not a super clean mapping of the Dictionary API to a more idiomatic Rust API.

That's annoying indeed. To be honest, I would make it as user-friendly as possible. How many times do people use nil keys and values, especially if they're even hard to detect in GDScript? I would personally be OK with not supporting them (and adding such a feature if demand arises), or provide an "advanced" API which is a bit harder to use, but leaves the most common case simple to use.

And another thing to consider is in GDScript Dictionaries behave differently. Trying to call dictionary["nonexistent key"] fails with an error instead of returning null, and dictionary.get("nonexistent key") apparently returns Nil without inserting it into the map. Although I guess this is a bit more similar to how Rust's Index trait panics if the entry doesn't exist.

Yeah, I think like in the other issue, we need to make a call if it's worth more to stay true to the Godot way of things, even if things are counter-intuitive, or to idiomatic Rust, even if it differs slightly from Godot. It's hard to meet a general decision, in this case I could imagine the advantages of an API closer to HashMap would outweigh the drawbacks, but this might need to be explored more.

@toasteater

I think there is a way to disambiguate between the two with what we have: by abusing referential equality with a sentinel object. We'll pass a new object as the default value to godot_dictionary_get_with_default instead of Nil, and check equality against that to decide whether we want to return None or Some(Nil). This, of course, comes at some run-time cost, but does preserve the semantics best.

This seems to be a bit far-fetched though, not sure if a user which is not familiar with the internals would come across this. But if some arcane way of achieving nil key/value helps to keep the more common use case clean, I'm not against it.

@ghost
Copy link

ghost commented Jun 10, 2021

@Bromeon I was just considering the feasibility, not the desirability, but I don't think null values are very useful in GDScript because, as Waridley said, the GDScript API provides no way to programmatically distinguish a None from a Some(Nil). I think it should be fair to just treat Nils as Nones in get.

@Waridley
Copy link
Contributor Author

Waridley commented Jun 10, 2021

I think my personal preference would be to keep get returning a Variant and either let the user call dict.contains(&key).then(|| dict.get(key)) themselves (it's not that long anyway) or leave the try_get function as a shorthand for it (which is both shorter and indicates that it is really the best way we have to check if a key exists before getting its value). Creating a new Object every time to pass to get_with_default would I think require 2 extra API calls to make the object and then convert it to a variant, unless there's some other function I don't know about. contains only requires one extra call. And making a lazy static object would still require checking if it's initialized every time or somehow setting it up in init.

@ghost
Copy link

ghost commented Jun 10, 2021

The problem I see with get() -> Variant is that it's not obvious from the type signature what would happen if the key isn't there. Rust collections return Options most of the time, so a user might assume that it would panic and do an unnecessary contains if they only really need to fall back to Nil. That and it's inconsistent with the rest of the ecosystem. If we're providing a shorthand for the contains/then approach after all, I'd personally prefer to reverse the names:

pub fn get<K>(&self, key: &K) -> Option<Variant> {
    self.contains(key).then(|| self.get_or_nil(key))
}

pub fn get_or<K, D>(&self, key: &K, default: D) -> Variant {
    /* FFI stuff */
}

pub fn get_or_nil<K>(&self, key: &K) -> Variant {
    self.get_or(key, Variant::new())
}

...so that get has a familiar signature, and the closest mapping to the native API exists if anyone wants it.

Waridley added a commit to Waridley/godot-rust that referenced this pull request Jun 13, 2021
rebased on godot-rust#746 & godot-rust#748 and squashed. Still writing more tests & awaiting final status on those PR's
@Waridley
Copy link
Contributor Author

Was too eager to push and go to bed, need to fix other uses of Dictionary::get in the library now that its signature has changed

@Waridley Waridley force-pushed the dictionary_api branch 2 times, most recently from 99f90bd to e152522 Compare June 13, 2021 06:37
Waridley added a commit to Waridley/godot-rust that referenced this pull request Jun 13, 2021
rebased on godot-rust#746 & godot-rust#748 and squashed. Still writing more tests & awaiting final status on those PR's
@Waridley
Copy link
Contributor Author

Should be good now

Copy link
Member

@Bromeon Bromeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! 🙂

Still not 100% sure if the proposed API is actually the one that satisfies common use cases. It mostly mirrors the Godot API.

For example, when do we really want get_or_insert_nil()? Above, we concluded that having null keys/values is not that useful, as the differentiation from absent values is hard -- and now we provide a method that inserts exactly that?

Typically, a user would insert his own custom value if there is no value yet, and for this, separate get() + update() calls are good enough. I think that a frequent usage pattern might be:

let value = dict.get_or(key, default);
let updated_value = do_sth(value);
dict.update(key, updated_value);


/// Returns a copy of the value corresponding to the key, or `default` if it doesn't exist
#[inline]
pub fn get_or<K, D>(&self, default: D, key: K) -> Variant
Copy link
Member

@Bromeon Bromeon Jun 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is default the first parameter? Having key as first parameter for all 3 get* methods is more intuitive, and consistent with GDScript.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of it in terms of Option::map_or/Option::map_or_else... Which admittedly aren't necessarily intuitive; I still have to think about them a bit when I use them. And I suppose this is a case where staying closer to GDScript is probably a good thing.

@Waridley
Copy link
Contributor Author

Waridley commented Jun 13, 2021

For example, when do we really want get_or_insert_nil()? Above, we concluded that having null keys/values is not that useful, as the differentiation from absent values is hard -- and now we provide a method that inserts exactly that?

While Nil keys are rarely useful (though I still wouldn't say never useful), I think having Nil values is much more commonly useful. It's just frustrating that Godot doesn't make it easier to tell the difference between a Nil value and a missing one. Though as I'm writing this I realize my docs don't necessarily make it clear that it inserts a Nil value for the given key. And while I don't personally see a use for this method, I guess I was too afraid that maybe someone out there was relying on this behavior for some reason from get... And it is a method provided by the GDNative API that we're otherwise not using...

However, I realize, 1) There is an alternative way to achieve this, and 2) As you mentioned in Discord, the main reason C++'s std::map::operator[] inserts null if the entry doesn't exist is to allow assignment like map["new key"] = value... Which actually is possible currently via get_mut_ref, if for some reason the multiple FFI calls to check if the dictionary has the key and then insert it if not are too costly. And an alternative way to just insert Nil and return a copy if it without calling contains first is via get_ref and then cloning the result.

let value = dict.get_or(key, default);
let updated_value = do_sth(value);
dict.update(key, updated_value);

This example would actually panic on update if the key doesn't exist, but it works with insert instead, but requires either unsafe or a Dictionary<LocalThreadAccess>, (which is the biggest necessary change in this PR to fix the unsoundness with get).

We could also probably add Index/IndexMut impls for Dictionary and VariantArray, but if that would cause too much bikeshedding on how they should behave, it should probably be saved for another PR... I'd say they should be implemented in terms of get_ref/get_mut_ref for Unique/ThreadLocal dictionaries, but either a lack of impls for Shared dictionaries, or different implementations for them would probably both be surprising...
Edit: now that I think about it, since get_ref/get_mut_ref require unsafe, Index and IndexMut should definitely not be implemented in terms of them. And since they're mostly for convenience anyway, perhaps just an Index implementation that returns a copy is fine.
Edit2: Although this is possibly another good argument for something like Ref/TRef for collections. IndexMut could be implemented on TRef because the reference could only live for as long as you assumed it was safe to use the collection.

@ghost
Copy link

ghost commented Jun 13, 2021

just an Index implementation that returns a copy

You can't do that since the reference is built into the Index trait: https://doc.rust-lang.org/std/ops/trait.Index.html#tymethod.index

It's one of my pet peeves too but we'll just have to forget about Index.

@Waridley
Copy link
Contributor Author

@toasteater Ah, of course, duh.

So the final verdict is remove get_or_insert_nil, and keep the other functions except swap the argument order on get_or?

@Bromeon
Copy link
Member

Bromeon commented Jun 14, 2021

And while I don't personally see a use for this method, I guess I was too afraid that maybe someone out there was relying on this behavior for some reason from get... And it is a method provided by the GDNative API that we're otherwise not using...

I see. I would say for such cases it's better if we guide the users towards "idiomatic" usage patterns. There will be a few users that might need to adapt their code, but a lot of new users who don't wonder about the strange combination of API functions. If people really need something like this, we can always add it in the future.

This example would actually panic on update if the key doesn't exist, but it works with insert instead, but requires either unsafe or a Dictionary<LocalThreadAccess>, (which is the biggest necessary change in this PR to fix the unsoundness with get).

That's also another thing I wanted to look at, but let's not broaden the scope of this PR too much. The naming and semantics of insert()/update()/erase() etc. deviate slightly from HashMap, and it's not clear why and how. Let's discuss those things another time 🙂

So the final verdict is remove get_or_insert_nil, and keep the other functions except swap the argument order on get_or?

For me yes, anything else @toasteater?

@ghost
Copy link

ghost commented Jun 14, 2021

So the final verdict is remove get_or_insert_nil, and keep the other functions except swap the argument order on get_or?

It's a yes for me as well. Please go ahead!

@Waridley
Copy link
Contributor Author

Waridley commented Jun 14, 2021

I thought of one more thing that could be added:
The self.get_or_nil(key) in get should never return the default. If it does, it probably means the user erased the entry through another reference on another thread in between the contains call and this one. Highly unlikely, but the "sentinel object" idea toasteater had could be used to detect this and provide a helpful error message to the user. It seems extremely unlikely that this exact thing would happen even if a user is unsoundly mutating the dictionary from multiple threads, though. And the fact that the object would have to either be created on every call to get or be stored in a lazy static or something like that, means it might not be worth the effort.

pub fn get_or<K, D>(&self, key: K, default: D) -> Variant
where
K: ToVariant + ToVariantEq,
D: ToVariant + ToVariantEq,
Copy link

@ghost ghost Jun 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value does not need to be ToVariantEq. The ToVariantEq trait indicates that a type's Eq implementation has the same behavior as its Godot counterpart. It's only useful for dictionary keys.

It might also be useful to relax the bound to OwnedToVariant, as is done for insert: https://github.com/godot-rust/godot-rust/blob/c02bd27ebee3df3035f7ae3294d205e00141b8e5/gdnative-core/src/core_types/dictionary.rs#L268

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I somehow didn't even see the line about relaxing the bound to OwnedToVariant. However, it's probably best to track it as a separate issue as you have done anyway, as I'm not sure I fully understand the safety implications and when ToVariant will still be required.

gdnative-core/src/core_types/dictionary.rs Outdated Show resolved Hide resolved
@Waridley
Copy link
Contributor Author

The #[derive(ToVariant, FromVariant)] implementation might be wrong now. A struct with a None field is getting converted to Some instead.

@Waridley
Copy link
Contributor Author

The #[derive(ToVariant, FromVariant)] implementation might be wrong now. A struct with a None field is getting converted to Some instead.

Nevermind, it was just that both of the types I tried to make it an option of are impossible to distinguish from None: Option<Variant>::None is the same as Some(Nil), and Option<()> is represented as Nil no matter what 😅

But on that note, why is () treated as Null, when a unit struct represented as an empty dictionary? To me it seems more like an empty tuple, thus VariantArray::new().

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

Conversation about Option<()> is continued in the serde PR.

I've created issue #752 to track trait bound inconsistency for Dictionary methods.

bors r+

bors bot added a commit that referenced this pull request Jun 15, 2021
748: Fix `Dictionary::get` unsoundness and add a few convenience functions r=toasteater a=Waridley

For some strange reason `godot_dictionary_get` calls `Dictionary::operator[]` instead of `Dictionary::get`, which ends up inserting `Nil` if the key doesn't already exist in the dictionary.
https://github.com/godotengine/godot/blame/3.3/modules/gdnative/gdnative/dictionary.cpp#L126
This means it is unsound to call `godot_dictionary_get` on a `Dictionary<Shared>`. This PR re-implements godot-rust's `Dictionary::get` in terms of `godot_dictionary_get_with_default` (the behavior originally intended and likely expected by most Rust users), and adds `Dictionary::get_or_insert_nil` to provide the original behavior if it is indeed desired, but only on `Dictionary<T: LocalThreadAccess>` and an `unsafe` counterpart for `Dictionary<Shared>`, like the other mutating functions on `Dictionary`. 

I also updated the documentation of `get_ref` and `get_mut_ref` since they call `operator[]`. Perhaps they should also be moved to non-shared access, but I figured that's less important since they're unsafe anyway?

I also added `try_get` and `get_non_nil` while I was at it, to simplify checking if a key exists and checking that its corresponding value is not `Nil` respectively.

Co-authored-by: Waridley <[email protected]>
@bors
Copy link
Contributor

bors bot commented Jun 15, 2021

Build failed:

@Waridley
Copy link
Contributor Author

Oh, wow, didn't realize bool::then was stablized so recently. I'll fix it in a little while.

Also adds `get_or` and `get_or_nil` convenience functions.
@Bromeon
Copy link
Member

Bromeon commented Jun 15, 2021

bors r+

@bors
Copy link
Contributor

bors bot commented Jun 15, 2021

Build succeeded:

@bors bors bot merged commit 50f90d5 into godot-rust:master Jun 15, 2021
@Waridley Waridley deleted the dictionary_api branch September 15, 2021 16:49
@Bromeon Bromeon mentioned this pull request Jan 9, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants