Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tests with Godot 3.3 #746

Merged
merged 2 commits into from
Jun 14, 2021
Merged

Fix tests with Godot 3.3 #746

merged 2 commits into from
Jun 14, 2021

Conversation

Waridley
Copy link
Contributor

@Waridley Waridley commented Jun 9, 2021

The most important change here is the first commit making the .gdnlib file not-reloadable. Godot 3.3 apparently added hot-reloading support for gdnative, but did so by terminating the library when no NativeScripts reference it: https://github.com/godotengine/godot/blame/3.3/modules/gdnative/nativescript/nativescript.cpp#L1550

This probably makes sense for most users, but for the specific case of keeping a GDNative object alive while NativeScripts are created and freed, this causes the library to be terminated out from under the GDNative object This is actually supposed to be prevented, but is not due to a bug in the engine. However even fixing that bug upstream would still cause panics in the tests due to the godot-rust class registry thinking all classes have already been registered when godot_nativescript_init is called again. Therefore, I also included a commit that refactors the init code a bit to only panic if a class is registered with the same init handle more than once, just in case users have a use case where making the library not-relodable is not an option for some reason. This adds a lifetime parameter to InitHandle in order to keep it Copy, but that does not seem to break any of my code or the tests.

There was also a superfluous error in the test_vararray_return_crash test due to the camera not being in the world. This may not matter for the test, but I figured it was somewhat confusing that it was not included in the "expected error messages" list.

These changes were kept as separate commits so you can let me know if you want to forego or modify any of them.

Waridley added 2 commits June 8, 2021 19:22
Fixes crashes due to hot-reloading changes in godot 3.3 that cause reloadable libraries to be terminated when no NativeScripts remain referencing them.
@Waridley
Copy link
Contributor Author

Waridley commented Jun 9, 2021

This should probably be included for #733 if we're planning for godot-rust 0.9.4 to support Godot 3.3

@Waridley Waridley changed the title Fix test with Godot 3.3 Fix tests with Godot 3.3 Jun 9, 2021
@ghost
Copy link

ghost commented Jun 9, 2021

Don't panic if godot_nativescript_init is called again

If this happens, the .gdnlib should probably be marked as not-reloadable for other reasons, but we should at least avoid panicking here.

Could you elaborate? It's part of the contract for Godot to at least call terminate before calling init again. If it indeed does, then this is only hiding the real problem that something is missing in cleanup.

@Waridley
Copy link
Contributor Author

Waridley commented Jun 9, 2021

@toasteater Godot must call godot_gdnative_terminate before calling godot_gdnative_init again, but apparently it can call godot_nativescript_init any number of times (if that bug gets fixed), and there's no godot_nativescript_terminate counterpart. It's a terrible design, but it's definitely intended from looking at the code and comments in GDNative::terminate.

@Waridley
Copy link
Contributor Author

Waridley commented Jun 9, 2021

The bug ironically means that godot_gdnative_terminate does always get called when the last NativeScript object referencing it is dropped, but it does so even if more GDNative objects are referencing it, which is obviously a problem and was never intended. The original author actually did think about this possibility, but just forgot a couple lines of code in GDNative::initialize

@ghost
Copy link

ghost commented Jun 9, 2021

but it's definitely intended from looking at the code and comments in GDNative::terminate.

I'm not sure how the code and comments there are related to calling godot_nativescript_init multiple times? This wasn't the behavior with 3.2 either, and 3.x is supposed to stay compatible with 3.2. Seems like this should be treated as an engine bug instead?

Of course there are some engine bugs we just have to work around due to compatibility, like the Vector <-> VariantArray ones, but this one I feel could be fixed on their side.

The original author actually did think about this possibility, but just forgot a couple lines of code in GDNative::initialize

What about making a PR to the engine with those couple lines of code, maybe?

@Waridley
Copy link
Contributor Author

Waridley commented Jun 9, 2021

I'm not sure how the code and comments there are related to calling godot_nativescript_init multiple times? This wasn't the behavior with 3.2 either, and 3.x is supposed to stay compatible with 3.2. Seems like this should be treated as an engine bug instead?

Sorry, yeah, this wasn't a super clear way to explain it. Basically, it wasn't relevant to NativeScripts until hot-reloading was added and suddenly scripts could be responsible for terminating the library. Previously this corner-case could only happen with multiple GDNative objects referencing the library, and then calling terminate() on one of them while still using the other.

What about making a PR to the engine with those couple lines of code, maybe?

Just did this: godotengine/godot#49467
Basically I'm saying, the "Don't panic if godot_nativescript_init is called again" commit will only be relevant if that PR is merged. Otherwise, I don't think it matters either way.

I could drop that commit from this PR and make a separate PR when we get a response on godotengine/godot#49467 if you want. Again, the main thing needed here is to make the gdnlib not-reloadable.

@ghost
Copy link

ghost commented Jun 9, 2021

@Waridley I think it's best to drop that commit for now. I'm yet to spend enough time to look at the situation but frankly, it looks like kludge. We want to find a way to at least recognize it when the init function is called for a second time and do proper cleanup. Hiding the error in the cheapest way possible would not help.

@ghost ghost mentioned this pull request Jun 11, 2021
Waridley added a commit to Waridley/godot-rust that referenced this pull request Jun 13, 2021
rebased on godot-rust#746 & godot-rust#748 and squashed. Still writing more tests & awaiting final status on those PR's
Waridley added a commit to Waridley/godot-rust that referenced this pull request Jun 13, 2021
rebased on godot-rust#746 & godot-rust#748 and squashed. Still writing more tests & awaiting final status on those PR's
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks!

bors r+

bors bot added a commit that referenced this pull request Jun 14, 2021
746: Fix tests with Godot 3.3 r=toasteater a=Waridley

The most important change here is the first commit making the `.gdnlib` file not-reloadable. Godot 3.3 apparently added hot-reloading support for gdnative, but did so by terminating the library when no `NativeScript`s reference it: https://github.com/godotengine/godot/blame/3.3/modules/gdnative/nativescript/nativescript.cpp#L1550

This probably makes sense for most users, but for the specific case of keeping a `GDNative` object alive while `NativeScript`s are created and freed, this causes the library to be terminated out from under the `GDNative` object This is actually supposed to be prevented, but is not due to a bug in the engine. However even fixing that bug upstream would still cause panics in the tests due to the godot-rust class registry thinking all classes have already been registered when `godot_nativescript_init` is called again. Therefore, I also included a commit that refactors the `init` code a bit to only panic if a class is registered *with the same init handle* more than once, just in case users have a use case where making the library not-relodable is not an option for some reason. This adds a lifetime parameter to `InitHandle` in order to keep it `Copy`, but that does not seem to break any of my code or the tests. 

There was also a superfluous error in the `test_vararray_return_crash` test due to the camera not being in the world. This may not matter for the test, but I figured it was somewhat confusing that it was not included in the "expected error messages" list.

These changes were kept as separate commits so you can let me know if you want to forego or modify any of them.

Co-authored-by: Waridley <[email protected]>
@bors
Copy link
Contributor

bors bot commented Jun 14, 2021

Build failed:

@ghost
Copy link

ghost commented Jun 14, 2021

@Waridley
Copy link
Contributor Author

@toasteater Weird... It worked on my machine. I'll try to figure out why it failed, and whether there's a more robust way to avoid the error, or if it really even matters. Part of me worries that it returning a default value might obscure the possibility of a crash, but I may not be fully understanding the test.

But on that note, could it make sense for most tests and the godot_test! macro to print out the error if the test does panic, instead of just calling .is_ok()?

@ghost
Copy link

ghost commented Jun 14, 2021

Part of me worries that it returning a default value might obscure the possibility of a crash, but I may not be fully understanding the test.

@Waridley The original crash was caused by an inconsistency in the GDNative glue code, where for some C++ types mapping to VariantArray, the glue code will require the return pointer to point to an initialized instance. This is worked around by avoiding ptrcall for methods returning VariantArray. It doesn't matter what the return value is, only that the C++ type is Vector<T>. See #422 and related engine issue.

But on that note, could it make sense for most tests and the godot_test! macro to print out the error if the test does panic, instead of just calling .is_ok()?

The panic is always printed in the log, just not immediately above the stacktrace, for some reason:

thread '<unnamed>' panicked at 'missing current scene', test/src/test_vararray_return.rs:35:18
stack backtrace:
   -- test_float32_array_debug
   -- test_color_array_access
   -- test_color_array_debug
...
 -- test_variant_call_args
 -- test_variant_ops
 -- test_vararray_return_crash
   0: rust_begin_unwind
             at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/std/src/panicking.rs:493:5
   1: core::panicking::panic_fmt

@Waridley
Copy link
Contributor Author

It doesn't matter what the return value is, only that the C++ type is Vector<T>.

Alright, I think I'll drop this one as well for now. Though I wish I knew why it panicked.

The panic is always printed in the log, just not immediately above the stacktrace, for some reason:

Oh! I see. Weird.

@ghost
Copy link

ghost commented Jun 14, 2021

bors retry

@bors
Copy link
Contributor

bors bot commented Jun 14, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant