-
-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usage of EGL API while faking GLX - the GLX context gets lost #220
Comments
Probably the best that VGL could do is as follows:
Would that be sufficient to fix the problem from Gazebo's point of view? |
Never mind. I see now that that wouldn't fix the problem, and I unfortunately can't see any way to fix it. The application seems to be relying on a clean separation of OpenGL states between the two APIs, but I'm not sure if that's a valid assumption in general or whether it's implementation-specific. In any case, VirtualGL has no way to implement that separation when using the EGL back end. |
Also, I don't see why disabling the EGL API while a GLX context is current would be the right approach either, since that would cause the aforementioned EGL Pbuffer test to fail artificially. |
Thank you for your ideas. I was not sure if I'm requesting a sane thing or not. Gazebo will always use either GLX xor EGL, never both. However, in the beginning, it does the probing that scrambles the already detected GLX context (as GLX is probed first). So for Gazebo, returning no EGL support when GLX is already being interposed would make sense. However, I'm not sure how is it with mixing these two together in general - whether it is something valid, or if it's always a nonsense. Do you know some example where both GLX and EGL would be used in a single app (except for Gazebo)? |
I can't think of many reasons why an application would want to use both GLX and EGL, and I can't think of any reasons why an application would want to bind both types of contexts simultaneously. I'm not sure if that behavior is even explicitly defined, particularly if the EGL API is bound to the desktop OpenGL API. Only one OpenGL context can be current in the same thread at the same time, so after the call to When using the EGL back end, what happens in the example is that, because I think I can make VGL behave as Gazebo expects by:
Thus, if Here is a patch that attempts to accomplish this. However, before I am comfortable committing the patch, I need to understand a couple of things:
I would also need to perform my own testing to make sure that the patch has no unforeseen negative consequences. That won't happen until I get answers to the two questions above, and because of the holidays, it probably won't happen this week in any case. |
Thank you for the patch. I verified all combinations of faking via EGL/GLX and forcing Gazebo to use either GLX- or EGL-based rendering, and all possible combinations work. So even the EGL probing process is not disrupted by this patch. ad 2. Could we say the correct behavior is whatever happens on a system with non-faked rendering? If so, then the MWE I've provided demonstrates exactly what should happen (the GLX context is unaffected by the EGL calls). I'm not sure, though, whether the behavior I'm seeing is platform-specific or if it behaves the same on all GPUs. My initial tests were done on a notebook with AMD Ryzen iGPU. Now I tested it on a desktop with NVidia 3090 and it behaves the same. |
With the patch, does the EGL probing process produce the same results as it would on a local machine without VGL?
I observed the same thing with my Quadros and a FirePro, so the behavior is at least de facto with the most popular implementations, but that doesn't necessarily mean that it's correct. I can think of several reasons why returning anything other than 0 from
All of those assumptions are patently false, irrespective of whether VGL is used. |
Yes. Exactly the same. I understand why you'd rather return an invalid context after GLX and EGL have been mixed in a single program. Feel free to not fix this issue (or hide the fix behind a CLI flag). Until somebody comes with a proper example of an app using both GLX and EGL, I think it is hard to estimate what should be happening. Gazebo uses both at the beginning during the probing process, but then it sticks with one of the APIs. I'll push a fix to Gazebo that makes sure the GLX context is restored after the EGL probing. |
I have no problem integrating the fix as long as I can see documentation that at least suggests that I am doing the right thing. Let me think about it and do more googling before you go to the trouble of pushing a fix for Gazebo. |
The aforementioned patch has been integrated into the 3.1 (main) branch, and a subset of the patch that was applicable to VGL 3.0 has been integrated into the 3.0.x branch. Please verify that everything still works on your end with the latest Git commits. (If you could test both branches, that would be great.) |
Thank you! I did thorough testing and everything looks good, no problems noticed. In particalar, I did a cartesian product of these options:
In all cases, I ran the server with rendering sensors, the GUI, and watched the output of one of the rendering sensors in the GUI. Just to explain a bit more why Gazebo is doing the probing process as it is doing it. It is actually not done directly by Gazebo, but by the OGRE 2.2 rendering engine's GlSwitchableSupport class. This class is meant to provide a generic GL interface where the rendering device and driver can be selected via runtime options. So it is technically not only specific to Gazebo, but to any OGRE-based app. However, it is apparent from the commit history, that GlSwitchableSupport was added because of Gazebo, and I haven't found any example of it being used elsewhere... |
Gazebo is an app that allows the user to choose between GLX and EGL-based rendering. For some use-cases, it would, however, be more practical to use GLX faked by VirtualGL EGL backend.
In issue gazebosim/gz-rendering#526 we are debugging why the GLX backend loses the current GL context in some setups.
I've traced it down to Gazebo first calling
glXMakeCurrent()
for the GLX context, and then probing EGL availability by callingeglMakeCurrent()
on all available EGL devices (this is not intentional, but it's the way the OGRE rendering framework tests EGL PBuffer support, and there's probably no easy way around it). One of theeglMakeCurrent()
calls will, however, reset the GLX context, which is thus lost.The workaround on Gazebo side is straightforward - just store the GLX context before EGL probing, and restore it afterwards.
Thinking about a proper solution, I first thought VirtualGL could do nothing about this. But then it came to my mind that it could actually report some part of the EGL API unavailable for the card/display that is used for GLX faking. Would that make sense (at least as a configurable option)? I'm not, however, skillful enough to tell what part of the EGL API would need to be disabled. Actually, it seems to me some avoiding is tried (or even EGL emulation), but it is probably not enough in this case?
I've assembled a MWE showing the behavior. With the two lines with comment
//FIX
, everything works as expected - that's the store&restore context workaround. If you comment out these two lines, the current GLX context will get lost after the firsteglMakeCurrent()
call.Compile with
g++ -o mwe mwe.cpp -lGL -lGLU -lX11 -lEGL
.The text was updated successfully, but these errors were encountered: