-
-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delayed display output when using GLX backend and VSYNC on AMD #1345
Comments
does this still happen with glx backend? (reason i am asking: if glx backend works fine, this is probably a driver problem with xrender; if it doesn't, this is more likely to be a damage tracking problem in picom). |
Hi, It only happens with the |
@Antiz96 oh, ok. I saw in that case I am incline to think this is a driver problem then. does this happen with minimal config? i.e. |
Ah, I assume this is a mistake, probably a relic of tests we made. Sorry about that. I am not the one that reported the issue, I'll let @christian-heusel confirm. |
Oh yeah sorry for that, having the xrender back in there this was from debugging .. Seems like I grabbed the config at an unfortunate time! 😅 (will edit my post accordingly)
No, the problem does not occur with the minimal config ... |
what about |
I have just updated my system today. I can confirm the slow down when It happens on both This is my
|
This also has the problem 😊 Contrary to what @Neko-Box-Coder has reported in #1345 (comment) I do not face the issue with the xrender backend (so maybe that's a different issue) 😊 |
It is also happening on Intel when |
OK, now can you try |
That's it. No issues! |
No issues for me either! |
hmm, interesting. can you run it would help if you can also record your screen with a clock showing milliseconds so i can relate log to the delay. but if you can't that's also fine. |
You can use this as the clock: https://codepen.io/yshui/pen/abeNvGv I realize it is very probable that this delay will just go away if you have this clock running... But do try. |
Yep, I couldn't reproduce the delay with the clock or by recording the screen :S The smallest log I could capture: trace.log |
hmmm, without knowing exactly when the delay happened the log is not very useful. the delay happens when you press a key to trigger a screen update, right? is there a way to maybe log your key presses with a millisecond timestamp? and then tell me which key press is correlated with the delay. |
Right. When I change tag/desktop/workspace
Got it, here we go: trace.log Delays in:
|
huh, this might be why:
vblank scheduler skipped one msc... now we just need to figure out what caused it. |
I add a debug print and pushed it to |
I also have a hunch what might be wrong here |
After you have got trace log from |
Here is the log from the The |
Once again we found ourselves going into sleep while there are X events in libxcb's buffer. This time it's because both vblank_handle_x_events and x_poll_for_events can read from the X connection. This is introduced as part of commit d617d0b. Before that, we were only calling xcb_poll_for_queued_events after vblank_handle_x_events. By changing that to a xcb_poll_for_events (and later to x_poll_for_events, which calls xcb_poll_for_events), we could read additional vblank events into the buffer. These events will not be handled because vblank_handle_x_events has already be called, so we will be going into sleep with these events in the buffer. Since vblank events driver the frame render loop, this causes indefinitely long delay between frames. Related-to: #1345 #1330 Signed-off-by: Yuxuan Shui <[email protected]>
Once again we found ourselves going into sleep while there are X events in libxcb's buffer. This time it's because both vblank_handle_x_events and x_poll_for_events can read from the X connection. This is introduced as part of commit d617d0b. Before that, we were only calling xcb_poll_for_queued_events after vblank_handle_x_events. By changing that to a xcb_poll_for_events (and later to x_poll_for_events, which calls xcb_poll_for_events), we could read additional vblank events into the buffer. These events will not be handled because vblank_handle_x_events has already be called, so we will be going into sleep with these events in the buffer. Since vblank events driver the frame render loop, this causes indefinitely long delay between frames. Related-to: #1345 #1330 Signed-off-by: Yuxuan Shui <[email protected]>
Here is the trace log from Delays in:
The branch |
thanks for the log, and thanks for confirming! i think this can be closed now. |
i am surprised that that commit actually worked, but this bug isn't fixed. there are more fundamental problems... |
This partially reverts c57b705, the previous fix for #1345 The previous fix falsely assumed if `x_poll_for_event` returned nothing, it must have not read from the X connection. This is just not true. It worked, probably because it's impossible to have vblank events close together (they are only once per frame). But that's an assumption we probably shouldn't make. _Separately_, we have another problem. We assumed if `x_poll_for_event` returned nothing, we don't need to flush again. This is not true either. There could be pending requests being completed, whose callback function might have sent new requests. This, for example, causes `picom-inspect` to hang, because some requests are stuck in the outgoing buffer. The explanation for the fix is going to be long, because this kind of problems are never simple. First of all, we need a version of `x_poll_for_event` that actually guarantees to not read from X. The current version of `x_poll_for_event` is already extremely complex, I don't want to pile more complexity on top of it. So instead we are now using a different approach, one some might consider a ugly hack. The complexity of `x_poll_for_event` all stems from the lack of `xcb_poll_for_queued_{reply,special_event}` API, so we had to use complex to merge message from different streams (events, special events, replies, and errors) all the while trying our best (and fail) to handle all messages in the xcb buffer before going to sleep. There is a `xcb_poll_for_queued_event` which in theory can be used for this and will make things a lot simpler, the problem is we don't control events, so if there is a large gap between event arrivals, picom would just be sitting there doing nothing. And that's exactly what we do in this commit, that is, controlling events. Every time we wait for replies/errors for some requests, we send a request that are guaranteed to _fail_. This is because unchecked errors will be returned by `xcb_poll_for_queued_event` as an event. This way, we can be sure an event will be received in a bounded amount of time, so we won't hang. This vastly simplifies the logic in `x_poll_for_event`, and made adding a no-read version of it trivial. Now we have that, some care need to be taken to make sure that we _do_ read from X sometimes, otherwise we will go to sleep without reading anything, and be woken up again immediately by the file descriptor. This will result in an infinite loop. This has some ripple effects, for example, now we queue redraw whenever the wm tree changes; before, redraws were only queued when the wm tree becomes consistent. Even with this, we still need the `while(true)` loop in `handle_x_events`, this is because of the other problem we mentioned at the beginning - flushing. The way we fix the flushing problem is by tracking the completion of pending requests - if any requests were completed, we flush conservatively (meaning even when not strictly necessary) - simple enough. But flushing could read (yes, read) from X too! So whenever we flush, we need to call `x_poll_for_event` again, hence the while loop. Fixes: #1345 Signed-off-by: Yuxuan Shui <[email protected]>
sorry about this but i need someone to test the fix again. please try the |
This partially reverts c57b705, the previous fix for #1345 The previous fix falsely assumed if `x_poll_for_event` returned nothing, it must have not read from the X connection. This is just not true. It worked, probably because it's impossible to have vblank events close together (they are only once per frame). But that's an assumption we probably shouldn't make. _Separately_, we have another problem. We assumed if `x_poll_for_event` returned nothing, we don't need to flush again. This is not true either. There could be pending requests being completed, whose callback function might have sent new requests. This, for example, causes `picom-inspect` to hang, because some requests are stuck in the outgoing buffer. The explanation for the fix is going to be long, because this kind of problems are never simple. First of all, we need a version of `x_poll_for_event` that actually guarantees to not read from X. The current version of `x_poll_for_event` is already extremely complex, I don't want to pile more complexity on top of it. So instead we are now using a different approach, one some might consider a ugly hack. The complexity of `x_poll_for_event` all stems from the lack of `xcb_poll_for_queued_{reply,special_event}` API, so we had to use complex to merge message from different streams (events, special events, replies, and errors) all the while trying our best (and fail) to handle all messages in the xcb buffer before going to sleep. There is a `xcb_poll_for_queued_event` which in theory can be used for this and will make things a lot simpler, the problem is we don't control events, so if there is a large gap between event arrivals, picom would just be sitting there doing nothing. And that's exactly what we do in this commit, that is, controlling events. Every time we wait for replies/errors for some requests, we send a request that are guaranteed to _fail_. This is because unchecked errors will be returned by `xcb_poll_for_queued_event` as an event. This way, we can be sure an event will be received in a bounded amount of time, so we won't hang. This vastly simplifies the logic in `x_poll_for_event`, and made adding a no-read version of it trivial. Now we have that, some care need to be taken to make sure that we _do_ read from X sometimes, otherwise we will go to sleep without reading anything, and be woken up again immediately by the file descriptor. This will result in an infinite loop. This has some ripple effects, for example, now we queue redraw whenever the wm tree changes; before, redraws were only queued when the wm tree becomes consistent. Even with this, we still need the `while(true)` loop in `handle_x_events`, this is because of the other problem we mentioned at the beginning - flushing. The way we fix the flushing problem is by tracking the completion of pending requests - if any requests were completed, we flush conservatively (meaning even when not strictly necessary) - simple enough. But flushing could read (yes, read) from X too! So whenever we flush, we need to call `x_poll_for_event` again, hence the while loop. Fixes: #1345 Signed-off-by: Yuxuan Shui <[email protected]>
This partially reverts c57b705, the previous fix for #1345 The previous fix falsely assumed if `x_poll_for_event` returned nothing, it must have not read from the X connection. This is just not true. It worked, probably because it's impossible to have vblank events close together (they are only once per frame). But that's an assumption we probably shouldn't make. _Separately_, we have another problem. We assumed if `x_poll_for_event` returned nothing, we don't need to flush again. This is not true either. There could be pending requests being completed, whose callback function might have sent new requests. This, for example, causes `picom-inspect` to hang, because some requests are stuck in the outgoing buffer. The explanation for the fix is going to be long, because this kind of problems are never simple. First of all, we need a version of `x_poll_for_event` that actually guarantees to not read from X. The current version of `x_poll_for_event` is already extremely complex, I don't want to pile more complexity on top of it. So instead we are now using a different approach, one some might consider a ugly hack. The complexity of `x_poll_for_event` all stems from the lack of `xcb_poll_for_queued_{reply,special_event}` API, so we had to use complex to merge message from different streams (events, special events, replies, and errors) all the while trying our best (and fail) to handle all messages in the xcb buffer before going to sleep. There is a `xcb_poll_for_queued_event` which in theory can be used for this and will make things a lot simpler, the problem is we don't control events, so if there is a large gap between event arrivals, picom would just be sitting there doing nothing. And that's exactly what we do in this commit, that is, controlling events. Every time we wait for replies/errors for some requests, we send a request that are guaranteed to _fail_. This is because unchecked errors will be returned by `xcb_poll_for_queued_event` as an event. This way, we can be sure an event will be received in a bounded amount of time, so we won't hang. This vastly simplifies the logic in `x_poll_for_event`, and made adding a no-read version of it trivial. Now we have that, some care need to be taken to make sure that we _do_ read from X sometimes, otherwise we will go to sleep without reading anything, and be woken up again immediately by the file descriptor. This will result in an infinite loop. This has some ripple effects, for example, now we queue redraw whenever the wm tree changes; before, redraws were only queued when the wm tree becomes consistent. Even with this, we still need the `while(true)` loop in `handle_x_events`, this is because of the other problem we mentioned at the beginning - flushing. The way we fix the flushing problem is by tracking the completion of pending requests - if any requests were completed, we flush conservatively (meaning even when not strictly necessary) - simple enough. But flushing could read (yes, read) from X too! So whenever we flush, we need to call `x_poll_for_event` again, hence the while loop. Fixes: #1345 Signed-off-by: Yuxuan Shui <[email protected]>
Sorry for the late feedback. Everything is still working ✅ |
Platform
Arch Linux (Rolling) amd64
GPU, drivers, and screen setup
Environment
i3 version 4.23 (which has gaps support merged)
picom version
Diagnostics
Configuration:
Configuration file
Steps of reproduction
neomutt
)You can i.e. see in the following videos that the clicks of my keyboard usually result in an immediate change but sometimes it needs some time until something happens:
2024-09-29.12-07-48.mp4
Expected behavior
Everything feels snappy.
Current Behavior
Have output lags from time to time
Other details
See the video above.
cc @Antiz96
The text was updated successfully, but these errors were encountered: