-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewind ZSTD support #16959
base: master
Are you sure you want to change the base?
Rewind ZSTD support #16959
Conversation
8291e41
to
83e13a4
Compare
Core/SaveState.cpp
Outdated
|
||
// Temporarily allocate a buffer to do decompression in. | ||
size_t compressCapacity = ZSTD_compressBound(compressed.size()); | ||
u8 *compress_buf = (u8 *)malloc(compressCapacity); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, why not result->zstd_compressed->resize(compressCapacity);
and then result->zstd_compressed->resize(result->compressed_size);
? Worst case, it copies (seems unlikely, though)... which is already what this is doing.
-[Unknown]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was gonna make this more OOM-safe (but didn't get around to it). But yeah, let's do that instead.
if (i >= base.size()) { | ||
compressed[i] = state[i]; | ||
} else { | ||
compressed[i] = base[i] ^ state[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end, compressing a lot of zeros can be less efficient than compressing a single zero, since you have to encode the count. A large % of the memory (largest part of a state by weight) doesn't change every second, which is why the block thing works pretty well.
Maybe could do xor on just the modified blocks, though, and that might compress better and not be terribly slow either...
-[Unknown]
How is the performance speed wise, especially on less powerful devices? I could imagine -[Unknown] |
Performance seems fine on the devices I've tested on, but probably would be bad if your device was single-core, in which case you better stay away from the feature anyway... It is a bit more expensive on the battery though, since it spend up to half a second compressing a save state on a background thread on a pretty old device. But I don't really know if I want to add an option... The idea was to get memory consumption down to a level where it's not much worse than for example cranking resolution up to 10x, in which case the "mem hog" label would need to be applied to such features too - or not at all, hopefully you shouldn't have to think about it. It is not quite as good as I hoped yet though... heavy games are now at about 8-15MB / state, so about 300MB ram with 20 saves. Which is a drastic improvement and will be fine on most devices, but still.. it is a bit of a hog :P Though many games are more like 1-4 MB per state now, like Outrun. EDIT: Scratch that, most games shrink down a lot after a little while once the base state isn't from the menu or something... We should probably move things around in save states a bit, so that things like memory will always be at the same offset, at least for the same game. Right now CoreTiming can throw wrenches in that in theory at least... |
83e13a4
to
25e526a
Compare
25e526a
to
077dc8c
Compare
This question probably belongs to earlier commit, but still. How robust is this in face of non-monotonic wall clock? |
Emulated PSP only takes system time on game boot, then it can freely move away from it by running emulation at different speed or getting back in time with save states(including rewind) or changing the system time itself. |
It has the same problems with real-time clock as regular savestates. |
Well, #16958 may have made it so rewind would stop working on clock skew for a little while, although it more matters if But this pull probably doesn't change anything. -[Unknown] |
61a13be
to
a1e1386
Compare
On top of #16958
Makes the rewind functionality much less of a memory hog, takes the memory requirement down to half or a third or so.
Has two modes, one where we first apply the old memcmp-based compression step, and one where it's replaced with xoring against the base state. The former seems to do surprisingly better, not sure why the xoring isn't as good.
It does seems like there are some potential improvements to the savestate format we could do. All the big memory blocks should come first, so that nothing that changes length is before those big blocks.