-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework winapi's QueryPerformance* functions to match XDK #663
base: master
Are you sure you want to change the base?
Changes from 5 commits
ab3a9f5
64f6505
3faa575
235dde3
9a603f7
8ea50ce
eb4385b
ef0e000
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,22 +3,59 @@ | |
// SPDX-FileCopyrightText: 2019 Stefan Schmidt | ||
|
||
#include <profileapi.h> | ||
#ifdef USE_RDTSC_FOR_FREQ | ||
#include <synchapi.h> | ||
#endif | ||
#include <assert.h> | ||
#include <stddef.h> | ||
#include <xboxkrnl/xboxkrnl.h> | ||
|
||
#ifdef USE_RDTSC_FOR_FREQ | ||
static LARGE_INTEGER frequency = {{0, 0}}; | ||
static void __attribute__((constructor)) PrimeQueryPerformanceFrequency () | ||
{ | ||
ULARGE_INTEGER f_rdtsc = {{0, 0}}, s_rdtsc = {{0, 0}}; | ||
ULONG f_ticks = 0, s_ticks = 0; | ||
JayFoxRox marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
KeEnterCriticalRegion(); | ||
|
||
// The values generated after launching aren't accurate, give it time to increment... | ||
Sleep(700); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This can't be true for rdtsc. I'm not sure about the clock source for the KeTickCount, but I assume it's some other low level timer. Also, why would the ticks be incorrect during early boot?! Especially in an XBE I'd expect a soft-reset. By the point the XBE loads, a bunch of kernel code has ran already. The If your reference clock source ( All of this hackery, to me, looks like it's only meant to support overclocking the CPU (so you attempt to measure a constant factor of your CPU clock to some reference clock which runs at the stock clock). |
||
|
||
f_rdtsc.QuadPart = __rdtsc(); | ||
f_ticks = KeTickCount; | ||
|
||
Sleep(200); | ||
|
||
s_rdtsc.QuadPart = __rdtsc(); | ||
s_ticks = KeTickCount; | ||
|
||
s_rdtsc.QuadPart -= f_rdtsc.QuadPart; | ||
s_rdtsc.QuadPart /= s_ticks - f_ticks; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. dumb question, but what are There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like a real big nit and probably shouldn't have been commented. I'll change it if it's bothering. It's private variables in a function that isn't meant to be called by a developer. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Addressed in 8ea50ce There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
❓ ❗ ❓ It's public code in an open-source project that needs to be maintained and should be easily readable and extensible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
frequency.QuadPart = s_rdtsc.QuadPart; | ||
frequency.QuadPart *= 1000LL; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why multiply here? wouldn't it be more stable to do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's getting more digits, but not stable, no. Each run is slightly more variable. Wont fix. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My comment wasn't about clock stability, but https://en.wikipedia.org/wiki/Numerical_stability You are dividing away up to 3 decimal digits (leading to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Let me be clear, as this goes into another comment I made. I don't care about this amount of accuracy. I am only doing it to make you happy. If you do not care, then great. Hopefully other maintainers feel the same. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you don't care, then why is there still a 700ms sleep at startup and a 200ms sleep for the measurement? The review mostly focused on getting that time away because it's absolutely not tolerable to delay startup. You can probably get away with much much less measurement time and get a more accurate result by fixing your math. You want to calculate something like
The current method (hardcoded value) is even more accurate than this.. and it will be more exact than the measurement regardless how broken your caps are (unless you overclock/underclock/swapped your CPU). Ideally you wouldn't have to wait at all (and there's probably many ways to get there). We care about these changes because they matter. You are making it impossible to get this merged by saying "Wont fix". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Hard disagree. The review has been maintainers attempting to get me to justify every single line of this PR. What percentage would you be happy with? We know @thrimbor isn't happy with 0.5%, so we need to be lower than that. How long are you willing to allow for startup delay? (2nd time I've asked) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't need to run this code. I have enough coding experience to know that almost 1 second of additional startup time is bad. I have a couple of applications in mind which would be broken by this (xbe-loader being one of them). I'm also familiar enough with these APIs and now how they are being used that the loss in precision matters.
Yours is 0.1% as accurate as the existing solution (~1000x worse? Don't quote me on that.. too many numbers being thrown around). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not justify, but modify those lines with errors (or which can be significantly improved with little effort). You should be grateful that people are trying to guide you to better code. The real issue with this particular review is that valid concerns by two maintainers about the same thing are being ignored (like the numerical stability, which then requires excessive startup delay). Initially we reported some problem at one location, but then the next iteration of the same code had the same issue (the loop was replaced by a single wait which was still too long). And when that wasn't addressed (like the numerical stability), I (we?) sometimes pointed out new problems which arose because the first thing wasn't fixed (such as the 700ms wait with the bad comment). Pair that with initial skepticism if more than 50% PR is beneficial. The XDK / rdtsc thing with hardcoded value is mergeable in isolation if the later XDKs also do it (although this might just be a case of function inlining / LTO across the kernel barrier). The dynamic measurement - with some changes - could become acceptable, but it's still worse than the hardcoding and doesn't address the initial use-case you envision (the reasoning / description of which doesn't make sense).. the only use case it does help is with overclocking/underclocking/CPU swaps (which is fine, but not if it has severe drawbacks for everyone else). |
||
|
||
KeLeaveCriticalRegion(); | ||
} | ||
#endif | ||
|
||
BOOL QueryPerformanceCounter (LARGE_INTEGER *lpPerformanceCount) | ||
{ | ||
assert(lpPerformanceCount != NULL); | ||
|
||
lpPerformanceCount->QuadPart = KeQueryPerformanceCounter(); | ||
lpPerformanceCount->QuadPart = __rdtsc(); | ||
return TRUE; | ||
} | ||
|
||
BOOL QueryPerformanceFrequency (LARGE_INTEGER *lpFrequency) | ||
{ | ||
assert(lpFrequency != NULL); | ||
|
||
lpFrequency->QuadPart = KeQueryPerformanceFrequency(); | ||
#ifdef USE_RDTSC_FOR_FREQ | ||
lpFrequency->QuadPart = frequency.QuadPart; | ||
#else | ||
lpFrequency->QuadPart = 733333333; | ||
#endif | ||
return TRUE; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't hide correct behavior behind a compile-time flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XDK only replies with 733MHz, so that would be "correct". As stated, I don't like hard numbers, but don't want to force wasted time on developers if they don't require it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct though? I'd even consider rejecting this with "move it to a fork":
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's assumptions like this that stop nxdk with pbkit running from functioning on my Xbox. There are applications now due to MIT licensing that I can't readily modify the source of, which means I need to hack the xbe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is code in the winapi - so it's probably not mission critical.
If your Xbox drifts too far from 733.333MHz to crash applications then it's probably broken - it's not a fault of nxdk.
If you changed the clocks or the CPU (which I suspect), then it's no longer an Xbox and breaks many assumptions that both, MS XDK and the nxdk have (also the kernel probably.. which you probably had to patch then to even get your Xbox to boot).
... which make the Xbox a gaming console instead of a generic PC.
nxdk winapi isn't a generic winapi and it's not a PC OS / environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's entirely nxdk's fault for killing pbkit if the frequency is not default. https://github.com/XboxDev/nxdk/blob/master/lib/pbkit/pbkit.c#L2812
Again bad assumptions about the box we're running on. I've tried multiple times to introduce similar things (NV base address being one of them) and at every turn I seem to hit this review bomb or ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes: That particular check is bad and inprecise - it can easily cause issues and is totally avoidable.
And yes.. and that should be addressed.. but that's not what this PR is doing.
But modifying the NV base address is also not really reasonable. I also needed that for some of my projects.
However, I can see how it's not practical to do such invasive changes in nxdk.
I think one of the things we could do is to expose constants in some header (probably as macro, so they can be changed dynamically, potentially with a source marker so you know where the constant gets used from).
People could just reconfigure their nxdk if necessary then - however, that doesn't help with closed source apps.
However, I don't think it's practical to keep some of these hacks for niche system configurations or even niche software applications (like my attempt to emulate a virtual NIC from a DXT / xbe-loader) in the upstream nxdk codebase.
A lot of these changes quickly become incompatible with much more important (and required) changes.
Testing such niche applications is also a lot more difficult and therefore we'd be constantly introducing new bugs for those niche cases.
We'd also lose the major benefit of a gaming console (at least in the sense up to the 7th console generation or so): fixed hardware which allows massive simplifications and assumptions (which brings performance gains).
I think this is also a huge draw for some of the developers - Xbox being a fixed target makes it an interesting coding challenge and having a fixed spec brings some comfort.
xbox-linux and some of OpenXDK tried to turn Xbox into a PC.. that's also a noble goal, but I think that's a non-goal for nxdk. And even then, the xbox-linux Xbox platform likely has to make a bunch of assumptions.
I feel like this PR has gotten a lot more attention than it deserves.
There aren't many active maintainers (or outside reviewers) in XboxDev, but good changes still gets merged eventually.. occasionally I see merges in my GH notifications.
I'd prefer if some things got merged sooner than later, but this is how it is.
How many PRs did you review today? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, and I tried. Both an frequency accurate timer version, and a dumb "remove the kill code" version. Both were left stale and nitpicked.
We're all just giving our free time to see the change in the world that we envision. If you feel you're giving it more attention than you should, then I'm confused why you are.
None. I don't do software development for work. Hopefully you had a good day at work today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an assumption of good faith: One would expect to gain more than to lose on doing a review.
Even if there's a bad PR we still have to take the time to review and close it.
That's the cost of doing FOSS.
In turn, we hope that the author improves their code so that it might be eventually mergeable (hopefully giving us a return of invest). Or we hope that some author shows up with perfectly mergeable code (which is an instant win).
This PR in particular looked to become mergeable eventually:
However, then it stagnated quickly, leading to many discussions, rather than progress on the code.
The initial arguments faded into the background as some weird arguments came up like "bad caps" or "unstable clocks at XBE launch" which made little sense.. unless you aren't running a stock Xbox.
Both, @thrimbor and I tried to assist or nudge you in the right direction, but you don't seem to care for our guidance (regarding our comments about numerical stability for example, starting at surface level and then even taking time to explain it with examples - still in good faith.. either that this PR is merged, or that others learn from it).
By now, I've lost faith in this PR. I'll do a last pass responding to the remaining open discussions.
It might still be merged eventually, but I don't think it's going to happen because of comments like "Wont fix".