Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Prime95 process doesn't use enough CPU power anymore #49

Open
roaminghawk opened this issue Jul 16, 2023 · 23 comments
Open

The Prime95 process doesn't use enough CPU power anymore #49

roaminghawk opened this issue Jul 16, 2023 · 23 comments
Labels
discussion No real issue, just discussing under investigation Checking out the issue

Comments

@roaminghawk
Copy link

roaminghawk commented Jul 16, 2023

Hi,

Does this error indicate instability?

ERROR: 10:53:50
ERROR: There has been an error while running Prime95!
ERROR: At Core 0 (CPU 0)
ERROR MESSAGE: The Prime95 process doesn't use enough CPU power anymore (only 0% instead of the expected 4.17%)
ERROR: The last passed FFT size before the error was: 13824K
ERROR: Unfortunately FFT size fail detection only works for Smallest, Small or Large FFT sizes.

I'm getting it at stock settings on a 7900X.

@roaminghawk
Copy link
Author

For anyone having that error in the future:
That error is a combination of CoreCycler/Prime95/Windows 11/CPU C-State. Testing with yCruncher passes. Testing in safe mode with Prime95 passes. Disabling Global C-State and testing in Windows (Not safe mode) passes.

@sp00n
Copy link
Owner

sp00n commented Jul 16, 2023

The CPU utilization error fires if over a course of 8 seconds the stress test program process doesn't run with the expected amount of CPU utilization and no error message could be found in the stress test log file.
This CPU utilization check depends on the Windows Performance Counters, which may become buggy. Check the readme file for further details on this, there are ways to fix this if they have somehow corrupted (which happens more often than I had imagined)

Normally it should run with C-States enable if you're running with stock settings (which means no PBO2 and no Curve Optimizer active).
I neither have a Ryzen 7000 nor do I run Windows 11 though, so there may be some incompatibilities with the new chips. Or the chip is actually faulty.
C-States have been known to cause trouble while overclocking, but this is normally not the case with stock settings, so this might actually be the case. Or the Performance Counters just have freaked out, as explained above.
Do you still have both the CoreCycler and the Prime95 log file this run?

@roaminghawk
Copy link
Author

Thanks for the reply.
I've passed the all cores 6 minute test with C-State disabled with no errors. Logs included with one of the "not enough power" runs.
Currently I'm running a Curve Optimizer with an offset of -30 run.

logs.zip

@sp00n
Copy link
Owner

sp00n commented Jul 17, 2023

You have quite a bit of the CPU utilization messages in your log with the C-States enabled, most of the time it recovered in time though so the error wasn't thrown.
There are no message at all with the C-States disabled, so there might actually be a problem for your setup with the C-States. Or disabling them somehow reset the Performance Counters and they're now working as expected. It's hard to tell.

The Prime95 log file also doesn't show any errors, so it seems to be running fine for now, although 6 minutes per core is only a first check for the most obvious instabilities.
If the problem re-occurs you could try to fix the Performance Counters as described in the readme, or as the ultimate workaround you could disabled the CPU utilization check altogether by setting disableCpuUtilizationCheck = 1 in the config.ini file, under the Debug section.

I'll have to keep the combination of Windows 11 and C-States in mind for when this error is reported in the future.

@roaminghawk
Copy link
Author

I'm now testing in safe mode with C-States at auto with default settings and it does not show those errors.
In your opinion, what is the next step after passing the 6min test, for less than obvious instabilities, but still not the 12 hr per core test?

@LucidLuxxx
Copy link

LucidLuxxx commented Sep 10, 2023

Not sure if this could be helpful with the dev or devs, but I have a sensor panel in my tower that uses aida64 to show different sensors (CPU utilization, speed, etc). Aida64 had a bug with it as well in windows 11 with the CPU utilization. My cpu was showing 3% while running cinebench multicore lol. Aida64 had an update in their latest beta, in the main applications stability settings there's a new checkbox for windows 11 CPU utilization fix. Now it's reading correctly. I'm not a programmer and don't understand code, but maybe devs could look at aida's application files to see how they fixed it and maybe piggy back off that? I always just run it in safe mode for absolute best stability anyways but just wanted to throw that out of it can help.

@sp00n
Copy link
Owner

sp00n commented Sep 10, 2023

After googling around, it seems to be a problem that was introduced with Windows 11 22H2.
Many monitoring programs suddenly reported very low usage, even if the CPU was almost fully loaded.

I've found this (Sidewinder being the author for MSI Afterburner if I remember correctly):

AUWUXfg

I'm using \Process(name_of_process)\% Processor Time for the Performance Counter path to get the CPU utilization, which seems to be affected by this Windows bug. Unfortunately I have no way of testing any alternative myself.

@LucidLuxxx
Copy link

Is there any way to use the system idle process and then some kind of calculation to get the opposite? Like (100%_total_cpu-4%_idle=96%_usage)? Kinda like that msi article you shared was saying. Maybe use a VM if your setup is what's preventing you from being able test an alternative? Again, Im not a programmer and I'm probably way off on offering any help lol. Figured its worth a shot. If it helps, good. If not, I still enjoy your program regardless lol.

@LucidLuxxx
Copy link

I should also say that I had windows 11 pro before and I had this issue with corecycler reporting not enough power. I did a fresh install of windows 11 pro, for other reasons, and corecycler is now working correctly and doesn't report the power issue anymore. No clue why that would fix it but it did.

@sp00n
Copy link
Owner

sp00n commented Sep 10, 2023

The idle "process" is not a real process, it's just the remainder of the CPU resources that are currently not used.
There is a performance counter for % idle time, however I really need to keep track of the stress test process itself, if instead I would check for total CPU usage (or total idle percentage, which is just the reverse), other running processes could interfere and cause false (negative) detections.

@leorg99
Copy link

leorg99 commented Feb 20, 2024

I am also seeing this issue with a 7600x at stock settings on Windows 11 23H2 with latest cumulative update.

15:22:56 - Set to Core 2 (CPU 4 and 5)
                 + Setting the affinity to 48
                 + Successfully set the affinity to 48
           Running until all FFT sizes have been tested...
                 + 15:23:05 - Suspending the stress test process for 1000 milliseconds
                 +            Resuming the stress test process
                 + 15:23:07 - Checking CPU usage: 8.2%
                 + 15:23:09 - ...the CPU usage was too low, waiting 2000ms for another check...
                 + Process Id: 148932
                 + 15:23:13 - Checking CPU usage again (#1): 8.29%
                 +            Still not enough usage (#1)
                 + 15:23:13 - ...the CPU usage was too low, waiting 2000ms for another check...
                 + Process Id: 148932
                 + 15:23:18 - Checking CPU usage again (#2): 8.15%
                 +            Still not enough usage (#2)
                 + 15:23:18 - ...the CPU usage was too low, waiting 2000ms for another check...
                 + Process Id: 148932
                 + 15:23:22 - Checking CPU usage again (#3): 8.41%
                 +            The process seems to have recovered, continuing with stress testing

When I run prime95 manually (even the latest beta version 30.19 build 9) after corecycler has created the prime.txt and stress.txt files, I see no errors.

Interestingly enough, when I connect to this PC while corecycler is running prime95, the CPU usage does not recover and it throws an error for CPU usage being too low (usually <1%). Again, no issue when running prime95 directly. This happens on every core. My suspicion looking through source is that it's just not detecting the cpu usage/load properly.

Edit: I can give you access to this box as it's just sitting to the side while I test it. You can remote in through anydesk or something and try to debug?

config.zip

@Zrrrrrrg
Copy link

I meet this question mostly under the following scene:
I set up the program, let it runs itself and leave the machine alone, close the screen, etc....
After a few hours when I come back, open the screen and check the current situation,
it shows that it passes all iteration and still running... But just after a few seconds (maybe 10?), it pumps out that
"doesn't use enough CPU power anymore".
If it is caused by Windows mechanism, could we develop a WSL version using Linux ABI? Except Aida64, y-cruncher and prim95 should have linux executable.

@sp00n
Copy link
Owner

sp00n commented Jun 7, 2024

I now have a Windows 11 box which seems to show this behavior.
I have disabled the CPU utilization check in the latest alpha3, so the error itself shouldn't appear anymore.

However I noticed that when I re-enable this CPU check, it happens when I enable 2 threads, and the Windows Task Manager (or Process Explorer, System Informer, etc, whatever you're using) then shows that it doesn't fully load both virtual CPUs of the core.

It looks something like this:

image

If anyone wants to test this, let me know if you see the same happening.
And if you're using the latest 0.9.5.0alpha3, make sure to re-enable the CPU check by setting disableCpuUtilizationCheck = 0 in the config.ini.

I have no idea what's going on there right now. Right now I suspect the Windows Thread Director or whatever it's called to interfere.

@LucidLuxxx
Copy link

LucidLuxxx commented Jun 7, 2024

If anyone wants to test this, let me know if you see the same happening.
And if you're using the latest 0.9.5.0alpha3, make sure to re-enable the CPU check by setting disableCpuUtilizationCheck = 0 in the config.ini.

I downloaded alpha3 and left everything at default settings except DisableCpuUtilizationCheck which I set to 0, and Threads I set to 2. Mine seems to be working normally. I'm on Windows 11 23H2 Build 22631.3672
Screenshot 2024-06-07 162436

@sp00n
Copy link
Owner

sp00n commented Jun 7, 2024

If anyone wants to test this, let me know if you see the same happening.
And if you're using the latest 0.9.5.0alpha3, make sure to re-enable the CPU check by setting disableCpuUtilizationCheck = 0 in the config.ini.

I downloaded alpha3 and left everything at default settings except DisableCpuUtilizationCheck which I set to 0, and Threads I set to 2. Mine seems to be working normally. I'm on Windows 11 23H2 Build 22631.3672

Well, if you don't encounter the error anymore, as you previously said, it's not too unexpected that you also don't see this happening. 😁
I was hoping someone with the error could test this, but at least you did confirm that not all Windows 11 installations suffer from this weird behavior.

@kydex
Copy link

kydex commented Jul 31, 2024

I have this error using R7 7700 and Windows 11 23H2, disabling C-states solves the problem
CoreCycler v0.9.6.2

@sp00n
Copy link
Owner

sp00n commented Aug 20, 2024

C-States seem to be recurring issue with this. I wonder why.
I've heard before that disabling C-States could improve the stability of an overclock/undervolt, but this issue is really weird.

@sp00n sp00n added under investigation Checking out the issue discussion No real issue, just discussing labels Aug 20, 2024
@LucidLuxxx
Copy link

I disable C-States period. It improves your curve stability values, improves latency, and helps ur 1% low fps. With C-States enabled, core 4 and 0 are stable at -2 and +2. With C-States disabled, I get -7 and -5 and core 0 and 4. It uses more power though and idle temps may be a little higher.

@Demiurg0s
Copy link

I'm having that problem in two cores with 7800x3d, testing PBO with -15.

Should I worry about it?

@Deepcuts
Copy link

Had to switch to Windows 11 while not completely finished curve testing on Windows 10.
Even with C-States disabled in BIOS, I get lots of these errors while running Hanbrake encodes at the same time.
I did not see these errors under Windows 10.
image

@sp00n
Copy link
Owner

sp00n commented Nov 23, 2024

Well, that's not too surprising, Handbrake will claim the resources for itself.

I did change the priority of the stress test program back from high to normal in one of the later releases. because it was giving issues when you tried to bring it to the front instead of leaving it running in the background.
Maybe you were still using a version with the higher priority in Windows 10, you can change that back to Above Normal or High in the config.ini. Or Windows 11 just distributes the load differently, I noticed quite significant differences how the Windows scheduler behaves (i.e. for Intel P and E-Cores).

@LucidLuxxx
Copy link

LucidLuxxx commented Dec 6, 2024 via email

@sp00n
Copy link
Owner

sp00n commented Dec 23, 2024

However that makes it even more ominous why it would discover more errors when only running at base clock. ¯\_(ツ)_/¯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion No real issue, just discussing under investigation Checking out the issue
Projects
None yet
Development

No branches or pull requests

8 participants