-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Intel turbo frequency throttling through intel_pstate #227
base: master
Are you sure you want to change the base?
Conversation
Here's a temperature vs turbo-pct graph of compiling some random 31 packages on a macbook pro 16,2. Note that this patch doesn't control the |
Hello @gktrk , in intel-pstate.c line 128 -- shouldn't the "val" unsigned int be converted to int for the comparison for the "val+=step" and "val<0" to work without the risk of overflow? |
Sorry for the late reply. You're right, there's a potential for underflow. It should just be |
Support controlling the maximum turbo frequency percentage on Intel CPUs. Under heavy loads with 100% turbo frequency, spinning the fans at their maximum speed does not prevent the CPU from reaching temperatures above 90C and beyond. Reducing the effective clock frequency of the CPU is an effective way to control its temperature. The proposed intel_pstate controller tries to maintain the turbo frequency at the maximunm possible value that still keeps the CPU temperature below the user-specified 'max_temp' value. Every time the temperature crosses the max_temp threshold, the controller reduces the maximum frequency percentage by 4%. When the temperature drops, the controller increases the turbo frequency by 1%. To account for the jitter in temperature readings, the controller only increases the temperature when the drop is greater than or equual to 3 degrees Celcius. If the temperature drops below 'high_temp', the maximum turbo frequency percentage is set back to 100%. The values 4%, 1%, and 3C are chosen empirically and the code can be updated later on to make these user-configurable as well through mbpfan.conf. Signed-off-by: Göktürk Yüksek <[email protected]>
Force pushed with the fix. Everything is signed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with the existing CPU governors -- how do they interact with this PR?
result = settings_get_int(settings, "general", "intel_pstate_control"); | ||
|
||
if (result != 0) { | ||
intel_pstate_control = (result == 0) ? 0 : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this always 1? So you can just set intel_pstate_control = result
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be a case where it's not in the config file? Anyone upgrading from a previous release may not immediately update their config file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we must handle the missing value case as well. But it appears that settings_get_int
returns 0 when it is missing, the same as disabled?
else if (new_temp >= max_temp) /* throttle down to keep the temp in control */ | ||
intel_pstate_adjust(intel_pstate, -4); | ||
else if ((new_temp - old_temp) <= -3) /* core is cooling down, increase turbo */ | ||
intel_pstate_adjust(intel_pstate, +1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested with stress --cpu 4
on a MBA 2014 and found that the pstate gradually decreased under load but after removing the load it stayed in one state for a while then jumped by a lot:
Adjusting intel_pstate: val: 40, step: -4
Old Temp: 85 New Temp: 86 Fan: Exhaust Speed: 6500
Sleeping for 1 seconds
Old Temp: 86 New Temp: 84 Fan: Exhaust Speed: 6443
Sleeping for 1 seconds
Old Temp: 84 New Temp: 83 Fan: Exhaust Speed: 6386
Sleeping for 1 seconds
Old Temp: 83 New Temp: 81 Fan: Exhaust Speed: 6215
Sleeping for 1 seconds
Old Temp: 81 New Temp: 81 Fan: Exhaust Speed: 6215
Sleeping for 1 seconds
Old Temp: 81 New Temp: 80 Fan: Exhaust Speed: 6101
Sleeping for 1 seconds
Old Temp: 80 New Temp: 80 Fan: Exhaust Speed: 6101
Sleeping for 1 seconds
Old Temp: 80 New Temp: 79 Fan: Exhaust Speed: 5968
Sleeping for 1 seconds
Old Temp: 79 New Temp: 79 Fan: Exhaust Speed: 5968
Sleeping for 1 seconds
Old Temp: 79 New Temp: 78 Fan: Exhaust Speed: 5816
Sleeping for 1 seconds
Old Temp: 78 New Temp: 78 Fan: Exhaust Speed: 5816
Sleeping for 1 seconds
Old Temp: 78 New Temp: 78 Fan: Exhaust Speed: 5816
Sleeping for 1 seconds
Old Temp: 78 New Temp: 77 Fan: Exhaust Speed: 5645
Sleeping for 1 seconds
Old Temp: 77 New Temp: 77 Fan: Exhaust Speed: 5645
Sleeping for 1 seconds
Old Temp: 77 New Temp: 77 Fan: Exhaust Speed: 5645
Sleeping for 1 seconds
Old Temp: 77 New Temp: 76 Fan: Exhaust Speed: 5455
Sleeping for 1 seconds
Old Temp: 76 New Temp: 75 Fan: Exhaust Speed: 5246
Sleeping for 1 seconds
Old Temp: 75 New Temp: 77 Fan: Exhaust Speed: 5246
Sleeping for 1 seconds
Old Temp: 77 New Temp: 77 Fan: Exhaust Speed: 5246
Sleeping for 1 seconds
Old Temp: 77 New Temp: 76 Fan: Exhaust Speed: 5246
Sleeping for 1 seconds
Adjusting intel_pstate: val: 36, step: 1
Old Temp: 76 New Temp: 73 Fan: Exhaust Speed: 4771
Sleeping for 1 seconds
Old Temp: 73 New Temp: 73 Fan: Exhaust Speed: 4771
Sleeping for 1 seconds
Old Temp: 73 New Temp: 72 Fan: Exhaust Speed: 4505
Sleeping for 1 seconds
Old Temp: 72 New Temp: 72 Fan: Exhaust Speed: 4505
Sleeping for 1 seconds
Old Temp: 72 New Temp: 71 Fan: Exhaust Speed: 4220
Sleeping for 1 seconds
Old Temp: 71 New Temp: 70 Fan: Exhaust Speed: 3916
Sleeping for 1 seconds
Old Temp: 70 New Temp: 69 Fan: Exhaust Speed: 3593
Sleeping for 1 seconds
Old Temp: 69 New Temp: 69 Fan: Exhaust Speed: 3593
Sleeping for 1 seconds
Old Temp: 69 New Temp: 69 Fan: Exhaust Speed: 3593
Sleeping for 1 seconds
Old Temp: 69 New Temp: 69 Fan: Exhaust Speed: 3593
Sleeping for 1 seconds
Old Temp: 69 New Temp: 69 Fan: Exhaust Speed: 3593
Sleeping for 1 seconds
Old Temp: 69 New Temp: 69 Fan: Exhaust Speed: 3593
Sleeping for 1 seconds
Old Temp: 69 New Temp: 69 Fan: Exhaust Speed: 3593
Sleeping for 1 seconds
Old Temp: 69 New Temp: 68 Fan: Exhaust Speed: 3251
Sleeping for 1 seconds
Old Temp: 68 New Temp: 68 Fan: Exhaust Speed: 3251
Sleeping for 1 seconds
Old Temp: 68 New Temp: 67 Fan: Exhaust Speed: 2890
Sleeping for 1 seconds
Old Temp: 67 New Temp: 67 Fan: Exhaust Speed: 2890
Sleeping for 1 seconds
Old Temp: 67 New Temp: 68 Fan: Exhaust Speed: 2890
Sleeping for 1 seconds
Old Temp: 68 New Temp: 67 Fan: Exhaust Speed: 2890
Sleeping for 1 seconds
Old Temp: 67 New Temp: 67 Fan: Exhaust Speed: 2890
Sleeping for 1 seconds
Adjusting intel_pstate: val: 37, step: 100
Old Temp: 67 New Temp: 66 Fan: Exhaust Speed: 2510
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing you use the default value of high_temp
in mbpfan.conf
. This was a deliberate design decision by me to maximize the performance when the temperature is below high_temp
. Note that setting the pstate doesn't directly change the CPU frequency. An SMT core would opportunistically engage turbo on its own under certain type of loads and then disengage. If a core is stalling on a cache miss for one of its threads, it wouldn't engage turbo for example. If you schedule two compute-bound jobs on the same physical SMT core and the IPC is high, then the core can bet that increasing the clock would provide even more IPC and engage turbo.
Let's take your log for example. The turbo % goes from 40 to 36 at the beginning and the temperature starts falling. At this point, it might be falling because we capped the maximum turbo frequency, or it could be because the load has decreased and turbo isn't kicking in like it used to. There is also the thermal lag, which is apparent in the graph I posted above. Notice how in the middle sections of the graph, every time turbo pct is reduced, it's reflected in the temperature only a few seconds after.
After seeing a temperature drop of 3C (from 76C to 73C), the pstate controller assumes that it's safe to increase the turbo pct by 1% up to 37%. Clearly, if I increase turbo for 1% for every 3C drop in the temperature, I'll never get back to 100%. At low temperatures. you do want the turbo to be at 100%. The fans will be enough to handle short, bursty turbo kick-ins that improve performance. So below high_temp
, I restore it back to 100%. Note again that, restoring it to 100% doesn't affect the CPU frequency under no load, which you might have observed yourself too. Notice how in the graph I posted above, at the end the turbo pct is set to 100% yet the temperature keeps dropping.
Why decrease by 4% at a time? If you look at the beginning of the graph, you'll see that in a manner of seconds the temperature jumps from ~42C to ~77C. With a polling frequency of 1 second and a turbo reduction of 1%, we can never respond to that type of increase in a timely manner. So we decrease aggressively, increase opportunistically. Why increase by 1% only after a 3C drop in temperature? I originally coded with 1C and the change was very jittery. 3C allows for smoother transitions.
The relationship between the turbo and the temperature is unlike the fan speed and the temperature. With the fan, you want higher fan speed at high temperatures, and lower fan speed at lower temperatures. With the turbo, because it's an opportunistic performance mechanism, you want it at maximum at low temperatures until the fans are not enough to contain the increase temperature. My mbpro will settle at 79C with a turbo pct of 80% under hours-long compilation workloads. However, for web browsing and things like that, it will have no issues running at 100% turbo.
Sorry this has been a relatively long response to a comment. It's just more intricate than fan control. I'm more than open to suggestions of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested with the latest PR doing a dnf update
. I observed similar results and I still don't understand the intended behavior. In my mind, if the CPU is throttled by a lower pstate, the fan should run as fast as possible to allow the pstate to increase as soon as possible. I am testing on a MBA 2014 with default config -- maybe we observe different behavior based on hardware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gktrk Do we have a path forward here? I still don't understand the intended behavior.
I added max MHz to the logs and see other strange behavior:
Somehow this always sets the pstate unless the CPU frequency changes? It shouldn't set pstate if nothing has changed. |
I believe it's hitting the |
I've pushed the fixes as new commits for ease of review, so you can see what I've changed over the baseline PR. I can fixup them into a single commit before the merge later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the delayed review -- I am using my desktop and tablet more often these days and did not have the opportunity to test until now.
result = settings_get_int(settings, "general", "intel_pstate_control"); | ||
|
||
if (result != 0) { | ||
intel_pstate_control = (result == 0) ? 0 : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we must handle the missing value case as well. But it appears that settings_get_int
returns 0 when it is missing, the same as disabled?
if (new_val < 0) | ||
new_val = 0; | ||
if (new_val > 100) | ||
new_val = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we reparent the min
and max
macros in mbpfan.c
to util.h
so we can call them here?
extern int intel_pstate_init(t_intel_pstate *intel_pstate); | ||
extern int intel_pstate_is_available(void); | ||
extern int intel_pstate_adjust(t_intel_pstate *intel_pstate, int step); | ||
extern void intel_pstate_exit(t_intel_pstate *intel_pstate); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does extern
have a meaning here? We don't do this for other function prototypes.
else if (new_temp >= max_temp) /* throttle down to keep the temp in control */ | ||
intel_pstate_adjust(intel_pstate, -4); | ||
else if ((new_temp - old_temp) <= -3) /* core is cooling down, increase turbo */ | ||
intel_pstate_adjust(intel_pstate, +1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested with the latest PR doing a dnf update
. I observed similar results and I still don't understand the intended behavior. In my mind, if the CPU is throttled by a lower pstate, the fan should run as fast as possible to allow the pstate to increase as soon as possible. I am testing on a MBA 2014 with default config -- maybe we observe different behavior based on hardware?
Still something going on? |
Hi, I've had to remove my native installation because the kernel support for T2 chip proved to be too problematic. I haven't worked on this patch for a while.
It is possible that we observe different behavior based on hardware. My issue before writing this patch was that I would start a long build job, the turbo would kick in, mbpfan would set fans to max, which wouldn't be enough, and eventually the temperature would go above 100C. What I wanted was to maintain the CPU at the maximum allowed temperature when the fans are at 100%, by adjusting the maximum turbo frequency. The following graph form an earlier comment demonstrates my point: As you can see, for most of the workload the temperature stays constant. This is with fans running at 100%. So for me, the pstate is to be controlled when the fans are not enough. Maybe it doesn't make sense if you CPU doesn't heat up too much under full load with max turbo frequency. Then the best course of action is to leave turbo at 100% all the time. Either way, as much as I'd like to go back to a native install, because Mac OS doesn't let me increase the turbo frequency, I don't see myself doing that anytime soon. If it doesn't benefit anybody else in the community, I'm OK with closing this pull request. |
Support controlling the maximum turbo frequency percentage on Intel
CPUs. Under heavy loads with 100% turbo frequency, spinning the fans
at their maximum speed does not prevent the CPU from reaching
temperatures above 90C and beyond. Reducing the effective clock
frequency of the CPU is an effective way to control its temperature.
The proposed intel_pstate controller tries to maintain the turbo
frequency at the maximunm possible value that still keeps the CPU
temperature below the user-specified 'max_temp' value. Every time the
temperature crosses the max_temp threshold, the controller reduces the
maximum frequency percentage by 4%. When the temperature drops, the
controller increases the turbo frequency by 1%. To account for the
jitter in temperature readings, the controller only increases the
temperature when the drop is greater than or equual to 3 degrees
Celcius. If the temperature drops below 'high_temp', the maximum turbo
frequency percentage is set back to 100%. The values 4%, 1%, and 3C
are chosen empirically and the code can be updated later on to make
these user-configurable as well through mbpfan.conf.
Signed-off-by: Göktürk Yüksek [email protected]