Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MSI Katana GF66 11UE and GF66 11UG support #129

Merged
merged 7 commits into from
Jun 12, 2024

Conversation

TGODiamond
Copy link
Contributor

Fn <-> Windows works, but the files report the opposite.

Real-time fan monitoring is not accurate: GPU rt fan file can report over 100%, and CPU rt fan file is very weird. It is like address 0xc9 isn't even linearly representing the CPU fan speed.

Everything else works.

EC Dump

     | _0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _a _b _c _d _e _f
-----+------------------------------------------------
0x0_ | 00 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1_ | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x2_ | 00 00 00 00 00 00 00 00 0a 05 00 00 02 04 0b 0b
0x3_ | 03 01 00 0d 00 00 50 81 d2 11 88 2c c8 01 c0 00
0x4_ | f8 11 5e 00 d1 0f 00 00 c5 0e 28 31 f6 0b fa 32
0x5_ | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x6_ | 00 00 00 00 00 00 00 00 36 00 37 40 49 4c 52 58
0x7_ | 64 2b 26 2b 30 36 3c 46 55 64 08 03 03 03 03 03
0x8_ | 32 00 37 3d 43 49 4f 54 63 00 00 2b 30 36 3c 46
0x9_ | 55 64 08 03 03 03 03 02 06 0f 7d 06 0a 78 45 00
0xa_ | 31 35 38 31 45 4d 53 31 2e 31 30 37 30 36 32 38
0xb_ | 32 30 32 32 30 39 3a 30 37 3a 30 38 00 00 00 28
0xc_ | 00 00 01 00 00 00 00 00 00 d3 00 00 00 00 00 00
0xd_ | 00 00 c4 81 0d 00 05 d0 00 01 00 00 00 00 00 00
0xe_ | e2 00 00 d1 0f 00 00 40 00 00 00 00 00 d1 00 00
0xf_ | 40 00 70 00 00 64 00 00 64 00 00 00 00 00 00 00

@glpnk
Copy link
Contributor

glpnk commented Jun 6, 2024

Nice work!

Please mention #38 in docs.

.rt_fan_speed_address = 0xc9,

Actually, original meaning of this value seems to be %RPM of the CPU cooler, but it's badly documented and misunderstood by other contributors. It also has limits, for some reason. Probably, all coolers might use %RPM in range of 0-150% if other not stated in MSI app for custom fan curve. And the right address for your device should be 0x71. According to fan curve from dump, maximal %RPM should be 100%, but I think in cooler boost it might be set to 150%, which you need to check. UPD: CPU %RPM have "normalization", GPU not, so you need to specify max value rt_fan_speed_base_max obtained from using cooler boost

0xC8-CF addresses are n/RPM values, and your kernel might have a module which LM-sensors use to get RPM

Fn <-> Windows works, but the files report the opposite.

It's known issue, because it's assumes that keyboard layout is ctrl | fn | win | alt and not gaming one

@TGODiamond
Copy link
Contributor Author

TGODiamond commented Jun 6, 2024

Good news! I just noticed something: The number is actually linear: The cpu fan uses 0xc8 as an overflow, behind 0xc9. The same with the gpu, which uses 0xca as an overflow. The larger the value, the slower the fan is, but is 0 when the fan is not spinning.

I also found out that 0x72-0x77 is the fan curve for the cpu fan, represented from 0% (0 in hex) to 150% (96 in hex). These values correlate to these sliders:
cpu fan curve

0x71 is the current fan speed percent selected by the fan curve.

Just like the cpu, the gpu fan curve is 0x8a-0x8f, where 0x89 being the current fan speed percent selected by the fan curve.

Just a reminder, I don't think my model has a "basic fan mode", only auto, silent and advanced, which is where the fan curve kicks in. It definetly does not show up in my MSI Center.

I also tried to set the fans to silent mode directly by editing the EC. When that happend, it ignored the fan curves, but it still changed 0x71 and 0x89 to the wanted percent from the fan curves. Same with the gpu.

@glpnk
Copy link
Contributor

glpnk commented Jun 6, 2024

Yeah, on silent is some different fan curve is used.

In LM sensors driver realtime RPM calculated like this

RPMcpu = 48000 / ({0xc8} << 8 + {0xc9})

GPU is similar, but with next pair. Also, seems like exist laptops with 3 and 4 coolers

@TGODiamond
Copy link
Contributor Author

Also, turning on cooler boost doesn't directly change 0x71 and 0x89.

@TGODiamond
Copy link
Contributor Author

It only changes since the temperature changes, thus changing the fan curve's wanted fan speed percent.

@glpnk
Copy link
Contributor

glpnk commented Jun 6, 2024

I tried to write into CPU %rpm address, it starts cooler, then stops it

@TGODiamond
Copy link
Contributor Author

I think that address is supposed to be read-only. Maybe the EC reacts when trying to change it?

@TGODiamond
Copy link
Contributor Author

Ok, I see the GF66 11UG has the same EC version as the GF66 11UE. I'll rename the PR to include that.

@TGODiamond TGODiamond changed the title Add MSI Katana GF66 11UE support Add MSI Katana GF66 11UE support and GF66 11UG Jun 6, 2024
@TGODiamond TGODiamond changed the title Add MSI Katana GF66 11UE support and GF66 11UG Add MSI Katana GF66 11UE and GF66 11UG support Jun 6, 2024
@TGODiamond
Copy link
Contributor Author

Added GF66 11UG to docs and comments

@teackot
Copy link
Collaborator

teackot commented Jun 12, 2024

Is this one also ready to be merged?

@TGODiamond
Copy link
Contributor Author

Only rt fan speed monitoring is not working properly, otherwise it's ready to be merged.

@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

CPU realtime speed .rt_fan_speed_address is 0x71

@teackot
Copy link
Collaborator

teackot commented Jun 12, 2024

From what I understood from this discussion, 0x71 is the speed preferred by the auto mode. So what is 0xc9?

@TGODiamond
Copy link
Contributor Author

Does RT fan speed have anything to do with the custom fan curves? Because if not, then it's 0xc9 with 0xc8 used as an overflow. Also, advanced mode should be the only mode where 0x71 is actually used.

@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

image

0xC8-CF values are RPM related values. 2 bytes per cooler

RPMcpu = 48000 / ({0xc8} << 8 + {0xc9})

@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

This values now used by lm-sensors module which used to show coolers RPM

0x71 is current selected %RPM value from fan curve, according to temperature

@TGODiamond
Copy link
Contributor Author

TGODiamond commented Jun 12, 2024

Also, I found out that it should be 480000, not 48000.

Found it out from here:
https://github.com/torvalds/linux/blob/master/drivers/platform/x86/msi-wmi-platform.c#L188C10-L188C16

Tested that, and works much better.

@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

Also, I found out that it should be 480000, not 48000.

I just remembered it incorrectly

Thanks for the link, check does your kernel have this module with sensors command, after installing lm-sensors package

@TGODiamond
Copy link
Contributor Author

Yes, it outputs this:

❯ sensors
iwlwifi_1-virtual-0
Adapter: Virtual device
temp1:        +39.0°C  

BAT1-acpi-0
Adapter: ACPI interface
in0:          11.98 V  
curr1:         0.00 A  

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +56.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +49.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:        +52.0°C  (high = +100.0°C, crit = +100.0°C)
Core 2:        +52.0°C  (high = +100.0°C, crit = +100.0°C)
Core 3:        +51.0°C  (high = +100.0°C, crit = +100.0°C)
Core 4:        +51.0°C  (high = +100.0°C, crit = +100.0°C)
Core 5:        +46.0°C  (high = +100.0°C, crit = +100.0°C)

nvme-pci-0200
Adapter: PCI adapter
Composite:    +30.9°C  (low  = -273.1°C, high = +80.8°C)
                       (crit = +84.8°C)
Sensor 1:     +30.9°C  (low  = -273.1°C, high = +65261.8°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +57.0°C  

@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

Your kernel probably don't have this module

UPD: It might be built-in from 6.10 kernel lm-sensors/lm-sensors#475 (comment)

@TGODiamond
Copy link
Contributor Author

Oh I see, I'm only running kernel 6.9.3

@teackot
Copy link
Collaborator

teackot commented Jun 12, 2024

My fan uses 0xcd, with 0xc9 displaying numbers without any effect

So, considering the overflow thing, can c8:c9, ca:cb, cc:cd and ce:cf correspond to different fan "slots"?

@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

Yes, also, c8:c9 and ca-cb used in MControlCenter as RPM for CPU and GPU https://github.com/dmitry-s93/MControlCenter/blob/main/src/operate.cpp#L164-L176

But it's ignoring overflow values, which reduce accuracy on low RPM

@TGODiamond
Copy link
Contributor Author

For me, it's c8:c9 for the cpu fan and ca:cb for the gpu fan. So yea, that must be the case.

@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

Actually, WMI-ACPI (ver 2), which your device use, maps few address groups. Map based on DSDT from similar device

Fan
image

Temperature
image

Thermal (IDK WTF is this and how works)
image

@teackot
Copy link
Collaborator

teackot commented Jun 12, 2024

We've got the addresses right, so this PR is ready.

The fan speed formulas need to be reworked and we need to implement the auto mode (0x71 can be used to display the current temp threshold), but that's for a different PR. Merging this one

@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

@teackot as you mentioned in #98, you have Intel gen 10 device. Your and my device use older WMI-ACPI layout, but it have fan curve + cooler RPM registers on same addresses, but use different WMI name.

In generation 1 WMI layout RPM section called AP
image

For some reason, my device sometimes stores garbage in random registers + randomize real cooler position

@teackot
Copy link
Collaborator

teackot commented Jun 12, 2024

@glpnk how many coolers do you actually have?

randomize real cooler position

Like the cooler address changes? I haven't noticed this on my device

@teackot teackot merged commit 02f08cf into BeardOverflow:main Jun 12, 2024
@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

Do we need to recreate functions which other module handles now?

IDK how originally this method worked, because some devices mention RPM=480000/X address, other %RPM

how many coolers do you actually have?

1 cooler. On previous screenshot was old dump. On newer, I've highlighted garbage values + real cooler is 3rd:

image psd

UPD: In MControlCenter cooler 1 can use 0xC9 or 0xCD address

@TGODiamond
Copy link
Contributor Author

Remember to mark #38 as completed.

@teackot
Copy link
Collaborator

teackot commented Jun 12, 2024

Do we need to recreate functions which other module handles now?

Right, we don't, only the curve. Should we just remove the fan_speed_show functions?

1 cooler. On previous screenshot was old dump. On newer, I've highlighted garbage values + real cooler is 3rd

My EC also spits garbage at 0xc9. Maybe the EC tries to read the values on all the "slots" and the garbage ones are just not wired to actual sensors, so it reads random noise?

@glpnk
Copy link
Contributor

glpnk commented Jun 12, 2024

Should we just remove the fan_speed_show functions?

I think we don't need to remove it completely, just fix realtime addresses for all devices and remove normalization

msi-ec/msi-ec.c

Lines 2497 to 2499 in 02f08cf

100 * (rdata - conf.cpu.rt_fan_speed_base_min) /
(conf.cpu.rt_fan_speed_base_max -
conf.cpu.rt_fan_speed_base_min));

Plus we need to figure out what is basic fan speed, and why some devices refers GPU %RPM value for it

Maybe the EC tries to read the values on all the "slots" and the garbage ones are just not wired to actual sensors, so it reads random noise?

I think this values are timers and they returning value after cooler sends pulse. So they might catch some noise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants