Skip to content

Latest commit

 

History

History
81 lines (43 loc) · 5.01 KB

readme.md

File metadata and controls

81 lines (43 loc) · 5.01 KB

SUPER-THREAD

or whatever, this is a simple & stupid utility to tweak affinity or some another settings of windows process and thread, with the inspire of ProcessHacker and GroupExtend.

preview

⚠DISCLAIMER⚠

required administrator privileges to run. this little tool which is written by a n00b with ugly code (O(^) inside!) basically will not damage anything, but there is absolutely no warranty, use on your own risk anyway.

⚠ Windows Defender may detect this program as a malware (the code gets security token maybe), use on your own risk.

⚠ Only tested on Windows 10 yet. as far as I know, Windows 11 and server 2022 thread affinity behavior has changed.

WHY

I wrote this shit because NUMA sucks on Microsoft® Windows®, especially when there are more than 64 threads on your monster machine.

legacy programs does not support SetThreadGroupAffinity() API, in other words, legacy programs can only utilize one single processor group, not the all threads available on system. well, with this tool, you may be able to override this, span threads of legacy program across all processor groups. for example, vmware workstation 16, cannot assign CPU threads to multiple processor groups out of box, with this tool, it may overcome that.

some devices like GPU may only be assigned to one specified CPU, and windows properly launches program which interacts with GPU a lot (yes, games) to opposite CPU. I hate to set affinity for them in task manager over and over. and ProcessHacker or ProcessExp (sysinternal suits) cannot assign group affinity, so I wrote one.

TECHNICAL DETAILS

⚠ undocumented Windows behavior, take a grain of salt while reading.

Windows may schedule things with thread as minimal entity.

msdn doc about scheduling priority which is defined by program priority class (which shows/sets in task manager for each process) and thread priority level. scheduling priority ranges from 0 (lowest priority) ~ 31 (highest priority) .

If a thread has ever called SetThreadGroupAffinity(), then the process it belongs to cannot set process affinity in task manager anymore, so let's call that process is threaded. that's why some processes can not set affinity in task manager sometimes.

isn't there a API named SetProcessGroupAffinity()? yes, and meanwhile, no. there are two APIs named like that in MinGW header, but both of them are stump. starting with Win10, I believed, task manager can directly change process group affinity, there are definitely some APIs can change process group directly. disassembling the taskmgr.exe, and those hidden API were disclosed.

so when a process is not threaded, this tool call hidden API to set process group affinity. and if a program was migrated to another processor group manually, its new threads will still start on previous processor group, it will be come threaded program likely. Windows 11 has an API named SetProcessDefaultCpuSetMasks() can fix this behavior probably, but Windows 10 is likely out-of-luck.

⚠ you have to understand NUMA a bit, in some workloads, assigning threads across different CPU socket will not provide any performance benefits at all, or performance may drop due to accessing remote memory and busy socket communications.

OVERVIEW

this tool runs in a simple manner:

USAGE

./super-thread.exe -c "path_to_json_config"

./super-thread.exe -h

CONFIG

Threads mode is recommended over Processes mode on Windows 11, since the API itself sometimes may even fail in Task Manager.

for details, please check ./config/config_commented.json

  • processes mode
    • by_map: all processes matched profile will be set with static group affinity provided

    • node_rr: all processes matched profile will be assigned to processor group with a round-robin manner and affinity mask provided

    • node_randome: all processes matched profile will be assigned to processor group randomly with affinity provided

    • onload: not implemented

  • threads mode
    • node_rr: set all threads of processes matched profile with round-robin processor group and affinity provided
    • node_random: set all threads of processes matched profile with random processor group and affinity provided
    • cpu_rr: set threads to single logical cpu by round-robin manner with affinity and node provided, to avoid scheduling migration. if multiple nodes are specified in node_map, threads will assign to next node when current node is fully assigned.
    • onload: not implemented

COMPILE

compile with cmake and msys2 mingw64 toolchain.

WHAT IS BROKEN

  • program may hang while querying handle of some processes, use file handle identity with caution
  • system-wide file handle search is broken since may stuck on some processes