Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LinuxCNC and musl C library (possible?) #3185

Open
1 of 3 tasks
NTULINUX opened this issue Nov 18, 2024 · 35 comments
Open
1 of 3 tasks

LinuxCNC and musl C library (possible?) #3185

NTULINUX opened this issue Nov 18, 2024 · 35 comments

Comments

@NTULINUX
Copy link
Contributor

NTULINUX commented Nov 18, 2024

Here are the steps I follow to reproduce the issue:

  1. Use a Clang 19 system with musl
  2. Download and attempt to build LinuxCNC
  3. Wait for errors

This is what I expected to happen:

A proper compilation.

This is what happened instead:

A few errors, maybe an easy fix?

It worked properly before this:

Clang 16 and glibc built fine when I last tested this.

Information about my hardware and software:

  • I am using Gentoo

  • I am using this kernel version (shown by uname -a): Linux ntu-gentoo 6.12.0-rc7-BUILD SMP Mon Nov 11 10:20:31 CST 2024 x86_64 AMD Ryzen 7 2700X Eight-Core Processor AuthenticAMD GNU/Linux

  • I am running ...

    • A binary version from linuxcnc.org (including buildbot.linuxcnc.org)
    • A binary I built myself
    • A binary version from some other source besides linuxcnc.org
  • I am using this LinuxCNC version (shown in package manager or, for git versions, scripts/get-version-from-git): latest 2.9 git checkout

@NTULINUX
Copy link
Contributor Author

NTULINUX commented Nov 18, 2024

Compiling libnml/buffer/tcpmem.cc
libnml/buffer/tcpmem.cc:108:46: error: use of undeclared identifier 'u_short'; did you mean 'short'?
  108 |     server_socket_address.sin_port = htons(((u_short) tcp_port_number));
      |                                              ^~~~~~~
      |                                              short


Compiling rtapi/uspace_rtapi_app.cc
rtapi/uspace_rtapi_app.cc:351:14: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  351 |     char buf[len];
      |              ^~~
rtapi/uspace_rtapi_app.cc:351:14: note: read of non-const variable 'len' is not allowed in a constant expression
rtapi/uspace_rtapi_app.cc:350:9: note: declared here
  350 |     int len = read_number(fd);
      |         ^
rtapi/uspace_rtapi_app.cc:699:18: error: use of undeclared identifier 'M_TRIM_THRESHOLD'
  699 |     if (!mallopt(M_TRIM_THRESHOLD, -1)) {
      |                  ^
rtapi/uspace_rtapi_app.cc:704:18: error: use of undeclared identifier 'M_MMAP_MAX'
  704 |     if (!mallopt(M_MMAP_MAX, 0)) {
      |                  ^
rtapi/uspace_rtapi_app.cc:1053:21: error: use of undeclared identifier 'pthread_attr_setaffinity_np'; did you mean 'pthread_setaffinity_np'?
 1053 |           if((ret = pthread_attr_setaffinity_np(&attr, sizeof(cpuset), &cpuset)) != 0)
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     pthread_setaffinity_np
/usr/include/pthread.h:221:5: note: 'pthread_setaffinity_np' declared here
  221 | int pthread_setaffinity_np(pthread_t, size_t, const struct cpu_set_t *);
      |     ^
rtapi/uspace_rtapi_app.cc:1053:49: error: cannot initialize a parameter of type 'pthread_t' (aka 'unsigned long') with an rvalue of type 'pthread_attr_t *'
 1053 |           if((ret = pthread_attr_setaffinity_np(&attr, sizeof(cpuset), &cpuset)) != 0)
      |                                                 ^~~~~
/usr/include/pthread.h:221:37: note: passing argument to parameter here
  221 | int pthread_setaffinity_np(pthread_t, size_t, const struct cpu_set_t *);
      |                                     ^


Compiling hal/user_comps/xhc-whb04b-6/usb.cc
In file included from hal/user_comps/xhc-whb04b-6/usb.cc:20:
hal/user_comps/xhc-whb04b-6/./usb.h:269:20: error: field has incomplete type 'struct timeval'
  269 |     struct timeval mLastWakeupTimestamp;
      |                    ^
hal/user_comps/xhc-whb04b-6/./usb.h:269:12: note: forward declaration of 'XhcWhb04b6::timeval'
  269 |     struct timeval mLastWakeupTimestamp;
      |            ^
hal/user_comps/xhc-whb04b-6/usb.cc:503:40: error: variable has incomplete type 'struct timeval'
  503 |                         struct timeval now;
      |                                        ^
hal/user_comps/xhc-whb04b-6/./usb.h:269:12: note: forward declaration of 'XhcWhb04b6::timeval'
  269 |     struct timeval mLastWakeupTimestamp;
      |            ^
hal/user_comps/xhc-whb04b-6/usb.cc:517:44: error: variable has incomplete type 'struct timeval'
  517 |                             struct timeval now;
      |                                            ^
hal/user_comps/xhc-whb04b-6/./usb.h:269:12: note: forward declaration of 'XhcWhb04b6::timeval'
  269 |     struct timeval mLastWakeupTimestamp;
      |            ^


Compiling emc/task/backtrace.cc
emc/task/backtrace.cc:43:9: error: cannot assign to variable 'stderr' with const-qualified type 'FILE *const' (aka '_IO_FILE *const')
   43 |         stderr = freopen(filename, "a", stderr);
      |         ~~~~~~ ^
/usr/include/stdio.h:64:20: note: variable 'stderr' declared const here
   64 | extern FILE *const stderr;
      | ~~~~~~~~~~~~~~~~~~~^~~~~~


Compiling emc/rs274ngc/interp_o_word.cc
emc/rs274ngc/interp_o_word.cc:548:36: error: use of undeclared identifier 'basename'
  548 |     logOword("Entered:%s %s", name,basename(block->o_name));
      |                                    ^
emc/rs274ngc/interp_o_word.cc:549:36: error: use of undeclared identifier 'basename'
  549 |     it = settings->offset_map.find(basename(block->o_name));
      |                                    ^
emc/rs274ngc/interp_o_word.cc:612:28: error: use of undeclared identifier 'basename'
  612 |     settings->skipping_o = basename(block->o_name); // start skipping
      |                            ^
emc/rs274ngc/interp_o_word.cc:613:33: error: use of undeclared identifier 'basename'
  613 |     settings->skipping_to_sub = basename(block->o_name); // start skipping
      |                                 ^

@NTULINUX
Copy link
Contributor Author

For the pthread_attr_setaffinity_np error, this commit from AWS might offer some insight:

https://github.com/awslabs/aws-c-common/pull/755/files

@BsAtHome
Copy link
Contributor

BsAtHome commented Nov 23, 2024

struct timeval is declared in sys/time.h. It looks like it is not included. Maybe it gets included indirectly with glibc and not with musl. Try to add #include <sys/time.h> at the top of hal/user_comps/xhc-whb04b-6/usb.h where the other includes are.

@NTULINUX
Copy link
Contributor Author

Getting closer! That solved that! Thanks!

@NTULINUX
Copy link
Contributor Author

NTULINUX commented Nov 23, 2024

@NTULINUX
Copy link
Contributor Author

NTULINUX commented Nov 23, 2024

Just squashed libnml/buffer/tcpmem.cc by adding:

#include <sys/types.h>

To the file.

WOO!!

@BsAtHome
Copy link
Contributor

BsAtHome commented Nov 23, 2024

The declaration of "extern FILE *const stderr;" is not standard, but it is not wrong.
From the man stderr(3):
Since the symbols stdin, stdout, and stderr are specified to be macros, assigning to them is nonportable. The standard streams can be made to refer to different files with help of the library function freopen(3), specially introduced to make it possible to reassign stdin, stdout, and stderr.
(emphasis mine)

I suggest to change/fix the code to use freopen(3), without reassigning to stderr.

[edit]The return value may be used to check for failure, but the reassignment is done implicitly by freopen.[edit]

@NTULINUX
Copy link
Contributor Author

I suggest to change/fix the code to use freopen(3), without reassigning to stderr.

Now that I don't know how to properly do.

@BsAtHome
Copy link
Contributor

change the lines (src/emc/task/backtrace.cc:43):
fclose(stderr);
stderr = freopen(filename, "a", stderr);

into
freopen(filename, "a", stderr);

Description of the change: There is no need to close "old" stderr before reassigning it. The close operation is implicit in the call to freopen (man page says: The original stream (if it exists) is closed.). The assignment to stderr is the problem that freopen is supposed to solve(*). So, just use freopen and drop the assignment.

(*) There is some irony in the original code. The assignment to stderr and friends is the exactly the problem that freopen is supposed to solve and then you see a piece of code using freopen and assigning to stderr... Gotta love it. ;-)

@NTULINUX
Copy link
Contributor Author

Thanks, I'm not much of a coder! I'll try it in a bit!

@NTULINUX
Copy link
Contributor Author

NTULINUX commented Nov 23, 2024

Things are looking REALLY good! Only two files left; emc/rs274ngc/interp_o_word.cc and rtapi/uspace_rtapi_app.cc

I made a musl branch with the respective changes:

NTULINUX@50946b9

As for M_TRIM_THRESHOLD and M_MMAP_MAX, I spoke to a few people in the musl IRC channel and someone didn't see why 32MB of pre-faulted memory would be any different with or without mmap. It could not be doing anything on glibc builds either, i.e. changing the trim threshold modifies when free() returns memory to the OS, but as this is a real-time application, it pre-allocates and faults in all it's memory. As a result, from the perspective of the allocator, the program doesn't allocate anything as it's only done once on init so M_TRIM_THRESHOLD and M_MMAP_MAX could be doing nothing.

@BsAtHome
Copy link
Contributor

The mmap problem comes into existence when you allocate large blocks. The allocator may choose to use mmap instead of sbrk. This has been explicitly disabled by setting M_MMAP_MAX to zero. I agree that the problem might be academic, but there may be many corner cases.
The same goes for the trim threshold. Once the mlockall is called requesting future pages locked, then that should prevent reducing the heap because reducing sbrk would fail. However, that failure may cause malloc and friends to fail, depending the stupidity/cleverness of the allocator. Again, corner cases may exist.

For musl, it may not apply for their allocator. That can only be answered by investigating the allocator code. For this case, it may be necessary to add a configure test to check mallopt() and the existense of the mmap and threshold flags. Then it can be coded in a preprocessor conditional.

@smoe
Copy link
Contributor

smoe commented Nov 23, 2024

Nice!
OpenWRT is another distribution with musl as their default C library IIRC.

@NTULINUX
Copy link
Contributor Author

Yes it is!

@heatd
Copy link

heatd commented Nov 23, 2024

The mmap problem comes into existence when you allocate large blocks. The allocator may choose to use mmap instead of sbrk. This has been explicitly disabled by setting M_MMAP_MAX to zero. I agree that the problem might be academic, but there may be many corner cases.

What's the problem here, exactly? mmap on Linux behaves the same as brk.

The same goes for the trim threshold. Once the mlockall is called requesting future pages locked, then that should prevent reducing the heap because reducing sbrk would fail.

How? munmap and brk reduction don't care about regions being locked. "The memory lock on an address range is automatically removed if
the address range is unmapped via munmap(2)."

However, that failure may cause malloc and friends to fail, depending the stupidity/cleverness of the allocator. Again, corner cases may exist.

For musl, it may not apply for their allocator. That can only be answered by investigating the allocator code. For this case, it may be necessary to add a configure test to check mallopt() and the existense of the mmap and threshold flags. Then it can be coded in a preprocessor conditional.

FWIW musl's malloc always uses mmap.

@BsAtHome
Copy link
Contributor

The point of configure_memory() in rtapi/uspace_rtapi_app.cc:687 is to get a block of process memory mapped into RAM and make sure it stays there to prevent page-faults from occurring when the real-time process runs.

The code in question allocates a 32 MByte (PRE_ALLOC_SIZE) buffer using malloc() and touches every page. Then it actually does a free() on that same buffer. A call to mlockall() precedes to lock the pages, which makes them resident and unswapable.
If the allocator uses mmap and may unmap the block when freed, ignoring the page lock, then the whole exercise of locking pages for real time in the manner implemented is not working as intended.

This is where the two mallopt() calls come into play. They are, apparently, glibc specific on linux, to inform the allocator never to release memory back to the OS while running, preserving the pages to be resident in memory, regardless whether they are malloc'ed or not. The mallopt() calls are already in a #ifdef __linux__ block.

When musl always uses mmap, then there must be a way to guarantee never to release touched pages back to the OS, even if they are not currently used by the allocator. Remember, the code does a malloc()/free() combi on that large block. So, if musl were to munmap the pages after the call to free(), then musl will break real-time for the process and re-introduce page-faults because the memory would no longer be resident at the next call to malloc.

The question is: does musl's allocator break the resident pages requirement used for real-time by possibly releasing memory back to the OS? If so, is there a way to make musl preserve the resident pages and make it refrain from releasing memory back to the OS?

@heatd
Copy link

heatd commented Nov 23, 2024

The point of configure_memory() in rtapi/uspace_rtapi_app.cc:687 is to get a block of process memory mapped into RAM and make sure it stays there to prevent page-faults from occurring when the real-time process runs.

Yeah.... this doesn't actually work :) mlock doesn't stop you from getting page faults, merely stops you from getting expensive page faults (swapping, re-faulting in from a file, etc). Things like page migration can come in and ruin your day.

The code in question allocates a 32 MByte (PRE_ALLOC_SIZE) buffer using malloc() and touches every page. Then it actually does a free() on that same buffer. A call to mlockall() precedes to lock the pages, which makes them resident and unswapable. If the allocator uses mmap and may unmap the block when freed, ignoring the page lock, then the whole exercise of locking pages for real time in the manner implemented is not working as intended.

Yeah... not only will this not work, it might not even work for glibc (in the future, or maybe now, idk). It seems to expect that your PRE_ALLOC_SIZE allocation is reused for all of your heap needs? This isn't actually written down anywhere.

This is where the two mallopt() calls come into play. They are, apparently, glibc specific on linux, to inform the allocator never to release memory back to the OS while running, preserving the pages to be resident in memory, regardless whether they are malloc'ed or not. The mallopt() calls are already in a #ifdef linux block.

When musl always uses mmap, then there must be a way to guarantee never to release touched pages back to the OS, even if they are not currently used by the allocator. Remember, the code does a malloc()/free() combi on that large block. So, if musl were to munmap the pages after the call to free(), then musl will break real-time for the process and re-introduce page-faults because the memory would no longer be resident at the next call to malloc.

The question is: does musl's allocator break the resident pages requirement used for real-time by possibly releasing memory back to the OS? If so, is there a way to make musl preserve the resident pages and make it refrain from releasing memory back to the OS?

Yes, it will release memory back to the OS in certain situations, and no, you can't stop it from releasing memory. But it does sound like what you really want is your own custom allocator if you really need all of those things (and mlock, again, will not stop you from getting a free trip to the kernel via page fault).

@heatd
Copy link

heatd commented Nov 24, 2024

By the way: it's very possible your prefaulting loop has been compiling into nothing for a while (https://godbolt.org/z/379d57x4v). The last gcc godbolt has that doesn't straight up optimize this example out is gcc 4.4.7 :)

@BsAtHome
Copy link
Contributor

By the way: it's very possible your prefaulting loop has been compiling into nothing for a while (https://godbolt.org/z/379d57x4v). The last gcc godbolt has that doesn't straight up optimize this example out is gcc 4.4.7 :)

Clang seems to cut it out...
However, changing the declaration char *buf; into volatile char *buf; will fix it. You'll need to add a (void *) cast in the free call too.

@BsAtHome
Copy link
Contributor

The point of configure_memory() in rtapi/uspace_rtapi_app.cc:687 is to get a block of process memory mapped into RAM and make sure it stays there to prevent page-faults from occurring when the real-time process runs.

Yeah.... this doesn't actually work :) mlock doesn't stop you from getting page faults, merely stops you from getting expensive page faults (swapping, re-faulting in from a file, etc). Things like page migration can come in and ruin your day.

It isn't supposed to prevent all page-faults. It is supposed to prevent all page-faults on the locked memory because the locked pages are kept resident after being touched. That does work, in principle. Posix says that locked pages must be memory resident. The notes in linux' mlock(2)/mlockall(2) state:
Memory locking has two main applications: real-time algorithms and high-security data processing. Real-time applications require deterministic timing, and, like scheduling, paging is one major cause of unexpected program execution delays.
So, if memory locking does not work as intended, then we would have a general problem...

The code in question allocates a 32 MByte (PRE_ALLOC_SIZE) buffer using malloc() and touches every page.
...

Yeah... not only will this not work, it might not even work for glibc (in the future, or maybe now, idk). It seems to expect that your PRE_ALLOC_SIZE allocation is reused for all of your heap needs? This isn't actually written down anywhere.

Assuming that free()'d memory is reused by the allocator algorithm may not hold now or in the future. That is very true. However, I think it still holds for glibc. Another problem is that it is missing a call mallopt(M_MMAP_THRESHOLD, ...value larger than 32 MB...) to force it using sbrk instead of mmap.
There is also a note in mlockall(2) stating that real-time processes should grown the stack sufficiently too, which is also missing.

So, you are absolutely right, the current implementation is not adequate.

[snip]

The question is: does musl's allocator break the resident pages requirement used for real-time by possibly releasing memory back to the OS? If so, is there a way to make musl preserve the resident pages and make it refrain from releasing memory back to the OS?

Yes, it will release memory back to the OS in certain situations, and no, you can't stop it from releasing memory. But it does sound like what you really want is your own custom allocator if you really need all of those things

Indeed, a custom allocator would be required to do it right.

Anyhow
The conclusion would thus be: musl will probably behave differently than glibc when dynamic memory is in play. Maybe it will be a show-stopper or a jitter-generator, I cannot tell. But a generic solution, so that guarantees exist for musl (and, fwiw, glibc), will require a local allocator with guarantees about memory residency.
This is not going to happen unless someone implements it. So, as an easy fix, musl can be used if the configure_memory() function is emptied in a cpp conditional when musl is used as c-library. Selecting musl should probably be done using a configure option.

@BsAtHome
Copy link
Contributor

Yeah... not only will this not work, it might not even work for glibc (in the future, or maybe now, idk). It seems to expect that your PRE_ALLOC_SIZE allocation is reused for all of your heap needs? This isn't actually written down anywhere.

Just tested on:

  • Raspbian bookworm, g++ 12.2.0, clang++ 14.0.6 and glibc 2.36
  • Fedora 39 using g++ 13.3.1, clang++ 17.0.6 and glibc 2.38

Each compiler/glibc combi with (-Wall -O3) would allocate memory by setting brk, never reduced brk after free and malloc'ing a small piece of memory after the freeing the large block would occupy the same (locked) memory. And I never added the M_MMAP_THRESHOLD option as indicated earlier. It appears not to be required.

I guess it works as intended with glibc.

@NTULINUX
Copy link
Contributor Author

I just want to say this is all over my head :)

@andypugh
Copy link
Collaborator

I just want to say this is all over my head :)

Mine too, pretty much. I understand the problems being described, but I was unaware of them unil now, and have no idea what fixes might be required.

@BsAtHome
Copy link
Contributor

Shining light on basename error:
https://bugzilla.mozilla.org/attachment.cgi?id=8516179&action=diff
bluez/bluez@e2f1211

The basename() problem is rather peculiar. There are two versions in glibc. The Posix version may alter its argument, whereas the GNU version never alters the argument. Now, the GNU version is used because libgen.h is not included. The easiest fix is to include the code in the file to prevent any future problems.

So, add this function above Interp::control_back_to() (line 540):

// Prevent any possibility of the Posix version of basename(3) being used
// (possibly altering the argument). Therefore, local reimplementation
// of the *GNU version* of basename(3).
static char *thebasename(const char *filename)
{
     char *p = strrchr(filename, '/');
     return p ? p + 1 : (char *)filename;
}

And change all occurrences of basename() to thebasename() (lines 548, 549, 612 and 613).

The code originates from glibc (https://sourceware.org/git/?p=glibc.git;a=blob;f=string/basename.c). This small piece of code is rather trivial. Regardless, the licenses are compatible (LGPL 2.1+ in glibc and GPL 2 for the interpreter).

@rmu75
Copy link
Contributor

rmu75 commented Nov 26, 2024

is basename only used in the c++ part? if so would it make sense to use this opportunity and replace this with std::filesystem::path::filename ? AFAIK with currently supported OSes (at least for 2.10, but I think also for 2.9) we can rely on C++17 being available.

@NTULINUX
Copy link
Contributor Author

I'll let BsAtHome be the answer for that. After basename is sorted, we're just left with 'pthread_attr_setaffinity_np' vs. 'pthread_setaffinity_np'.

@BsAtHome
Copy link
Contributor

is basename only used in the c++ part? if so would it make sense to use this opportunity and replace this with std::filesystem::path::filename ?

Yes, it seems only be used in this specific place.

Yes, you can argue to use the c++ equivalent function. It is slightly more expensive to use, which would suggest to cache the filename part. It is also a filesystem-generic solution to the problem of finding the basename. Does the c++ version give you the same results as the current version? Not sure if that should matter, but the question is whether it would be prudent to change.

I'm not sure we need the extra complexity for a generic solution. Or are the O-words' filenames possibly very complex? If so, then we'd seen problems already, which we haven't. So my guess is that a "simple" fix should suffice.

Attached is a suggested diff to fix the basename problem with cached result.
basename.diff.txt

@rmu75
Copy link
Contributor

rmu75 commented Nov 26, 2024

To me it's more a matter of principle to incrementally modernize the codebase, e.g. replace #defines with functions or enum class, avoid C-style string handling etc... I'm pretty sure nothing using basename is in any way performance critical.

@BsAtHome
Copy link
Contributor

To me it's more a matter of principle to incrementally modernize the codebase, e.g. replace #defines with functions or enum class, avoid C-style string handling etc... I'm pretty sure nothing using basename is in any way performance critical.

It is not about being performance critical. Calling the same function giving the same result is just Bad CodeTM in my view...

And, yes, it may be good to change the code to become real c++ instead of c-with-classes. But for the glibc->musl thing it may be too much asked in this instance. Changing c-strings into real c++ strings, while a Good ThingTM to do, is a very large task and making just this one basename change is not going to make the code more c++-like.

It would be an admirable goal to make the code to conform to (modern) c++ standards. But then you should start a separate project to do so because it will be a very long (and probably bumpy) road.

@NTULINUX
Copy link
Contributor Author

NTULINUX commented Nov 26, 2024

NTULINUX@aa03e92

src/emc/rs274ngc/interp_o_word.cc built!

Only two left:

rtapi/uspace_rtapi_app.cc:1053:21: error: use of undeclared identifier 'pthread_attr_setaffinity_np'; did you mean 'pthread_setaffinity_np'?
 1053 |           if((ret = pthread_attr_setaffinity_np(&attr, sizeof(cpuset), &cpuset)) != 0)
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     pthread_setaffinity_np
/usr/include/pthread.h:221:5: note: 'pthread_setaffinity_np' declared here
  221 | int pthread_setaffinity_np(pthread_t, size_t, const struct cpu_set_t *);
      |     ^
rtapi/uspace_rtapi_app.cc:1053:49: error: cannot initialize a parameter of type 'pthread_t' (aka 'unsigned long') with an rvalue of type 'pthread_attr_t *'
 1053 |           if((ret = pthread_attr_setaffinity_np(&attr, sizeof(cpuset), &cpuset)) != 0)
      |                                                 ^~~~~
/usr/include/pthread.h:221:37: note: passing argument to parameter here
  221 | int pthread_setaffinity_np(pthread_t, size_t, const struct cpu_set_t *);
      |                                     ^

@BsAtHome
Copy link
Contributor

Only two left:

Actually, it is just one problem giving two errors.

pthread_attr_setaffinity_np() does not exist in musl. The function is appended "_np", meaning non-portable.
It is used to setup scheduling affinity of the thread as an attribute to the call to pthread_create. The specific attribute, scheduling affinity, is not supported by musl.

Musl does have a pthread_setaffinity_np() implementation, which could be used after the call to create the thread. However, my pthread portability wisdom is not up-to-date and can not say for sure what to do here. But a guess would be to, preserving the current status quo for glibc:

  1. put the if(nprocs > 1) {...} in a preprocessor conditional that excludes it from musl. Is there a __MUSL__ define or something like that? And, I do suggest not to use __GLIBC__ as conditional because this is a musl specific problem.
  2. add the same if(nprocs > 1) {...} below the pthread_create() call, also in a preprocessor conditional, and set the newly created thread's affinity using pthread_setaffinity_np().

If the status quo for glibc may change, then drop using the attribute for scheduling affinity and always use pthread_setaffinity_np(). Then the same code should work for glibc and musl.

Please, someone with current pthread knowledge should confirm this or point out the flaws in my reasoning :-)

As a test-case, see code below. Note that it cannot fail in the same way because the thread was already created. The choice is to kill the thread and return with an error or live with failing to set scheduling affinity. The glibc implementation actually starts the new thread in a "stopped" mode and kills it if setting the affinity fails. If the glibc behavior is required, then an additional trampoline while creating the thread is needed to emulate glibc behavior.

... remove previous if(nprocs > 1) {}...

... pthread_create()...

  if(nprocs > 1) {
      const static int rt_cpu_number = find_rt_cpu_number();
      if(rt_cpu_number != -1) {
#ifdef __FreeBSD__
          cpuset_t cpuset;
#else
          cpu_set_t cpuset;
#endif
          CPU_ZERO(&cpuset);
          CPU_SET(rt_cpu_number, &cpuset);
          if(0 != pthread_setaffinity_np(task->thr, sizeof(cpuset), &cpuset))
               rtapi_print("ERROR: failed to set thread scheduling affinity (%s)", strerror(errno));
      }
  }

@NTULINUX
Copy link
Contributor Author

NTULINUX commented Nov 26, 2024

1. put the if(nprocs > 1) {...} in a preprocessor conditional that excludes it from musl. Is there a __MUSL__ define or something like that?

They do not, but their reasoning IMO makes a lot of sense:

https://www.openwall.com/lists/musl/2013/03/29/13

Instead of using __MUSL__ they want you to actually test functionality, not assume based on C library.

Also, not sure if you saw my post earlier:

#3185 (comment)

Tried your change and it compiled!

NTULINUX@5c297dc

@BsAtHome
Copy link
Contributor

Instead of using __MUSL__ they want you to actually test functionality, not assume based on C library.

That would indeed be the way. But I will not be writing autoconf tests. Someone else will need to fight configure.ac.

@NTULINUX
Copy link
Contributor Author

You've done more than enough and I greatly appreciate all your help, you've been fantastic! LinuxCNC at least now compiles against musl for the very first time! The LinuxCNC devs can now review the changes and hopefully get some of these merged, or fix up whatever isn't exactly right. Should I open a pull request?

@BsAtHome
Copy link
Contributor

Should I open a pull request?

You could and it can be discussed in the pr. I don't think it will be accepted as a whole, but it gives a good indication of what needs to change. You can use the feedback to distill viable parts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants