-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLEXT full ARM relocation support (adding executable) #70171
Conversation
Hello @selescop, and thank you very much for your first pull request to the Zephyr project! |
6baff14
to
63348e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off this is a lot of great work, so thank you. There's many new things in this one PR and it will be rather time consuming to review all at once.
Some broad suggestions on what some easier to review PRs might be...
- arch relocate API and relocations added for arm (great!) with some test extensions added to the existing test case
- filesystem loader + test/sample updates (great!)
- more explicit control over the location of the heap with the linker script/mpu region macros (not clear at a quick glance why this is needed, some further background information could be helpful)
- cmake changes to the existing llext module as needed in each rather than an all new llext.cmake module I think might be a better route
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many lovely things in this PR, however, some notes:
- This PR is quite large, and contains features that are not related to each other, like the added arm relocations and the fs_loader. This should be split into separate PRs
- Each added relocation (except relocs that are treated equally) should get its own commit, and tests should be added to the test suite to ensure the added relocation is actually produced by one of the extensions so it can be tested.
- Some changes are not needed, like changing function names from test_entry to start, the PR is easier to review if changes like these are not introduced.
Splitting this PR into multiple more focused PRs so we can make incremental changes is preferred over large (somewhat cluttered) PRs, which tend to end up hanging for a long time in review limbo :)
echo "===> Creating container image" | ||
fallocate -l ${DISK_SIZE_KB}k "$OUTPUT" | ||
echo "===> Creating FAT partition image" | ||
mkfs.fat -F12 -S"$LOGICAL_SECTOR_SIZE" "$OUTPUT" >/dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does a user on windows use any of this? Or mac?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change 'fallocate' call for 'dd' call.
It works on MSYS2/MSYS and MSYS2/MinGW. Is this what you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it needs instructions for windows/mac users on how to use this (e.g. readme.rst)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a README.rst for the fs_loader
I'll make this script more user-friendly (+usage & co)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some general shell advice:
- Use shellcheck. shellcheck saves lives.
- Use
set -e
([sof-test] add set -e to all test cases (Unless You Love Debugging) thesofproject/sof-test#312) - All code in a function(); use a
main()
function (clean-up: move all shell script code to a function and use a "main" thesofproject/sof-test#740)
I don't think the Zephyr project has any shell guidelines (for the simple reason that it tries to support vanilla Windows?) but these are small changes that give huge rewards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @marc-hb,
I've written the script to be more friendly (usage & co). It can now runs on GNU OS.
For the 'set-e', I'm not really into that. There are many commands that does not return 0 on success. Moreover, a command that fails in a script does not necessarily means that the script should fail as well. I prefer use 'if'/'else' instead and try to give more information to the user whenever possible (like, why the command failed, maybe why the script fails and maybe give help. Silent failures are not great user experience in my opinion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the 'set-e', I'm not really into that. There are many commands that does not return 0 on success.
There are a few but it's very rare actually. Our entire test suite uses set -e
with great success.
Moreover, a command that fails in a script does not necessarily means that the script should fail as well.
Yes of course. The fix is very simple: unusual_command || true
I prefer use 'if'/'else' instead and try to give more information to the user whenever possible (like, why the command failed, maybe why the script fails and maybe give help.
Yes of course. It's not "either ... or... ". set -e
does absolutely not stop you from doing that. You can and should still do this even with set -e
:
cmd1 "$arg1" || die "cmd1 %s failed\n" "$arg1"
set -e
is for the other commands: the ones you did NOT expect to fail. It brings shell script closer to all other languages (except C)
Silent failures are not great user experience in my opinion)
Agreed 100%. The only thing worse is a silent failure AND a script that keeps running anyway - because no set -e
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does a user on windows use any of this? Or mac?
The portability of Zephyr's production code and build system is amazing. On the other hand, test code is a much bigger challenge. Last time I checked, twister was quite limited on Windows.
This small shell script is in the samples/
directory.
63348e9
to
60572d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the relocations are copied pretty directly from linux, which is GPL V2 https://github.com/torvalds/linux/blob/480e035fc4c714fb5536e64ab9db04fedc89e910/arch/arm/kernel/module.c#L326-L341
I am no licensing expert but I don't think we can copy and modify GPL V2 code without going through quite a few hoops, neither do I think we should do so.
It is quite evident from the inheritance of macros like ___opcode_identity32
and a host of undefined const values like (upper & 0x03ff) << 12
that this is not written in the Zephyr code style. static inline functions are preferred over macros if possible, const values are defined and used by name #define SOME_MASK 0xFF3
, and macros are UPPER_CASE (unless they are shadowing actual functions like assert())
My current stance on the added relocations specifically is that we should write them in a Zephyr style, and not copy directly from Linux, to get a simpler and cleaner file with no licensing clash.
agree and good catch @bjarki-trackunit. |
81e9134
to
afba247
Compare
fi | ||
|
||
echo -ne "Creating empty '`basename $OUTPUT`' image..." | ||
dd if=/dev/zero of="$OUTPUT" bs=1k count=${DISK_SIZE_KB} status=none 2>/dev/null || die "failed." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dd if=/dev/zero of="$OUTPUT" bs=1k count=${DISK_SIZE_KB} status=none 2>/dev/null || die "failed." | |
dd if=/dev/zero of="$OUTPUT" bs=1k count=${DISK_SIZE_KB} status=none || die "dd to $OUTPUT failed." |
You are already using status=none
so don't discard the error message. Unless some tool's design is extremely bad and absolutely requires it, never discard stderr. It's really bad practice and can cost hours and hours of debugging.
You wrote that you don't like silent failures but this line currently replaces a relevant error message with a meaningless "failed" which is no better than silence. Same below.
echo -ne "done\nCreating FAT partition image..." | ||
mkfs.fat -F12 -S"$LOGICAL_SECTOR_SIZE" "$OUTPUT" >/dev/null || die "failed." | ||
echo -ne "done\nCopying input files..." | ||
mcopy -i "$OUTPUT" $INPUTS "::/" || die "failed." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mcopy -i "$OUTPUT" $INPUTS "::/" || die "failed." | |
mcopy -i "$OUTPUT" $INPUTS "::/" || die "mcopy "$OUTPUT" $INPUTS failed." |
echo -ne "Creating empty '`basename $OUTPUT`' image..." | ||
dd if=/dev/zero of="$OUTPUT" bs=1k count=${DISK_SIZE_KB} status=none 2>/dev/null || die "failed." | ||
echo -ne "done\nCreating FAT partition image..." | ||
mkfs.fat -F12 -S"$LOGICAL_SECTOR_SIZE" "$OUTPUT" >/dev/null || die "failed." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
# - dosfstools | ||
# - mtools | ||
|
||
PROGNAME=`basename $0` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
backquotes are deprecated. Use shellcheck.
mkfs.fat -F12 -S"$LOGICAL_SECTOR_SIZE" "$OUTPUT" >/dev/null || die "failed." | ||
echo -ne "done\nCopying input files..." | ||
mcopy -i "$OUTPUT" $INPUTS "::/" || die "failed." | ||
echo "done" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
echo "done" | |
printf "$0 done\n" |
What is "done"? This line will be buried in thousands of other lines in the build logs.
2d36bf5
to
43ebe02
Compare
@bjarki-trackunit @nordicjm |
I noticed this only now and it is quite puzzling. How is this not duplicating some of the work in #67997 and friends? Sorry no time and not enough knowledge to review it right now. |
Well, not duplicating but it is the same kind of idea. We need a way to build an extension and we both built a cmake target for it (factorization). The other PR#67997 is based on shared library CMake mechanism as here we build a partially linked ELF: same need/idea, different way to implement it. |
Adds support for all relocation type produced by GCC on ARM platform using partial linking (-r flag) or shared link (-fpic and -shared flag). Adds a section MPU according to allow using llext with MPU enabled. Signed-off-by: Cedric Lescop <[email protected]>
Adds a file loader in llext. With this loader, llext can load an extension using a file descriptor. Add shell command to use this loader. Adds a shell command to print llext heap usage. Signed-off-by: Cedric Lescop <[email protected]>
This patch adds a cmake macro named zephyr_llext to define an extension and build it. Source files are defined using zephyr_llext_sources cmake function. Include directories are defined using zephyr_llext_include_directories cmake function. Compilation and link flags are computed from zephyr_interfaces. Signed-off-by: Cedric Lescop <[email protected]>
Adds a sample how to use llext file loader. 2 extensions are defined and added in a fatfs image that can be flash on target. Zephyr application loads and calls start symbol on both extensions. Extension start symbol creates a thread printing thread id each seconds. Signed-off-by: Cedric Lescop <[email protected]>
Updates llext tests to use new cmake extension declaration. MPU is no longer disabled. Signed-off-by: Cedric Lescop <[email protected]>
@selescop, you did an amazing amount of work so first and foremost thank you! 🚀 You could keep this big PR as a 'draft' (so you can run CI on it), and take out one bit at a time for review/inclusion. Once a bit is merged you can rebase and proceed with the next step. |
43ebe02
to
0513e09
Compare
Yep, that's my tomorrow's plan. Now it's family time 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look at the build system related code.
Some early comments before going more in depths in some areas, but initial feeling is that the direction looks promising 👍
# | ||
|
||
|
||
macro(zephyr_llext name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this a macro and not a function ?
All variables defined inside a macro will keep living when macro returns, which can give some nasty surprises, so extra care must be taken when deciding to define a macro.
So far I've not seen a reason why this must be implemented as a macro, so would like to understand the reasons.
list(LENGTH ExtraMacroArgs NumExtraMacroArgs) | ||
# Execute the following block only if the length is > 0 | ||
if(NumExtraMacroArgs GREATER 0) | ||
foreach(ExtraArg ${ExtraMacroArgs}) | ||
if(${ExtraArg} STREQUAL PIC) | ||
set(LLEXT_IS_PIC yes) | ||
endif() | ||
endforeach() | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit unusual.
Why not a proper function and then use cmake_parse_arguments()
?
) | ||
|
||
if(LLEXT_IS_PIC) | ||
target_compile_options(${ZEPHYR_CURRENT_LLEXT} PRIVATE -fpic -fpie) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this code in general is GNU centric, but we actually have compiler flags defined for such cases to avoid custom toolchain flags / handling throughout the codebase.
This makes it easier to add better support for more toolchains in Zephyr.
For example:
zephyr/cmake/compiler/gcc/compiler_flags.cmake
Lines 232 to 234 in 9c05618
set_compiler_property(PROPERTY no_position_independent | |
-fno-pic | |
-fno-pie |
target_compile_options(${ZEPHYR_CURRENT_LLEXT} PRIVATE -fpic -fpie) | ||
else() | ||
if("${ARCH}" STREQUAL "arm") | ||
target_compile_options(${ZEPHYR_CURRENT_LLEXT} PRIVATE -mlong-calls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment regarding compiler flag handling.
printf "Creating empty '%s' image..." "$(basename "$OUTPUT")" | ||
dd if=/dev/zero of="$OUTPUT" bs=1k count=${DISK_SIZE_KB} status=none || die "dd to $OUTPUT" | ||
printf "done\nCreating FAT partition image..." | ||
mkfs.fat -F12 -S"$LOGICAL_SECTOR_SIZE" "$OUTPUT" >/dev/null || die "mkfs.vfat failed" | ||
printf "done\nCopying input files..." | ||
mcopy -i "$OUTPUT" "$INPUTS" "::/" || die "mcopy $OUTPUT $INPUTS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaict this can also be done in Python. Perhaps take a look at pyfatfs
.
Could we have a Python implementation instead, and thereby be one step closer to supporting llext in windows and MacOS ?
This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time. |
For the record, a large part of this was merged in #70452 Not sure about the rest. |
Well, just give me sometime and I'll come back to do another small PR on the rest. (we're launching a new product) |
This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time. |
Add a full support for ARM relocations.
This allows to load a complete application (ARM ELF) from filesystem into RAM and execute it.
All symbols exported via EXPORT_SYMBOL() macro can be called, ex: thread creation
The "simple" sample has been updated.
Tests are provided making sure to not break the xtensa's one
We'd like to open a discussion about (cyber)security, especially on the USERSPACE mode -> new MPU section to be revisited