-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Makefile.base: use thin static archives. #10195
Conversation
pls try on OS X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely an interesting feature. Currently, the builds use up quite a lot of disk space when testing different applications and boards.
Makefile.base
Outdated
$(BINDIR)/$(MODULE).a: $(OBJ) | $(DIRS:%=ALL--%) | ||
@# Recreate archive to cleanup deleted/non selected source files objects | ||
$(Q)$(RM) $@ | ||
$(Q)$(AR) $(ARFLAGS) $@ $^ | ||
$(Q)$(AR) $(ARFLAGS) $(foreach f,$(abspath $@ $^),$(call relpath,$f)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of forking a new subshell we could patsubst away the BINDIR prefix from the objects, since the output ($@
) is always located in the top of the BINDIR tree where all the objects are
$(Q)$(AR) $(ARFLAGS) $(foreach f,$(abspath $@ $^),$(call relpath,$f)) | |
$(Q)$(AR) $(ARFLAGS) $@ $(patsubst $(dir $@)%,%,$^) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that make is running from a different directory than BINDIR, so those paths will be wrong. Also "$@" also needs to be relative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried stripping RIOTBASE, but that will break external apps and external modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does $@
have to be relative?
I thought the relative paths were supposed to be relative to the archive file, I guess it needs a different approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around a bit with the proposed makefile change.
Using this line I get fully working thin archives both in Docker and outside. ar -t bin/frdm-kw41z/core.a
works on the host after building the archive inside Docker with make BUILD_IN_DOCKER=1
, and the file names are shown as relative to the current directory instead of absolute.
$(Q)$(AR) $(ARFLAGS) $(foreach f,$(abspath $@ $^),$(call relpath,$f)) | |
$(Q)cd $(dir $@) && $(AR) $(ARFLAGS) $@ $(patsubst $(dir $@)%,%,$^) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did some experiments with @cladmi on ar
with relative and absolute paths. In all ar
invocations, the paths given to the tool must make sense when interpreted from the current working directory.
ar rcTs ABSOLUTE.a ABSOLUTE.o
always works, but the path of the o file is stored as absolute in the a file.ar rcTs RELATIVE.a RELATIVE.o
always works, andar
stores the path of the o file relative to the a file (it converts it internally).ar rcTs ABSOLUTE.a RELATIVE.o
causes ar to store the path to the .o file unchanged, as it was given in the command line. This gives an invalid .a, unless the command is ran from the same directory where the a file is located.
Can a OSX user do this for me, please? EDIT: AFAICT OSX does not have realpath (you call that an OS?) |
For the cool kids using Mac, how about they Think Different™ # remaining words in a list
_rest = $(wordlist 2,$(words $(1)),$(1))
# return an empty string if the arguments are different
_streq = $(filter $(1),$(2))
_putdots = $(patsubst %,..,$(1))
_relpath = $(if $(call _streq,$(firstword $(1)),$(firstword $(2))),\
$(call _relpath,$(call _rest,$(1)),$(call _rest,$(2))),\
$(call _putdots,$(1)) $(2))
_joinpath = $(if $(1),$(addsuffix $(call _joinpath,$(call _rest,$(1))),$(firstword $(1))/))
_splitpath = $(subst /, ,$(1))
relpath = $(call _joinpath,$(call _relpath,$(call _splitpath,$(1)),$(call _splitpath,$(2))))
all:
echo $(call relpath,/1/2/3/5/6/7,/1/2/3/4/5/6/7)
echo $(call relpath,/1/2/3/5/6/7,/)
echo $(call relpath,/,/1/2/3/5/6/)
# these fail but it should not matter because $1 should always be a dir
# and $2 a file
echo $(call relpath,/,/)
echo $(call relpath,/1/2/3,/1/2/3) Now we don't need |
d87ec18
to
6a4246d
Compare
output on macOS:
|
I plan to review this after the release as it should not go in after soft feature freeze. |
Can we Murdock this thing, see if it works for all tests/examples/boards? |
The initial space savings measurements were way off. I updated the description with improved statistics. The savings are now close to the expected 50%. |
I think you can apply this diff to make it run in murdock for all build cases: diff --git a/.murdock b/.murdock
index 5d03a2a09..37daef4ec 100755
--- a/.murdock
+++ b/.murdock
@@ -155,7 +155,7 @@ compile() {
# compile
CCACHE_BASEDIR="$(pwd)" BOARD=$board TOOLCHAIN=$toolchain RIOT_CI_BUILD=1 \
- make -C${appdir} clean all -j${JOBS:-4}
+ make -C${appdir} clean all archive-check -j${JOBS:-4}
RES=$?
# run tests I could not directly push to your branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just started reviewing without testing yet.
As said, as it saves half of the disk space when building, making the whole build go from 2TB to 1TB, it is definitely a thing to do. Producing relocated objects from an archive could be a sub-target used only when necessary as it may not give any other benefit by default than --whole-archive
.
Question, maybe also for my future me testing, does the generated archive contains timestamp of the objects ? or is there a way to generate it with a creation time of the newest object.
Because as soon as ELFFILE
uses the archives timestamps to know if it changed, it would treat it as a normal archive. And rebuilding for nothing would not be the best.
It is for future proof evolution even if unused now as everything still depends on FORCE
.
For the tests I would prefer creating a tests/build_system_utils
test that does this instead of putting it directly in utils. As the tests are also mixed in between it makes it really hard to understand the _relpath
implementation.
I proposed something in smlng#6 that could go in this direction.
It could then be executed by murdock. And maybe also during release tests on different systems and with docker too.
If for any reason, a make
only version of _relpath
is not wanted, python
could be used as it is already a requirement for RIOT. It could even be a first version with python and an upcoming PR for the make
only version.
Also, somewhere this should be summarized #10195 (comment)
Makefile.base
Outdated
@@ -70,7 +70,7 @@ $(BINDIR)/$(MODULE).a $(OBJ): | $(BINDIR)/$(MODULE)/ | |||
$(BINDIR)/$(MODULE).a: $(OBJ) | $(DIRS:%=ALL--%) | |||
@# Recreate archive to cleanup deleted/non selected source files objects | |||
$(Q)$(RM) $@ | |||
$(Q)$(AR) $(ARFLAGS) $@ $^ | |||
$(Q)$(AR) $(ARFLAGS) $(foreach f,$(abspath $@ $^),$(call relpath,$(abspath .),$f)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relpath
implementation should take care of the foreach
like most makefile functions do.
Also I would prefer the python arguments ordering https://docs.python.org/3.7/library/os.path.html#os.path.relpath
relpath, directories, optional_ref_directory
Where optional_ref_directory is replaced by $(abspath .)
if it is empty. It can be done with $(if $2,$2,$(abspath .))
in the function call.
|
||
# Rule to check if thin archives are correctly produced, that is, with a correct | ||
# relative path. | ||
ifeq ($(BUILD_IN_DOCKER),1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The archive-check
does not need to be done in docker I think.
Using relative path for archives was done to ensure that a file generated in docker are still valid in the host system.
So checking on the host system when build in docker should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I disable docker support for this one it means that either:
- The user must have the toolchain available locally (to use $(AR))
- The native
ar
could be used, but then another variable should be created to hold it's name (someting like $(NATIVE_AR))
I don't think this is a big deal since it is just a diagnostic tool.
makefiles/utils.inc.mk
Outdated
|
||
# Return an empty string if the arguments are different (works only for single | ||
# words | ||
streq = $(filter $(1),$(2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be more findstring
, maybe with a strip
after, to not use patter matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One requirement for this (I should document it) is that two empty strings are considered different.
makefiles/utils.inc.mk
Outdated
|
||
# Produce an error if the argument is empty. The second argument defines the | ||
# message. | ||
assert = $(if $(1),,$(error $(2))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now the similar to ensure_value
from #10342 so definitely putting ithem in common would help :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think assert can be made an alias for "ensure_value"
makefiles/utils.inc.mk
Outdated
relpath = $(call joinpath,$(call _relpath,$(call splitpath,$(1)),$(call splitpath,$(2)))) | ||
|
||
# Use coreutils to calculate a relative path | ||
sys_relpath = $(shell realpath -m --relative-to=$(1) $(2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cannot be tested with mac, so replacing it with a python
version would be better.
makefiles/utils.inc.mk
Outdated
echo $(call relpath,/1/2/3,/1/2/3) | ||
|
||
# Return a path relative to the current directory | ||
relpath_here = $(call relpath,$(abspath .),$(abspath $(1))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said in the first comment, this should be handled by using a default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Overloading" relpath
to do something special when the first (or second) argument is empty exposes us to bugs: one of the arguments could come from a variable that is not defined (think a typo) and then, instead of getting an- obviously wrong- empty result one gets a path relative to the current directory.
makefiles/utils.inc.mk
Outdated
@echo OK? $(call assert_eq,$(call intercal,-,ab),ab,"This should not fail") | ||
|
||
_putdots = $(patsubst %,..,$(1)) | ||
_relpath = $(if $(call streq,$(firstword $(1)),$(firstword $(2))),\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A documentation explaining this one would be great as it is the main implementation.
makefiles/utils.inc.mk
Outdated
@echo OK? $(call assert_eq,z,x,"This should fail") | ||
|
||
# Get the remaining words in a list | ||
rest = $(wordlist 2,$(words $(1)),$(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some functions like this one would maybe better be private namespaced functions as the name is not really explicit and could collide with another name.
If the ".o" is a prerequisite of the ".a" then the latter will be re-generated whenever the former changes, so this is not an issue (as long as dependencies are correctly declared). |
@jcarrano I realized after that I was mixing file content and file timestamp in my head. So whenever we re-create it it will be ok. |
I rebased, integrated the make stuff with the |
This is failing because of #12155. |
#12155 is merged, try rebasing. |
@benpicco There's no need to rebase, I just toggled the "ready for ci" label. |
6cb0fb4
to
d9132ee
Compare
On @cladmi's suggestion I rebased so that the pr branch can be directly tested. |
Normal, or thick archives contain a copy of the object code. Thin archives, on the contrary, are just an index to the .o files. This patch does two things: 1. Change ARFLAGS to enable the "T" options. 2. Call AR with all relative paths. The second step is necessary because the build system handles all absolute paths. If the index in the thin archive contains absolute paths, archives created in docker are no usable outside, and moving the objects breaks the archive. If all arguments to AR are relative, the resulting archive contains filenames *relative to the .a file* and nothing should break as long as the relative location of the .a and .o remains unchanged. Compilation time is unchanged, but disc usage is reduced by approximately 50%. These are the result of a full RIOT build: | Thin Archive | no | yes | Savings (%) | | -------------- | ------: | ----: | ----------- | | pkg (10e6 KiB) | 1 790 | 905 | 49% | | Non pkg | 71 | 71 | 1% | | Total | 1 812 | 976 | 46 % |
This adds a new target "archive-check". Thin archives should be created with relative paths so that archives created in a build container are useful in the host. This check ensures there are no absolute paths inside thin archives.
Arduino still has issues:
|
mmm, that's weird. I only seems to happen with llvm. |
I found the issue:
See, the archive is being created from (i.e. the CWD is) the
And then (manually) running the linker step works. |
llvm-ar behaves weidly when creating thin archive. This only manifests itself when using arduino sketches as these are built from the "bin" directory. Specifically, given a directory "m" and an object in "m/obj.o " an invocation with CWD==m: ``` llvm-ar rcTs ../m.a obj.o ``` Will create a maformed archive. Binutils does not have any issue with this. The following command, executed with CWD==m/.. works: ``` llvm-ar rcTs m.a m/obj.o ``` The trick used in this commit is to put the source files in a different directory than the object files and compile from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. Most comments were about the make archive-check
diagnostic tool, the core patch seems pretty straightforward.
I tested #11565 with this patch applied just to see if it also works with the generated code there and it does.
@smlng tested this on OS X almost a year ago, I suppose it still works there?
Let's see if CI is doing something weird in the next days 😉
This breaks |
Contribution description
Normal, or thick archives contain a copy of the object code. Thin archives, on the contrary, are just an index to the .o files.
This patch does two things:
The second step is necessary because the build system handles all absolute paths. If the index in the thin archive contains absolute paths, archives created in docker are no usable outside, and moving the objects breaks the archive.
If all arguments to AR are relative, the resulting archive contains filenames relative to the .a file and nothing should break as long as the relative location of the .a and .o remains unchanged.
Compilation time is unchanged, but disc usage is reduced by approximately 2x:
Disk space savings
The following numbers were extracted from these branches:
After compiling everything, there will be an output directory containing the sizes of the build artefacts. I used these command lines:
Sizes are broken down into package and non-package files because package data is mostly unaffected by this change. Units is million KiB (sorry for the mixture of 1000 and 1024 base)
Testing procedure
Thin archives
One of the issues with thin archives is how paths to object files are stored in .a files. If done incorrectly, there will be absolute paths, which means that docker-generated file won't be usable outside the container and, more generally, that it is not possible to move the bin directory.
I added a new target that checks if the archives have absolute paths. Run:
Inside and outside of docker (the command is docker-enabled).
Make functionsI added tests for the new make functions, intests/build_system_utils
. Just runmake
. I refactored the code and improved the output to make it clearer which tests are being run.Other stuff
The commits following the first one may need some tidying up/reworking.I would like to have feedback regardingutils.inc.mk
(is that the correct place to put the code?).I have no idea why disc space savings are so high. I was expecting 50% percent reduction or so, not 90%.Dependencies
Depends on #11218Depends on #12155.