Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker: use R installation manager #312

Merged
merged 3 commits into from
Nov 7, 2023
Merged

Docker: use R installation manager #312

merged 3 commits into from
Nov 7, 2023

Conversation

jeroen
Copy link
Contributor

@jeroen jeroen commented Nov 4, 2023

Switch the host R installation to use rig from @gaborcsardi.

This makes it much easier to pin the host installation to match the webr version (4.3.0 currently). Also it automatically sets up pak and PPM binaries, so I think this will much simplify management.

Because HOME is mounted to something ephemeral on GitHub Actions.
@georgestagg
Copy link
Member

Nice! This looks like a great idea to me, thanks so much.

I'm just doing some local testing in Docker before I approve and merge.

@jeroen
Copy link
Contributor Author

jeroen commented Nov 6, 2023

OK cool! If rig does a little bit too much for your taste you can also grab the installer directly from:

https://cdn.posit.co/r/ubuntu-2204/pkgs/r-4.3.0_1_amd64.deb

But then you still have to manually set the p3m repositories.

@georgestagg
Copy link
Member

georgestagg commented Nov 6, 2023

If rig does a little bit too much for your taste you can also grab the installer directly from [...]

Hmmm. I'll take a look at how much rig adds to the image size vs. pulling the .deb directly and make a decision once I know for sure.

Edit: Image size before this PR: 2.94GB. After: 4.69GB. However, using the .deb directly doesn't seem to reduce the image size much, so I don't think it's rig adding the extra size.

@gaborcsardi
Copy link

Edit: Image size before this PR: 2.94GB. After: 4.69GB. However, using the .deb directly doesn't seem to reduce the image size much, so I don't think it's rig adding the extra size.

The rstudio/r-builds R builds have a lot of dependencies that are not strictly required, e.g. compilers. The rig binary itself is less than 10MB AFAIR, and has no dependencies.

You probably also want to delete the downloaded deb packages after rig add.

@georgestagg
Copy link
Member

georgestagg commented Nov 6, 2023

The rig binary itself is less than 10MB AFAIR, and has no dependencies.

Brill, thanks for confirming. 10MB is a bargain for the management tools rig itself provides.

So, the cost is in using the rstudio R builds. I guess it's not so bad, and probably worth the extra GB (if you're already downloading 3GB in any case) to ensure we have a confirmed working native R of the right version.

@jeroen
Copy link
Contributor Author

jeroen commented Nov 6, 2023

Hmm are you sure those numbers are correct? I would be surprised if the posit build of R has many more dependencies than the default ubuntu builds.

I have the version from the above PR in my own ghcr.io, and the sizes* are uncompressed:

# docker image ls ghcr.io/r-wasm/webr:main
REPOSITORY            TAG       IMAGE ID       CREATED      SIZE
ghcr.io/r-wasm/webr   main      3a13fe3601ba   5 days ago   2.94GB

# docker image ls ghcr.io/jeroen/webr:main
REPOSITORY            TAG       IMAGE ID       CREATED       SIZE
ghcr.io/jeroen/webr   main      8456fd927511   3 hours ago   3.17GB

And compressed:

dockersize() { docker manifest inspect -v "$1" | jq -c 'if type == "array" then .[] else . end' |  jq -r '[ ( .Descriptor.platform | [ .os, .architecture, .variant, ."os.version" ] | del(..|nulls) | join("/") ), ( [ .SchemaV2Manifest.layers[].size ] | add ) ] | join(" ")' | numfmt --to iec --format '%.2f' --field 2 | column -t ; }

## jeroen@dev:~$ dockersize ghcr.io/r-wasm/webr:main
linux/amd64  1.07G

##jeroen@dev:~$ dockersize ghcr.io/jeroen/webr:main
linux/amd64  1.18G

*scripts from here

Copy link
Member

@georgestagg georgestagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! Here is how things look on my end:

Screenshot 2023-11-06 at 21 37 03

Where the untagged images are with and without rig. Perhaps the difference is because I built those locally on arm macOS, rather than through GHA/Linux? I don't really see why that should be the case, though.

In any case, I'm happy to continue with using rig and RStudio's R builds. Other than a few minor suggestions, let's merge and see where things lie after CI catches up.

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
@jeroen
Copy link
Contributor Author

jeroen commented Nov 7, 2023

Thanks. I'm reverting the last commit that added apt-cache clean because it did not reduce the image size at all.

@georgestagg georgestagg merged commit 6fd7fac into r-wasm:main Nov 7, 2023
@jeroen
Copy link
Contributor Author

jeroen commented Nov 8, 2023

Awesome. Going to trigger 40k rebuilds with this R-4.3.0 so that we don't get the warning: this package was built with R-4.3.2 when loading the packages in webr.

@georgestagg
Copy link
Member

Wow, that's a lot of packages! Good luck!

@gaborcsardi
Copy link

Just out of curiosity, why are you not using R 4.3.2?

@jeroen
Copy link
Contributor Author

jeroen commented Nov 8, 2023 via email

@georgestagg
Copy link
Member

The current release of webR uses R 4.3.0 as the base source. I think using R 4.3.2 to build Wasm packages does work, but it shows a warning message when loading the resulting package in webR.


As for the reason webR still uses R 4.3.0, it's because webR patches the source code to reimplement some of R's features using web browser APIs, and also makes some other changes for compatibility with the WebAssembly and Emscripten environment. From experience, when I update the version of the R source used the patches don't apply cleanly and need to be reworked. This has been a non-trivial amount of work in the past, so I don't focus too much on following every R patch release.

Perhaps in the future, some of the patches can be upstreamed to R core to make the Wasm build process simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants