Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: provide pre-built nodejs/node images or cache #39672

Open
bnb opened this issue Aug 5, 2021 · 45 comments
Open

build: provide pre-built nodejs/node images or cache #39672

bnb opened this issue Aug 5, 2021 · 45 comments
Labels
build Issues and PRs related to build files or the CI.

Comments

@bnb
Copy link
Contributor

bnb commented Aug 5, 2021

Is your feature request related to a problem? Please describe.
Presently, making a small contribution to node.js core is particularly challenging. There are a number of contributing factors, but a primary one is simply having to build Node.js from scratch. For someone unfamiliar/new to Node.js, this can be particularly challenging compared to basically anything else in the JavaScript ecosystem.

Describe the solution you'd like
It would be a nice contributor experience enhancement to provide a pre-built image (containerd or Docker?) or some kind of cache that would help alleviate the pain of needing to build completely from scratch every single time.

One theoretical manifestation of how this could work:

  • Nightly, build a Docker container.
  • This Docker container includes all necessary tools / nice-to-haves for building Node.js.
  • This Docker container builds Node.js.

Containers provide some nice benefits:

  • We'd potentially be able to enhance many contributors' workflows.
  • We'd enable a better experience through cloud-hosted developer environments like Codespaces that have the option to build from a container.
  • We could include node-core-utils and potentially enhance it further with something like a contributing checklist to ease first-time-contributor or early-contributor burden, effectively automating what's codified in the PR guidance.

Or, we could do something in the vein of a cloud cache (in the vein of goma). While these require immense work, they are extremely useful and don't require you to run Docker. If there's more interest in this, I'm happy to see what resources I can pull from Microsoft (including potentially open sourcing some things) to enable it.

Either way, this is a build feature that seems to go in a positive direction that helps both new and existing contributors. Would love thoughts, if there are any.

Describe alternatives you've considered

Doing nothing: I mean this works but also retains an unnecessary bad experience.

@bnb
Copy link
Contributor Author

bnb commented Aug 5, 2021

Also, FWIW, I am happy to help make this happen however I can. I assume if we go down the route of images, the auto-generation will be the... harder part for me to meaningfully contribute to purely because I don't have access/am not familiar with the infrastructure we'd likely use to do that compute.

@targos
Copy link
Member

targos commented Aug 5, 2021

The challenge I see with a Docker container is how to "connect" it to the user's GitHub credentials (so they can easily push to their fork after working on a change).

@bnb
Copy link
Contributor Author

bnb commented Aug 5, 2021

@targos yeah, that's a place I think the enhancement with a small CLI or other kind of helper would be beneficial.

@bnb
Copy link
Contributor Author

bnb commented Aug 5, 2021

I also think that's a much better problem to have than having to compile Node.js on your own computer 😅

@bnb
Copy link
Contributor Author

bnb commented Aug 5, 2021

I suppose I should ask, assuming we do images:

  • are there any strong opinions on what needs to be included excluded outside of the most basic requirements?
  • how would we build and publish nightly? do we want to Do It Ourselves or do we want to see if Actions will be "good enough"?

@mcollina
Copy link
Member

mcollina commented Aug 6, 2021

I think we should really explore two options and see what is feasible:

  1. have one docker image generated every night compiled with everything we need.

  2. have a platform-specific (Linux/Intel, Mac/Intel, Mac/Apple, Win/Intel) archive created from the build/ folder that is donwloaded

A few stats:

  • tar czf time: 30s
  • tar xzf time: 5s
  • .tar.gz size: 284MB

If we consider a 100MBit line, this bundle will download in around 20-30s.

@targos
Copy link
Member

targos commented Aug 6, 2021

have one docker image generated every night compiled with everything we need.

I'm experimenting with this in https://github.com/targos/node-dev-docker

@targos
Copy link
Member

targos commented Aug 6, 2021

how would we build and publish nightly? do we want to Do It Ourselves or do we want to see if Actions will be "good enough"?

GitHub actions might be enough, but I don't know where we would push the image. I would try with the GitHub container registry, but I don't know if it's designed to hold short-lived images (we don't want to keep all nightly builds in storage forever).

@targos
Copy link
Member

targos commented Aug 6, 2021

have one docker image generated every night compiled with everything we need.

I'm experimenting with this in https://github.com/targos/node-dev-docker

After 41 minutes on my mac, the image is built. Its size is 2.86 GB

@mcollina
Copy link
Member

mcollina commented Aug 6, 2021

After 41 minutes on my mac, the image is built. Its size is 2.86 GB

Compressed or uncompressed? Usually those are transferred in a compressed fashion.

@targos
Copy link
Member

targos commented Aug 6, 2021

I guess uncompressed. It's the size reported in the Docker dashboard:

image

@richardlau
Copy link
Member

You could always remove out. Although that would remove the compiled output, the ccache cache should still exist to speed up build times.

@targos
Copy link
Member

targos commented Aug 6, 2021

I going to try again with --ninja because I don't like the output with the normal build and subsequent calls to make are not no-ops.

@mcollina
Copy link
Member

mcollina commented Aug 6, 2021

You could always remove out. Although that would remove the compiled output, the ccache cache should still exist to speed up build times.

This might be really interesting.

@targos
Copy link
Member

targos commented Aug 6, 2021

Still building, but FWIW the image is already 1.86 GB before compilation, so there's probably a lot of room for improvement (like removing the apt cache)

@AshCripps
Copy link
Member

* how would we build and publish nightly? do we want to Do It Ourselves or do we want to see if Actions will be "good enough"?

We do have a GCP account in build used for the download metrics so we could make use of the container registry there. (if its applicable)

@targos
Copy link
Member

targos commented Aug 6, 2021

You could always remove out. Although that would remove the compiled output, the ccache cache should still exist to speed up build times.

Somehow ccache isn't used when the image is built... I pushed what I have so far.

@AshCripps
Copy link
Member

Dont you need to explicitly set CC and CXX to use the ccache versions?

@richardlau
Copy link
Member

Dont you need to explicitly set CC and CXX to use the ccache versions?

Not if using the ccache symlinks.

@targos
Copy link
Member

targos commented Aug 6, 2021

The problem is that the $PATH isn't right. I haven't found out why yet, but even with source $HOME/.bashrc it doesn't work.

@richardlau
Copy link
Member

The problem is that the $PATH isn't right. I haven't found out why yet, but even with source $HOME/.bashrc it doesn't work.

Perhaps because the RUN statement in the docker file is not an interactive shell? From the .bashrc file in the image:

nodejs@76d13c60b94b:~$ cat .bashrc
# ~/.bashrc: executed by bash(1) for non-login shells.
# see /usr/share/doc/bash/examples/startup-files (in the package bash-doc)
# for examples

# If not running interactively, don't do anything
case $- in
    *i*) ;;
      *) return;;
esac
...

But maybe we don't need to modify .bashrc at all and can use Dockerfile's ENV instruction? targos/node-dev-docker#1

@targos targos added the build Issues and PRs related to build files or the CI. label Aug 9, 2021
@leorossi
Copy link

leorossi commented Aug 9, 2021

I would like to investigate another path: we could use GH Actions Artifacts in order to expose the build directory (and/or whatever is needed to speed up the processs).

The artifact can be compressed and be available for download like in the image below

Screenshot 2021-08-09 at 16 15 43

What do you think?

@mcollina
Copy link
Member

mcollina commented Aug 9, 2021

This seems pretty interesting. Once we build, we might be able to build a small CLI inside node-core-utils to download the bundle for the closest base commit.

@targos
Copy link
Member

targos commented Aug 9, 2021

It seems interesting, but isn't the build output very dependent on the environment (OS, build tools, etc.) ?

@AshCripps
Copy link
Member

would that involve having the run a action for each of the main three? (windows, mac, linux)? cause then we run into issues with things like macos action being very slow and unable to use ccache

@leorossi
Copy link

I made a workflow that built node on ubuntu-latest and pushed to an s3 bucket.

This is the output: https://nodejs-build-snapshots.s3.eu-west-1.amazonaws.com/builds/node-linux-v17.0.0-build-files.tar.gz

I see that the build directory /home/runner is hardcoded in many thousands of files so I don't know if this is portable.

I am not an expert of building big C++/like projects, let me know if you have any clue...

@bnb
Copy link
Contributor Author

bnb commented Aug 12, 2021

Found the right person at GitHub to talk to today and they said they'd be able to get us an extra large runner for this if we decide to go down the Actions route 👍🏻

@bnb
Copy link
Contributor Author

bnb commented Aug 12, 2021

GitHub actions might be enough, but I don't know where we would push the image. I would try with the GitHub container registry, but I don't know if it's designed to hold short-lived images (we don't want to keep all nightly builds in storage forever).

I'm sure we could get GitHub Container Registry set up if we want to, and I'd be happy to help with that. Docker Hub is also an option - I assumed we'd go with this purely based off of it being the default.

re: short-lived images: I honestly don't know if this matters. I actually think it might be nice to have nightly developer builds permanently archived. Perhaps there's justification for not keeping them that I'm missing?

@bnb
Copy link
Contributor Author

bnb commented Aug 17, 2021

I'm working on an image for this. I'm going to see if I can grab some time with a couple of my coworkers who are much more experienced with Docker than I am to make it Good™ but hoping to publish it to a repo this week. I'll share it here for feedback - I've incorporated everything that's been suggested that is reasonable without enhancing existing tooling (see: nodejs/node-core-utils#554 for context on that!)... I figure we can do that work incrementally.

@targos
Copy link
Member

targos commented Aug 21, 2021

@bnb in case you missed it, I spent some time working on an image in https://github.com/targos/node-dev-docker. Feel free to take anything from it :)

@bnb
Copy link
Contributor Author

bnb commented Aug 22, 2021

oooh I did miss it, my apologies. I'll comb through it and see what's similar and what's different - I'm sure there's some things you thought of that I didn't 👍🏻

@bnb
Copy link
Contributor Author

bnb commented Oct 28, 2021

I've been working on this over the past few weeks. Today I made the final bit of progress needed (thanks @mhdawson and @bmeck for the advice) to get it in a workable state.

Progress

So far, I've got a Docker image that:

  • sets up a user in ubuntu:latest
  • installs all of the necessary dependencies to build Node.js
  • creates the needed directories for building and cachingt
  • clones Node.js (shallowly - this can be tweaked to be more expansive later if we'd like)
  • builds Node.js with Ninja instructions
  • installs the built Node.js into the system, making it usable along with all of the globals we'd provide (node, npm, etc.)
  • installs node-core-utils globally

This is all done in bnb/devenv and is presently published to bitandbang/devenv:latest on Docker Hub. Taking this approach (publishing an image) lends a few benefits, with one of the most impactful (imo) being that it can be set up to run as a VM completely agnostically with whatever specs you can afford to throw at it.

Additionally, I've got an example devcontainer config (available at bnb/node-devcontainer that includes a minimal-ish .devcontainer.json configuration and Dockerfile that allow GitHub Codespaces to build out a nice developer environment that can be used from GitHub.dev, from VS Code, or from the command line (thanks to the gh codespaces ssh command released yesterday!) directly within GitHub or from the Docker Desktop app. There are a few extensions I've set up that I think greatly enhance the experience when using Codespaces and VS Code. YMMV, we can always change this.

My proposal with this would be to include the configuration files (those in ./.devcontainer/) in nodejs/node. I'd of course expect some iteration/additions as we get a bit further along, but this works as-is now.

Further Work

I've set this up with building and publishing nightly in mind. Theoretically this should be doable (and ~relatively trivial) with GitHub Actions, but I've been focusing on the Docker side of things until the breakthrough progress I had today.

If we do decide to proceed with this, I'd presumably move the devenv repo to nodejs/ (the name was absolutely temp, I don't have a strong opinion on what it should be named... yet) and have it run/be maintained in the org. I'd also PR the .devcontainer directory to nodejs/node, which would allow people to launch the developer environment directly from nodejs/node on GitHub.com.

Ideally, to proceed, we'd need a few things:

  • GitHub Actions building the Docker image nightly
  • GitHub Actions publishing the Docker image nightly

And to land:

  • move bnb/devenv to the Node.js org.
  • PR ./.devcontainer from bnb/node-devcontainer into nodejs/node

If this is something you'd like to see, I'd appreciate hearing that.

@mcollina
Copy link
Member

I'm +1 on trying this out and make it part of the Node.js infrastructure. I have a few questions.

Could the docker image be part of our docker team help maintaining?

Could the image be used to develop Node.js outside of CodeSpaces? In that case, how would the flow work?

Would it be possible we get some level of sponsorship from GitHub to let developers use CodeSpaces for free in that environment/as part of contributing to Node.js?

@mhdawson
Copy link
Member

I'm +1 on trying this out and make it part of the Node.js infrastructure.

+1 from me as well.

@bnb
Copy link
Contributor Author

bnb commented Oct 29, 2021

Could the docker image be part of our docker team help maintaining?

It could be maintained by anyone, so yes if they want to. I'd definitely love to continue contributing, of course :)

Could the image be used to develop Node.js outside of CodeSpaces? In that case, how would the flow work?

Yep! The image is your typical Docker image, just perhaps a bit more chonky than you'd normally want a production service Docker image to be. Spin it up on your local machine or in a cloud, SSH in and you're good to go. We should definitely have instructions for this!

Would it be possible we get some level of sponsorship from GitHub to let developers use CodeSpaces for free in that environment/as part of contributing to Node.js?

I'm not sure on this, since I don't think this is how Codespaces works in terms of the infrastructure/billing. Specifically, it's presently tied to org membership with the org being responsible for the bill. What I'd assume is more likely than public entitlements, given the billing structure, is getting an entitlement for the org and its members. I'll be happy to ask on both fronts, though.

@targos
Copy link
Member

targos commented Jan 13, 2022

Is this relevant?

github/roadmap#373

@bnb
Copy link
Contributor Author

bnb commented Feb 16, 2022

@targos heh, yes. I'll have to see what's needed for that :)

@targos
Copy link
Member

targos commented Feb 24, 2022

@simoneb
Copy link
Contributor

simoneb commented Jun 16, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI.
Projects
None yet
Development

No branches or pull requests

8 participants