-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set singleuser.args to jupyterhub-singleuser #2805
Conversation
After writing https://github.com/pangeo-data/pangeo-docker-images/pull/355/files#diff-a77643b43a7be453fa8556937bf32b27907e152a10d4c693f3e7670c66a44378, I would like to ask that we reconsider *requiring* entrypoint be set to a specific thing (`jupyterhub-singleuser`) in the images users are required to produce. This makes it quite difficult to produce images that *work across multiple tools* without having to configure each of those tools. In this case, we want to use the *exact same image* in both binder and apache beam. Apache beam *requires* an entrypoint be specified, while at least as of now binder does not - so I wrote a script that uses heuristics to determine if the caller is apache beam or something else, and change behavior accordingly. But if I want this image to also work with JupyterHub in the future, I'll have to modify my script to detect if JupyterHub is calling it, and exec `jupyterhub-singleuser`. Now if dask also wants to use this image, I'd need to have a code path for that as well. I think this strong coupling causes a lot of unnecessary friction here. Another point of confusion here is that `cmd` in helm / k8s is actually *ENTRYPOINT* in docker, while `args` in helm / k8s is *CMD* in docker. Earlier, we were setting `cmd` to `jupyterhub-singleuser`, which overrode any entrypoint the docker builder had set up - causing the issues we were trying to fix. However, now that KubeSpawner doesn't actually use `args` (k8s / helm) in any meaningful fashion, can we set the default `args` to `jupyterhub-singleuser`? This has a few advantages: 1. *ENTRYPOINT* set in Dockerfile is *not* overriden by what we set in helm, which is the current problem. This goes away 2. All existing docker images continue to work without admins having to set extra config or the image having to be rebuilt. 3. The same image can be used in multiple places without having to be aware of who is calling it. I think this is an approach to solving the original problem that is both *easier to document*, *easier for our users (existing and new)* and makes life a lot less difficult for people building images.
I can investigate the test failures and try getting this to work if y'all think the general approach is workable, @manics @minrk @consideRatio |
Here's the somewhat tortured test I had to write to validate my 'multi dispatch entrypoint' works ok https://github.com/pangeo-data/pangeo-docker-images/pull/355/files#diff-6027b5fe95c80f9fe0bcead4ae1288105dfee85005ad12d2aa07d58c0d2fa8d2R35 |
The aim of the original PR was to ensure images that run scripts with custom setup will work without changes. For example, the docker-stacks images run some hooks to customise the environment before running jupyterhub-singleuser: If we override the entrypoint or command this will break the docker-stacks images again: |
About docker-stacks imagesDocker stacks relies on the startup scripts to do misc logic, but will call From a docker-stacks perspective, such images should be started with the start-notebook.sh script if users also want to rely on various environment variable influencing those scripts. They can start like that either by default via their Dockerfile's declared ENTRYPOINT+CMD, or via z2jh's configuration of kubespawner, which in turn configures the k8s About reverting #2449There is a key part that I know @minrk wants resolved, and that is the possibility to configuring z2jh in a way that makes the Dockerfile's CMD be respected. Can we do that? I recall an issue, so I'll look into this... Currently, zero-to-jupyterhub-k8s/jupyterhub/files/hub/jupyterhub_config.py Lines 412 to 413 in 58d7d67
The default value of Okay, I think we should be good. Users that want the behavior of relying on the My current opinionI think either choice, of defaulting to Given that this isn't a new project, and we have a current default and have forced users that wanted to rely on Example: singleuser:
cmd:
# declaring cmd like above is the same as declaring it like below
# cmd: null |
cmd: | ||
args: | ||
- jupyterhub-singleuser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For singleuser
, there is no args
- this is Helm chart config mimicing the jupyterhub configuration options for the spawner, not directly mapped to k8s config specification (command
+ args
)
So, singleuser.cmd
is the JupyterHub Spawner cmd
, which is in the end rendered to the k8s args
by KubeSpawner.
I think looking at #2449 is needed to better revert that.
I'm very much +1 to this too. |
I've lost track of where it was discussed recently, but as long as set-to-null in a user chart to opt-in to the image's custom command is easy and doesn't require replicating the command itself (the original motivation of #2138, I think), I'm happy. As @consideRatio mentioned, changing None to be the default in KubeSpawner seeems to have made this possible, so with this change (after fixing
This seems the best balance to me. |
OK, let's revert the breaking change then. Can you update https://zero-to-jupyterhub.readthedocs.io/en/latest/jupyterhub/customizing/user-environment.html#choose-and-use-an-existing-docker-image to include |
After writing
https://github.com/pangeo-data/pangeo-docker-images/pull/355/files#diff-a77643b43a7be453fa8556937bf32b27907e152a10d4c693f3e7670c66a44378,
I would like to ask that we reconsider requiring entrypoint be set to
a specific thing (
jupyterhub-singleuser
) in the images users arerequired to produce. This makes it quite difficult to produce images
that work across multiple tools without having to configure each of
those tools. In this case, we want to use the exact same image in both
binder and apache beam. Apache beam requires an entrypoint be
specified, while at least as of now binder does not - so I wrote a
script that uses heuristics to determine if the caller is apache beam or
something else, and change behavior accordingly. But if I want this
image to also work with JupyterHub in the future, I'll have to modify my
script to detect if JupyterHub is calling it, and exec
jupyterhub-singleuser
. Now if dask also wants to use this image, I'dneed to have a code path for that as well. I think this strong coupling
causes a lot of unnecessary friction here.
Another point of confusion here is that
cmd
in helm / k8s is actuallyENTRYPOINT in docker, while
args
in helm / k8s is CMD in docker.Earlier, we were setting
cmd
tojupyterhub-singleuser
, whichoverrode any entrypoint the docker builder had set up - causing the
issues we were trying to fix. However, now that KubeSpawner doesn't
actually use
args
(k8s / helm) in any meaningful fashion, can we setthe default
args
tojupyterhub-singleuser
? This has a fewadvantages:
helm, which is the current problem. This goes away
set extra config or the image having to be rebuilt.
aware of who is calling it.
I think this is an approach to solving the original problem that is both
easier to document, easier for our users (existing and new) and
makes life a lot less difficult for people building images.
Re-implementation of #2449