Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐞 existing services break in 0.15 #9190

Closed
levlaz opened this issue Dec 12, 2024 · 17 comments · Fixed by #9247
Closed

🐞 existing services break in 0.15 #9190

levlaz opened this issue Dec 12, 2024 · 17 comments · Fixed by #9247
Labels
kind/bug Something isn't working

Comments

@levlaz
Copy link
Contributor

levlaz commented Dec 12, 2024

What is the issue?

I have a sample go application (https://github.com/levlaz/snippetbox) that installs mariadb (https://daggerverse.dev/mod/github.com/levlaz/daggerverse/mariadb@cea1668da940b45864116049bd20087855c8c787) from the daggerverse and mounts it as a service binding.

THe mariadb module is targeting 0.11.6 and the snippetbox module is targeting 0.13.16

This is no longer working in 0.15 with the following error:

│ ✔ .withEnvVariable(name: "CACHEBUSTER", value: "2024-12-12 22:56:27.108879878 +0000 UTC m=+0.053134793"): Container! 0.0s
│ ✘ .withExec(args: ["sh", "-c", "mysql -h db -u root < internal/db/init.sql"]): Container! 0.1s
│ ┃ mysql: Deprecated program name. It will be removed in a future release, use '
│ ┃ /bin/mariadb' instead                                                        
│ ┃ ERROR 2026 (HY000): TLS/SSL error: self-signed certificate                   
│ ! process "sh -c mysql -h db -u root < internal/db/init.sql" did not complete successfully: exit code: 1
│ ○ .withExec(args: ["sh", "-c", "mysql -h db -u root snippetbox < internal/db/seed.sql"]): Container! 0.0s
│ ○ .withExec(args: ["go", "run", "./cmd/web", "--dsn", "web:pass@tcp(db)/snippetbox?parseTime=true"]): Container! 0.0s
✘ .terminal: Container! 4.9s
! failed to start service: solve mount /: process "sh -c mysql -h db -u root < internal/db/init.sql" did not complete successfully: exit code: 1

I suspect its not able to get the service to work due to the new API change, but I was under the impression that this change should not break existing pipelines like this.

Dagger version

dagger v0.15.0 (registry.dagger.io/engine:v0.15.0) darwin/arm64

Steps to reproduce

dagger -m github.com/levlaz/snippetbox call server

Log output

Here's a trace: https://dagger.cloud/levs-test-org/traces/f06d326bb79ffd99fb7afdc647b58f21

@levlaz levlaz added the kind/bug Something isn't working label Dec 12, 2024
@rajatjindal
Copy link
Member

Hi @levlaz

Thank you for reporting this issue.

I just tried this with v0.14.0 and got the same error:

mysql: Deprecated program name. It will be removed in a future release, use '/usr/bin/mariadb' instead
ERROR 2026 (HY000): TLS/SSL error: self-signed certificate

(trace: https://v3.dagger.cloud/rajatjindal/traces/7a4d4678fc04d42c8071fe1d03c4e4bb?listen=b913daa7677b6b45&listen=5d710621d7bdd220)

once I added --skip-tls to mysql commands, it passed through that error. (Kindly note that it still gets stuck at withExec to start the golang server, that I am trying to understand why and if that was always the case)

could you please confirm if that makes sense or did I missed something when trying to reproduce the problem?

@rajatjindal
Copy link
Member

I think I know what might be happening here. when calling dagger container up, it calls internally the asService api, which seems like is going to the new implementation even when the module is at version < v0.15.0.

@jedevc is that expected?

We may have to add a legacy vs non-legacy version of dagger container up for this.

@jedevc
Copy link
Member

jedevc commented Dec 13, 2024

Hm, yes. container.Up should also be versioned.

@jedevc
Copy link
Member

jedevc commented Dec 13, 2024

It actually might be possible to do something like this - instead of needing to handle a legacy function:

	var svc dagql.Instance[*core.Service]
	err := s.srv.Select(ctx, ctr, &svc,
		dagql.Selector{
			Field: "asService",
			View:  s.srv.View,
		},
	)

But that's just a guess, we need a test for this as well.

@rajatjindal
Copy link
Member

rajatjindal commented Dec 13, 2024

I actually tried it, and it seems like it always pass the server view if we use s.srv.View. so I am trying to find an alternative for this. (we might be able to get current module source and use view from there. trying this now).

@jedevc
Copy link
Member

jedevc commented Dec 13, 2024

pass the server view if we use s.srv.View

Is this not what we want?

@jedevc
Copy link
Member

jedevc commented Dec 13, 2024

The server is created from the module source, so I would expect this to work. We already use it like this:

// custom dagger field
dagql.Func("__schemaVersion", func(ctx context.Context, self T, args struct{}) (string, error) {
return srv.View, nil
}).View(dagql.AllView{}),

@jedevc jedevc added this to the v0.15.2 milestone Dec 13, 2024
@rajatjindal
Copy link
Member

hmm I must be missing something (will take a short break and try again), but it seems to be returning v0.15.2 when I use hack/dev and hack/dev dagger query --doc ./modules/compatcheck/introspection.graphql

I also tried adding a logline to engine when this api is called, and that prints v0.15.2 as well.

@jedevc
Copy link
Member

jedevc commented Dec 13, 2024

hack/dev dagger query --doc ./modules/compatcheck/introspection.graphql this is definitely expected - the dagger command here would be using v0.15.2 (it doesn't use the module it's in intentionally).

If you call this in the context of a module - it should give the right result though.

@rajatjindal
Copy link
Member

I confirmed by adding logging that the view is set to v0.14.0 (from dagger.json) when calling the function that returns a container. BUT when we call up on that container, the version seen on server side is v0.15.2 (the current engine version).

I am trying to identify how to fix this.

@jedevc
Copy link
Member

jedevc commented Dec 16, 2024

@rajatjindal what's the command you're running to see this? Are you doing dagger call server up?

(if so, then yes, the up is being called from the CLI - which is version v0.15.2, so we would expect to use v0.15.2 here).

Only modules are given the backwards compat guarantee - the CLI always consumes it's version of the API. There was a point when I considered versioning the CLI like this - but if you do this, you can very easily get problems where the CLI expects a specific API version for it's own internal queries.

@jedevc
Copy link
Member

jedevc commented Dec 16, 2024

Okay okay, I synced with @rajatjindal on this, and we worked out next steps.

So - the reason this breaks is this:

  • server returns a Container. We then call up on it (in the context of the cli, which has been upgraded to v0.15).
  • I think the right thing to do here is to change server to return a Service instead of a Container - it should be a ready-to-run Service. In this case, the Service conversion would happen completely as intended with module compat.
  • But when returning a Container and converting it to a Service once we've crossed this boundary just can't really work.

Aside from the fixes we'll make below - I think it's worth updating snippetbox to return a Service from this function.

Note

Why can't we just use the API versioned from the module in the CLI? Why can't we just have Container.up here use the v0.14.0 API from the module instead of the v0.15.0 API from the CLI?

This is a deliberate design decision we made when doing module compat - without this, it's really weird. Suppose we make a breaking change to an API in withExec. Now, if that applies on the CLI as well, as an external caller, I need to modify my call to match the API version of the module I'm calling - which isn't easy to guess or know. This would be really confusing, so we just present one version constantly - the CLI always gets it's version of the API (v0.15.0 in this case).

That said, the current state is horribly difficult to migrate to, so here's what we're going to do:

  • Container.up isn't properly versioned - you'd only hit this if you called Container.up in a module, so that's not exactly the issue here, but we should still fix it. @jedevc will fix this in Container.up consistency with Container.asService #9231.
  • Container.up should have all the args from Container.asService, and all the args from Service.up - this would let you do dagger call server up --args=.... @jedevc will fix this in Container.up consistency with Container.asService #9231
  • Container.up and Container.asService should fail if no args are provided anywhere. This'll fix the infinite hanging problem where the last WithExec will be run as default, and also catch this case. @rajatjindal will fix this - Services hang if no entrypoint, args or withdefaultargs is used #9198.
    • This would still break the example in snippetbox that you're using @levlaz - but it would provide a significantly improved error message.
    • We might also extend this to version WithDefaultArgs - if snippetbox's server method used WithDefaultArgs, we wouldn't trigger this error - but if we add a versioned WithDefaultArgs, we could be able to detect this case, and error out. If we do this, @jedevc will fix this.

I think this is the best we can do, keeping the API design that we've done - sadly, I don't think there's a feasible way to get the snippetbox module working with v0.15.0 as-is, with no changes to either the module or the call command that invokes it. I think this is probably a miss from us, we should have caught this significantly earlier in the process - might be worth a retro to work out how to not do this again.

The aim is to try and fix this ASAP, and get it into v0.15.2 sometime early this week.

@marcosnils
Copy link
Contributor

Thx for the writeup Jeremy, it's super clear.

Dagger does not use the container ENTRYPOINT by default inAsService or WithExec functions, but uses CMD/defaultArgs by default. This allows the CMD/defaultArgs to be overridden by args provided to AsService or WithExec.

Should we expand this a bit and briefly explain why?

marcosnils added a commit to marcosnils/dagger that referenced this issue Dec 18, 2024
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Dec 18, 2024
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Dec 18, 2024
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Dec 18, 2024
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Dec 18, 2024
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Dec 18, 2024
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Dec 18, 2024
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
@marcosnils
Copy link
Contributor

@jpadams let's not forget to update your last comment based on the latest interactions with @shykes 🙏

@jpadams
Copy link
Contributor

jpadams commented Dec 21, 2024

PROPOSAL
tl;dr: For calls to AsService on a Container without args and with NO changes/overrides of the ENTRYPOINT+CMD, Dagger will execute ENTRYPOINT+CMD on service start.

  • note: CMD is aka DefaultArgs in Dagger
  • note: also applies to Up when used as syntactic sugar for AsService+Up
  • see In practiceAsService/Up below for more detail

Said another way: For AsService (doesn't apply to WithExec), if you don't "break glass" and change the image's ENTRYPOINT+CMD with WithEntrypoint, WithoutEntrypoint, WithDefaultArgs, WithoutDefaultArgs, or with args on the AsService itself, then Dagger would just use the image's ENTRYPOINT+CMD.

Terminology note:

  • ENTRYPOINT/Entrypoint/entrypoint is the Docker/OCI/Dagger term. command is the corresponding k8s term.
  • CMD/Cmd/cmd is the Docker/OCI term. DefaultArgs (various case) is the Dagger term. args is the corresponding k8s term.

Users being bit by need for UseEntrypoint:
https://discord.com/channels/707636530424053791/1320415936213155891/1320430956397592733
https://discord.com/channels/707636530424053791/1319810585399001148/1319834611643187271

The Dagger philosophy:
Dagger combines both build capabilities of Dockerfile / docker build and run capabilities of docker run (interactive and service/daemon mode) and docker compose, while adding additional capabilities not available in those tools. Our desire is for a unified experience with a DX that is intuitive for both experienced users of Docker and Kubernetes as well as users new to containers and pipelines.

In most cases, for the best containerized pipelines-as-code experience, users should not rely upon the OCI container image’s ENTRYPOINT or CMD, but rather explicitly state their desired arguments in the WithExec or AsService functions as args. In this way, the arguments are kept “close” to the point of execution and users avoid any confusion that could come about from relying on the implicit ENTRYPOINT+CMD behavior used by default in Docker and Kubernetes or the ENTRYPOINT+args behavior in docker run. There is an exception to this rule to accommodate the ecosystem of service containers (e.g. database, webserver, cache, proxy) that use ENTRYPOINT to execute an entrypoint.sh script to perform essential service configuration (see AsService below).

Note: In order to have compatibility with the Docker and Kubernetes's ecosystems, users have full control over setting the ENTRYPOINT and CMD when building container images for consumption outside of Dagger.

In practice, we should support:
AsService/Up - Since pre-built container images for popular long-running services (e.g. database, webserver, cache, proxy) often use ENTRYPOINT+CMD to execute an entrypoint.sh script to perform essential service configuration (spreadsheet of popular images), for calls to Up or AsService on a Container without args and with NO mutations/overrides to the ENTRYPOINT+CMD/DefaultArgs, execute ENTRYPOINT+CMD (one or both of ENTRYPOINT or CMD could be empty) on service start as if useEntrypoint=True were set (as in [1], [2] below).

Note: If both ENTRYPOINT+CMD would run, but they are both empty, we should allow the natural error to occur.

Note: Calling Up directly on a Container is equivalent to first calling AsService (with no args) and then Up, and will use the same CMD/DefaultArgs.

[1] dagger shell -c 'container | from postgres | with-exposed-port 5432 | with-env-variable POSTGRES_PASSWORD postgres | as-service | up'

[2] dagger shell -c 'container | from postgres | with-exposed-port 5432 | with-env-variable POSTGRES_PASSWORD postgres | up'

Note today in 0.15.1, you need to use:
[3] dagger shell -c 'container | from postgres | with-exposed-port 5432 | with-env-variable POSTGRES_PASSWORD postgres | as-service --use-entrypoint | up'
or
[4] dagger shell -c 'container | from postgres | with-exposed-port 5432 | with-env-variable POSTGRES_PASSWORD postgres | with-default-args docker-entrypoint.sh,postgres | as-service | up'
or (should also work in 0.15.1, but has a shell bug)
[5] dagger shell -c 'container | from postgres | with-exposed-port 5432 | with-env-variable POSTGRES_PASSWORD postgres | as-service --args docker-entrypoint.sh,postgres | up'
so here's the dagger core version:
dagger core container from --address postgres with-exposed-port --port 5432 with-env-variable --name POSTGRES_PASSWORD --value postgres as-service --args docker-entrypoint.sh,postgres up

@jpadams
Copy link
Contributor

jpadams commented Jan 2, 2025

@shykes Looks like we have consensus on this proposal ☝️ from most of team and I know we discussed it live where I believe you were 👍. Could you drop a comment here to record decision to re-introduce the prior behavior of:

For calls to AsService on a Container without args and with NO changes/overrides of the ENTRYPOINT+CMD, Dagger will execute ENTRYPOINT+CMD on service start.

@shykes
Copy link
Contributor

shykes commented Jan 2, 2025

@jpadams @marcosnils @levlaz I opened a separate issue specifically for the proposed API change (better for the record): #9305

@jedevc jedevc removed this from the v0.15.2 milestone Jan 6, 2025
marcosnils added a commit to marcosnils/dagger that referenced this issue Jan 7, 2025
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jan 7, 2025
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jan 7, 2025
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jan 7, 2025
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jan 7, 2025
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jan 7, 2025
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jan 8, 2025
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jan 8, 2025
this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes dagger#9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
marcosnils added a commit that referenced this issue Jan 11, 2025
* engine: return error if service images can't be started.

this addresses the issue introduced in v0.15.0 where services will hang
to start if neither CMD or ENTRYPOINT is used.

fixes #9190

Signed-off-by: Marcos Lilljedahl <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants