Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic HTTP health check #4352

Merged
merged 21 commits into from
Feb 10, 2023
Merged

Add basic HTTP health check #4352

merged 21 commits into from
Feb 10, 2023

Conversation

richlander
Copy link
Member

@richlander richlander commented Jan 24, 2023

I followed these instructions: https://learn.microsoft.com/aspnet/core/host-and-deploy/health-checks.

I tested with curl and Kubernetes. Both worked.

% curl http://localhost:8000/healthz
Healthy

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

Copy link
Member

@mthalman mthalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, having the app capable of doing a health check and leaving it at that, it doesn't go far enough. We should also demonstrate how it's used in the Dockerfile. I think we should update the Dockerfiles to include a HEALTHCHECK instruction. This is still useful in plain old Docker, outside the context of Kubernetes. The health check status will show up in a docker ps output of the running sample container.

@richlander
Copy link
Member Author

richlander commented Jan 24, 2023

I will do that Matt. That's a good idea.

This issue is now more relevant: #4300. I didn't realize when we had that conversation that non-distroless images also didn't include curl or wget. It increases the desire for this feature. I'm now going to need to install one of those tools in our sample, which is a bit unfortunate.

Also, for folks just wanting a healthcheck experience in dev, installing the extra tool might be overkill.

@richlander
Copy link
Member Author

Before I go any further, I want to validate the change for just Dockerfile. I'm happy to make the same change to the others.

Before:

$ docker run --rm -d -p 8000:80 mcr.microsoft.com/dotnet/samples:aspnetapp
42f412fdb600bbdc515ec7839f2632bdb3658400f76e021b1da83b61544f8033
$ docker ps -l
CONTAINER ID   IMAGE                                        COMMAND         CREATED         STATUS         PORTS                  NAMES
42f412fdb600   mcr.microsoft.com/dotnet/samples:aspnetapp   "./aspnetapp"   4 seconds ago   Up 3 seconds   0.0.0.0:8000->80/tcp   elegant_buck

After:

$ docker run --rm -d -p 8000:80 aspnetapp
5a992022ff999d10e82992b7f729105ba12ef630655a3a5ccf0f76424d61f6bc
$ docker ps -l
CONTAINER ID   IMAGE       COMMAND                  CREATED          STATUS                             PORTS
       NAMES
5a992022ff99   aspnetapp   "dotnet aspnetapp.dll"   10 seconds ago   Up 10 seconds (health: starting)   0.0.0.0:8000->80/tcp   beautiful_kilby
docker ps -l
CONTAINER ID   IMAGE       COMMAND                  CREATED          STATUS                    PORTS                  NAMES
5a992022ff99   aspnetapp   "dotnet aspnetapp.dll"   34 seconds ago   Up 33 seconds (healthy)   0.0.0.0:8000->80/tcp   beautiful_kilby

Looks good?

RUN apt-get update \
&& apt-get install -y --no-install-recommends curl \
&& rm -rf /var/lib/apt/lists/*
HEALTHCHECK CMD curl http://localhost/healthz
Copy link
Member

@jander-msft jander-msft Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to somehow account for if the user specifies HTTPS_PORTS or HTTPS_PORTS environment variables on the container?

Copy link
Member Author

@richlander richlander Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point. Ports suck.

With .NET 8, that will be easier. This is still .NET 7. I don't know what to do. I added a comment.

@@ -12,6 +12,11 @@ RUN dotnet publish -c Release -o /app --use-current-runtime --self-contained fal

# final stage/image
FROM mcr.microsoft.com/dotnet/aspnet:7.0
RUN apt-get update \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended to be a multi-platform Dockerfile. This breaks that contract. This is another case for needing #4300.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that this approach is ugly and we should provide a helper app. My intent was to demonstrate the basic case and see where the conversation led.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks that contract.

And it breaks the build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that this approach is ugly and we should provide a helper app.

Even with the helper app, I'm wondering whether this is going to overcomplicate the sample and it should be factored out into a separate Dockerfile that demonstrates the health check.

&& apt-get install -y --no-install-recommends curl \
&& rm -rf /var/lib/apt/lists/*
# This port needs to match the port being used
HEALTHCHECK CMD curl http://localhost:80/healthz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I first saw this, I thought it was a bit weird to see the HEALTHCHECK instruction prior to the app being copied into the image. After thinking about it, it makes a bit of sense in that it never changes and therefore can better utilize the docker layer cache in a build environment. That aside I wonder if it will be confusing to the reader.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach is the correct one. The caching doesn't add much but it is caching all the same. The ASPNET_CORE_URLS ENV is added before all of this. It's just layering of one form or another.

samples/aspnetapp/aspnetapp/Program.cs Outdated Show resolved Hide resolved
@@ -12,6 +12,11 @@ RUN dotnet publish -c Release -o /app --use-current-runtime --self-contained fal

# final stage/image
FROM mcr.microsoft.com/dotnet/aspnet:7.0
RUN apt-get update \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks that contract.

And it breaks the build.

samples/aspnetapp/aspnetapp/EnvironmentInfo.cs Outdated Show resolved Hide resolved
@@ -12,6 +12,11 @@ RUN dotnet publish -c Release -o /app --use-current-runtime --self-contained fal

# final stage/image
FROM mcr.microsoft.com/dotnet/aspnet:7.0
RUN apt-get update \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that this approach is ugly and we should provide a helper app.

Even with the helper app, I'm wondering whether this is going to overcomplicate the sample and it should be factored out into a separate Dockerfile that demonstrates the health check.

@richlander
Copy link
Member Author

We could do the healthcheck in a separate Dockerfile. However, we'd want to push that version to MCR.

@richlander
Copy link
Member Author

I switched to Alpine (because wget is there) and added a new pattern that could be used to collapse other Alpine Dockerfiles. I'm happy to do that in another PR.

In our initial conversations, we were going to call this one Dockerfile.health. It would be best to avoid that if we have the goal of enable health checks throughout our fleet of samples.

I will add this to the manifest.json file once everyone is happy with the approach. I tested it and it works as expected.

Copy link
Member

@mthalman mthalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these file names should indicate their difference from the result of the samples. How about Dockerfile.healthcheck-alpine-slim and Dockerfile.healthcheck-nanoserver-slim?

The https://github.com/dotnet/dotnet-docker/blob/main/manifest.samples.json file should be updated to reference these Dockerfiles so that they will be published.

Did we also want to update https://github.com/dotnet/dotnet-docker/blob/main/samples/aspnetapp/README.md to have a blurb that pointed people to these Dockerfiles and explained the health check?

@richlander
Copy link
Member Author

On the naming, it's a question of where we are headed with this. I chose the naming w/o healthcheck with the idea that:

  • These new Dockerfiles would replace the existing slim ones (for Alpine and Nanoserver to start).
  • It would be odd to not have the health check in the Alpine and Nanoserver ones given that it is just one more (useful) line.
  • We'll design an approach for the rest of the dockerfiles in a bit.
  • We can point out that we have healthcheck examples from other docs/READMEs.

@richlander
Copy link
Member Author

richlander commented Jan 26, 2023

You can check the output of health checks via docker inspect. How you use curl and wget matters for that. I also discovered that the details of using these tools differs a fair bit.

Reference: https://stackoverflow.com/questions/47722898/how-to-do-a-docker-healthcheck-with-wget-instead-of-curl

On Alpine, the following appears to be the best approach:

HEALTHCHECK CMD wget -qO- -t1 http://localhost:80/healthz || exit 1

It produces the following output for the success case:

$ docker inspect 21ec64322542 | jq .[-1].State.Health
{
  "Status": "healthy",
  "FailingStreak": 0,
  "Log": [
    {
      "Start": "2023-01-26T23:18:42.790331305Z",
      "End": "2023-01-26T23:18:42.953258349Z",
      "ExitCode": 0,
      "Output": "Healthy"
    },
    {
      "Start": "2023-01-26T23:19:12.962127063Z",
      "End": "2023-01-26T23:19:13.095388607Z",
      "ExitCode": 0,
      "Output": "Healthy"
    }
  ]
}

And for the failure case:

$ docker inspect 0fd5206a0f1b | jq .[-1].State.Health
{
  "Status": "starting",
  "FailingStreak": 2,
  "Log": [
    {
      "Start": "2023-01-26T23:25:14.920091363Z",
      "End": "2023-01-26T23:25:15.090543745Z",
      "ExitCode": 1,
      "Output": "wget: server returned error: HTTP/1.1 404 Not Found\n"
    },
    {
      "Start": "2023-01-26T23:25:45.098815567Z",
      "End": "2023-01-26T23:25:45.168346131Z",
      "ExitCode": 1,
      "Output": "wget: server returned error: HTTP/1.1 404 Not Found\n"
    }
  ]
}

I found that Alpine and Ubuntu had different behavior, in terms of the silent behavior while still printing error information. I ended up with the following.

  • On Alpine, use -q.
  • On Ubuntu, use -nv.

For curl, I found the following:

HEALTHCHECK CMD curl -sf --show-error http://localhost:80/healthz || exit 1

FYI: curl produces a variety of error codes, while Docker requires 0 or 1. In one case, it produced an error code of 22. It's best to control the error code.

The same again, using docker inspect, this pattern produces the following output for the success case:

"Health": {
                "Status": "healthy",
                "FailingStreak": 0,
                "Log": [
                    {
                        "Start": "2023-01-26T16:05:57.2499917-08:00",
                        "End": "2023-01-26T16:05:57.3792955-08:00",
                        "ExitCode": 0,
                        "Output": "Healthy"
                    },
                    {
                        "Start": "2023-01-26T16:06:27.3966966-08:00",
                        "End": "2023-01-26T16:06:27.4224827-08:00",
                        "ExitCode": 0,
                        "Output": "Healthy"
                    }
                ]
            }

And for the failure case:

"Health": {
                "Status": "starting",
                "FailingStreak": 1,
                "Log": [
                    {
                        "Start": "2023-01-26T16:04:21.0513775-08:00",
                        "End": "2023-01-26T16:04:21.1536597-08:00",
                        "ExitCode": 1,
                        "Output": "curl: (22) The requested URL returned error: 404\r\n"
                    }
                ]
            }

Works?

Copy link
Member

@mthalman mthalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to also run eng/readme-templates/Get-GeneratedReadmes.ps1 to update the readmes due to the Dockerfile path change in the manifest.

samples/aspnetapp/README.md Show resolved Hide resolved
@Saibamen
Copy link

This feature will be enabled in base Docker images?
It could be enabled by setting ENV variable (to make sure that Developer enables Healthz on code-level).

@richlander
Copy link
Member Author

richlander commented Jan 28, 2023

That occurred to me, but it doesn't sound like the best model since it would still require configuration if another port is chosen. Setting an ENV and opting into HEALTHCHECK seems pretty similar into of extra work required. We want to ensure that all the capabilities are in place to use HEALTHCHECK. It's 100% possible today, just less pretty and convenient than we realized.

Is this a capability that you will use? Have you been using healthcheck to date? If we only implement it in these samples, will you apply that pattern to your Dockerfiles?

@richlander
Copy link
Member Author

Can I do args for the nanoserver Dockerfile? Ideally, we could avoid duplicating the Dockerfile just to accommodate the difference between the two FROMs.

@mthalman
Copy link
Member

Can I do args for the nanoserver Dockerfile? Ideally, we could avoid duplicating the Dockerfile just to accommodate the difference between the two FROMs.

Can you explain with an example? What are you looking to reuse that you'd use ARGs for?

@richlander
Copy link
Member Author

Docker.nanoserver-slim, for multiple versions of nanoserver. Right now, it is targeting just one and I'm not sure how to resolve that.

@Saibamen
Copy link

Is this a capability that you will use? Have you been using healthcheck to date? If we only implement it in these samples, will you apply that pattern to your Dockerfiles?

No, I was not using healthchecks before, but I think I will be using it for Docker images where we don't have implemented `"api/version" controller's action

@mthalman
Copy link
Member

Docker.nanoserver-slim, for multiple versions of nanoserver. Right now, it is targeting just one and I'm not sure how to resolve that.

Ah, got it. I don't have a problem using an ARG for that, as long as it gets defaulted.

@richlander
Copy link
Member Author

I was super unclear as to what I was asking. How do I specify an arg in the manifests.json file? I didn't see that used?

@mthalman
Copy link
Member

I was super unclear as to what I was asking. How do I specify an arg in the manifests.json file? I didn't see that used?

Here's an example: https://github.com/dotnet/docker-tools/blob/bf5e6ad6053b16d023b4021bef892403783709d2/src/Microsoft.DotNet.ImageBuilder/manifest.json#L28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants