Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement gRPC standard health checking service #135

Closed
jen20 opened this issue Nov 12, 2019 · 20 comments · Fixed by #304
Closed

Implement gRPC standard health checking service #135

jen20 opened this issue Nov 12, 2019 · 20 comments · Fixed by #304
Assignees
Labels
C-enhancement Category: New feature or request
Milestone

Comments

@jen20
Copy link
Contributor

jen20 commented Nov 12, 2019

Feature Request

Motivation

gRPC defines a standard health checking service, here. This is a somewhat common requirement for services.

Proposal

Implement a crate which allows this to be implemented via a simple trait and exposed via the Tonic add_service function.

Alternatives

Implement this as a crate outside tonic. This may be a better option, but I don't think there is a better place to track it than here regardless of which path we take implementing it.

@jen20 jen20 added the C-enhancement Category: New feature or request label Nov 12, 2019
@jen20 jen20 self-assigned this Nov 12, 2019
@LucioFranco
Copy link
Member

@jen20 separate crate makes sense 👍

@xd009642
Copy link
Contributor

Has there been any progress on this? Just wondering before I implement my own 😄

@jen20
Copy link
Contributor Author

jen20 commented Nov 28, 2019

@xd009642 I haven't made any yet, but do need to at some point. If you do it before Saturday you'll likely beat me to it though :-)

@xd009642
Copy link
Contributor

I think I'll be starting on Monday so you may still beat me 😄

@zakhenry
Copy link
Contributor

zakhenry commented Nov 28, 2019

I’ve done the other end of this (client health checking) using tonic - https://github.com/zakhenry/nightingale It might make sense for this lib to be in the same repo? I’d be happy to contribute 😁

It would be cool to move it into the hyperium namespace to make it more “official”

@rlabrecque
Copy link

Here's something for someone to start on, I hacked this together quickly a few weeks back, I don't think it quite follows the spec, and is kinda gross, but it was good enough for my needs initially.

pub trait NamedService {
    fn get_name() -> String;
}
use grpc::health::v1::{
    health_check_response::ServingStatus,
    server::{Health, HealthServer},
    HealthCheckRequest, HealthCheckResponse,
};
use std::collections::VecDeque;
use tonic::{transport::ServiceName, Code, Request, Response, Status};
#[allow(unused_imports)]
use tracing::{debug, error, info, trace, warn};

type HealthServiceResult<T> = Result<Response<T>, Status>;
type Stream = VecDeque<Result<HealthCheckResponse, Status>>;

#[derive(Clone, Debug)]
pub struct HealthService {
    services: Vec<String>,
}

impl HealthService {
    pub fn new(service_names: Vec<String>) -> HealthServer<Self> {
        let mut health_service = HealthService {
            services: service_names,
        };

        health_service.services.push(Self::get_name());
        health_service.services.push(String::new()); // Empty string to match against for "any service"

        HealthServer::new(health_service)
    }
}

impl NamedService for HealthService {
    fn get_name() -> String {
        <HealthServer<Self> as ServiceName>::NAME.to_string()
    }
}

#[tonic::async_trait]
impl Health for HealthService {
    async fn check(&self, request: Request<HealthCheckRequest>) -> HealthServiceResult<HealthCheckResponse> {
        if self.services.contains(&request.get_ref().service) {
            let response = Response::new(HealthCheckResponse {
                status: ServingStatus::Serving as i32,
            });

            Ok(response)
        } else {
            Err(Status::new(Code::NotFound, ""))
        }
    }

    type WatchStream = Stream;

    async fn watch(&self, _request: Request<HealthCheckRequest>) -> HealthServiceResult<Self::WatchStream> {
        Err(Status::unimplemented("not implemented"))
    }
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    LogTracer::init()?;

    let tracing_fmt_subscriber = FmtSubscriber::builder()
        .with_env_filter(EnvFilter::from_default_env())
        .finish();
    tracing::subscriber::set_global_default(tracing_fmt_subscriber)?;

    let config = Config::new();

    let addr = format!("{}:{}", config.service_ip, config.service_port).parse()?;

    let router = Server::builder()
        .add_service(RedactedService::new(&config));

    let router = router.add_service(HealthService::new(vec![
        RedactedService::get_name(),
    ]));

    router.serve(addr).await?;

    Ok(())
}

@xd009642
Copy link
Contributor

xd009642 commented Dec 1, 2019

I started something and used some of @rlabrecque's work to go off https://github.com/xd009642/tonic-health I was thinking of using a spmc queue to effectively signal the wait method and potentially adding the signal handling for graceful shutdown (although looking at it I think I won't do that here).

@xd009642
Copy link
Contributor

xd009642 commented Dec 1, 2019

Okay I'm not there yet but it's starting to look some what there. My wait implementation is still a bit up in the air though

@rlabrecque
Copy link

If you know how to get rid of the NamedService stuff, that would be fantastic!

@frederikbosch
Copy link

@xd009642 As I have also been busy with the health check service, I will share my experience, thanks for sharing yours.

From your HealthCheckService I extracted a ServiceRegister, and then I inject this ServiceRegister into the HealthCheckService. The consumer of the health check service communicates changes with the register.

The register can shared in threads. Now, my experience with Arc, Mutex and RwLock is limited. Now I use Arc<RwLock<ServiceRegister>> to inject into HealthCheckService but I am not sure if that is the best solution. Maybe it also depends on the use-case. If someone has more experience and has a suggestion in this regard, I would love to hear some feedback.

I have published my code as a gist. If @LucioFranco thinks this is going into the right direction, I could create a PR to have the register and service build into the library.

@xd009642
Copy link
Contributor

xd009642 commented Feb 3, 2020

@frederikbosch oh nice, I kept meaning to get back to this but there's so many other things I have to do. I'll try out your version and see if it sorts the issues I found with my initial impl 👍

@LucioFranco
Copy link
Member

This seems fine but I don't quite see how an end user would use this? Like how would you know that your service is unavailable? Do you have any code that shows how you may do that?

@daniel-emotech
Copy link

(so I'm also xd009642, just on my work account).

So the standard usage is if you implement graceful shutdown any pending requests will be finished, new requests rejected and the server close when the last requests are processed or a timeout has elapsed.

With that implementation, if you receive a SIGTERM the signal handler will set all the services to not serving, the services will check the registry when they receive a request and respond back with unavailable for new requests and the health check service will be send back not serving.

Anything else is less standard and more application specific

@frederikbosch
Copy link

@LucioFranco The following code is used by the end user.

// initialize service register on boot of app
let health_services = ServiceRegister::new(ServingStatus::Serving);
let guarded_services: Arc<RwLock<ServiceRegister>> = Arc::new(RwLock::new(health_services));

// inject the guarded services into some struct as service_register
pub struct SomeStruct {
    service_register: Arc<RwLock<ServiceRegister>>,
}

// and then call the service register to register the service
self.service_register.write().await.set_status("", ServingStatus::Serving).unwrap();

@LucioFranco
Copy link
Member

Ok, I think this makes sense to make a tonic-heath crate, that contains a service implementation for this. It should return a handle that can set certain services to not serving anymore. I'd like to avoid the user having to think about arc<mutex<>>. I'd be happy to vendor this within the tonic repo.

@frederikbosch
Copy link

frederikbosch commented Feb 5, 2020

I would love to learn how we can bypass Arc<Mutex<>>. I think I lack the skills for this (atm). And then I could create a PR for this.

@LucioFranco
Copy link
Member

@frederikbosch well easily, you can just make that an internal implementation detail :)

or you can use channels.

@frederikbosch
Copy link

frederikbosch commented Feb 5, 2020

@LucioFranco I did not use channels because that would require another thread, one for listening health check requests and one for awaiting updates from the channel. With the solution I presented above, you only need a thread for the health check requests. How to make Arc<Mutex<>> an implementation detail is something I would love learn.

@LucioFranco
Copy link
Member

Sure, so I think really what you need to do is just store the arc mutex internally to some struct and expose easy to use async fn to set the correct items.

@LucioFranco LucioFranco added this to the 0.2 milestone Mar 27, 2020
jen20 added a commit that referenced this issue Mar 27, 2020
This commit adds a new crate `tonic-health` which implements the
[standard GRPC Health Checking][checking] protocol.

Currently there is only a server implementation, though others have
alluded in the discussion in #135 that client implementations exist
which could also be imported as necessary.

A example server has also been added - once the client work is done a
client for this should be added also.

[checking]: https://github.com/grpc/grpc/blob/master/doc/health-checking.md
@jen20 jen20 linked a pull request Mar 27, 2020 that will close this issue
5 tasks
jen20 added a commit that referenced this issue Mar 28, 2020
This commit adds a new crate `tonic-health` which implements the
[standard GRPC Health Checking][checking] protocol.

Currently there is only a server implementation, though others have
alluded in the discussion in #135 that client implementations exist
which could also be imported as necessary.

A example server has also been added - once the client work is done a
client for this should be added also.

[checking]: https://github.com/grpc/grpc/blob/master/doc/health-checking.md
jen20 added a commit that referenced this issue Mar 30, 2020
This commit adds a new crate `tonic-health` which implements the
[standard GRPC Health Checking][checking] protocol.

Currently there is only a server implementation, though others have
alluded in the discussion in #135 that client implementations exist
which could also be imported as necessary.

A example server has also been added - once the client work is done a
client for this should be added also.

[checking]: https://github.com/grpc/grpc/blob/master/doc/health-checking.md
jen20 added a commit that referenced this issue Mar 30, 2020
This commit adds a new crate `tonic-health` which implements the
[standard GRPC Health Checking][checking] protocol.

Currently there is only a server implementation, though others have
alluded in the discussion in #135 that client implementations exist
which could also be imported as necessary.

A example server has also been added - once the client work is done a
client for this should be added also.

[checking]: https://github.com/grpc/grpc/blob/master/doc/health-checking.md
jen20 added a commit that referenced this issue Mar 30, 2020
This commit adds a new crate `tonic-health` which implements the
[standard GRPC Health Checking][checking] protocol.

Currently there is only a server implementation, though others have
alluded in the discussion in #135 that client implementations exist
which could also be imported as necessary.

A example server has also been added - once the client work is done a
client for this should be added also.

[checking]: https://github.com/grpc/grpc/blob/master/doc/health-checking.md
jen20 added a commit that referenced this issue Mar 30, 2020
This commit adds a new crate `tonic-health` which implements the
[standard GRPC Health Checking][checking] protocol.

Currently there is only a server implementation, though others have
alluded in the discussion in #135 that client implementations exist
which could also be imported as necessary.

A example server has also been added - once the client work is done a
client for this should be added also.

[checking]: https://github.com/grpc/grpc/blob/master/doc/health-checking.md

Fixes #135.
@frederikbosch
Copy link

Awesome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants