Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow secure communication between components #458

Closed
4 of 7 tasks
jpkrohling opened this issue Oct 6, 2017 · 25 comments
Closed
4 of 7 tasks

Allow secure communication between components #458

jpkrohling opened this issue Oct 6, 2017 · 25 comments
Assignees
Labels

Comments

@jpkrohling
Copy link
Contributor

jpkrohling commented Oct 6, 2017

Update 2019-09-20: replaced by #1718

Document and/or implement secure communication channels[1] between components, like:

This is related to #404 .

1 - TLS for HTTP, but not sure how it would work with thrift

@jpkrohling
Copy link
Contributor Author

Current state:

Jaeger uses other libraries to handle the communication with remote components, like storage access (Cassandra / Elasticsearch). Its HTTP endpoints are not encrypted, which might/should be solved by using a reverse proxy in front of the component to be protected. The tracer components are able to send data via HTTPS to a remote collector server. The Agent is not able yet to send data to the collector using a secure communication channel.

@yurishkuro yurishkuro changed the title Allow security communitication between components Allow secure communication between components Oct 6, 2017
@Dieterbe
Copy link
Contributor

Dieterbe commented Nov 3, 2017

FYI my use case:

  • we have various k8s clusters in various locations (datacenters) in the US (though in the future likely in other continents) and using various cloud providers
  • in each k8s cluster we have various services generating spans and complete traces (currently a single trace always comes from 1 location only, we don't have requests that span across multiple locations, though that may come later)
  • we want 1 central jaeger deployment in our 'ops' cluster, it has a multi-node cassandra cluster in that cluster, the cluster does not extend into the other locations across the internet. everything (cassandra etc) is contained within that one location
  • i honestly don't care much whether to run the collector centrally in the ops cluster, or run collectors in each location and then those collectors talk to the central ops cassandra over the internet (though that does sound a bit weird). given that collectors can be scaled linearly (at least that's what it looks like) it seems better to run them centrally, and then have agents in each location talk over the internet to the collectors (that's also what people seem to be recommending)
  • obviously we don't want people to be able to sniff our jaeger traffic, because they contain confidential information, and we don't want anyone be able to send crap into our environment, so we need encryption+authentication.
  • this prompted me to enquire with my infra folks about a vpn/secure tunnel between the locations and the central ops cluster. they informed me of various limitations ("some of our cloud providers do not provide a layer2 network, so it is not possible to add custom routes", "in some locations the nodes/pods in each cluster cant connect out"), so they're asking instead for an application level solution such as https with auth.
  • i'm not sure if going from tchannel to http(s) is a big performance downgrade, a secure encrypted/authenticated tchannel would also for me I suppose.
  • simple solutions are good solutions, maybe it's just a matter of terminating ssl and basic auth via a kubernetes ingress

@yurishkuro
Copy link
Member

Agent to collector path is using tchannel for legacy reasons. I would much rather use grpc, which will have standard support for https.

@yurishkuro yurishkuro added this to the Core Infra Best Practices milestone Nov 30, 2017
@rbtcollins
Copy link
Contributor

Hi @Dieterbe we had much the same use case with a of variation:

  • we're deliberately not including confidential info (PII, customer data) in our traces: we want the traces to be accessible to all the teams and not have to try to do masking on a per-span basis!

So what we've done is deploy cassandra and the query centrally, and then put an agent on every node via a daemonset (to avoid the per-pod overheads of sidecars), and a collector ha pair for the whole k8s cluster, then used TLS client certs to secure the collector -> cassandra traffic, and the user -> query traffic.

We had to improve some bits of Jaeger to permit this, but I think they have all been merged now, though I haven't fully verified the dependency job change in prod for us (soon though).

We aren't worried about sniffing of agent -> collector traffic w/in our k8s clusters, and the rest is secured (or localhost only).

@yurishkuro
Copy link
Member

Cf #773 for gRPC work.

One question I have about using HTTPS is what's the accepted practice for certificates? Are we ok to use some internally generated certificate for the servers in the collectors? If someone has a link to a blog post discussing this it would be appreciated.

@sneko
Copy link

sneko commented May 6, 2018

@yurishkuro I'm using ES operator (https://github.com/upmc-enterprises/elasticsearch-operator) to manage ES clusters on Kubernetes. The operator can set up Kibana and Cerebro at same time while enabling secured communication over HTTPS.

They are using an opaque secret to store differents files related to certs:

Name:         es-certs-elasticsearch-cluster
Namespace:    logging
Labels:       <none>
Annotations:  <none>

Type:  Opaque

Data
====
kibana.pem:         1631 bytes
cerebro-key.pem:    1675 bytes
kibana-key.pem:     1679 bytes
cerebro.pem:        1631 bytes
node-key.pem:       1679 bytes
node-keystore.jks:  3506 bytes
node.pem:           1631 bytes
truststore.jks:     1032 bytes
ca-key.pem:         1679 bytes
ca.pem:             1367 bytes

Then they are mounting a volume at /elasticsearch/config/certs that Kibana/Cerebro can use.

I'm not sure you were expecting this kind of information, or if it's the best but that's a possible way to secure Jaeger <-> ESCluster 😃

@jpkrohling
Copy link
Contributor Author

One question I have about using HTTPS is what's the accepted practice for certificates?

IMO, it's sufficient for us to just add a couple of configuration options:

  • what cert to offer to clients
  • what the key is to decrypt the content the client is sending
  • which CA cert to use when trusting server certs

Platforms like OpenShift and Kubernetes are able to generate certs on demand via an internal CA, as well as rotate the certs/keys based on certain rules. This is not the kind of knowledge we want within our code.

@rbtcollins
Copy link
Contributor

@yurishkuro The key behavioural decisions deployers will be making are:

  • what CA to issue client certs from
    • often becomes a custom trust root on the server side
  • what CA to issue server certs from (they can be different)
    • if a private CA will be a trust root on the clients

This translates into the config options that @jpkrohling mentioned, though that list is incomplete.

The full set for a single direction of authentication is:

  • public side of cert
  • cert to present (aka private key)
  • trust chain for verification of certs (sometimes delivered in the public side of the server cert, but logically separate)
  • trust root (to be installed on the component verifying the creds. Note that for client certs this is typically supplied explicitly, even if it is a sub-CA, because you don't want any cert from the root CA to be considered a valid client.

So there are up to 8 unique config values in the most complex case of having two private CA's.

@jpkrohling
Copy link
Contributor Author

The full set for a single direction of authentication is:

We are talking about different things here. You are probably talking about Mutual TLS authentication, whereas I'm talking about only encrypting the communication channel.

I still believe that auth should be handled at the infra layer. Mutual TLS Auth fits this scenario and can be easily accomplished by tools like Istio. At most, we should allow clients to send auth data (basic HTTP auth, bearer tokens), but that's it.

To allow secure communications, on the other hand, all we need to do is pass the cert data to the underlying handler, so, there's minimal code on our side.

@rbtcollins
Copy link
Contributor

@jpkrohling If its just the channel that needs encrypting, OE can be used without any certificate authority at all: I believe you're really talking about authenticating the well known endpoint and encrypting the channel, otherwise no CA would be involved in the discussion.

There's minimal code on our side for handling client certificates as well: its really quite straight forward. I think that we should either say 'deploy all our components behind a service-mesh or similar layer, running only on localhost and using an outbound proxy', or support things fully. Doing half-a-TLS support is worse than none IMO because it leads folk into a setup that cannot grow with them.

@jpkrohling
Copy link
Contributor Author

If its just the channel that needs encrypting, OE can be used without any certificate authority at all: I believe you're really talking about authenticating the well known endpoint and encrypting the channel, otherwise no CA would be involved in the discussion.

The Certificate Authority is to tell the client side of the communication that the cert being offered by the server is to be trusted. Otherwise, there could be a man in the middle intercepting the traffic. It's particularly relevant if the server certificate was generated by an internal CA like Kubernetes' Service CA.

(I think I should know what OE is about, but I'm currently having a blank...)

There's minimal code on our side for handling client certificates as well: its really quite straight forward

If we are just delegating CLI options to the underlying library, I'm all for it. But it should not be a feature of Jaeger.

Doing half-a-TLS support is worse than none IMO because it leads folk into a setup that cannot grow with them

Client Auth Cert is quite different and significantly more complex than just encrypting a pipe using TLS. I don't think we should mix this issue with auth at all.

@jpkrohling
Copy link
Contributor Author

If we are just delegating CLI options to the underlying library, I'm all for it. But it should not be a feature of Jaeger.

I mean something like what is being requested by #678

@justinclift
Copy link

justinclift commented Jan 2, 2019

Looking at this too, for an initial small deployment on servers in Scaleway.

It seems like Jaeger Query doesn't (at present) have any support for clients wanting to access it via HTTPS/TLS.

That part should be fairly straight forward to implement, as (in the simplest case) it's just a slightly different Go library call. http.ListenAndServeTLS() instead of http.ListenAndServe()

The TLS version of the call just needs a certificate file and key file supplied.

For our use case, they'd be generated by LetsEncrypt. The cert and key files would be passed via command line, or config file argument. Something like:

  • --query.certificate-file string        Path to the TLS certificate file
  • --query.certificate-key-file string Path to the key file for the TLS certificate

Does that sound reasonable? 😄

@yurishkuro
Copy link
Member

We have precedent for TLS for storage, so should be using consistent flag names, e.g.

      --cassandra.tls                                   Enable TLS
      --cassandra.tls.ca string                         Path to TLS CA file
      --cassandra.tls.cert string                       Path to TLS certificate file
      --cassandra.tls.key string                        Path to TLS key file
      --cassandra.tls.server-name string                Override the TLS server name
      --cassandra.tls.verify-host                       Enable (or disable) host key verification (default true)

      --es.tls                                     Enable TLS
      --es.tls.ca string                           Path to TLS CA file
      --es.tls.cert string                         Path to TLS certificate file
      --es.tls.key string                          Path to TLS key file

@justinclift
Copy link

justinclift commented Jan 2, 2019

Ahhh. So more like this?

--query.tls.cert string   Path to TLS certificate file
--query.tks.key string    Path to TLS key file

@justinclift
Copy link

Hmmm, it should be possible to provide a query.tls.ca option as well, but I'd have to look into it more. Pretty sure it just means the TLS setup needs to be done a bit differently first, but that's from dodgy memory and it's been ages since I wrote TLS specific handling code. 🤷‍♂️

@jpkrohling
Copy link
Contributor Author

An HTTP server typically sets only a cert (chain) and a key. The cert chain would include the CA that was used to sign the server's own cert and all upstream CAs.

@iori-yja
Copy link

iori-yja commented Feb 28, 2019

TLS option is good for collector's http as well.

Use case:
I am trying to report from AWS lambda which is usually running outside of AWS VPC, which requires collector to listen to the internet request. To keep bearer secret, it would be nice to have TLS connection on tracer->collector communication.

@jpkrohling
Copy link
Contributor Author

To keep bearer secret, it would be nice to have TLS connection on tracer->collector communication.

On the backend side, a reverse proxy could be used for this purpose. On the client side, the env var JAEGER_ENDPOINT can be used with some clients, where an HTTPS URL would be specified.

@jpkrohling
Copy link
Contributor Author

With the inclusion of gRPC between the agent and the collector, I think this item is complete, missing only an official documentation about securing the UI/Query and about the communication between the client and agent.

@jpkrohling jpkrohling self-assigned this Apr 2, 2019
@tcolgate
Copy link
Contributor

tcolgate commented Jun 7, 2019

The existing gRPC TLS code doesn't support authenticating the clients. In TLS terms, the normal thing to do is allow the clients to present a key/cert, and have the server verify that against a CA

I've taken the liberty of putting together a PR, #1591

@yurishkuro
Copy link
Member

@tcolgate go for it!

We also need to implement some basic auth and/or API key

@yurishkuro
Copy link
Member

replaced by #1718

@karnveerayush
Copy link

Hi All,

Is there a way to host Yaeger UI over HTTPS rather than HTTP?

If it is possible, what are the steps required to achieve that?

Thanks

@pavolloffay
Copy link
Member

Would have to use a separate component to secure the UI/query service. Here is one blog post that might help you https://medium.com/jaegertracing/protecting-jaeger-ui-with-an-oauth-sidecar-proxy-34205cca4bb1?source=collection_detail----99735986d50-----37-----------------------

Also our operator k8s is able to take care of securing the UI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants