proposal for v3 auth #4475

xiang90 · 2016-02-10T03:39:02Z

We have a role based auth system for v2. We are pretty happy about its simplicity and flexibility.

For v3, we will probably adopt the similar role based system. But we need to solve a few things:

the auth should be connection based, not request based. We want to use gRPC handshake or cert to solve this.
the auth API management API should always return the detailed view of roles and users.

raoofm · 2016-02-10T18:33:22Z

the auth should be connection based, not request based. We want to use gRPC handshake or cert to solve this.

Correct me if I'm wrong, I think this will partially solve the current performance issue with etcd auth mentioned here #3223

If it doesn't, then what is the plan to solve this issue for v3?

xiang90 · 2016-02-10T18:41:04Z

If it doesn't, then what is the plan to solve this issue for v3?

It will be almost the same, just at gRPC layer.

mitake · 2016-02-15T14:30:46Z

@xiang90 I have a question about non quorum read in v2 protocol. Will it provided in v3?

It is important for auth, because the non quorum read allows stale read result but improve performance. It is hard to integrate such a functionality with auth. Because auth information should also be stored in the raft storage and checking the information in consistent manner will sacrifices the benefit of the stale read.

IIUC, the lease mechanism can provide consistent and low cost read in v3. Therefore non quorum read will not be required. It is an important point of the new auth mechanism so I want to confirm it first.

xiang90 · 2016-02-15T15:56:49Z

@mitake v3 enabled q-read by default. https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto#L121-L126

mitake · 2016-02-15T23:54:15Z

@xiang90 I didn't notice the flag, thanks!

Because of the reason I described in the above, it is difficult for non quorum read to coexist with the auth mechanism. For simplicity, rejecting non quorum read when auth is enabled would be safer. Is it acceptable?

xiang90 · 2016-02-16T04:23:07Z

Because of the reason I described in the above, it is difficult for non quorum read to coexist with the auth mechanism. For simplicity, rejecting non quorum read when auth is enabled would be safer. Is it acceptable?

As we discussed:

admin should first blacklist all requests and then add role/perms to whitelist.
auth is not designed for frequent role/perm changes.

What you mentioned might make sense, but it is not what we want to ensure with from the very beginning.

If you have a specific use case that requires it, please share it with us. Then we can reconsider it.

mitake · 2016-02-16T04:42:57Z

admin should first blacklist all requests and then add role/perms to whitelist.
auth is not designed for frequent role/perm changes.

Should these restriction of operation be introduced in v3, too? I thought

the auth API management API should always return the detailed view of roles and users.

means the inconsistent state isn't allowed in v3. Am I misunderstanding?

heyitsanthony · 2016-02-16T04:58:43Z

@mitake to be clear, does this describe the auth problem?

A requests role change to deny B's access to "k"; quorum
A receives role change response; revision = n
B requests data for "k", non-quorum
B receives data for "k"; revision < n

mitake · 2016-02-16T05:02:17Z

@heyitsanthony yes. I'm thinking about the case you described.

heyitsanthony · 2016-02-16T05:18:35Z

@mitake OK, is there a threat that's handled with consistent auth? I don't think it fixes stale reads. For example, B can still set up a proxy that watches on "k" which would serve a stale "k" even after the access denial goes through.

mitake · 2016-02-16T05:39:03Z

@heyitsanthony of course proxies will introduce a chance of inconsistency. But I don't understand how proxy will interact with other features in v3 protocol (it will be redesigned for v3, right? #4318).

I was thinking that lease and watch provide consistent client side caching for improving read performance. Therefore I still don't understand the benefit of serializable range in v3. So I'd like to separate it from the auth problem at first.

Of course I might be missing something. I'm happy if you could point out my misunderstanding kindly :)

heyitsanthony · 2016-02-16T06:03:40Z

@mitake I was talking about a rogue proxy with B's credentials which ignores auth. My point is that any data available prior to the denial revision is already as good as leaked to the client; read consistency with auth can't take it back.

mitake · 2016-02-16T06:54:54Z

@heyitsanthony OK, I understand your point, thanks for your explanation. It seems reasonable because newly created values (after enabling auth) cannot be read without auth. The commit message of auth enabling must be received by followers' state machine before arrival of the newly created values (I needed a time for understanding it, sorry!).

I have a next question (sorry for frequent questions): if v3 auth provides per connection auth (as described in the first message of this issue), the inconsistent state would be matter because even once a connection is authorized, it will be invalid after auth configuration change. How should we provide the efficient auth mechanism?

I'm thinking that using authentication mechanism of gRPC (as current implementation does) and letting messages from clientv3 have permission information (user ID, password) would be the simplest way. Permission check is done at every read write request processing by etcd servers. Is it ok, @xiang90 ?

mitake · 2016-02-18T04:42:10Z

@xiang90 @heyitsanthony I'm writing a design doc for the v3 auth and ACL here: https://docs.google.com/document/d/1Zs5lc4H8kHr4sgoniKlL6YRsLKFNy8m-1pyGhkCmmO4/edit?usp=sharing

I'm really glad if you can review and comment on it.

heyitsanthony · 2016-02-18T07:56:39Z

@mitake took a look, left some comments. Thanks for taking the time to write the doc.

mitake · 2016-02-22T07:08:07Z

@xiang90 @heyitsanthony thanks for your comments on the doc.

It seems that the consensus on the design is almost achieved. I'd like to have a clear consensus on this point:

every request requires permission check
a leader node checks permission for linearizable read and write
both of leader and followers check permission for non linearizable read

Is this ok for you?

xiang90 · 2016-02-22T07:21:17Z

@mitake Authentication per connection. Re-Authentication when the password of a user is changed.

mitake · 2016-02-22T07:27:18Z

@xiang90 yes, I agree with your points. How do you think about the design of the permission check?

mitake · 2016-02-25T23:22:45Z

@xiang90 @heyitsanthony updated the doc for a plan of implementation. Could you take a look?

mitake · 2016-03-01T06:00:30Z

@xiang90 listed RPCs in the doc. Could you review? Can I start implementation?

xiang90 · 2016-03-01T06:28:31Z

@mitake RPC looks good. Let's write the proto and send a pull request?

mitake · 2016-03-01T06:42:26Z

@xiang90 thanks, I'll send a PR later.

rominf · 2016-03-04T06:55:03Z

It seems to me that it's only possible to set a plaintext passwords via API v2. Is that correct? If so, I propose to allow setting passwords using hash.

mitake · 2016-03-04T06:58:15Z

@rominf Even in v2 API, passwords are stored in etcd store after hashing. Do you want to hash passwords in client side?

rominf · 2016-03-04T07:05:58Z

@mitake Yes, I do. I'm talking about changing:

Change password

PUT /v2/security/users/charlie/password

Sent Headers:
    Authorization: Basic <BasicAuthString>
Put Body:
    {"user": "charlie", "password": "newCharliePassword"}
Possible Status Codes:
    200 OK
    403 Forbidden
    404 Not Found
200 Headers:
    ETag: "users/charlie:<tzNow>"
200 Body:
    JSON user struct, updated

to

Change password

PUT /v2/security/users/charlie/password

Sent Headers:
    Authorization: Basic <BasicAuthString>
Put Body:
    {"user": "charlie", "password": "newCharliePassword"}
    OR using SHA512 hashed with salt password:
    {"user": "charlie", "password": "$6$ke3amfbPHV0WalYF$df2lVawKuP4gc.gpZuGMlmA27R4bvuKgf.YO/0bRR1tQZEIQ1kpXVxPmQyZ31.ieGFuJvU7UtXyibVRZtGamT/"}
Possible Status Codes:
    200 OK
    403 Forbidden
    404 Not Found
200 Headers:
    ETag: "users/charlie:<tzNow>"
200 Body:
    JSON user struct, updated

mitake · 2016-03-04T07:13:53Z

@rominf I understand. @xiang90 @heyitsanthony how do you think about the idea of client side password hashing? I think it is reasonable.

heyitsanthony · 2016-03-04T07:19:46Z

@mitake I'm not sure if it really improves security; the hashed password can still be used in a replay attack unless the server provides a nonce for the hash salt.

rominf · 2016-03-04T07:57:39Z

What's the purpose of nested dictionary for permissions? Why:

 { "role" : "fleet", "permissions" : { "kv" : { "read" : [ "/fleet/" ], "write": [ "/fleet/" ] } }, "grant" : {"kv": {...}}, "revoke": {"kv": {...}} }

but not:

 { "role" : "fleet", "permissions" : { "read" : [ "/fleet/" ], "write": [ "/fleet/" ] }, "grant" : {...}, "revoke": {...} }

?

mitake · 2016-04-14T06:07:42Z

@xiang90 @heyitsanthony I'd like to use JWT (https://jwt.io/) for the token of v3 auth. JWT seems to be a simple and mature token format. In addition, it enables verifying tokens based with the RSA key mechanism. So it would be friendly for the new proxy of v3 API. Proxies will be able to verify requests from clients, it would be a great help for reducing read traffic.

I'd like to have a consensus about using JWT (it will introduce a new vendor, I'm considering to use it https://github.com/dgrijalva/jwt-go). Can I hear your opinion?

xiang90 · 2016-04-14T14:38:00Z

@mitake What do you want to put inside JWT? What benefits JWT will bring us in our use case besides proxy (Proxy should either terminate TLS/Auth or just be transparent in my opinion)? I would image a simple obscure random token will do the work.

mitake · 2016-04-15T13:44:48Z

@xiang90 The main benefits from JWT are like below:

It can provide a signed token that cannot be created illegally. So the token is helpful for protecting data from malicious or buggy clients. At first I also thought random token would be enough for the purpose (as described in the doc). However, it is hard to analyze how much randomness is required for preventing the problems of the malicious or buggy clients. JWT is built on proven technique, so using it is safer than using ad-hoc my own solution.
With JWT, a server can verify a token sent from a client with its RSA public key. If proxies share a common key with servers, proxies can verify requests from clients by themselves. It will contribute to reducing get request traffic from proxy to server (the most important feature of proxy I think).

xiang90 · 2016-04-15T14:23:44Z

It can provide a signed token that cannot be created illegally. So the token is helpful for protecting data from malicious or buggy clients. At first I also thought random token would be enough for the purpose (as described in the doc). However, it is hard to analyze how much randomness is required for preventing the problems of the malicious or buggy clients. JWT is built on proven technique, so using it is safer than using ad-hoc my own solution.

As long as the token space and sign key space are the same, I feel the problem is pretty much equivalent unless when you have a significant amount of tokens compared to the token space . A client can guess a token, and a client can guess a sign key too.

With JWT, a server can verify a token sent from a client with its RSA public key. If proxies share a common key with servers, proxies can verify requests from clients by themselves. It will contribute to reducing get request traffic from proxy to server (the most important feature of proxy I think).

This is the benefit I could see since JTW actually provides since it contains the auth info and the info verifiable by both parities. However, the proxy issue you mentioned is different from what we are trying to solve right now. Token Invalidation can be complicated. I feel we should first finish the simplest solution. Then we can explore more about a better token mechanism.

mitake · 2016-04-18T02:17:58Z

A client can guess a token, and a client can guess a sign key too.

JWT's token space is limited with RSA based sign. So guessing it is harder than just a random number.

Token Invalidation can be complicated.

JWT allows tokens to have various metadata. So we can let the a token to have a revision of storage when its client was authorized. We can check the authentication is obsolete or not based on the numbers. It doesn't require a heavy invalidation mechanism.

I feel we should first finish the simplest solution. Then we can explore more about a better token mechanism.

Actually I want to use JWT for simplifying auth package without reinventing the wheel. My handmade token mechanism will make the package bloat and it will be abandoned in the near future. Using well designed and reviewed methodology will make the package simpler and will minimize API change, I think.

xiang90 · 2016-04-18T03:31:47Z

Actually I want to use JWT for simplifying auth package without reinventing the wheel.

What does JWT provide use than a obscure token right now?

My handmade token mechanism

I feel it should be just a getToken method that randomly generate a fixed length token. It wont take a lot of effort and it allows us to evaluate what the token should be in the future aside from the auth implementation.

xiang90 · 2016-04-18T03:34:37Z

JWT's token space is limited with RSA based sign. So guessing it is harder than just a random number.

I mean the attacker can guess the private key of the RSA anyway. I do not feel there is a big win. A long enough true random token would be equally safe.

JWT allows tokens to have various metadata.

By invalidation, I mean at proxy side. If we do not want the token to have any meaningful information to be validated against, then we do not need JTW.

mitake · 2016-04-18T04:28:41Z

OK, then I'll implement the simple methodology first. Anyway, it would be useful for easy trial that doesn't require public/private keys.

But,

I mean the attacker can guess the private key of the RSA anyway.

I don't think this is a realistic assumption. It would mean public key encryption methodologies cannot be used. And the random number methodologies always have a problem of exhausting entropy in machines.

aarondav · 2016-04-18T16:07:53Z

@mitake ACLs seem to be a part of the v3 auth design (they're already documented as a feature on the etcd home page!), but I don't think they're yet implemented -- correct me if I'm wrong. Is there a timeline for that part of the implementation? We'd really like to make use of path-based restrictions on users.

Edit: #2384 seems to indicate that there is some support for keyspace restrictions. Are ACLs fully implemented, or just keyspace restrictions per user? Apologies for my misunderstanding.

mitake · 2016-04-19T02:27:55Z

Hi @aarondav , thanks for your interest. As you found, the feature is already implemented in v2 API. We are working on auth and ACL in v3 API. v2 auth is already implemented and ready for using (it is contained in v2.3.0 release).
The feature of v3 API will be finished in May. Will you use v3 API? If so, could you wait for a while? I'll let you know about the progress.

Just curious: will you use etcd in Spark?

aarondav · 2016-04-19T02:37:18Z

Ahh, gotcha! Great, didn't realize that distinction between the v2 and v3 APIs.

We are actually using etcd via flannel in Kubernetes, and are experimenting with an architecture with multiple isolated networks sharing the same etcd, which requires that each network has access to only its sub-tree of etcd.

mitake · 2016-04-19T02:42:29Z

@aarondav I see, thanks. Flannel seems to be using v2 API, so you can use the feature with etcd v2.3.0 or newer.

xiang90 · 2016-06-17T22:35:15Z

@mitake Anything left here for 3.0? Or we can close it?

mitake · 2016-06-18T03:07:38Z

@xiang90 yes, though there are still remaining tasks e.g. jwt token and handling revision, the basic functionalities are already implemented. I agree to closing it.

xiang90 · 2016-06-18T13:15:37Z

@mitake Probably create issues for the things we want to do for the next steps: jwt + revision in token.

mitake · 2016-06-20T04:38:37Z

New issues related to auth v3:

revision aware permission checking: auth v3: revision aware permission checking #5719
jwt as token: auth v3: support jwt as auth token #5718

xiang90 added this to the v3.0.0 milestone Mar 10, 2016

shanegibbs mentioned this issue Mar 24, 2016

Support Basic Auth for Etcd v2 kubernetes/kubernetes#23398

Closed

xiang90 self-assigned this May 10, 2016

xiang90 closed this as completed Jun 18, 2016

proposal for v3 auth #4475

proposal for v3 auth #4475

Comments

xiang90 commented Feb 10, 2016

raoofm commented Feb 10, 2016

xiang90 commented Feb 10, 2016

mitake commented Feb 15, 2016

xiang90 commented Feb 15, 2016

mitake commented Feb 15, 2016

xiang90 commented Feb 16, 2016

mitake commented Feb 16, 2016

heyitsanthony commented Feb 16, 2016

mitake commented Feb 16, 2016

heyitsanthony commented Feb 16, 2016

mitake commented Feb 16, 2016

heyitsanthony commented Feb 16, 2016

mitake commented Feb 16, 2016

mitake commented Feb 18, 2016

heyitsanthony commented Feb 18, 2016

mitake commented Feb 22, 2016

xiang90 commented Feb 22, 2016

mitake commented Feb 22, 2016

mitake commented Feb 25, 2016

mitake commented Mar 1, 2016

xiang90 commented Mar 1, 2016

mitake commented Mar 1, 2016

rominf commented Mar 4, 2016

mitake commented Mar 4, 2016

rominf commented Mar 4, 2016

mitake commented Mar 4, 2016

heyitsanthony commented Mar 4, 2016

rominf commented Mar 4, 2016

mitake commented Apr 14, 2016

xiang90 commented Apr 14, 2016

mitake commented Apr 15, 2016

xiang90 commented Apr 15, 2016

mitake commented Apr 18, 2016

xiang90 commented Apr 18, 2016

xiang90 commented Apr 18, 2016

mitake commented Apr 18, 2016

aarondav commented Apr 18, 2016 • edited Loading

mitake commented Apr 19, 2016 • edited Loading

aarondav commented Apr 19, 2016

mitake commented Apr 19, 2016

xiang90 commented Jun 17, 2016

mitake commented Jun 18, 2016

xiang90 commented Jun 18, 2016

mitake commented Jun 20, 2016

aarondav commented Apr 18, 2016 •

edited

Loading

mitake commented Apr 19, 2016 •

edited

Loading