Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal for v3 auth #4475

Closed
xiang90 opened this issue Feb 10, 2016 · 44 comments
Closed

proposal for v3 auth #4475

xiang90 opened this issue Feb 10, 2016 · 44 comments
Assignees
Milestone

Comments

@xiang90
Copy link
Contributor

xiang90 commented Feb 10, 2016

We have a role based auth system for v2. We are pretty happy about its simplicity and flexibility.

For v3, we will probably adopt the similar role based system. But we need to solve a few things:

  1. the auth should be connection based, not request based. We want to use gRPC handshake or cert to solve this.
  2. the auth API management API should always return the detailed view of roles and users.

/cc @mitake

@raoofm
Copy link
Contributor

raoofm commented Feb 10, 2016

the auth should be connection based, not request based. We want to use gRPC handshake or cert to solve this.

Correct me if I'm wrong, I think this will partially solve the current performance issue with etcd auth mentioned here #3223

If it doesn't, then what is the plan to solve this issue for v3?

@xiang90
Copy link
Contributor Author

xiang90 commented Feb 10, 2016

If it doesn't, then what is the plan to solve this issue for v3?

It will be almost the same, just at gRPC layer.

@mitake
Copy link
Contributor

mitake commented Feb 15, 2016

@xiang90 I have a question about non quorum read in v2 protocol. Will it provided in v3?

It is important for auth, because the non quorum read allows stale read result but improve performance. It is hard to integrate such a functionality with auth. Because auth information should also be stored in the raft storage and checking the information in consistent manner will sacrifices the benefit of the stale read.

IIUC, the lease mechanism can provide consistent and low cost read in v3. Therefore non quorum read will not be required. It is an important point of the new auth mechanism so I want to confirm it first.

@xiang90
Copy link
Contributor Author

xiang90 commented Feb 15, 2016

@mitake
Copy link
Contributor

mitake commented Feb 15, 2016

@xiang90 I didn't notice the flag, thanks!

Because of the reason I described in the above, it is difficult for non quorum read to coexist with the auth mechanism. For simplicity, rejecting non quorum read when auth is enabled would be safer. Is it acceptable?

@xiang90
Copy link
Contributor Author

xiang90 commented Feb 16, 2016

Because of the reason I described in the above, it is difficult for non quorum read to coexist with the auth mechanism. For simplicity, rejecting non quorum read when auth is enabled would be safer. Is it acceptable?

As we discussed:

  1. admin should first blacklist all requests and then add role/perms to whitelist.
  2. auth is not designed for frequent role/perm changes.

What you mentioned might make sense, but it is not what we want to ensure with from the very beginning.

If you have a specific use case that requires it, please share it with us. Then we can reconsider it.

@mitake
Copy link
Contributor

mitake commented Feb 16, 2016

admin should first blacklist all requests and then add role/perms to whitelist.
auth is not designed for frequent role/perm changes.

Should these restriction of operation be introduced in v3, too? I thought

the auth API management API should always return the detailed view of roles and users.

means the inconsistent state isn't allowed in v3. Am I misunderstanding?

@heyitsanthony
Copy link
Contributor

@mitake to be clear, does this describe the auth problem?

  1. A requests role change to deny B's access to "k"; quorum
  2. A receives role change response; revision = n
  3. B requests data for "k", non-quorum
  4. B receives data for "k"; revision < n

@mitake
Copy link
Contributor

mitake commented Feb 16, 2016

@heyitsanthony yes. I'm thinking about the case you described.

@heyitsanthony
Copy link
Contributor

@mitake OK, is there a threat that's handled with consistent auth? I don't think it fixes stale reads. For example, B can still set up a proxy that watches on "k" which would serve a stale "k" even after the access denial goes through.

@mitake
Copy link
Contributor

mitake commented Feb 16, 2016

@heyitsanthony of course proxies will introduce a chance of inconsistency. But I don't understand how proxy will interact with other features in v3 protocol (it will be redesigned for v3, right? #4318).

I was thinking that lease and watch provide consistent client side caching for improving read performance. Therefore I still don't understand the benefit of serializable range in v3. So I'd like to separate it from the auth problem at first.

Of course I might be missing something. I'm happy if you could point out my misunderstanding kindly :)

@heyitsanthony
Copy link
Contributor

@mitake I was talking about a rogue proxy with B's credentials which ignores auth. My point is that any data available prior to the denial revision is already as good as leaked to the client; read consistency with auth can't take it back.

@mitake
Copy link
Contributor

mitake commented Feb 16, 2016

@heyitsanthony OK, I understand your point, thanks for your explanation. It seems reasonable because newly created values (after enabling auth) cannot be read without auth. The commit message of auth enabling must be received by followers' state machine before arrival of the newly created values (I needed a time for understanding it, sorry!).

I have a next question (sorry for frequent questions): if v3 auth provides per connection auth (as described in the first message of this issue), the inconsistent state would be matter because even once a connection is authorized, it will be invalid after auth configuration change. How should we provide the efficient auth mechanism?

I'm thinking that using authentication mechanism of gRPC (as current implementation does) and letting messages from clientv3 have permission information (user ID, password) would be the simplest way. Permission check is done at every read write request processing by etcd servers. Is it ok, @xiang90 ?

@mitake
Copy link
Contributor

mitake commented Feb 18, 2016

@xiang90 @heyitsanthony I'm writing a design doc for the v3 auth and ACL here: https://docs.google.com/document/d/1Zs5lc4H8kHr4sgoniKlL6YRsLKFNy8m-1pyGhkCmmO4/edit?usp=sharing

I'm really glad if you can review and comment on it.

@heyitsanthony
Copy link
Contributor

@mitake took a look, left some comments. Thanks for taking the time to write the doc.

@mitake
Copy link
Contributor

mitake commented Feb 22, 2016

@xiang90 @heyitsanthony thanks for your comments on the doc.

It seems that the consensus on the design is almost achieved. I'd like to have a clear consensus on this point:

  1. every request requires permission check
  2. a leader node checks permission for linearizable read and write
  3. both of leader and followers check permission for non linearizable read

Is this ok for you?

@xiang90
Copy link
Contributor Author

xiang90 commented Feb 22, 2016

@mitake Authentication per connection. Re-Authentication when the password of a user is changed.

@mitake
Copy link
Contributor

mitake commented Feb 22, 2016

@xiang90 yes, I agree with your points. How do you think about the design of the permission check?

@mitake
Copy link
Contributor

mitake commented Feb 25, 2016

@xiang90 @heyitsanthony updated the doc for a plan of implementation. Could you take a look?

@mitake
Copy link
Contributor

mitake commented Mar 1, 2016

@xiang90 listed RPCs in the doc. Could you review? Can I start implementation?

@xiang90
Copy link
Contributor Author

xiang90 commented Mar 1, 2016

@mitake RPC looks good. Let's write the proto and send a pull request?

@mitake
Copy link
Contributor

mitake commented Mar 1, 2016

@xiang90 thanks, I'll send a PR later.

@rominf
Copy link

rominf commented Mar 4, 2016

It seems to me that it's only possible to set a plaintext passwords via API v2. Is that correct? If so, I propose to allow setting passwords using hash.

@mitake
Copy link
Contributor

mitake commented Mar 4, 2016

@rominf Even in v2 API, passwords are stored in etcd store after hashing. Do you want to hash passwords in client side?

@rominf
Copy link

rominf commented Mar 4, 2016

@mitake Yes, I do. I'm talking about changing:

Change password

PUT /v2/security/users/charlie/password

Sent Headers:
    Authorization: Basic <BasicAuthString>
Put Body:
    {"user": "charlie", "password": "newCharliePassword"}
Possible Status Codes:
    200 OK
    403 Forbidden
    404 Not Found
200 Headers:
    ETag: "users/charlie:<tzNow>"
200 Body:
    JSON user struct, updated

to

Change password

PUT /v2/security/users/charlie/password

Sent Headers:
    Authorization: Basic <BasicAuthString>
Put Body:
    {"user": "charlie", "password": "newCharliePassword"}
    OR using SHA512 hashed with salt password:
    {"user": "charlie", "password": "$6$ke3amfbPHV0WalYF$df2lVawKuP4gc.gpZuGMlmA27R4bvuKgf.YO/0bRR1tQZEIQ1kpXVxPmQyZ31.ieGFuJvU7UtXyibVRZtGamT/"}
Possible Status Codes:
    200 OK
    403 Forbidden
    404 Not Found
200 Headers:
    ETag: "users/charlie:<tzNow>"
200 Body:
    JSON user struct, updated

@mitake
Copy link
Contributor

mitake commented Mar 4, 2016

@rominf I understand. @xiang90 @heyitsanthony how do you think about the idea of client side password hashing? I think it is reasonable.

@heyitsanthony
Copy link
Contributor

@mitake I'm not sure if it really improves security; the hashed password can still be used in a replay attack unless the server provides a nonce for the hash salt.

@rominf
Copy link

rominf commented Mar 4, 2016

What's the purpose of nested dictionary for permissions? Why:

 { "role" : "fleet", "permissions" : { "kv" : { "read" : [ "/fleet/" ], "write": [ "/fleet/" ] } }, "grant" : {"kv": {...}}, "revoke": {"kv": {...}} } 

but not:

 { "role" : "fleet", "permissions" : { "read" : [ "/fleet/" ], "write": [ "/fleet/" ] }, "grant" : {...}, "revoke": {...} } 

?

@mitake
Copy link
Contributor

mitake commented Apr 14, 2016

@xiang90 @heyitsanthony I'd like to use JWT (https://jwt.io/) for the token of v3 auth. JWT seems to be a simple and mature token format. In addition, it enables verifying tokens based with the RSA key mechanism. So it would be friendly for the new proxy of v3 API. Proxies will be able to verify requests from clients, it would be a great help for reducing read traffic.

I'd like to have a consensus about using JWT (it will introduce a new vendor, I'm considering to use it https://github.com/dgrijalva/jwt-go). Can I hear your opinion?

@xiang90
Copy link
Contributor Author

xiang90 commented Apr 14, 2016

@mitake What do you want to put inside JWT? What benefits JWT will bring us in our use case besides proxy (Proxy should either terminate TLS/Auth or just be transparent in my opinion)? I would image a simple obscure random token will do the work.

@mitake
Copy link
Contributor

mitake commented Apr 15, 2016

@xiang90 The main benefits from JWT are like below:

  • It can provide a signed token that cannot be created illegally. So the token is helpful for protecting data from malicious or buggy clients. At first I also thought random token would be enough for the purpose (as described in the doc). However, it is hard to analyze how much randomness is required for preventing the problems of the malicious or buggy clients. JWT is built on proven technique, so using it is safer than using ad-hoc my own solution.
  • With JWT, a server can verify a token sent from a client with its RSA public key. If proxies share a common key with servers, proxies can verify requests from clients by themselves. It will contribute to reducing get request traffic from proxy to server (the most important feature of proxy I think).

@xiang90
Copy link
Contributor Author

xiang90 commented Apr 15, 2016

It can provide a signed token that cannot be created illegally. So the token is helpful for protecting data from malicious or buggy clients. At first I also thought random token would be enough for the purpose (as described in the doc). However, it is hard to analyze how much randomness is required for preventing the problems of the malicious or buggy clients. JWT is built on proven technique, so using it is safer than using ad-hoc my own solution.

As long as the token space and sign key space are the same, I feel the problem is pretty much equivalent unless when you have a significant amount of tokens compared to the token space . A client can guess a token, and a client can guess a sign key too.

With JWT, a server can verify a token sent from a client with its RSA public key. If proxies share a common key with servers, proxies can verify requests from clients by themselves. It will contribute to reducing get request traffic from proxy to server (the most important feature of proxy I think).

This is the benefit I could see since JTW actually provides since it contains the auth info and the info verifiable by both parities. However, the proxy issue you mentioned is different from what we are trying to solve right now. Token Invalidation can be complicated. I feel we should first finish the simplest solution. Then we can explore more about a better token mechanism.

@mitake
Copy link
Contributor

mitake commented Apr 18, 2016

A client can guess a token, and a client can guess a sign key too.

JWT's token space is limited with RSA based sign. So guessing it is harder than just a random number.

Token Invalidation can be complicated.

JWT allows tokens to have various metadata. So we can let the a token to have a revision of storage when its client was authorized. We can check the authentication is obsolete or not based on the numbers. It doesn't require a heavy invalidation mechanism.

I feel we should first finish the simplest solution. Then we can explore more about a better token mechanism.

Actually I want to use JWT for simplifying auth package without reinventing the wheel. My handmade token mechanism will make the package bloat and it will be abandoned in the near future. Using well designed and reviewed methodology will make the package simpler and will minimize API change, I think.

@xiang90
Copy link
Contributor Author

xiang90 commented Apr 18, 2016

Actually I want to use JWT for simplifying auth package without reinventing the wheel.

What does JWT provide use than a obscure token right now?

My handmade token mechanism

I feel it should be just a getToken method that randomly generate a fixed length token. It wont take a lot of effort and it allows us to evaluate what the token should be in the future aside from the auth implementation.

@xiang90
Copy link
Contributor Author

xiang90 commented Apr 18, 2016

JWT's token space is limited with RSA based sign. So guessing it is harder than just a random number.

I mean the attacker can guess the private key of the RSA anyway. I do not feel there is a big win. A long enough true random token would be equally safe.

JWT allows tokens to have various metadata.

By invalidation, I mean at proxy side. If we do not want the token to have any meaningful information to be validated against, then we do not need JTW.

@mitake
Copy link
Contributor

mitake commented Apr 18, 2016

OK, then I'll implement the simple methodology first. Anyway, it would be useful for easy trial that doesn't require public/private keys.

But,

I mean the attacker can guess the private key of the RSA anyway.

I don't think this is a realistic assumption. It would mean public key encryption methodologies cannot be used. And the random number methodologies always have a problem of exhausting entropy in machines.

@aarondav
Copy link

aarondav commented Apr 18, 2016

@mitake ACLs seem to be a part of the v3 auth design (they're already documented as a feature on the etcd home page!), but I don't think they're yet implemented -- correct me if I'm wrong. Is there a timeline for that part of the implementation? We'd really like to make use of path-based restrictions on users.

Edit: #2384 seems to indicate that there is some support for keyspace restrictions. Are ACLs fully implemented, or just keyspace restrictions per user? Apologies for my misunderstanding.

@mitake
Copy link
Contributor

mitake commented Apr 19, 2016

Hi @aarondav , thanks for your interest. As you found, the feature is already implemented in v2 API. We are working on auth and ACL in v3 API. v2 auth is already implemented and ready for using (it is contained in v2.3.0 release).
The feature of v3 API will be finished in May. Will you use v3 API? If so, could you wait for a while? I'll let you know about the progress.

Just curious: will you use etcd in Spark?

@aarondav
Copy link

Ahh, gotcha! Great, didn't realize that distinction between the v2 and v3 APIs.

We are actually using etcd via flannel in Kubernetes, and are experimenting with an architecture with multiple isolated networks sharing the same etcd, which requires that each network has access to only its sub-tree of etcd.

@mitake
Copy link
Contributor

mitake commented Apr 19, 2016

@aarondav I see, thanks. Flannel seems to be using v2 API, so you can use the feature with etcd v2.3.0 or newer.

@xiang90 xiang90 self-assigned this May 10, 2016
@xiang90
Copy link
Contributor Author

xiang90 commented Jun 17, 2016

@mitake Anything left here for 3.0? Or we can close it?

@mitake
Copy link
Contributor

mitake commented Jun 18, 2016

@xiang90 yes, though there are still remaining tasks e.g. jwt token and handling revision, the basic functionalities are already implemented. I agree to closing it.

@xiang90
Copy link
Contributor Author

xiang90 commented Jun 18, 2016

@mitake Probably create issues for the things we want to do for the next steps: jwt + revision in token.

@xiang90 xiang90 closed this as completed Jun 18, 2016
@mitake
Copy link
Contributor

mitake commented Jun 20, 2016

New issues related to auth v3:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants