ui: require web login for secure clusters #28207

couchand · 2018-08-02T16:59:02Z

Closes: #6307
Closes: #18206
Closes: #26518
Release note (admin ui change): Login is now required for secure clusters.
Users log in with a regular database username, so that user must already have a
password set. Insecure clusters do not require login, and have a visual
indicator showing that they are insecure.

cockroach-teamcity · 2018-08-02T16:59:08Z

This change is

Once cockroachdb#28207 lands, all of these endpoints will be protected by login already, so we can skip the check for the setting remote_debugging.mode. Closes: cockroachdb#24992 Release note (admin ui change): The cluster setting remote_debugging.mode no longer controls access to any web ui API endpoints, since they are already protected behind user login.

Closes: cockroachdb#6307 Closes: cockroachdb#18206 Closes: cockroachdb#26518 Release note (admin ui change): Login is now required for secure clusters. Users log in with a regular database username, so that user must already have a password set. Insecure clusters do not require login, and have a visual indicator showing that they are insecure.

Customers should have login on by default for secure clusters, but may need to turn it off if there is some bug with login that we don't know about, or if they really feel they'd rather leave security of the admin UI to auth proxys, network firewalls, etc. This commit provides that escape hatch, in a way that does not encourage the customer to use it: it's not very discoverable, and requires a restart to take effect. Release note (admin ui change): add cluster setting to disable web ui authentication on secure clusters

mberhault · 2018-08-09T19:57:23Z

LGTM. We'll want some way to disable login for 2.1, but not so official as to make it a cluster setting. I think an environment variable strikes the right balance of 1) ease of use to disable in case of issues and 2) debug-level feature that can go away in 2.2.

vilterp · 2018-08-09T20:10:20Z

I spoke to @BramGruneir, @mberhault and @vivekmenezes, who all thought that there should be a way to disable it. @piyush-singh was more on the side of not making it disable-able. We need a decision here.

mberhault · 2018-08-09T20:20:31Z

The main thinking behind the environment variable is that login can be disabled in case of issues (which is not entirely out of the realm of possibilities).

Unlike cluster settings, environment variables are about deeper settings that shouldn't be widely toggled. It also provides no guarantees about future support, meaning we can (and should) remove it in 2.2 after a full cycle without issues.

It absolutely should not be documented or advertised as a way to easily disable login. Instead, we should make users aware of it if they run into issues. We can re-evaluate the need for a cluster setting to disable login if we get a lot of push back in 2.1.

vilterp · 2018-08-09T20:32:48Z

Copied my commit to a separate PR (#28416) to avoid the confusion of a PR that hard codes it and then introduces the env var.

vilterp · 2018-08-09T21:05:43Z

Just bors r+d the other PR. There's still time to remove the env var before 2.1 if we so decide.

vivekmenezes · 2018-08-10T01:02:11Z

I'm for having an env variable. It makes sense given the variety of users we have.

…

On Thu, Aug 9, 2018 at 5:05 PM Pete Vilter ***@***.***> wrote: Just bors r+d the other PR. There's still time to remove the env var before 2.1 if we so decide. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#28207 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALOpBA2dK3oN0iw24-ell12s0fjMqWfUks5uPKQ1gaJpZM4VsqkN> .

BramGruneir · 2018-08-10T03:04:56Z

I don't understand the reluctance to have a cluster setting. We should default to higher levels of security but we shouldn't dictate how users determine their own threat model. A lot of clusters have extremely locked down access, requiring login on top of that is just punishing. Furthermore, having to restart a node to disable login is going to further confuse them and having to do this on every node is going to be even worse. This should be a cluster setting, defaulted to true. Anything else is just getting in their way.

…

On Thu, Aug 9, 2018, 21:02 vivekmenezes ***@***.***> wrote: I'm for having an env variable. It makes sense given the variety of users we have. On Thu, Aug 9, 2018 at 5:05 PM Pete Vilter ***@***.***> wrote: > Just bors r+d the other PR. There's still time to remove the env var > before 2.1 if we so decide. > > — > You are receiving this because you were mentioned. > > > Reply to this email directly, view it on GitHub > < #28207 (comment) >, > or mute the thread > < https://github.com/notifications/unsubscribe-auth/ALOpBA2dK3oN0iw24-ell12s0fjMqWfUks5uPKQ1gaJpZM4VsqkN > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#28207 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABihuTQfqZUa6MJW-cVR7cFMSeWjeZjeks5uPNuZgaJpZM4VsqkN> .

couchand · 2018-08-11T06:06:59Z

Closing, since #28416 was merged despite what seem to be reasonable, unresolved objections. Oh well.

couchand · 2018-08-11T06:56:00Z

This PR itself isn't the problem anymore, but I guess this is where the discussion is happening.

I have a number of concerns here. First and foremost, I don't understand why this is coming up again now. We've already discussed each and every point raised here, yet the prior decision (as far as I was aware) was to not go down this path. That decision seems to have been revisited and then changed without any compelling reason to revisit the decision and without any new arguments made in favor.

Generally, I think it's a bad idea to relitigate decisions made previously without presenting new information. Down that path lies madness. We need some amount of stability, and if we're just randomly changing our minds based on whomever happens to be around, that seems like the opposite of stability. It will be hard to pay down a bug list and deliver new features if we're also freely revisiting every decision we've made.

Specifically, I don't think any of the arguments made on this thread hold water. I don't know if there were other points made IRL that weren't recorded here. Since it seems like we're just going over this discussion once again, I'll add my own perspective. Let's go ahead and consider each of the arguments made on this thread. This is something I'm loathe to do at all, since I'm supposed to be on vacation right now, but because it's critical to get this issue right, I will take the time to help with the analysis. But first, some context.

We discussed previously the idea that we might allow some users to turn off authentication for the web UI. We decided that any such option would need to also disable endpoints that return sensitive information, and the UI elements for the same. We termed this feature "anonymous access" and all agreed that, while it was a good idea, it would be too expensive.

Ok, here we go with addressing the comments in this thread.

@mberhault is arguing that we need to hedge against potential bugs. A bit conservative in my mind (we should probably just invest in sufficient testing!), but not on the whole a crazy argument. He says we should just not document it and plan to remove it in 2.2. The only thing lacking is actually making the short-term nature of the switch clear. According to our recent decisionmaking around unsupported features, such a feature should be clearly indicated by adding EXPERIMENTAL or some such to the environment variable name. Also notably, Marc's position is mutually inconsistent with everyone else's.

@vivekmenezes suggests that we have a variety of users, but without more details we can't really be sure what he means or why it would matter if there's such a switch.

@BramGruneir argues that we aren't going far enough with an environment variable, and the switch should rather be implemented as a cluster setting, so as to not get in the way of our users. Notably, he's the only one to have (explicitly) considered a threat model, though his argument is that we should just expect our users to worry about it.

There are a few problems with this position. Most critically, taken to its logical conclusion, this argument would suggest we should allow users to deploy a "secure" cluster with no authentication even for SQL access. After all, why would we want to get in our users' way? "Requiring login... is just punishing." So why do we not have such a switch to disable all authentication?

Well, I think the arguments in favor of requiring authentication for secure clusters are quite valid: don't give your users a footgun. This is just basic security practice, if you give users a choice they will choose the wrong option.

Moreover, it disrespects the vast majority of our users that don't have sufficient resources to audit the information exposed through the web UI. We don't even have those resources ourselves! (If we did, we would just implement anonymous access...) Exposing this switch actually imposes a significant burden on our users, and it's irresponsible to do so.

A concern that no one on this thread raised (but which was raised in our previous discussions) regards single sign-on: many (most? maybe all?) of our users would actually prefer to authenticate with an external system rather than CRDB's internal user table. This somewhat dovetails into the fact that everywhere in our documentation we advise users against setting passwords (rightly! passwords are broken!), but the only form of authentication we currently support for the web UI is passwords. These arguments are sound, but only really suggest that we need to prioritize SSO support ASAP.

So I'll just wrap up this diatribe with: we shouldn't do this. We agreed not to do this and we don't have a good reason to do this. But I guess it was done anyway.

vivekmenezes · 2018-08-11T22:04:59Z

I'm happy with this on by default. I'm okay taking the approach that we wait until we hear users complain about this and they will but they do have the workaround of setting a username:password

BramGruneir · 2018-08-13T21:43:24Z

So I'd like to rebut a few points you made @couchand.

To your point that we shouldn't relitigate a decision that's already been made, in general, I think that goes against general engineering principles. We often revisit arguments with new information and change prior decisions. To be clear, I don't know where these previous discussions took place. Where did this happen and as this clearly is an impactful change that affects everyone that's using the product, why was this not an RFC? It's a lot easier to defend if we can just point back to the RFC and say this was already settled there. Especially, in the case where these arguments have already been considered. Doing so would have avoided the whole "randomly changing our minds" problem that I think everyone here wants to avoid.

Since I was not really privy to the discussion beforehand, these arguments are all new to me. Including the proposed "anonymous access" mode, which I think would be a great idea.

Now I'd like to rebut against the points you made towards my comments.

To start, I agree entirely with your concern about having only one means of authentication. It is a massive hole in our security model. Not being able to use LDAP or some SSO system yet is frustrating.

To the other points you raise:

Most critically, taken to its logical conclusion, this argument would suggest we should allow users to deploy a "secure" cluster with no authentication even for SQL access. After all, why would we want to get in our users' way? "Requiring login... is just punishing." So why do we not have such a switch to disable all authentication?

To start, I'm not arguing for that. And to be clear, there is right now a huge difference between what can be done in SQL and what can be done via the UI and even the endpoints. We do not require passwords (and even suggest against them) for secure cluster access. So I'd argue that this is materially different.
That being said, I also think your argument against disabling all authentication, while being clearly taken as an extreme example, might actually be useful for some security minded users. It's common practice to block access to the machines already using some other means. So why require yet another level of security on top of that? This is doubly so when dealing with our certs and trying to get a SQL connection established via drivers and ORMs. We only recently got a lot of those examples working in our docs and that was not a simple task and is quite brittle.

Well, I think the arguments in favor of requiring authentication for secure clusters are quite valid: don't give your users a footgun. This is just basic security practice, if you give users a choice they will choose the wrong option.

I think our stance should always be this: Let's default to the most secure option, but allow user to disable it when they choose to. To not give users options is to disrespect them. And to assume that they don't know their own systems and security model is misguided at best.

If we're really being security conscious here, requiring login is asking for the UI to be exposed to the internet. Have we tested this? Have we tried to see if the endpoints can withstand a DDOS attack or other such vectors? Have we tested timed attacks against our login with password system? Do we have an audit trail for all UI logins? Do we have different permissions for UI/endpoint access vs DB access? Do we limit access based on db and table grants in the UI? If not we're clearly giving the wrong impressions to our users on multiple fronts.

I'd like to go back to my original argument which I don't think you've adequately addressed.
Most customers lock down access to their DB machines based on IP, ssh cert, or some other means. Requiring login would be a significant burden on them that in those cases is unnecessary. Why would we not have a cluster setting to turn off login in the UI and let users decide access on their own?

couchand · 2018-08-20T20:10:31Z

Thanks for taking the time to review my comment @BramGruneir, I really appreciate that you care enough to engage with this issue critically.

we shouldn't relitigate a decision that's already been made, in general, I think that goes against general engineering principles

We should absolutely revisit something if we get new information that invalidates an assumption we've made. But I don't think that's happened in this case, the only change (as far as I can tell) was a rotation of personalities.

I agree with your point that the documentation here seems scarce, @piyush-singh has mentioned several times that he wishes we had gone full-RFC on the login question. I totally agree with that point; it seems organizationally we've been shying away from RFCs, because they keep us from moving fast "enough". That seems really unfortunate to me, since careful analysis and documentation has the potential to significantly reduce churn.

there is right now a huge difference between what can be done in SQL and what can be done via the UI and even the endpoints

True, but maybe a bit myopic. I argued very early in this release cycle that the introduced pain of authentication needed to be offset by additional functionality to make it seem worth it. We also are definitely moving in the direction of exposing more sensitive details as well as administrative knobs, and I don't want to get in a situation where we are constantly looking over our shoulder to make sure we're remembering to keep individual pieces of functionality safe -- I'd much rather have the luxury that SQL does: if they have access, they can do stuff.

It's common practice to block access to the machines already using some other means. So why require yet another level of security on top of that?

Onion security. Don't just rely on a single wall repelling all invaders and trust everything inside. That's the sort of thinking that led to https://www.androidauthority.com/wp-content/uploads/2014/06/SSL-Added-and-Removed-Here.jpg

Let's default to the most secure option, but allow user to disable it when they choose to.

I'd suggest that the model should be: default to the most secure option, and allow the user to relax that to any other option we deem to also be secure. Don't provide a footgun. Again, we don't allow a user to run a secure cluster without SQL authentication, why would we allow one without web UI authentication? Given the direction we want things to go, they should really be considered to be the same class of access.

login is asking for the UI to be exposed to the internet. Have we tested this?

Great point! Let's test it, then! (oh, and we should probably also hold SQL to the same standard, which as far as I know has undergone none of the testing or auditing that you recommend, which is unfortunate)

Do we limit access based on db and table grants in the UI? If not we're clearly giving the wrong impressions to our users on multiple fronts.

I've made this argument over and over, you won't get any disagreement from me. Unfortunately, that's been deprioritized.

I'd like to go back to my original argument which I don't think you've adequately addressed.
Most customers lock down access to their DB machines based on IP, ssh cert, or some other means. Requiring login would be a significant burden on them that in those cases is unnecessary. Why would we not have a cluster setting to turn off login in the UI and let users decide access on their own?

I do think I've addressed that argument, by proxy. I think the argument can identically be made regarding a SQL connection. Unless we're willing to allow unfettered access universally (and we absolutely should not), we shouldn't allow unfettered access to the web UI.

timaa2k · 2019-06-12T10:19:10Z

We at Mesosphere are currently relying on the undocumented environment variable COCKROACH_DISABLE_WEB_LOGIN to disable the Admin UI login. Our own Identity and Access Management component takes care of guarding the CockroachDB UI but we still require a secure cluster for TLS support between CockroachDB nodes.

In the future we'd really like to avoid scripting our way to do an automated CockroachDB login procedure in the current form with gRPC + protobuf. So please keep this option around until there are better alternatives for an automated login procedure.

couchand · 2019-06-12T14:17:27Z

We at Mesosphere are currently relying on the undocumented environment variable COCKROACH_DISABLE_WEB_LOGIN to disable the Admin UI login.

I was deathly afraid something like that might happen. Oh well, I do hope nothing bad comes of it.

Of course the proper answer here is single-sign on, but it looks like there's not many resources going to addressing the problem now.

couchand requested a review from a team August 2, 2018 16:59

couchand mentioned this pull request Aug 2, 2018

ui: remove remote_debugging checks #28209

Closed

couchand mentioned this pull request Aug 2, 2018

Web UI Authn & Authz (Tracking Issue) #25529

Closed

vilterp force-pushed the feature/enable-login branch from f12d895 to fa5ff70 Compare August 7, 2018 19:39

couchand requested a review from a team August 7, 2018 19:39

vilterp force-pushed the feature/enable-login branch from fa5ff70 to d90517d Compare August 9, 2018 19:22

couchand closed this Aug 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ui: require web login for secure clusters #28207

ui: require web login for secure clusters #28207

couchand commented Aug 2, 2018

cockroach-teamcity commented Aug 2, 2018

mberhault commented Aug 9, 2018

vilterp commented Aug 9, 2018

mberhault commented Aug 9, 2018

vilterp commented Aug 9, 2018 •

edited

Loading

vilterp commented Aug 9, 2018

vivekmenezes commented Aug 10, 2018 via email

BramGruneir commented Aug 10, 2018 via email

couchand commented Aug 11, 2018

couchand commented Aug 11, 2018 •

edited

Loading

vivekmenezes commented Aug 11, 2018

BramGruneir commented Aug 13, 2018

couchand commented Aug 20, 2018

timaa2k commented Jun 12, 2019

couchand commented Jun 12, 2019

ui: require web login for secure clusters #28207

ui: require web login for secure clusters #28207

Conversation

couchand commented Aug 2, 2018

cockroach-teamcity commented Aug 2, 2018

mberhault commented Aug 9, 2018

vilterp commented Aug 9, 2018

mberhault commented Aug 9, 2018

vilterp commented Aug 9, 2018 • edited Loading

vilterp commented Aug 9, 2018

vivekmenezes commented Aug 10, 2018 via email

BramGruneir commented Aug 10, 2018 via email

couchand commented Aug 11, 2018

couchand commented Aug 11, 2018 • edited Loading

vivekmenezes commented Aug 11, 2018

BramGruneir commented Aug 13, 2018

couchand commented Aug 20, 2018

timaa2k commented Jun 12, 2019

couchand commented Jun 12, 2019

vilterp commented Aug 9, 2018 •

edited

Loading

couchand commented Aug 11, 2018 •

edited

Loading