Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OrientDB authorization model can not be used as application level authorization #2229

Closed
enisher opened this issue Apr 15, 2014 · 73 comments
Closed
Assignees

Comments

@enisher
Copy link
Contributor

enisher commented Apr 15, 2014

Such features as record level security encourage user to use OrientDB authorization as application level authorization. In other words, use OUser class as a main class for application user.

However there are some significant design issues in binary network protocol.

I suppose users would like to use approach similar to following

  • Application opens several sockets to database.
  • For each user connected to the application a db session is opened.
  • OPEN_DATABASE command is used to authorize user.
  • SessionID or similar entity is used as auth token.

Issues with concurrent requests

ODatabaseDocumentTx is not thread safe. And in binary network protocol each client session holds its instance of ODatabaseDocumentTx.

Thus session can not be used by several connections from different sockets because they can be handled in different server threads.

As a result we can not reuse session in different client threads.

This means that we have to open a new session for each user * each application thread. That leads to a need to reauthorize user in db for each thread. And as soon as user request can be handled by several application thread we have to store username and password at application layer.

Memory consumption

Having a thousands of active users may lead to thousands of opened connections in client connection manager. This could be memory consuming.

Possible memory leaks

If user is not logged out and socket was reused by other user, its session may remain in client connection manager forever.
It won't be closed automatically as soon as socket still alive and used by application to handle another user sessions.

Session Id shouldn't be used as auth token

Session Id is serial, so it is not really safe to use it as a auth token.

No native way to make users rememberable

As soon as OrientDB supports only user/password authorization, "Remember me" feature could be implemented only by workarounds.

Automatic shutdown of session

Automatic shutdown of sessions that associated to broken connections are not designed for sessions that could be used by several connections.

So some session that is closed in such way may still be used by some other thread.

@phpnode
Copy link
Contributor

phpnode commented Apr 16, 2014

just to reiterate how severe this is, it makes it impossible to use orient's permission model efficiently for most websites. It renders OUser, ORole and all of the class and record level security features inaccessible.

@ruckc
Copy link

ruckc commented Apr 17, 2014

How about tackling this from a different method. I think this is actually two issues, the first is connection pooling and the second is session handling across the pool.

To address connection pooling the application could have its own account (typical setup) and allow it to "switch user" to the actual user's account. This could be implemented as an additional privilege with a role restriction. The application in this case would be responsible for validating credentials initially and it could be given a ticket for future requests. The tickets could be infinite lifespan (tied into their session record), max lifespan, or have a max inactive lifespan. This approach could also be used to allow external authentication (to Active Directory or some external service) and OrientDB would have to "trust" the application in its switch user capacity. PostgreSQL supports a portion of this workflow through SET ROLE.

To handle the session state issue OrientDB could just store session data in a cluster (either memory or disk based) with each user having one record (and record level permissions in-place). A plugin could be written to handle purging old session.

@phpnode
Copy link
Contributor

phpnode commented Apr 17, 2014

@ruckc I agree that this is two issues, but I don't really like your proposed solution - why do we need a special role and "user switching" when global sessions, ala HTTP, could easily solve the problem without having to introduce a greater degree of trust for the client app?

I think it's brittle and a lot of overhead if the application has to do a SET ROLE = foo before each subsequent command - we're trusting the wrong thing

I agree we could create a new OSession class as you propose, treat it like any other cluster but old records are removed when their TTL expires. Developers can then use OSession to store temporary data just for that particular user.

@phpnode
Copy link
Contributor

phpnode commented Apr 17, 2014

Copied / pasted from another issue:

currently Orient has a 1 socket = 1 thread structure, but this means that if clients want concurrency, they must implement their own connection pool (and you can't reuse session ids over multiple sockets because it's not thread safe). Wouldn't a thread pool help with this? Instead of having 1 socket = 1 thread, when a socket receives a request it should take a thread from the thread pool, do its work, then return it to the pool?

@ruckc
Copy link

ruckc commented Apr 18, 2014

So, the solutions to this type of problem across other databases/software:

  1. PostgreSQL, use set role. Since users are roles in PostgreSQL, could work with the application having its own credentials, and would require validating the user credentials at the application, and trusting the application to SET ROLE appropriately.
  2. Unix, use su, requires authentication credentials of user switching to.
  3. SQL Server's whitepaper on the concepts even make assumptions that by doing RLS you are not using connection pools due to the requirements driving row level security.
  4. Oracle implements Proxy Authentication which is similar in concept to PostgreSQL's SET ROLE. It supports connection pooling by associating the proxy'd account in connection metadata when fetching/creating a new connection.

I think the easiest/lightest method is using Unix's switch user (su) model to allow an application to elevate existing open'd databases to a user's granted role. Essentially extending OUser to OApplication and OIndividual each having credentials required. Also the client API should provide a connection pool that can automatically exit and persist the user's OSession when a connection returns to the pool, but the connection/database should stay open, just with the OApplication as db.getUser() until the application requests another connection for a user. Additionally, the OApplication would have no actual CRUD rights on the database other than a permission to allow switching user. This could also be extended with another permission (i.e. connect) to only allow OApplication's to create connections.

One hinderance to the connection pool above would be implementing scrypt/bcrypt hashing due to the time requirements to validate the credentials over and over. This could potentially be mitigated by having a in-memory cache of sha256 hashes of recently validated passwords.

Something to keep the model flexible enough would be overriding openDatabase(String user,String pass) to openDatabase(Credentials creds), which would support user/pass authentication along with PKI or any other two factor authentication.

@phpnode
Copy link
Contributor

phpnode commented Apr 18, 2014

@ruckc the point of OSession is to avoid the need to reauthenticate again and again, the OSession id is a long, secret key that is enough, by itself, to provide authentication. So no need to check the password on each request.

The binary protocol already has a thing called a sessionID, required for every request, so there's no need for the application to have its own account, and there's no need to deal with user switching (which implies that the client application can keep state). All that the application needs to do is forward the correct sessionID for that user - this is how it already works. The problem is that the existing sessionIDs are sequential (easy to guess), non-global and non-reusable. If we can change that we don't need to really change anything else about the security model i think.

@lvca
Copy link
Member

lvca commented Apr 18, 2014

@phpnode so generating a random long could be good enough for you? What else is missing?

@enisher
Copy link
Contributor Author

enisher commented Apr 22, 2014

@lvca, @phpnode I suppose concurrent request processing in a single session are also important for such case

@phpnode
Copy link
Contributor

phpnode commented Apr 22, 2014

@enisher exactly, that's the only remaining issue I think

@phpnode
Copy link
Contributor

phpnode commented Apr 22, 2014

@lvca @enisher on second thoughts, there's a remaining problem - AFAIK sessionIds are not shared between orientdb servers, which makes working with server clusters awkward. This is another reason to go with a new OSession class I think, because it could leverage Orient's existing replication features, and give an obvious method for manually expiring active sessions - just delete the record.

@giastfader
Copy link
Contributor

Hi guys,
If I am still on time, I would to give my contribute on this.
As far the sessionId type is concerned I think that it shouldn't be a long.
I think that a UUID can fit better for unique-identifier than a long and can be more difficult to guess by an attacker. But I suppose that in this case the impact on the protocol will be significant.
I also think that there should be a way to automatically invalidate all the session ID related to a specific user if he/she changes the password. This is a typical scenario when data are accessed by multiple devices.

@phpnode
Copy link
Contributor

phpnode commented Apr 22, 2014

@giastfader totally agree, in fact I meant that the session key itself should be long (i.e. unguessable), not that it should be a Long, sorry for the confusion!

@giastfader
Copy link
Contributor

@phpnode :)
Maybe the @lvca 's reply led me astray.

@lvca
Copy link
Member

lvca commented Apr 22, 2014

@phpnode and @giastfader Thanks for your suggestions. I don't know if writing the record to the database could be a good idea because we've Hazelcast under the hood and creating a clustered Map with sessions it's straightforward. The problem could be: how many sessions could we have? thousands or millions?

@phpnode
Copy link
Contributor

phpnode commented Apr 22, 2014

@lvca potentially millions for the largest of sites. Without the record approach, how would we expire an existing session? how would we find the active sessions for a given user etc? Presumably via new SQL commands?

@giastfader
Copy link
Contributor

@lvca however I think that you should pay attention if you try to solve a problem that maybe, and I mean maybe, should not be solved by the database engine. Trying to solve this problem could lead you in scalability issues, and sessions (and stateful machine in general) IMHO cause the biggest concerns in scalable systems.
What do you think about encrypted session tokens?
Let me explain:

  • A user try to access to the db providing both username and password
  • The database build a token using the provided credentials AND a timestamp
  • The database encrypts this token with a secret key defined somewhere (maybe in the config file like the root password)
  • The database returns the encrypted token to the client
  • When the user wants to access to the database, he/she provides the token instead of the username/password pair
  • The database knows how to decrypt the token, extracts the username and the timestamp and grants or denied the access

PRO: the db engine does not need to store anything, clients don't need to store username/password, just the tokens
CONS: decrypt the tokens could be CPU intensive

WDYT?

@lvca
Copy link
Member

lvca commented Apr 22, 2014

By using Hazelcast it's vert easy: we have such Map to manage and all the nodes see the same map. So all the lookups works in distributed fashion. Current background task that purges all the expired sessions would work in the same way. WDYT?

@phpnode
Copy link
Contributor

phpnode commented Apr 22, 2014

@lvca sounds reasonable I think, but I'd imagine it won't be possible to query, e.g. there'll be no way to do the equivalent of this:

SELECT COUNT(DISTINCT(user.name)) FROM OSession;

to retrieve the number of currently online users.

@lvca
Copy link
Member

lvca commented Apr 23, 2014

@phpnode No, but we could create a "memory" cluster (non-persistent) and in case of distributed architecture the cluster would be replicated among servers.

@phpnode
Copy link
Contributor

phpnode commented Apr 23, 2014

@lvca this would be ideal i think!

@phpnode
Copy link
Contributor

phpnode commented May 6, 2014

@lvca @enisher any updates on this? It's really blocking us 😦

@tglman
Copy link
Member

tglman commented May 7, 2014

Hi, so speaking with the @phpnode the blocking issue is just the "migration" of the session cross connection/server, so we thought to two possible easy solutions:

the first one is a random generated uuid, to use a key of a distributed map, where in the value we have the detail of the session, this uuid will replace the current sessionId, but will be used also for new connections.
advantages:

  • easy/fast algorithm
  • easy implementation

disadvantages:

  • management of session expire server side
  • stateful server

The second solution can be generate a token as described by @giastfader
advantages:

  • server completely stateless
  • the session can be migrated also if the servers could not see each other

disadvantages:

  • complex/slow algorithm
  • need to share of a key at setup time

all the other features, like the possibility to query the active session, are more a nice to have and not blocking right now ;)

shall we choose one of the two ?!?!
I actually prefer slow and scalable (token)

@enisher
Copy link
Contributor Author

enisher commented May 7, 2014

Ok, so we don't need concurrent request processing in the same session for now, that is much easier.

The second proposed solution seems more sophisticated as for me.

@phpnode
Copy link
Contributor

phpnode commented May 7, 2014

Definitely think that the first option is much easier and faster. All we'd need is to store a map of UUIDs to user ids. Since we can reliably assume that user ids fit in integers, this can be as little as 20 bytes per user (plus a little more if we want to do TTLs).

Doing this doesn't really make the server any more stateful than it is at the moment - the server already keeps a map of sessionIds to userIds, we'd just be making that map shared rather than being per connection.

If we go with the second option, I don't think it's possible for the server to expire a key, so we'd have no way of revoking session ids for users that have been deactivated etc.

@mattaylor
Copy link
Contributor

is this a 2.0 thing now?

@young-druid
Copy link
Contributor

If a decision to use OUser as an authorization entity is made it is important to add more security into its model. I suggest not just encrypting password with SHA-256 but also adding a 'salt' field into OUser class and using it during encryption. What do you think?

@phpnode
Copy link
Contributor

phpnode commented Jul 10, 2014

@young-druid there's another issue for that - #2242

@young-druid
Copy link
Contributor

Oh, right. Thanks for that. I put my comment with a link about hashing in Java there.

@phpnode
Copy link
Contributor

phpnode commented Aug 4, 2014

@lvca can this be part of 2.0? IMHO it's really essential and blocking one of Orient's most useful features.

@lvca
Copy link
Member

lvca commented Dec 9, 2014

Emanuele will continue documentation and tests in #3155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests