Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support atomic operation (separate from ACID) [moved] #1056

Closed
lvca opened this issue Dec 10, 2012 · 15 comments
Closed

Support atomic operation (separate from ACID) [moved] #1056

lvca opened this issue Dec 10, 2012 · 15 comments
Assignees
Milestone

Comments

@lvca
Copy link
Member

lvca commented Dec 10, 2012

This is Issue 1056 moved from a Google Code project.
Added by 2012-09-13T18:10:57.000Z by tal.liron.
Please review that bug for more context and additional comments, but update this bug.

Original labels: Type-Defect, Priority-Medium, Release

Original description

OrientDB (as of 1.1.0) does not support atomic operations outside of full-blown transactions.

Transactions are a great way to enforce atomicity. Unfortunately they do no work in clusters, and generally add a layer of complexity and performance degradation not needed nor desired for strict compare-and-set atomicity.

I suggest that small atomic operations are a necessary feature for a broad range of applications, and it may be a good idea to allow for them even when transactions do not exist. OrientDB's competitor, MongoDB, has several atomic operations that make it extremely useful:

http://www.mongodb.org/display/DOCS/Atomic+Operations

(Relational databases traditionally allow for limited atomicity for AUTOINCREMENT columns, even in replication, but this is a one-off solution for a very specific problem.)

Of course, we want this feature in OrientDB to be fast and optimal, and not involve locking either the class or the record (it should rely on OrientDB's MVCC implementation). The pervasiveness of atomic operations is one reason why MongoDB does not allow for multi-master replication. Obviously, adding this feature to OrientDB will involve thinking through the whole architecture.

I would suggest special extensions to OrientDB's SQL language to allow for a few atomic operations. For example:

DELETE ONCE * FROM Person WHERE banned = TRUE

The above would atomically delete these records, but also *return them* if any were found (atomically) due to the "ONCE" keyword, so it works like a SELECT query. This is equivalent to MongoDB's findAndRemove. An example of an UPDATE:

UPDATE ONCE Person SET name = "Luca" WHERE name = "Tal"

Again, the "ONCE" keyword will make it atomic and also return it. As in MongoDB's findAndUpdate, it may be nice to be able to control whether you want the record before or after the update, so I suggest the following two variations:

UPDATE ONCE BEFORE Person SET name = "Luca" WHERE name = "Tal"
UPDATE ONCE AFTER Person SET name = "Luca" WHERE name = "Tal"

(The default should be BEFORE.)

Another important feature in these atomic operations is the ability to operate on one or many records. The default in SQL is many, but sometimes you just want one, so we can add the "FIRST" keyword:

DELETE ONCE FIRST * FROM Person WHERE banned = TRUE ORDER BY created DESC
@tliron
Copy link

tliron commented Dec 2, 2013

Is there any plan to support this? Until this happens, I can't see OrientDB replacing MongoDB for any of my clients.

@lvca
Copy link
Member Author

lvca commented Dec 2, 2013

What are your specific use cases? Are you using MongoDB like a queue?

@tliron
Copy link

tliron commented Dec 2, 2013

My use cases are large data-driven web applications. It's simply impossible to program anything useful without atomicity. OrientDB supports transactions, but those are heavy and are also not supported in a cluster. The right way to go is atomic compare-and-set operations per document.

@lvca
Copy link
Member Author

lvca commented Dec 2, 2013

This is relatively easy to implement with the following syntax:

... RETURNING before|after

"RETURNING" has been taken from Postgres and indicate what to return. We could use before and after to decide what return. This could be supported by UPDATE and DELETE statements. Any other better names?

@andrii0lomakin
Copy link
Member

Sorry for closing, actually I was going to add my 5 cents. To be atomic operation should be trully atomic, I mean server crash should not make data to be in the middle (right ?), it will be introduced here #1604 that is 2-nd task and I am working on it now. Also new LINKBAG (LINKSET which allows duplicates) and document are two different instances but they are updated using CAS style (MVCC actually but I think we talk about the same thing).

@lvca
Copy link
Member Author

lvca commented Dec 2, 2013

@Laa you're right. That is fundamental to have true ATOMIC operation. Here the focused is more about the SQL commands: they should return something to let to the app to do operation with just one call instead of executing SELECT + UPDATE / DELETE.

@tliron
Copy link

tliron commented Dec 2, 2013

Well, structured document databased are a bit different than relational ones. Look closely at the options available for you in MongoDB: you can increase integers deep inside the structure, pull/push elements in arrays, and many others, as well as the regular compare-and-set. I'm honestly not picky about the syntax: as long as OrientDB can reach feature parity with MongoDB on this. I was just brainstorming some ideas that I thought could keep the SQL style and still do what MongoDB does.

My "ulterior motive" here is that I'm the developer behind Diligence, a very comprehensive web framework based on Prudence and MongoDB. It's JVM-based, and as such OrientDB could potentially be a very good fit.

@lvca
Copy link
Member Author

lvca commented Dec 2, 2013

This is needed also to build persistent queue-like structures:

PUSH -> insert into ... set ...
POP -> delete from ... where ... limit 1 returning before

@tliron
Copy link

tliron commented Dec 3, 2013

You're absolutely right: but queues are just the tip of the iceberg of what's needed.

@lvca
Copy link
Member Author

lvca commented Dec 3, 2013

@tliron, could you explain all your use cases to be sure to cover them with one issue? I mean what kind of atomic operations do you need?

@tliron
Copy link

tliron commented Dec 3, 2013

I will do my best. Quick update, the MongoDB documentation has changed a bit, you can see all the atomic operators here:

http://docs.mongodb.org/manual/reference/operator/update/

Pay special attention to the "$" placeholder, which can do some very powerful things:

http://docs.mongodb.org/manual/reference/operator/update/positional/

The truth is that I use practically all of these operators, a lot! The general principle is this: in the web world, all user requests can happen concurrently, so you must always be prepared for handling multiple conflicting requests without losing data or duplicating it. Here are some examples from Diligence:

Diligence Documents/Wiki: Two people editing a wiki page at the same time. All versions of the wiki page in the same document. Via a "$push" operation, you can add an element to a JSON array atomically, thus making sure that if a different user edits the wiki page at the same time, no version is lost.

Diligence Notification: This is a classic queue as you noted. Notifications are added to one or more queues are then atomically removed via a MongoDB findAndRemove operation.

Diligence Serials: A service used by other services to generate unique IDs. Relies on the "$inc" operation.

Diligence Authentication/Registration: An "unauthorized" document is created for users when they first sign in, with another document specifying a special unique token (which expires after a time). They will be sent an email with a link they must click to authorize the account, based on that token. Atomic operations are used here to make sure that the account is only authorized once, and that the token is removed once. Something similar is done with the external authentication providers (OpenID, Facebook, Google, etc.)

Diligence Discussion: This service lets you add a mini-forum to any document, where people can post comments and reply, so it is built like a tree that can handle concurrent posts. Perhaps this is the strongest example for when atomic operations are absolutely necessary, but it's also quite complex. I do various tricks to support a tree format with MongoDB, which is a special problem field. Actually, OrientDB would offer graphs, which could be a better approach, but not necessarily: the big advantage of having it as a single document is for extremely fast fetches.

Diligence Shopping Cart: If you are adding to the same document concurrently, you must be sure not to account for all simultaneous changes, which can only be done atomically.

And that's just in basic Diligence: application built on top of Diligence of course do a lot of other work that would require atomicity.

@ghost ghost assigned lvca Feb 2, 2014
@lvca
Copy link
Member Author

lvca commented Feb 2, 2014

What we can support is the RETURNING keyword to let DELETE and UPDATE commands to return the record before or after the change. Example to pop a record:

DELETE FROM Queue ORDER BY date LIMIT 1 LOCK record RETURNING before

@lvca
Copy link
Member Author

lvca commented Feb 2, 2014

Ok, changed RETURNING -> RETURN with accepted:

  • count -> default, like < v1.7. Returns the count of updated/deleted records
  • before -> returns the records before the update/delete
  • after -> returns the records after the update. Not available on delete

@kowalot
Copy link
Contributor

kowalot commented Feb 2, 2014

Let me share my preliminary test results of that functionality, showing clearly how beneficial it is.
I run simple test which increments 100000 times INT field in ONE object with different scenarios:
UPDATE #RID INCREMENT SALARY = 1 RETURN COUNT LOCK....

1 thread LOCK NONE (MVCC) - 2500 increments /sec
12 concurrent threads LOCK NONE (MVCC, repeating at remote client) - 329 / sec

1 thread LOCK RECORD - 2200 increments /sec
12 concurrent threads LOCK RECORD - 7200! / sec

Nothing more to add :) I hope you will reconsider to add pessimistic locking to TX_COMMIT as well.
Cheers
Pawel

@lvca
Copy link
Member Author

lvca commented Feb 2, 2014

These numbers are super good! We have the issue #12 for that

@lvca lvca modified the milestones: 1.7rc1, 1.7 Feb 5, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants