-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tree operations #71
Comments
That was a good read. I am all for tree operations. The basics of this model may be harded to implement but the advantages you described are very real. |
+1 |
Btw, talking about terminology, let's use the right terms and not start confusion. We're always talking about "Operational Transformations" (not "operations") and no matter if we have linear data or a tree, we're always talking about OT. There is no default set of operations in OT. The operations themselves are application specific and in our case they serve as tree operations, that's the only difference. Our operations would still be merged/combined, ofc. For example, two "add attribute" operations may result in a single one. Just like the insertion followed by the removal of the same node. But ofc, most of the operations will not be mergeable, which doesn't bring much limitations anyway. As a final comment, I have the feeling that conflicts will be better handled by using tree operations, especially when dealing with node splitting and merging. Well done @pjasiun! |
+1. We may find some troubles, but it seems that it will be easier to understand them with tree operations than linear operations. Tree operations will be a better experience for us and for end developers and even if we won't be able to solve some edge case situations it may be a win for us if we'll be able to easier solve the most common ones. |
I found an insightful paper on OT using tree data structure: http://www.collide.info/Lehre/SeminarWS0405/DavisSunLu02.pdf. It basically confirms what we felt - that OT should be doable on a tree if we stop using retains and start using path-addresses. This might not be the best way to implement collaborative editing but at least it is something that we can stick with for now. It's documented and we can be sure that we will have a solution. Unfortunately, we will stil be facing problems described by PJ:
For a note, the complexity of combining a set of operations will be This may feel like a lot, but honestly, If we really want Another issue I can see is that the Also it is not clear to me whether Then we will have even more complex operations like Last, but not least, it also not clear to me if undo will work as easy as in linear OT. As for now, I am ready to start implementing those algorithms in JavaScript, however... Before I have found and read the mentioned paper, I was also trying to explore a different approach that is both similar to what PJ described at the beginning of his first post (all user's operations) but is fundamentally different. The reasoning behind it was this: combining operations that are strictly connected with how we model document structure and how DOM works is difficult. When user presses enter, we know that we have to split the node, which we do by creating a node with the same type and attributes and then we move some stuff from the first node to the created one. We can implement this using: The idea is to focus on user's intensions rather than effects of those intentions. So, instead of resolving |
I have doubts about the use of custom elements for the user's intentions, but in general it looks like we have a good understanding about the proper way to proceed now. So let's move on. I'm also curious about move being a delete/insert pair or not. If it is an atomic operation we don't have to face the garbage collection issue, otherwise it'll be another problem to solve. In the other hand, using delete/insert for it simplifies the amount of operations and logic we have to deal with. So maybe this approach may the first to take. Btw, talk about potential performance issue, let's not be limited to simultaneous operations performed by different users. The most complex situation will be Undo/Redo in collaborative editing, where one should undo only their own operations. For example, having user A and B, we may have a undo stack like this: A1, A2, B1, B2, B3, B4... B100. Then A decides to undo A2. What we expect to happen then? I expect the inverted version of A2 to be transformed by all the 100 subsequent Bn operations and then applied to the document. |
On undo: Yes, that's what would happen. I will think if this can be optimized in any way. |
On intensions: These custom elements won't be applied to DOM. That would be just a way to mark user's action in our simplified tree structure. I don't know if you understood that correctly. Those custom elements would be then resolved to "real" operations, but only after all user's actions are combined. That would be just a way for easier combines and custom nodes won't be used anywhere else. I should stick with "node" and not write "element". For intensions-based approach I had (and still have) some hopes that this structure could be linear, or linear-like and maybe we would be able to merge two following operations. |
You're proposing dirtying up the data with "temporary" markers. Maybe it is fine, as long as the data get "locked" during processing and the markers get removed once completed. Still it sounds like hacking, not the right way to do it, but it is just a feeling. Anyway, I'm not voting against nether in favor of it. |
Back to the tree-approach and undo. There is one more problem I can see. Suppose that we delete a node from a tree and then want to undo that operation. If we removed that node from our structures completely we cannot do it. So we will have to keep everything in the structure, even if we removed it a long time ago. I don't know how problematic this can be. As far as transforming by 100 Bx operations is concerned, maybe we will be able to somehow optimize it by providing nodes to operations instead of addresses. Or by providing addresses but saving in memory which node that was. Those |
I assume that the operations stack will save the inverted operation as well... so a delete operation will save an insert operation with a copy of the node that has been deleted. So we don't save it in the data structure. |
Yes, that will happen. We will save a reference to that node with the operation. By "saved in the structure" I mean that it is just saved as a reference somewhere but, of course, it will be detached from the structure. This is dual/similar to what we often do in DOM. The bottom line is that we will have to keep those references "forever". |
BTW. If we create inverted operation at the same time as original operation, maybe we could transform it when we get a new operation from the server? So instead of doing 100 transforms after user uses undo, we will do a bit of work when we get a new operation. This will make our script run twice as long when we get a new operation from server, but undo will be as fast as normal operation. This is a wild thought though, I didn't think it through :) |
Nah... but then you'll have to this for all ops in the undo stack and for every new operation that arrive, while the other way we do it only when undo is requested (on demand). |
When we replace addresses with nodes (and offset if needed) I think we will be able to quickly make undo for From what I understand, in our approach, attributes are mostly connected with styling of characters (so we keep there info about font style,size,weight,family, text-decoration, etc.). In CKE4 it is impossible to attach such styles to block element (let's say So, back to the problem, sometimes undoing attributes change doesn't really make sense anymore. Let's say you have If we saved our original I tried to see how Google Docs handles it, so I typed some text and bolded a part of that text. Then, on another browser I moved part of the text (D&D) somewhere else. When I undone bold only not-moved part of text got undone. So I tried and moved the other part (on second browser). Then, when I tried to undone bold, nothing happend (Google had an "empty" undo step). |
Well, in that case I think we're simply doing with conflicts, which will definitely happen. For that, we must define a standard resolution pattern which will define the behavior. Much probably that's what's happening in GDocs... a conflict is hit and nothing happens. In any case, much probably we're talking about operations on ranges, which boundaries will get updated during transformation. If the final boundaries will point to an existing range, the undo operation will get applied there. Otherwise, a conflict would be identified and much probably nothing would happen. |
That is a good new, that some papers confirmed tree-operations model. Definitely the final document need to be clear and simple. However, I agree that at some point we will need to keep user intentions for better collision handling. For example wrap and split are in fact sets of one insert and multiple move operations. But when you think about merging them they should act differently. We can handle this problem in the several ways: we can have a meta data for operations or have special operations for split or wrap. We can add markers in the document (for the time of the algorithm) or we can keep these markers in some external data structure. But these are implementation details. Not that important now, when we need to bring the basics, though it is good that we have some ideas future problems now. In the first implementation we need an universal solution. We need to be able to handle all types of collisions, somehow. Then we can start thinking how to handle some special cases better. About the idea of move being insert + delete operation. The question is if operation keep the node or its copy. On the one hand if we delete an element which has a lot of children we need to keep the whole detached sub-tree in the delete operation to be able to insert it again. Making copy of it may be the waste of time and memory. On the other hand, if the insert operation contain the node which will be inserted into the document and then another operation add some attribute to this node the original operation will contain node with the attribute what may be a reason of the mess. And more changes in that node (attributes, name, children, parent) means more mess. The insert operation should keep a copy of the node to avoid such problems. If the insert keeps the copy it will not be that easy to notice that the pair of insert and delete is the move operation. There is one more problem with keeping nodes instead of positions. If you add the bold attribute on the range 0-5 which is
It is a better behavior then leaving And one more topic. That is true that if we want to undo 3 operations in the history we need to combine them with all later operations. But if we combine 3 operations with 100 it gives us 300 transformations. Note it is less then combining 20 operations with another 20. And it grows linear. However, there are some potential mechanism to optimise such collaborative undo. For example we could normalize operations to keep only minimal number of operations needed to transform document from the last our change to the next one - in fact types and sources on the remote operations are not important. But to start thinking about such optimization we need to have a problem. I do not know if 300 operation will be a common case and if calculating it will be noticeably long. I do not think we should over-engineer collaboration editing on this stage. The most important is that we are going in the good directions and we have some ideas for the potential problems. Now we need to code basics and start solving real issues. |
Are you sure that keeping copy of the original node with There are two things that we do with operations:
|
I changed my post from "reference" to "copy" -- I make a comment to highlight this in case you are reading the post in your email box, as this changes the whole meaning of the post. |
OK, PJ explained me how it can be messy. Though we can't provide real example, it has a bad feeling, i.e. when you inserted node |
I've implemented algorithms from the paper to see how fast they work and if implementation is difficult. Here you can play with the implementation in testing environment: https://github.com/ckeditor/ckeditor5-design/tree/op-trans Note, that not everything may work correctly, but from what I've tested it is pretty good. And what interests us most is speed and AFAICS it is great. |
For starters: tl;dr so, bear with me :) |
@pjasiun Hi, Before moving with Tree structure i suggest trying to solve "classic table OT problem", if you will. The problem is:
The result should be a consistent table with +1 row and +1 column. Also in case of DOM OT editor must guarantee (i guess CKEditor does) that DOM representation is independent and consistent across different browsers. Another option I was researching recently is to have JSON (you can peek at https://github.com/ottypes/json0) for top-level structure (define paragraphs and other important blocks) and use rich-text based OT (like https://github.com/dmitryuv/otdartlib/) for working with formatted text inside these blocks. Additionally, tables will require special 2-dimensional operations to handle & resolve concurrency. |
This is why I do not talk about DOM OT, but about Tree OT which will transform the Document Model. This Model will be then transformed into the DOM, but Document Model is abstract and browser independent.
In fact there are more collisions which are hard to solve without the context, without knowing the user's intention. One of the solutions may be adding some meta information about the change (change is a set of operations). So the change will be a set of atomic "insert element (cell)" operations, but on the other hand it will have a meta information that it is "insert row". Thanks to this "labeling" a conflict with the "insert column"-change will be handled in the special way and other while other operations will be transformed in the standard way. Thanks to this, we won't need to resolve every conflict separately. (We won't have to write a separate logic for transforming insert-row against change operation or remove operation). This is just one of ideas. Another is to represent table in Tree Data Model in a different way than it is in DOM. So the table would be a node that has attributes rows and cols. Then each cell that has some content is represented as a child. Empty cells won't be nodes in Tree Data Model. This is convenient on Tree Data Model side but may be difficult to handle in real dirty-DOM operations.
I talked with @fredck last month about such solution and I do not like it because tree (structure) operations and text operations are not totally separated. You need to solve cases like split node, moving part of the text or inserting tree structure into the text (inline widgets). I am afraid that the final solution would have too many pairs of operations to handle. But I agree that there are advantages of such structure too. |
We already checked your repository last week. It is very interesting but, to be honest, it is hard to said how it will fit CKEditor 5 needs at this stage. |
Agree, because of this we actually use linear rich-text OT right now with placeholders that can hold widgets (a subdocuments that can use different OT type, like JSON). This creates a set of limitations for the possible document complexity but easier (and requires less resources) to handle. I have working prototype for rich-text editor based on CodeMirror because it have most powerful API to watch for changes and manipulate content. It supports basic formatting, lists, images & widgets. But lacking of spell checker is a deal breaker for going to production, so I was very encouraged seeing such discussion around CKEditor. Also I know guys who work on office suite and collaboration based on OT (they also mentioned they use tree-based structure). I'm going to reach them to find out if they're willing to share more details. |
I updated https://github.com/ckeditor/ckeditor5-design/tree/op-trans with complementary set of algorithms and tests that uses node+offset as an address. Compared to root+path addressing the new operations are much simpler when it comes to transforming them, there are also less edge-cases. They will probably also be easier to handle when creating features/plugins. On the other hand we will still need some arbitrary addressing, like root+path, if we want to pass operations between client and server, because we have to pass strings. So the node+offset addressing would be for internal use. I have to think if conversion from root+path to node+offset will make problems when passing operations between client and server. Another option is keeping some kind of nodes' IDs but it seems bad... |
As far as I understand, we will have have a trouble with converting those two address notations. Say we have a document in state A and we create and apply operation O that changes document structure to A'. Then, if we get a remote operation with root+path address that applies to document in state A we will get wrong node if we convert root+path to node+offset using structure A'. So we would have to remember structure A. It won't be that difficult because, AFAICS we would need just one structure remembered - the one that is last known structure of the document on the server. But it is some complexity gain. As I proposed earlier, we could create a unique ids for nodes during their creation. Then, if we send operation to the server it would be something like Node+offset notation has its merits (mostly simplicity when it comes to defining OT algorithms for typical operations like |
Due to various issues connected with tree-operations on ranges we will introduce a different approach to removing (ranges of) nodes from data model. tl;dr: Problem: Similiar problem occurs when User A sets Solution: Since those simpler ideas aren't perfect I revisited the "artifical" nodes idea but went a step further. Most issues occur when User A does something and User B removes a range of nodes. Instead of introducing "artificial" parents for removed ranges I propose creating a special root (let's call it graveyard root). This root will be one the same "level" as main document root but won't be rendered. In fact, it will serve as a store for removed nodes. So, the core change introduced by this idea is that Thanks to this idea we achieve a few things:
Since the new idea is pretty ground-breaking a lot of code has to be changed. But this new soultion to removing nodes is bringing many advantages and it is profitable to introduce it. Right now I estimate that there is around 50% of work done in this matter. We are also keeping the solution that one operation when transformed against another operation may, in some special cases, produce two operations. For example, when we have a text |
I pushed branch
While writing tests I documented all the cases from tests to have a reference for myself, to see if I haven't omitted anything and if cases from tests are described and coded correctly. It does not mean that everything is 100% correct for sure, but those scenarios should work fine. At the same time, it's a documentation of how editor contents will be transformed when different scenarios happen so it is a description of collaborative editing UX. It's also a short summary of our OT research. I'll just post that stuff here. Scenarios are mostly in the order from the code (except of |
I was about to implement
document.applyDelta
to apply linear operations:I was creating tests for them to apply basic operations we will meet using these operations:
Then I started to think how to handle errors during
applyDelta
, for example if delta contains opening tag but no closing tag or opening and closing tags cross exiting tags in wrong order? We can prepare correct deltas (set of operations), but we can not be sure that combinations of them are correct. We would need to be able to revert part of the delta which failed, so we need to record what tree modifications were done.Then this thought appears: what if we use tree operations at the first place?
Definitely we can not have a big set of operations. When it come to combining operations we need to handle all pairs of then. And even if the result of the combination will be simple (one user split text node, another inert an image into that node, it is pretty simple to find where the image should be inserted after the splitting), considering 12 operations listed above we would need to handle 12^2 = 144 combinations (they are not necessarily commutative when it comes to collisions), it would be crazy. Fortunately we do not need so many operations, 5 would be enough:
Having tree operations, the positions of the node can be represented as a path (first child of the second child...) instead of the offset. It means there is not reason to keep the linear data structure, everything can be a tree, where characters are leaves.
Using these operations we can implement all of the higher level operations.
To compress operations we can make them working on ranges, not only singular nodes:
What are advantages of having tree operations:
added/removed/modified
by the change will be simple a normalize list of operations. There will be no linear data to iterate and not separation collection for reverting delta. So we will have 2 data structures.And two, in my opinion most important advantages:
But not being too optimistic it is not an universal solution for all our problems:
In case of troubles with combining changes during the collaborative editing it should be possible to transform tree operations to the linear operations (add child node is a set of retain and insert operations) combine linear operations and transform result back to the tree form. But it may not be very efficient.
I believe using tree operations simplified our document model. We will better understand what is going on, so we will be able to better and faster solve problems we will, definitely, meet.
The text was updated successfully, but these errors were encountered: