Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with the <table> abstraction in the model #6506

Closed
jodator opened this issue Mar 27, 2020 · 10 comments
Closed

Problems with the <table> abstraction in the model #6506

jodator opened this issue Mar 27, 2020 · 10 comments
Labels
package:engine package:table resolution:expired This issue was closed due to lack of feedback. status:stale type:bug This issue reports a buggy (incorrect) behavior. type:debt This issue describes a technical debt. type:refactor This issue requires or describes a code refactoring.

Comments

@jodator
Copy link
Contributor

jodator commented Mar 27, 2020

Issue description (from #6630)

Atomic converters works in on operation at a time basis. In order to work out some of the above constrains they calculate operations based on the model state. This fails if more than two operations happens on a table.

Consider changing header rows from 2 to 1. The downcast converter:

  1. Takes second model row H2 (because it is at index which is after a heading section) finds its view element and move it to <tbody> section.
  2. Taking model state calculates which view cells requires renaming to th.

Now consider removing multiple rows (H2 and B1). The RemoveRow command will:

  1. Remove H2 row.
  2. Change heading rows to 1.
  3. Set selection in B2.

Now this will fail because heading rows converter sees the same attribute change and will assume that table is valid. But what this converter will do is:

  1. Take the second model row (which is now B2 ) and move it's view to <tbody>.

After table's attributes change conversion the remove row converter will step in. Removing elements from model is done on positions (offset in parent). So it will:

  1. For a removed row at given position it will get model parent (table).
  2. Get view row at an expected offset (let it be 2).
  3. As the view has been modified (wrongly) it will find row B2 to be removed instead of H2.
  4. 💥 The model to view mapping is broken as view for B2 was removed instead of H2.

So the spotted problems are:

  1. Using model in downcast conversion can cause problems. The fix for changing header rows looks quite simple (I've almost got it working in I/6437: Fix - the table breaks after removing a last header row ckeditor5-table#301).
  2. Table model and its conversion is hard to operate and figure out side effects. Downcast converters are too simplistic and relays on model state / single change.
  3. It may encourage developers to use enqueueChange() - I don't see other option for the problem described here. The thing with enqueueChange() is that run conversion on each step so it acts like step-by-step conversion but it looks like more work is done than necessary.
  4. I don't like the headingRows conversion. One idea is to have <tableRow isHeading="true"> - this will require some other checks (no isHeading gaps have to be checked) than now but conversion "should" be simple for this case. And I don't see how we can make similar change with columns.
  5. I didn't check the columns heading conversion and its problems...

Original issue description

It is rather a general list of spotted problems with current table abstraction which might be unsolvable or would require much work.

  1. No table sections in the model (<tbody>, <thead>)

WIP - the list should be updated with other issues.


Ad. 1 - No table sections in the model

After introducing table selection operations on table started to be more complex than anticipated when creating table conversion. The problem is that we do not have header or body in the model to make model simpler while HTML <table> has them and those sections have some constrains. One of them is that a table cell cannot be spanned with cells from body section. This leads to various calculation when changing "simple" attribute of a table: headingRows. This worked surprisingly well with simpler operations (ie remove 1 row, change headings count) or was a bug al along which didn't surfaced.

The problem with remove row command (#6391) is that it removes multiple rows and changes headingRows in one action. The headingRows attribute downcast works on moving rows around and it worked with one row being removed. It fails beautifully when more than one row is removed. This is because by default the table attributes are converted before removing rows. Thus this operates on different set of data than and should be re-written completely to anticipate this.

The easy fix for this was to make headingRows being downcasted after table rows removal with model.enqueueChange. The other way of dealing with this could be always re-creating body and head sections and moving rows to them but it will probably fail in most scenarios with removing rows and changing heading rows count (ie algorithm could move to head rows being removed and thus creating invalid view state).

The above leads to a my conclusion that not having 1:1 view to model projection requires from us handling some stuff carefully. IE headingRows operations in the model must be done always after modifying table structure because conversion must operate on target table state (number of rows) and we do not have coversion API to delay some operations rather then using model.enqueueChange() . This is a bit unfortunate as it tangles model and the view in an implicit (not obvious) way.


If you'd like to see this fixed sooner, add a 👍 reaction to this post.

@jodator jodator added type:bug This issue reports a buggy (incorrect) behavior. package:table labels Mar 27, 2020
@Reinmar Reinmar added package:engine type:refactor This issue requires or describes a code refactoring. labels Mar 27, 2020
@Reinmar Reinmar changed the title Problems with the <table> abstraction in the model. Problems with the <table> abstraction in the model Mar 27, 2020
@Reinmar
Copy link
Member

Reinmar commented Mar 27, 2020

I always wanted to make the conversion base on as simple mechanism as possible. The complexity should be kept under the hood and should not leak to converters.

For non 1:1 mappings, the only reasonable solution I can see is doing a "reconversion" whenever we cannot handle a model change with an easy 1:1 converter.

For instance, in case of tables, a table style can be converted with attrribute-to-attribute mapping. However, table heading rows count cannot. It should re-trigger conversion of the entire table.

Unfortunately, the conversion system went the other way around and we have a lot of abstraction leaking to converters making them incomprehensible. IMO, this must change because there's simply no other way around.

The issue that you fixed with change+enqueueChange is fixed in the command. But what about collaboration or undo? What about integration code and other features? There's a great chance that one of these change sources will try to modify the table too much at once.

@scofalik
Copy link
Contributor

My thoughts on this matter:

@jodator:

  1. On complex conversion. You will never get rid of the complexity fully. Maybe you will lower it, but a lot of it will be just moved elsewhere.
  2. On quick fix and using enqueueChange. No, this is incorrect use of enqueueChange. This method exists so that you are able to add a change to other batch than the current one (putting non-consecutive changes in one undo step) or to create different kind of batch (mostly transparent, to get around undo). It wasn't created to get around timing restrictions or to hack stuff. You'd be hacking stuff with a ticking bomb. I've already either just complained privately or created an issue that having enqueueChange somewhere in remove row command was messing up with track changes. Such hack introduced unpredictability. enqueueChange should be avoided. Especially, if there is a high possibility that it will be used in another change/enqueue change block. Command might be called by another script and then your enqueueChange messes up with it.
  3. Instead, why not simply calling remove row command n times, when n rows are selected? You'll get n undo steps, that isn't desirable, but this is a simple change that won't impact other places of code. This will get around your headingRows problem untill we will propose a clean solution.
  4. Is this a problem just with headingRows or cols are a problem too? I guess just rows, because of <thead>? How about introducing isHeader attribute for <tableRow>?

@Reinmar 

  1. I have mixed feelings about recreation. I am always afraid of messing up with parts of code that might depend on the view. Moving stuff around is fine but creating new instances of view elements is worse. Mappings mostly, maybe some UI states depending on view. Glitches like flickering may occur. You might lose changes which where done directly on the view (I remember a lot of problems with <span>/<p> conversion in table cell). The recreation process would have to be called on model-level, I think, so that differ would actually have an information about removing and adding everything, to be sure that mappings are correct, etc. This will also reinforce our stance on not doing any change on the view outside of the conversion process.
  2. However, I am sure if I agree with saying that conversion should be simple. I think that model should be simple, that post-fixers should be simple, that OT should be simple. This guarantees stability of our feature's base and separates problems.

Since we are talking about a bigger refactor, maybe we could think about totally different representation of a table? I've always thought that there must be a better way to do it. I've seen some discussions on Twitter in this area, too. I feel there's general agreement that there's something wrong with how tables are represented in HTML. Maybe instead of going more into HTML, let's get really outside of it? I wonder how spreadsheet applications represent their tables.

I wanted to write that:

But, in general, I think that we should simply delete most of the code we have and re-think how conversion works, how commands work, how post-fixers work. Start again with a better understanding of the feature and of how it fits our architecture. After all, I don't think that current model is really bad.

But after another thought I change my mind:

Of course, I don't have a full understanding of issues in tables. If you think that having 1:1 representation will solve some problems, then that's fine, let's go this way. We will move complexity into commands and post-fixers. If I wanted to propose removing a lot of code, we can remove the idea for the model as well.

@Reinmar
Copy link
Member

Reinmar commented Mar 27, 2020

I'm afraid we need to discuss this f2f :D

@jodator
Copy link
Contributor Author

jodator commented Mar 27, 2020

  1. Instead, why not simply calling remove row command n times, when n rows are selected?

Dunno why we went in this direction (I wasn't here ;P). It looks like a single undo step requirement - but this can be "fixed". I was thinking about the same - especially that we already have a pattern for passing a batch to commands. The only other thing is that it might be desired to have a command that works on multi-cell selection.

  1. Is this a problem just with headingRows or cols are a problem too? I guess just rows, because of <thead>? How about introducing isHeader attribute for <tableRow>?

This is a problem with moving stuff around (<tr> to <thead>/<tbody>) and doing it before changing table structure in the downcast conversion.
The isHeader might be a way to do it - dunno what other problems might occur here (single attribute change vs multiple flags on different elements).

Maybe instead of going more into HTML, let's get really outside of it? I wonder how spreadsheet applications represent their tables.

I think that the problem here is that HTML <table> have <thead> and <tbody> sections and we should support them (ps. there's <tfoot> also :D ). It might be a case that a spreadsheet doesn't need to have a heading as it is defined in HTML.

@Reinmar Reinmar added this to the nice-to-have milestone Apr 27, 2020
@Reinmar Reinmar added the type:debt This issue describes a technical debt. label Apr 27, 2020
@jodator
Copy link
Contributor Author

jodator commented May 26, 2020

I've updated the issue description with similar analysis from #6630. The conversation above was related to the "original issue description" (also in the first post).

@jodator
Copy link
Contributor Author

jodator commented Jul 7, 2020

Another problems listed there #6630 (comment) (dunno I keep posting there).

@jodator
Copy link
Contributor Author

jodator commented Aug 10, 2020

OK I had this thought that is either brilliant or completely dumb, but why we made editing so complex? The desired data must have <thead> and <tbody> sections, that's for sure. But WDYT about making editing as straightforward as possible?

What if we'd ditch <thead> and <tbody> from the editing view? We could bring back some atomic converters (for the performance reasons) while possibly removing the complexity of the headingRows manipulations?

Besides:

  1. it would be nice to have the same thing in editing as in data views
  2. CSS styling issues because of no <thead> and <tbody>

I don't see any real need for <thead> and <tbody> there, so the editing view might be just as well:

<figure>
    <div>...</div> <!-- UI -->
    <table>
        <tr class="heading-row">
            <th><p>Heding row cell</p></th>
            <th><p>Heding row cell</p></th>
        </tr>
        <tr>
            <th><p>Heading column cell</p></th>
            <td><p>Data cell</p>
        </tr>
    </table>
</figure>

Pros:

  • Simpler view structure - no need to handle moving <tr> between sections or no need to re-convert the whole <table>

Cons:

  • Might not fix the problem with multiple changes and re-rendering whole <table> might be still needed.
  • ...

@Reinmar
Copy link
Member

Reinmar commented Aug 10, 2020

I'd say this is a good last resort solution :D If we won't be able to handle this better on our side, then we can change the structure. However, it comes with its problems – the styles written for the content outside and inside the editor will need to differ. And there may be some a11y issues due to not using a full table structure. In general, as long as we're able to keep the structures similar we should do that.

@pomek pomek removed this from the nice-to-have milestone Feb 21, 2022
@CKEditorBot
Copy link
Collaborator

There has been no activity on this issue for the past year. We've marked it as stale and will close it in 30 days. We understand it may still be relevant, so if you're interested in the solution, leave a comment or reaction under this issue.

@CKEditorBot
Copy link
Collaborator

We've closed your issue due to inactivity over the last year. We understand that the issue may still be relevant. If so, feel free to open a new one (and link this issue to it).

@CKEditorBot CKEditorBot added the resolution:expired This issue was closed due to lack of feedback. label Nov 12, 2023
@CKEditorBot CKEditorBot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package:engine package:table resolution:expired This issue was closed due to lack of feedback. status:stale type:bug This issue reports a buggy (incorrect) behavior. type:debt This issue describes a technical debt. type:refactor This issue requires or describes a code refactoring.
Projects
None yet
Development

No branches or pull requests

5 participants