You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to how Word works and how mammoth works, sometimes we define a particular class of element as a paragraph when it might also be a list.
Here's a good example:
Word document styling
mammoth document HTML conversion
lots of lists
almost everything is a paragraph now
What is happening mechanically is that I am defining style maps that map particular classes to HTML elements. However, some of these classes are very general (paragraph, normaltextrun, Default) and can apply equally to a paragraph or a list, depending on how the list is created.
I am not sure if there is a good way to
preserve the intent of the original Word doc, while
being explicit about how to treat content that comes in
Something I've done in the past has been to ignore styles that are too general (like Default), but it feels risky to me because then we don't really know what it will come in as. However, if we can't do anything else about this, then that's probably the best thing to do on balance.
The idea is that we can actually identify list item elements
without a named style map, and instead convert them directly to lists.
The TL;DR here is that more lists will convert automatically.
I am using an undocumented but stable stylemap syntax for this.
More can be learned here:
- #142
- mwilliamson/python-mammoth#151
The idea is that we can actually identify list item elements
without a named style map, and instead convert them directly to lists.
The TL;DR here is that more lists will convert automatically.
I am using an undocumented but stable stylemap syntax for this.
More can be learned here:
- #142
- mwilliamson/python-mammoth#151
Due to how Word works and how mammoth works, sometimes we define a particular class of element as a
paragraph
when it might also be a list.Here's a good example:
What is happening mechanically is that I am defining style maps that map particular classes to HTML elements. However, some of these classes are very general (
paragraph
,normaltextrun
,Default
) and can apply equally to a paragraph or a list, depending on how the list is created.I am not sure if there is a good way to
Something I've done in the past has been to ignore styles that are too general (like
Default
), but it feels risky to me because then we don't really know what it will come in as. However, if we can't do anything else about this, then that's probably the best thing to do on balance.I've opened up an issue in the mammoth repo to see if there is a better answer to this: mwilliamson/python-mammoth#151
Will update this issue once that one gets an answer.
The text was updated successfully, but these errors were encountered: