Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code relying on the method make_bookmark_tree doesn't work since release 53.0 #1428

Closed
vicmion opened this issue Aug 26, 2021 · 5 comments
Closed
Milestone

Comments

@vicmion
Copy link

vicmion commented Aug 26, 2021

Hello,

I just noticed that the method make_bookmark_tree of the class Document has been dropped since release 53.0
I was relying on that method to automatically generate an Index table at the beginning of a document.
Is there a way to make it work with the new version of weasyprint?

I took a look at this issue #1420 but couldn't find a solution.

Thanks

@vicmion vicmion changed the title Legacy code relying on the method make_bookmark_tree doesn't work since release 53.0 Code relying on the method make_bookmark_tree doesn't work since release 53.0 Aug 26, 2021
@EugenMayer
Copy link

@liZe we depend on make_bookmark_tree pretty much due to the way we generate our bookmarks.

You might remember #1121 (comment) and we run this in production.

We cannot upgrade to 53.0 due to this issue. What we do is

   bookmarks = document.make_bookmark_tree()
   toc_inject_page_numbers_for_bookmarks(bookmarks, root)

def toc_inject_page_numbers_for_bookmarks(bookmarks, root: Element):
    """Traverse all given bookmarks and it's children and for each bookmark
    find the referencing toc table entry and update that toc entry with the
    page_number from the bookmark
    """
    for title, (page_number, _, _), children, _ in bookmarks:
        title_ident: str = "toc-%s" % hashlib.md5(title.encode("utf-8")).hexdigest()

        # this should usually only be a single one, it is the toc entry which is
        # referencing the bookmark / header in the content. We identify the toc entry
        # using a md5 hash of the title .. instead of title in the text. this helps
        # us avoiding issues with the special chars/escaping in the xpath expression
        toc_entry = root.find(".//body/div[@class='content content-toctable']//a[@id='%s']//span[@class='toc-entry-page-number']" % title_ident)
        if toc_entry is not None:
            real_page_number = page_number + 1
            toc_entry.text = str(real_page_number)

        toc_inject_page_numbers_for_bookmarks(children, root)

So we really rely on getting all the bookmarks from the PDF. Can we sponsor this (we are a bronze sponsor/supporter). Any chance to get some work on this issue?

Thanks

@vicmion
Copy link
Author

vicmion commented Aug 27, 2021

I'm posting the solution I found. Hopefully it can help someone with the same problem.

bookmark_tree = document.make_bookmark_tree()

can be replaced with

def make_bookmark_tree(document):
    res = []
    for (i, page) in enumerate(document.pages):
        for bmark in page.bookmarks:
            res.append((i, bmark[1]))
    return res

bookmark_tree = make_bookmark_tree(document)

Consider that, of all the information returned from make_bookmark_tree, all I needed was the label and the page number of the bookmark. Including more informations should be straightforward.

@EugenMayer
Copy link

@vicmion thank you for the workarround.

@liZe considering that this custom code operates on a internal data-structure and we already have 3 people requiring the make_bookmark_tree method, I would suggest that it is at least implemented internally, not externally while using internal data-structures.

What are your thoughts?

@liZe liZe added this to the 53.2 milestone Aug 27, 2021
@liZe liZe closed this as completed in c51bc60 Aug 27, 2021
@EugenMayer
Copy link

@liZe thank you for implementing that quickly! Awesome!

@grewn0uille
Copy link
Member

Hello!
So we finally re-added the make_bookmark_tree method that works like before :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants