fix lean syntax highlighting. #77

lovettchris · 2022-09-20T02:35:40Z

and fixes #10

The snippets now look nice:

But there's a problem with "lean, error" blocks, can we drop the "lean" language prefix on these?

david-christiansen · 2022-10-09T17:46:00Z

How is that code highlighting the Lean? It looks to me like it's using an inaccurate regexp-based highlighter. For instance, in head?, the ? is not highlighted as part of the name, which I think has a strong potential to confuse readers. Is there a place where we can adjust the regexps being used to highlight to make sure that it's up to date with modern Lean 4 conventions?

The formatting of the error blocks (and presumably also the info blocks?) is much better, though!

lovettchris · 2022-10-10T18:21:27Z

Yes it is using highlight.js from mdbook, which we customized adding a "Lean" section to the end of it, so yes, it is mostly a regex based syntax highlighter and this is the same technique used in

So if you find bugs in the syntax highlighting then we need to fix highlight.js for all these books. There's some question marks in https://leanprover.github.io/theorem_proving_in_lean4/interacting_with_lean.html which seem ok, so we'll have to investigate what is going on with your example...

david-christiansen · 2022-10-10T19:33:26Z

The link in question doesn't have any question marks in the same syntactic category that I see it in in your screenshot above - a name being defined with def. My guess is that something like [a-zA-Z][a-zA-Z0-9_]+ needs to have an extra \?char put in it. Fairly classic issue.

Precisely because a trailing ? is so unfamiliar to programmers who don't already know Scheme or Ruby, I think it's important to highlight that accurately before we merge the overall highlighting patch. I'm happy to do any fixups of Lean to lean needed as part of the merge (that's a super easy find-replace while resolving conflicts), but I don't think I have the bandwidth to debug the hightlighting JS, at least not until the next release is out.

I'd merge a PR that improves the display of info and errors immediately, that's a real clear win. And it's fine to drop the Lean prefix as long as that doesn't break the CSS (I think mdbook sticks different classes on the Markdown language indication than it does on the Markdown class indications, so we might need to write something like:

```output error

instead of

```lean error

which is another easy find-replace.

What would it take to use LeanInk instead as part of the build script for the book? I'm imagining something like processing its intermediate JSON notation to at least get accurate highlighting, even if the full experience is too much work to do. But if we can just grep the LeanInk HTML for code bits and it works to just inline those and add some JS at the top, even better! I definitely don't have time to do it, though.

lovettchris · 2022-10-10T20:18:04Z

Yes, totally agree about '?' being an unusual identifier char, confused me first time I saw it, I had to ask Leo about it. I think the LEAN_IDENT regex is wrong, and needs to include ? and not just at the end of the identifier:

def test?forteen? :=
  IO.println "yep"

#eval test?forteen?

I'll work on a fix to the highlight.js and ensure Leo and others are fine with the fix in the other books also so we have the same highlight.js used everywhere. I'll think more about your leanink question and post comment on that soon.

lovettchris · 2022-10-10T20:21:48Z

Note that "LeanInk" seems to get the colors wrong also, see https://leanprover.github.io/lean4/doc/monads/transformers.lean.html where def indexOf? highlights the ? in purple making out that it's something different from the identifier, which it isn't. So I'll work on a fix for this also.

lovettchris · 2022-10-10T20:47:35Z

Ok, I'm proposing this fix, we'll see what Leo and Sebastian say about this:
leanprover/lean4#1713

lovettchris · 2022-10-10T22:30:16Z

Ok, I just updated the highlight.js in this PR so it is in sync with the fixed version in Lean4, it does not change the formatting of meta variables because in Lean you cannot start an identifier with question mark, but it does fix things like this, in book/html/getting-to-know/conveniences.html

lovettchris · 2022-10-10T23:14:42Z

Ok, now back to your LeanInk question. First off yes, we should change the error outputs to language output error or something other than lean, except in rare cases where the lean snippet is invoking a lean pretty printer :-)

But LeanInk doesn't actually do syntax highlighting, it is alectryon that does that and it is a big complex python program for generating formatted markdown, but yes, you could write your book in .lean then pre-process that with alectryon (which calls leanink just to get the elaborated type information) and get a bunch of markdown which you then run through mdbook.

But all your python needs to run BEFORE alectryon to insert the code snippets! So I think it is too complicated to think about integration of alectryon right now for your book. I think long term it would be better to port all the python code in alectryon to Lean, integrate that into LeanInk, and then port all your python code to .lean and integrate that into LeanInk then we have one command line to rule them all, replacing mdbook :-) But that may take a while... but it is something we are talking about doing...

lovettchris · 2022-10-10T23:21:33Z

Note: with this "PR" you need to remember from here on out to use lower case lean on the triple back tick lean code blocks.

david-christiansen · 2022-10-11T19:35:03Z

This looks good!

RE Alectryon and LeanInk - I'm aware of how it works, my curiosity is more as to whether it would be plausible to replace the current build process:

Lean file -> Python data structure mapping named examples to code strings -> Markdown with substituted strings

with something like:

Lean file -> LeanInk JSON -> Alectryon HTML/JS -> Python data structure mapping named examples to code HTML -> Markdown with inline HTML

I suspect it would require truly fearsome regular expressions, or alternatively some smarter HTML matching library together with fearsome regular expressions. But it seems plausilble that it could work.

The other, simple alternative is:

Lean file -> LeanInk JSON -> Accurately syntactically highlighted HTML -> Python data structure mapping named examples to code HTML -> Markdown with inline HTML

This seems easy enough, if somewhat tedious to implement. I could do this build process today:

Lean file -> Emacs buffer with LSP-based accurate highlighting -> Accurately syntactically highlighted HTML -> Python data structure mapping named examples to code HTML -> Markdown with inline HTML

with about 10 lines of elisp and a bit more Python, but I don't want to include batch-mode Emacs in the build script :-)

In any case, I'll merge this for now, as it's a clear improvement over what we had, and I'll be watching for formatting issues in my next proofreading pass. Thanks!

lovettchris · 2022-10-11T19:35:09Z

The fixed highlight.js was just merged into the lean repo for the lean reference manual to use.

lovettchris added 4 commits September 19, 2022 19:34

fix lean syntax highlighting.

16d107e

fix: move highlight.js to the right place.

836e2f8

fix: add missing lean language on lakefile.lean snippet.

36fe10e

Merge branch 'master' into clovett/syntax_highlighting

cb03dc8

update highlight.js

b5e2535

Merge branch 'master' into clovett/syntax_highlighting

cf52ebf

david-christiansen merged commit 0cfb5bf into master Oct 11, 2022

david-christiansen deleted the clovett/syntax_highlighting branch October 11, 2022 19:35

david-christiansen mentioned this pull request Oct 11, 2022

Remove "lean" language from info and error fenced code blocks #86

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix lean syntax highlighting. #77

fix lean syntax highlighting. #77

lovettchris commented Sep 20, 2022 •

edited

Loading

david-christiansen commented Oct 9, 2022

lovettchris commented Oct 10, 2022 •

edited

Loading

david-christiansen commented Oct 10, 2022

lovettchris commented Oct 10, 2022

lovettchris commented Oct 10, 2022

lovettchris commented Oct 10, 2022

lovettchris commented Oct 10, 2022 •

edited

Loading

lovettchris commented Oct 10, 2022

lovettchris commented Oct 10, 2022

david-christiansen commented Oct 11, 2022

lovettchris commented Oct 11, 2022

fix lean syntax highlighting. #77

fix lean syntax highlighting. #77

Conversation

lovettchris commented Sep 20, 2022 • edited Loading

david-christiansen commented Oct 9, 2022

lovettchris commented Oct 10, 2022 • edited Loading

david-christiansen commented Oct 10, 2022

lovettchris commented Oct 10, 2022

lovettchris commented Oct 10, 2022

lovettchris commented Oct 10, 2022

lovettchris commented Oct 10, 2022 • edited Loading

lovettchris commented Oct 10, 2022

lovettchris commented Oct 10, 2022

david-christiansen commented Oct 11, 2022

lovettchris commented Oct 11, 2022

lovettchris commented Sep 20, 2022 •

edited

Loading

lovettchris commented Oct 10, 2022 •

edited

Loading

lovettchris commented Oct 10, 2022 •

edited

Loading