Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capturing plugin metadata for reproducibility #13

Open
bollwyvl opened this issue Nov 3, 2020 · 6 comments
Open

Capturing plugin metadata for reproducibility #13

bollwyvl opened this issue Nov 3, 2020 · 6 comments
Labels
enhancement New feature or request

Comments

@bollwyvl
Copy link
Contributor

bollwyvl commented Nov 3, 2020

While I love the idea of extensible markup in the Jupyter ecosystem, it's gonna get gross, and not reproducible real fast if content just silently looks bad if plugins are missing.

As this doesn't seem forthcoming in the CommonMark spec, this repo should probably demonstrate an approach to:

  • instrumenting and capturing which plugins were actually used per render
  • storing this metadata
    • in the notebook, this seems like a top-level metadata field
metadata:
  jupyter-markup:
    plugins:
      - footnote
      - deflist
cells: []
  • in plaintext, presumably a comment syntax (gah!) could be added
<!-- jupyter-markup: footnote deflist -->
  • providing some feedback (status bar?) when authoring/reading if plugins are missing
    • and how to get the missing plugins... this seems really hard to manage, especially considering...
  • demonstrating/testing headless rendering in nbconvert/jupyter-book vs in-browser content
@agoose77
Copy link
Owner

agoose77 commented Nov 3, 2020

I agree that it would be very hard to error when plugins are missing, as incorrect Vs new syntax are the same problem from alternative perspectives. I think it might be something to embed in the notebook, because it would become very tedious to have to declare for every MD cell. The tricky thing is that the same logic could apply to having notebooks declare which lab extensions they expect. I'm going to give this more thought myself, because there will be some reproducibility from the fact that we can define the python packages that add the plugins as dependencies. Still, silent errors...

@agoose77
Copy link
Owner

agoose77 commented Nov 9, 2020

EDIT: moved from the parent PR

I've been having some more thoughts about this @bollwyvl

Are you thinking that we store the IDs MarkdownIt plugins themselves, or of the JupyterLab Markdown Plugin Extensions? I was initially thinking of some kind of system where the MarkdownIt extension checks whether it has the providers for requested MarkdownIt plugins, but this would tie the metadata source (e.g. notebook JSON or markdown header) to the implementation details (that we use MarkdownIt).

I was thinking that it would be better for the metadata source to request JLab extensions. If this were the case, at what point is it not better to generalise this approach and have notebooks be able to suggest which frontend extensions they expect. This wouldn't be a hard requirement, because people use all kinds of notebook frontends, and there are often different extensions to implement the same functionality. But, at least in JupyterLab this might be quite useful, e.g. a notebook could state

I need the following extensions:

  • jupyterlab-diagrams
  • ipympl

But, then we arrive at the point for JLab 3 where these extensions are already managed by the Python dependency management, and the whole thing becomes a lot simpler. I know that you can still load extensions with npm etc., but from the "good notebook workflow" perspective, the standard approach is to use a requirements.txt or conda environment.yml to capture Python dependencies; there is a precedent for reproducibility.

TL;DR, is it sufficient to implicitly capture md-it plugin requirements using the Python dependency manager?

@agoose77
Copy link
Owner

agoose77 commented Sep 1, 2021

I've thought more about this - with LSP integration, we can't rely on out-of-document information like pyproject.toml, and indeed, if those extensions are disabled, it would not be reflected in the LSP support. So, I think a per-doc metadata entry is needed.

See #40 (comment)

@bollwyvl
Copy link
Contributor Author

bollwyvl commented Sep 1, 2021 via email

@agoose77
Copy link
Owner

agoose77 commented Sep 3, 2021

Yes, I agree. The LS would need to understand the idea of configurable syntax options. The first step is to get this metadata into our documents, and from there we can look at the LSP side of things. I don't have time right now to work on this, but I'll pop back with more thoughts!

@agoose77 agoose77 added the enhancement New feature or request label Sep 3, 2021
@agoose77
Copy link
Owner

agoose77 commented Sep 9, 2021

Alright, I opened a Discourse discussion on this topic here.

The TL;DR is that I wonder whether the notebook should store the MIME-type of the cell for non-code cells, so that information like the plugins we're using (but also, the markdown renderer for "normal" notebooks) can be read by clients of the notebook.

Although this benefits us as extension authors, it would also fix a hole in the notebook spec that has been apparent for some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants