Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jupyterlab adds jupytext metadata, but then jupytext --sync removes it again #900

Closed
jli opened this issue Jan 4, 2022 · 9 comments · Fixed by #903
Closed

Jupyterlab adds jupytext metadata, but then jupytext --sync removes it again #900

jli opened this issue Jan 4, 2022 · 9 comments · Fixed by #903

Comments

@jli
Copy link

jli commented Jan 4, 2022

Use case

My team's use case: we edit .ipynb notebooks in Jupyterlab. We use jupytext to create companion .py files for code review.

Historically, we've relied on everyone installing jupytext locally to update the .py files after notebook edits. Now, we'd like to add jupytext --sync via pre-commit to ensure the files are kept in sync.

Issue

  • With jupytext installed locally, edits to the notebook via Jupyterlab also automatically update the companion .py file. jupyterlab/jupytext also adds a jupytext.text_representation dict to the notebook's metadata.
  • When jupytext --sync runs, it removes metadata.jupytext.text_representation from the notebook file again.
  • This causes problems when running jupytext via pre-commit: it updates the .ipynb notebook file to remove the extra jupytext metadata and then fails because the index is outdated:
$ git commit -a -m 'updated notebook'
jupytext.................................................................Failed
- hook id: jupytext
- exit code: 2
- files were modified by this hook

[jupytext] Reading test_notebook.ipynb in format ipynb
[jupytext] Loading test_notebook.py
[jupytext] Updating test_notebook.ipynb
[jupytext] Error: the git index is outdated.
Please add the paired notebook with:
    git add test_notebook.ipynb
[jupytext] Updating the timestamp of test_notebook.py
[jupytext] Reading test_notebook.py in format py
[jupytext] Loading test_notebook.ipynb
[jupytext] Error: the git index is outdated.
Please add the paired notebook with:
    git add test_notebook.ipynb
[jupytext] Updating the timestamp of test_notebook.py

Rerunning git commit -a works, since we add jupytext's change to the git index. However, this isn't ideal because it's confusing, and it's annoying to have to do this every time we edit notebooks.

Details

I'm wondering why jupytext is adding this metadata in one context (editing via Jupyterlab) but removing it in another (called directly with jupytext --sync). I believe this is the code that removes the metadata:

jupytext_metadata.pop("text_representation", {})
(call stack: cli.main -> cli.jupytext_single_file -> lazy_write -> jupytext.writes)

It also seems that this only happens when using --sync. If I use --from ipynb --to py instead, then the original notebook isn't written, which avoids this issue. I'm using this as a workaround for now.

Versions

This is happening with recent versions:

$ python -V
Python 3.8.12
$ pip list | grep -i jup
jupyter-client       7.1.0
jupyter-core         4.9.1
jupyter-server       1.13.1
jupyterlab           3.2.5
jupyterlab-pygments  0.1.2
jupyterlab-server    2.10.2
jupytext             1.13.5
@mwouts
Copy link
Owner

mwouts commented Jan 4, 2022

Hi @jli , thank you for reporting this. Well that is interesting! May I ask you to run this Python command?

from jupytext.config import find_jupytext_configuration_file
find_jupytext_configuration_file('.')

in the folder where you have the notebook? If any configuration file is found, can you share its content?

Note: It may happen that Jupytext CLI and Jupyter Lab are using different configurations, as Jupyter Lab cannot access files above the Jupyter root.

@jli
Copy link
Author

jli commented Jan 4, 2022

@mwouts Thanks for the quick response! Yep, I do have a jupytext.toml config file. It looks like:

formats = "ipynb,py:percent"

(With further experimentation, I've found that without this config file, jupyterlab won't automatically create the companion .py file when I create new notebooks. But with the config, jupyterlab does automatically create the companion file.)

I've also put together a small demo repo for this issue: https://github.com/jli/jupytext_sync_900
It has jupytext.toml, requirements.txt, and (optional) .pre-commit-config.yaml files which I'm able to use to reproduce this issue, along with a README.md with specific reproduction steps.

Thanks!

@mwouts
Copy link
Owner

mwouts commented Jan 5, 2022

Thank you @jli for the detailed information! I will make sure I can reproduce this, and then we will fix the issue. I'll keep you posted.

mwouts added a commit that referenced this issue Jan 5, 2022
mwouts added a commit that referenced this issue Jan 5, 2022
@mwouts
Copy link
Owner

mwouts commented Jan 5, 2022

Hi again @jli , I am working on a test at https://github.com/mwouts/jupytext/compare/new_test_for_pre_commit_sync, but at the moment I have not been able yet to reproduce the issue.

The test looks like what you reported by maybe I have missed something. Do you think you could evolve that test into something that better reproduces the problematic flow? Thanks!

@jli
Copy link
Author

jli commented Jan 10, 2022

I believe the difference between that test and my situation is that when I modify the notebook file, something is adding the jupytext.text_representation to the notebook metadata.

This is what's happening in my case:

  1. start: clean working tree, ipynb and py files in sync, no jupytext in notebook metadata.
  2. edit and save ipynb file with Jupyterlab + jupytext: both files are updated with the edit, but the notebook metadata also now has a jupytext.text_representation section.
  3. run jupytext --sync: invoking jupytext via the CLI removes the newly added jupytext metadata.

So in your test after updating the notebook, this assert should be true in order to match my situation:

    assert "text_representation" in tmpdir.join("test.ipynb").read()

I tried adding this, and the assert fails. So, the test isn't reproducing the part where editing the notebook adds the jupytext metadata section.

@mwouts
Copy link
Owner

mwouts commented Jan 10, 2022

Thanks @jli for the additional information. I'll add steps 1. and 2. to the test and hopefully we'll reproduce this problematic jupytext.text_representation (which in theory belongs only to text files)

@mwouts
Copy link
Owner

mwouts commented Jan 10, 2022

Indeed with the new version of the test at 8e2f3bb I can reproduce the issue that you documented above (thanks for pointing out at the exact pattern!)

        # modify the ipynb file in Jupyter
        # Reload the notebook
        nb = cm.get("test.ipynb")['content']
        nb.cells.append(new_markdown_cell("A new cell"))
        cm.save(dict(type="notebook", content=nb), "test.ipynb")
    
        # The text representation metadata is in the py file
        assert "text_representation" in tmpdir.join("test.py").read()
        # But not in the ipynb notebook
>       assert "text_representation" not in tmpdir.join("test.ipynb").read()
E       assert 'text_representation' not in ('{\n'\n ' "cells": [\n'\n '  {\n'\n '   "cell_type": "markdown",\n'\n '   "id": "cell-1",\n'\n '   "metadata": {},\n'\n '   "source": [\n'\n '    "A short notebook"\n'\n '   ]\n'\n '  },\n'\n '  {\n'\n '   "cell_type": "markdown",\n'\n '   "id": "cell-3",\n'\n '   "metadata": {},\n'\n '   "source": [\n'\n '    "A new cell"\n'\n '   ]\n'\n '  }\n'\n ' ],\n'\n ' "metadata": {\n'\n '  "jupytext": {\n'\n '   "text_representation": {\n'\n '    "extension": ".py",\n'\n '    "format_name": "percent",\n'\n '    "format_version": "1.3",\n'\n '    "jupytext_version": "1.13.5"\n'\n '   }\n'\n '  },\n'\n '  "kernelspec": {\n'\n '   "display_name": "Python 3",\n'\n '   "language": "python",\n'\n '   "name": "python_kernel"\n'\n '  }\n'\n ' },\n'\n ' "nbformat": 4,\n'\n ' "nbformat_minor": 5\n'\n '}\n')
E         'text_representation' is contained here:

@mwouts
Copy link
Owner

mwouts commented Jan 11, 2022

Hi @jli , this should be fixed in the new version 1.13.6 that is being published on pypi. Would you mind giving it a try and report if the problem disappears? Thanks

@jli
Copy link
Author

jli commented Jan 20, 2022

@mwouts I confirmed that with 1.13.6, the problem goes away: now edits in jupyter-lab no longer add the jupytext metadata.

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants