Have you always wished Jupyter notebooks were plain text documents? Wished you could edit them in your favorite IDE? And get clear and meaningful diffs when doing version control? Then, Jupytext may well be the tool you're looking for!
A Python notebook encoded in the py:percent
format has a .py
extension and looks like this:
# %% [markdown]
# This is a markdown cell
# %%
def f(x):
return 3*x+1
Only the notebook inputs (and optionally, the metadata) are included. Text notebooks are well suited for version control. You can also edit or refactor them in an IDE - the .py
notebook above is a regular Python file.
We recommend the percent
format for notebooks that mostly contain code. The percent
format is available for Julia, Python, R and many other languages.
If your notebook is documentation-oriented, a Markdown-based format (text notebooks with a .md
extension) might be more appropriate. Depending on what you plan to do with your notebook, you might prefer the Myst Markdown format, which interoperates very well with Jupyter Book, or Quarto Markdown, or even Pandoc Markdown.
Install Jupytext in the Python environment that you use for Jupyter. Use either
pip install jupytext
or
conda install jupytext -c conda-forge
Then, restart your Jupyter Lab server, and make sure Jupytext is activated in Jupyter: .py
and .md
files have a Notebook icon, and you can open them as Notebooks with a right click in Jupyter Lab.
Text notebooks with a .py
or .md
extension are well suited for version control. They can be edited or authored conveniently in an IDE. You can open and run them as notebooks in Jupyter Lab with a right click. However, the notebook outputs are lost when the notebook is closed, as only the notebook inputs are saved in text notebooks.
A convenient alternative to text notebooks are paired notebooks. These are a set of two files, say .ipynb
and .py
, that contain the same notebook, but in different formats.
You can edit the .py
version of the paired notebook, and get the edits back in Jupyter by selecting reload notebook from disk. The outputs will be reloaded from the .ipynb
file, if it exists. The .ipynb
version will be updated or recreated the next time you save the notebook in Jupyter.
To pair a notebook in Jupyter Lab, use the command Pair Notebook with percent Script
from the Command Palette:
To pair all the notebooks in a certain directory, create a configuration file with this content:
# jupytext.toml at the root of your notebook directory
formats = "ipynb,py:percent"
Jupytext is also available at the command line. You can
- pair a notebook with
jupytext --set-formats ipynb,py:percent notebook.ipynb
- synchronize the paired files with
jupytext --sync notebook.py
(the inputs are loaded from the most recent paired file) - convert a notebook in one format to another with
jupytext --to ipynb notebook.py
(use-o
if you want a specific output file) - pipe a notebook to a linter with e.g.
jupytext --pipe black notebook.ipynb
This is a quick how-to:
- Open your
.ipynb
notebook in Jupyter and pair it to a.py
notebook, using either the pair command in Jupyter Lab, or a global configuration file - Save the notebook - this creates a
.py
notebook - Add this
.py
notebook to version control
You might exclude .ipynb
files from version control (unless you want to see the outputs versioned!). Jupytext will recreate the .ipynb
files locally when the users open and save the .py
notebooks.
Collaborating on Jupyter notebooks through Git becomes as easy as collaborating on text files.
Assume that you have your .py
notebooks under version control (see above). Then,
- Your collaborator pulls the
.py
notebook - They open it as a notebook in Jupyter (right-click in Jupyter Lab)
- At that stage the notebook has no outputs. They run the notebook and save it. Outputs are regenerated, and a local
.ipynb
file is created - They edit the notebook, and push the updated
notebook.py
file. The diff is nothing else than a standard diff on a Python script. - You pull the updated
notebook.py
script, and refresh your browser. The input cells are updated based on the new content ofnotebook.py
. The outputs are reloaded from your local.ipynb
file. Finally, the kernel variables are untouched, so you have the option to run only the modified cells to get the new outputs.
Once your notebook is paired with a .py
file, you can easily edit or refactor the .py
representation of the notebook in an IDE.
Once you are done editing the .py
notebook, you will just have to reload the notebook in Jupyter to get the latest edits there.
Note: It is simpler to close the .ipynb
notebook in Jupyter when you edit the paired .py
file. There is no obligation to do so; however, if you don't, you should be prepared to read carefully the pop-up messages. If Jupyter tries to save the notebook while the paired .py
file has also been edited on disk since the last reload, a conflict will be detected and you will be asked to decide which version of the notebook (in memory or on disk) is the appropriate one.
Read more about Jupytext in the documentation.
If you're new to Jupytext, you may want to start with the FAQ or with the Tutorials.