jupyter / nbformat

Reference implementation of the Jupyter Notebook format

Home Page:http://nbformat.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NBFormat Does Not Recognize Indentation

z3ht opened this issue · comments

Hello,

Thanks for the awesome tool! I would like to use it for a project I'm working on but have the following issue:

NBFormat does not recognize indentation levels people set for their notebook's JSON file (more info: https://jupyter-notebook.readthedocs.io/en/latest/frontend_config.html#example-changing-the-notebook-s-default-indentation). This is a problem because when a notebook with more than one space for indentation is formatted by NBFormat, all lines are rewritten to use single spaced indentation. As a result, the git diff is unreadable.

I propose the following two possible solutions (either would work):

  • NBFormat could parse how many spaces are used in a file and default to using that many spaces when writing JSON back to that file
  • NBFormat could accept an indent option within the nbformat.write function allowing users to specify how many spaces should be used

I suspect the second possible solution would be easier and therefore the better option.

Problem Walkthrough
Original nb JSON (note the double-space indent):

  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "nbgrader": {
          "grade": false,
          "locked": true,
          "schema_version": 3,
          "solution": false
        },
        "id": "peZlh40I73Ql"
      },
      "source": [
        "# Deep Learning\n",
        "\n",
        "In this exercise, you will use a deep neural network to predict the values of houses based on some provided input data. You will use keras to build the model. Below is a description of how the keras models are set up."
      ]
    },
    ...
  }

After a call to nbformat.write(nb=nb, fp=notebook_path, version=nbformat.NO_CONVERT), the following is the JSON output (note the single-space indent):

 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "peZlh40I73Ql",
    "nbgrader": {
     "grade": false,
     "locked": true,
     "schema_version": 3,
     "solution": false
    }
   },
   "source": [
    "# Deep Learning\n",
    "\n",
    "In this exercise, you will use a deep neural network to predict the values of houses based on some provided input data. You will use keras to build the model. Below is a description of how the keras models are set up."
   ]
  },
  ...
}

Best,
Andrew

IMO, the fixed indentation reduces line noise in diffs. Avoiding line noise in diffs justifies setting the indentation level to a fixed indent.

Hmm... Now that you say this, I'm realizing just how much line noise two different developers with two different indentation settings could create. I concede that the better solution for my problem is to enforce a uniform indentation level across our project.

If the/a JSON parser was extended to support sniffing the (first few?) indentation levels, I still don't think nbformat should support configurable indentation levels.

Although possibly not necessary, I think parsing just the first line would work fine. I know this invites the argument that there may be a hidden line with different indention than the rest of the file that would cause line noise when formatted but I'm really not concerned with it.

If you need to reindent a JSON document to e.g. manually review the ipynb, python -m json.tool --indent 2 nb.ipynb may be helpful. https://docs.python.org/3/library/json.html#module-json.tool

Thanks for pointing this out! If we decide to go with an indentation level other than 1 and NBFormat does not add custom-indention support, I'll be sure to look into using this.

I'm closing this Issue because I no longer feel I need an indentation feature

Rog. For my nbformat with JSON-LD @context designs, perhaps others would feel differently about 2, maybe 3, 4 spaces instead of tabs?

The notebook-aware diffing support in nbdime may be helpful for your use case?

From markusschanta/awesome-jupyter#8 :

https://github.com/Yogayu/awesome-jupyterlab-extension#version-control