Org-babel as an alternative to Jupyter notebooks

24 Jul 2018 in Programming on Python, Jupyter, Emacs, Orgmode

Foldable hierarchical outlines¹ must, surely, be one of the building blocks of the universe. I wasn't particularly keen on the notebook format when I was still using Mathematica (which does have foldable headlines), preferring the Matlab-style popping up of windows which I could arrange next to each other on the screen. But I started adopting the Jupyter notebooks for python when I realized the Quixotism of my attempts to give figures names which would allow the subsequent reconstruction of their meaning. (I have gone so far as to save additional information in file metadata.)

The notebooks do have a lot going for them, beyond the obvious prettiness and out-of-the-box usability — for me, before all the fact that if I leave a notebook server running at work, I can just dig an ssh tunnel from home and fire up my browser. Nonetheless, they don't fold and therefore stubbornly resist the imposition of higher order at least in my hands. Sure, it would probably be relatively easy to implement some folding plugin for the browser interface, and I bet someone will at some point or other. But then, I really like org-mode anyway, in principle at least, and I keep hoping that with increased use, there will come a inflection point in my ability to tame it. It's another quest, for sure, and it might turn out quixotic as well — but for now, I decided to write up my tales from the rabbit-hole of find-function calls, for the benefit of my fellow org-curious.

Saving figures directly in python with matplotlib

Ob-ipython implements, as of this writing, a few nice features such as auto-completing through company-mode (asking the ipython kernel for completion suggestions). It is the org-babel python backend I'm test-driving, but I don't think there's anything in the workflow described below that wouldn't work with the regular ob-python backend.

One aspect of how images are handled in ob-ipython is that the %matplotlib inline backend is used to grab image data directly from the ipython kernel, which is then written to file from within Emacs; finally, a link to the file is inserted and org-display-inline-images (which generates image overlays from links) is called via the org-babel-after-execute-hook. Since I'm way more familiar with python than with elisp, I prefer the idea of controlling the process of writing out the image file in python directly. The following function allows for a different insertion procedure, which has the additional advantage that I can, if I so desire, use an interactive matplotlib backend to pop up the figure in a window first.

The following function saves a previously generated figure using the savefig command (assuming matplotlib.pyplot has been imported as plt), and uses the function org-babel-insert-result to append a #+RESULTS: block after the current src block, containing a link to the generated file. A call to org-display-inline-images at the end replaces links with overlay images:

(defun nandu-append-figure ()
  "Save current matplotlib figure to file and append link as result.

Execute a matplotlib.pyplot.savefig on the current figure in the
`ob-ipython-resources-dir' directory and append the link/overlay
to the current src block. The file will be named according to the
:savefig header param, or randomly if that isn't present.
:savefig should either be given a full filename with
extension (e. g. \"image.png\"), or just the extension (e. g.
\"png\"), in which case a random name is created also. The
:results header of the src block is handed over to
`org-babel-insert-result'. An optional :style header can give the
name of a style with which to print the figure to hardcopy."

  (interactive)
  (let* ((babel-args '((:session . nil)))
         (info (or (nth 2 (org-babel-get-src-block-info))
                   (org-babel-parse-header-arguments
                    (org-element-property :end-header (org-element-at-point)))))
         (res-dir (expand-file-name (file-name-sans-extension (buffer-name)) ob-ipython-resources-dir))
;; default figure format
         (ext "png")
         (path nil)
;; merge any result-params from the src block header with these
         (result-params '("replace" "raw")))

    (catch :im_format
      (when-let ((file_name (alist-get :savefig info)))
        (setq ext (split-string file_name "\\."))
        (if (cdr ext)
            (progn
              (setq path (expand-file-name file_name res-dir))
              (throw :im_format t))
          (setq ext (car ext))))                                                       ;; when-let
      (setq path (concat (make-temp-name (file-name-as-directory res-dir)) "." ext)))  ;; catch
;; if plt.close() isn't called, the figures accumulate weirdness over time

    (let ((cmd (format "plt.gcf().savefig('%s'); plt.close()" path)))
;; wrap call in a style context if :style is given
;; the style is looked for first in nandu-mpl-styles-directory, then among the standard mpl styles
      (when-let* ((style (alist-get :style info))
                  (mplstyle (or (car (directory-files nandu-mpl-styles-directory t style)) style)))
        (setq cmd (format "with plt.style.context('%s'):\n\t%s" mplstyle cmd)))
      (org-babel-execute:ipython cmd babel-args))

    (when-let ((results (alist-get :results info)))
      (setq result-params (cl-union (split-string results) result-params)))
    (org-babel-insert-result (format "[[file:%s]]" path) result-params)
    (org-display-inline-images)))

There are a few points to note:

I prefer a finder granularity to where the resulting image files get saved: underneath the ob-ipython-resources-dir, I add an additional level of directories, on for each .org file which contains images. The name of the subdirectories is the same as that of the files (without the .org) to which they correspond.
Instead of hijacking the :ipyfile keyword, I use my own to denote the filename: :savefig. Another advantage of this python-side approach is that I can use the filename to signal to savefig which format I want the file to be saved in. This is useful in case some figures are better suited to a bitmap format and others to a vector one. There is no need to choose an image format via %config InlineBackend.figure_format. Since it is hard to find the documentation for the IPython / Jupyter options, if you stumble across this post looking for them — there are two valid forms:
```
InlineBackend.figure_format = 'png' # or any other format, of course
InlineBackend.figure_formats = ['png', 'svg']
```
These can either be set via the %config magic on a per-file basis or in some configuration file for whatever Jupyter frontend you use.
The code above parses the src block header for an additional keyword, :style, which, if given, leads to the application of a particular style sheet. I look first in a particular directory for a matching style sheet (nandu-mpl-styles-directory²), then among the default matplotlib styles.

The resulting workflow is hence as follows:

Execute the src block with C-c C-c or ,, (in Spacemacs).
Execute the above function, which I for example have bound to s-<return> (I'm on a mac and use the apple key as super).

Note that if step 1) produces regular string output, it will be displayed by the usual mechanism, but step 2) will replace it with the image link and overlay.

If no interactive window pop-up is desired, one can simply select an apropriate backend via the matplotlib.use() function. This can be handy as well if it turns out that the interactivte figure canvas is distorted and one wants to re-plot the figure onto a non-interactive canvas (use pyplot.switch_backends() in that case). Of course, it would also be possible to generally decide on a non-interactive backend and bind a function combining the two steps to the desired key, thereby replicating the usual org-babel workflow.

There is, of course, one immediately obvious drawback, at least if you're on a computer with a high-resolution screen: The resolution of the images displayed inside Emacs doesn't come anywhere close to what it is in a browser…

A note on regular ob-ipython result blocks

At the time of this writing, ob-ipython inserts an ipython-like Out[...] as a comment before the returned results. Using a separate function, as suggested above, circumnavigates all output-specific code from the ob-ipython side. However, for regular, non-image output the fact that it is a comment that is inserted seems to interrupt org's parsing of the #+RESULTS: drawer, which is why I override the ob-ipython function ob-ipython--process-response currently with my own with the offending line

(format "# Out[%d]:\n" (cdr (assoc :exec-count ret)))

removed. Personally I also prefer my image output not to live in drawers in the way that ob-ipython does it, since those are always closed unless explicitly opened.

Footnotes:

One of the main features of org-mode (and OmniOutliner, to be fair, which got me hooked onto them in the first place at times when it was still bundled for free with OSX. But nowadays I need cross-platform tools and org-mode is more powerful, more hackable, and more free.)

Nandu is the name of my personal spacemacs configuration layer, from which the code is grabbed, so just ignore the name.

Org-babel as an alternative to Jupyter notebooks

Saving figures directly in python with matplotlib

A note on regular ob-ipython result blocks

Footnotes:

betaplane

Error

Saving figures directly in python with matplotlib

A note on regular ob-ipython result blocks

Footnotes:

Templates (for web app):

Error