Org Babel is an Underappreciated Tool
Tags:
Emacs Org mode’s source code blocks and the complementary org babel are underappreciated tools.
Emacs is a highly customisable text editor.
Org Mode is a broad set of functionality for org markup files.
Functionality ranges from todo list management & note taking, agenda management,
to spreadsheets and document exporting/publishing.
Org mode’s support for source code allows blocks of source code to be:
- evaluated (with the results embedded in the document; i.e. “plaintext Jupyter Notebooks”),
- exported (so the code can be presented in e.g. PDF documents),
- tangled (rearranging snippets into source code into files for the compiler).
Org Babel is the meta-programming functionality integrated with that, which allows source code blocks of different languages to interact.
It’s an underappreciated tool. For how useful the functionality is, it’s (at best) limited to Emacs users.
What I’ve Found it Useful For
“Plaintext Jupyter Notebooks” gets 95% of the way to describing org’s source code block evaluation.
– “Plaintext”, on the one hand, means it lacks some user-friendliness that Jupyter Lab’s rich web interface provides.
On the other hand, plaintext has its own advantages.
Anyway. I’ve found the that “run the source code blocks, include the results in the document” (also known as a “notebook”) to be useful.
I’ve used it for:
Playing around with / learning programming language subtleties.
- This is similar to how you’d use “scratch playgrounds” like https://play.rust-lang.org/: being able to run a snippet of code and see the result, without having to fuss with editing a file in some fresh location.
Exploring the behaviour of under-documented APIs.
- Similar to how notebooks are useful for writing tutorials or documentation, when I’ve been curious about how an under-documented API works, I’ve found the notebook interface to be useful.
DevOps work: spelunking around cloud environments and resources.
- howardism.org’s “Literate DevOps” is where I first came across the idea. – As well as the output of commands, a notebook supports idiomatically embedding links to reference material, or explanation for why commands were run. These are useful things for DevOps-related tasks.
Nice-to-Have Properties for Notebooks
I suspect this takes a bit of practice to get right.
Easy to execute: notebooks are easy to execute for the notebook author.
Easy to export to a human-readable format: e.g. I want to be able to export the report to markdown, and have the code in that be easy to read/understand, and easy to execute (or at least easy to copy-and-paste).
Maintainable “As-Code”: The code snippets notebook should be maintainable (easy to read). Ideally, the snippets have as little boilerplate as they need, and no less; as much abstraction as helps make the code understandable, and no more.
Sometimes I care about the results as a snapshot in time. (e.g. With the output of commands when querying a system that’s in a malformed state.. it’d be useful to have a notebook which reported that bad state, without overwriting those results with the latest (presumably correct) state).
Dynamics: Executable Document, Living Document, Literate Programming
Emacs org mode’s code block support enables some dynamics:
Executable Document
Because the org mode’s source code blocks can be executed, with the results embedded in the document, the resulting document can be described as an “executable document”: it’s a document where you can execute the (embedded) code snippets.
If the document’s emphasis is on the results (e.g. statistics/graphs), then this can be described as a “reproducible research” tool. (To reproduce the results, you’d just re-execute the document).
If the document emphasises documentation/specification, you can call it “executable documentation” or “executable specification”.
i.e. the benefits of a specification (communicating what something is) with the
benefits of an automated check (confidence in the system’s correctness).
Living Document
A “living” document is a document which is fresh and maintained. (As opposed to outdated).
In programming, ‘living documents’ often are associated with some kind of automated execution or generation.
Executable documents can be maintained as living documents: you can be confident the document is ‘up to date’ by re-executing it.
Literate Programming
I think ‘literate programming’ is broadly (mis)understood as programming where the code and prose are freely mixed together.
I think a more accurate and precise definition is: literate programming emphasises constructing an explanation of some module; where the literate program can be ‘weaved’ to construct a beautifully typeset document; or ‘tangled’ to produce source code files for the compiler. – The emphasis is on presentation of the program for a clear explanation (without being constrained by order that a compiler would require).
Org mode’s source code supports this.
Though, I suspect good ‘literate programming’ is requires a good amount of
discipline and thought.
Org-Babel is Meta-Programming
From the org mode community’s introduction to org babel:
Because the return value of a function written in one language can be passed to a function written in another language, or to an Org table, which is itself programmable, Babel can be used as a meta-functional programming language. With Babel, functions from many languages can work together. You can mix and match languages, using each language for the tasks to which it is best suited.
For example, let’s take some system diagnostics in the shell and graph them with R.
The code blocks can be named. The code blocks can take in :var inputs, and can reference the output values from other code blocks.
Code blocks can be re-called with different arguments.
That is: writing an org document with source code blocks is itself programming. It’s up to the programmer to choose an appropriate way of arranging the code.
I have not made much use of the polyglot nature of org-babel. But, I think it’s interesting.
It’s easy to imagine how having a notebook with a mix of =shell= (e.g. bash) cells, some =python= shells could be useful.
Notebooks Interface: Partway Between “Manual” and “Automated”
I think the most common ways to interact with some program are either entirely manual, or entirely automated.
e.g. if you’ve got some Python code, you’ll either run the code to observe its output; or perhaps you’ll write some automated check to have confidence that the code works.
e.g. if you want to launch some VM, you’ll either launch it in a web console, or run a command line command; or perhaps apply some Terraform code.
A notebook interface sits partway between these.
Unlike a script/automation: You can easily execute just the snippets you want, when you want. – If you’re interrupted
(e.g. some unexpected error occurs), you can either fix that externally, or embed instructions in the notebook to fix it.
– The instructions to fix it could even just be an explanation.
This means a notebook does not need to be as robust as an automation/script.
Unlike manual commands: A notebook document can easily include commentary, and provides an easy way to read through what was executed, and the output of what was executed.
I think those properties make notebooks a compelling tool for cases where you don’t quite have a fully automated setup easily available.