The Inescapable Pragmatism of Procedures

Dr. Ben Maughan writes:

At the moment I am rewriting some LaTeX notes into org mode to use in lecture slides. This involves several repetitive tasks, like converting a section heading like this

into this

Whenever I come across a problem like this, my first inclination is always to write a regular expression replacement for it.

A regular expression solution would likely be concise. It would be elegant. It would neatly state the abstract transform that needs to be performed, rather than getting bogged down in details of transformation. A clean, beautiful, stateless function from input to output.

And by definition, a regular expression replacement solution would have a well-defined model of valid input data. Only lines that match the pattern would be touched.

Like I said, a regular expression is always my first thought. But then I’ll work on the regex for a while, and start to get frustrated. There’s always some aspect that’s just a little bit tricky to get right. Maybe I’ll get the transform to work right on one line, but then fail on the next, because of a slight difference I hadn’t taken into account.

Minutes will tick by, and eventually I’ll decide I’m wasting time, throw it away, and just do the editing manually.

Or, on a good day, when I’ve had just the right amount of coffee, I will instead remember that macros exist. Macros are the subject of Maughan’s article.

The trick to making a good macro is to make it as general as possible, like searching to move to a character instead of just moving the cursor. In this case I did the following:

  1. Start with the cursor somewhere on the line containing the subsection and hit C-x C-( to start the macro recording
  2. C-a to go to the start of the line
  3. C-SPC to set the mark
  4. C-s { to search forward to the “{” character
  5. RET to exit the search
  6. C-d to delete the region
  7. Type “** ” to add my org style heading
  8. C-e to move to the end of the line
  9. BACKSPACE to get rid of the last “}”
  10. C-x ) to end the recording

Now I can replay my macro with C-x e but I know I’ll need this again many times in the future so I use M-x name-last-kbd-macro and enter a name for the macro (e.g. bjm/sec-to-star ).

If I ask Emacs to show me an editable version of Maughan’s macro, I see this:

This is the antithesis of a pattern-matching, functional-style solution. This is imperative code. It’s a procedure.

Let’s list some of the negatives of the procedural style:

  • It reveals nothing about the high-level transformation being performed. You can’t look at that procedure definition and get any sense of what it’s for.
  • It’s almost certainly longer than a pattern-replacement solution.
  • It implies state: the “point” and “mark” variables that Emacs uses to track cursor and selection position, as well as the mutable data of the buffer itself.
  • It has no clear statement of the acceptable inputs. It might start working on a line and then break halfway through.

Now let’s talk about some of the strengths of the procedural approach:

  • It is extraordinarily easy to arrive at using hands-on trial and error.
  • The hands-on manipulation becomes the definition, rather than forcing the writer to first identify the transforms, then mentally convert them into a transformation language.
  • It has a fair amount of robustness built-in: by using actions like “go to the next open bracket”, it’s likely to work on a variety of inputs without any specific effort on the part of the programmer.
  • It can get part of the work done and then fail and ask for help, instead of rejecting input that fails to match pattern.
  • It lends itself to a compelling human-oriented visualization: a cursor, moving around text and adding and deleting characters. In other words, it can tell its own story.
  • You can edit it without thinking too hard. You don’t have to hold a whole pattern in your head. You can just advance through the story until you get to the point where something different needs to happen, and add, delete, or edit lines at that point.
  • As the transforms become more elaborate, a regex-transformational approach will eventually hit a wall where regex is no longer a sufficiently powerful model, and the whole thing has to be rewritten. There’s no such inflection point with procedural code.

Time after time, the pattern-matching, functional, transformational approach is the first that appeals to me. And time after time it becomes a frustrating time-sink process of formalising the problem. And time after time, I then turn to the macro approach and just get shit done.

The procedural solution strikes me as being at the “novice” level on the Dreyfus Model of Skill Acquisition. We tell the computer: do this sequence of steps. If something goes wrong, call me.

By contrast, more “formal” solutions strike me as an attempt to jump straight to the “competent” or even “proficient” level: here is an abstract model of the problem. Get it done.

One problem with this, at least looking at it from an anthropomorphic point of view, is that this isn’t how knowledge transfer normally works. People work up to the point of advanced beginner, then competent, then proficient by doing the steps, and gradually intuiting the relations between them, understanding which parts are constant and which parts vary, and then gaining a holistic model of the problem.

Of course, we make it work with computers. We do all the hard steps of modeling the problem, of gaining that level-three comprehension, and then freeze-dry that model and give it to the computer.

But this imposes an artificially high “first step”: witness me trying, and failing, to get a regex solution working in a short period of time. Before reverting to the “dumb” solution of writing a procedural macro through trial and error.

And I worry about the scalability of this approach, as we have to do the hard work of modeling the problem for every last little piece of an application. And then re-modeling when our understanding turns out to be flawed.

This is one reason I’m not convinced that fleeing the procedural paradigm as fast as possible is the best approach for programming languages. I fear that by assuming that a problem must always be modeled before being addressed, we’re setting ourselves up for the exhausting assumption that we have to be the ones doing the modeling.

(And I think there might be a tiny bit of elitism there, as well: so long as someone has to model the problem before telling the computer how to solve it, we’ll always have jobs.)

This is also why I worry a little about a movement toward static systems. The interactive process described above works because Emacs is a dynamic lisp machine. A machine which can both do a thing and reflect on and record the fact that it is doing a thing, and then explain what it did, and then take a modified version of that explanation and do that instead.

I’ve recently realized that I’m one of those nutjobs who wants to democratize programming. And I think in order for that to happen, we need computing systems which are dynamic, but which moreover are comfortable sitting down at level 1 of the Dreyfus model. Systems that can watch us do things, and then repeat back what we told them. Systems that can go through the rote steps, and ask for help when something doesn’t go as expected.

Systems that have a gradual, even gradient from manual to automated.

And then, gradually, systems that can start to come up with their own models of the problem based on those procedures. And revise those models when the procedures change. Maybe they can come up with internal, efficient, elegant transformational solutions which accomplish the same task. But always with the procedure to fall back on when the model falls apart. And the users to fall back on when the procedure falls apart.

Now, there are some false dichotomies that come up when we talk about procedural/functional, formal/informal. For instance: there’s no reason that stateful, destructive procedures can’t be built on top of persistent immutable data structures. The bugbear of statefulness needn’t haunt every discussion of imperative coding.

But anyway, getting back the point at hand: there is an inescapable pragmatism to imperative, procedural code that mutates data (at least locally). There is a powerful convenience to it. And I think that convenience is a signal of a deeper dichotomy between how we show things to other people, vs. how we [think we should] explain things to computers. And for that reason, I’m nervous about discarding the procedural model.

P.S: I’m going to flagrantly exploit the popularity of this post to say: if you like software thinky-thoughts such as this one, you might also enjoy my newsletter!