Rake Part 5: File Operations

If you're a Ruby programmer, you've almost certainly used Rake, the build utility created by the late Jim Weirich. But you might not realize just how powerful and flexible a tool it can be. I certainly didn't, until I decided to use it as the basis for Quarto, my e-book production toolchain.

This post is part of a series on Rake, starting with the basics and then moving on to advanced usage. It originated as a series of RubyTapas videos published to subscribers in August-September 2013. Each post begins with a video, followed by the script for those who prefer reading to viewing.

My hope in publishing these episodes for free is that more people will come to know and love the full power of this ubiquitous but under-appreciated tool. If you are grateful for Rake, please consider donating to the Weirich Fund in Jim's memory.

Here’s the Rakefile we’ve been working on for the last few episodes. It finds Markdown source files in a sources subdirectory of a project, and produces a parallel hierarchy of HTML files in an outputs subdirectory.

Now that we are recreating the input file hierarchy in an outputs directory, we need to ensure that the destination directory exists before generating any HTML files. An easy way to do this in Rake is to use a directory task. This is like a file task, except for a directory. But unlike a file task, we don’t have to supply any code for how to make the directory appear if it doesn’t already exist. Simply by specifying the task, we are giving Rake implicit instructions to create the directory if it is needed.

We add this directory to the list of dependencies for the .html rule.

Now when we run rake, we can see that it creates the directory before beginning to generate the HTML files. Unfortunately, it runs into a problem as it tries to build the appendix.html file. Since this file is in a subdirectory of the sources directory, we want the HTML output file to be in a corresponding subdirectory of the outputs directory. But this subdirectory doesn’t yet exist.

To ensure this or any other intermediate directory exists before producing an HTML file, we could execute a mkdir -p shell command, using #pathmap to pass just the directory portion of the target filename.

But Rake gives us a shortcut for this. Instead of running a shell command, we can use a mkdir_p method right in the task:

Now when we run rake, it ensures the target directory exists before each markdown-to-HTML transformation.

Often when writing build scripts it’s convenient to have an easy way to quickly blow away all of the generated files. Let’s add a task to handle this. Once again, instead of running a shell, we’ll use a Rake helper method called rm_rf. This mirrors the shell rm -rf command, which recursively deletes files and directories without any warnings or confirmation.