Riffing on `interpose` implementations in Ruby

I very much enjoyed Brian Cobb’s step-by-step translation of the Clojure interpose  function to Ruby. I too agree that interpose  would be a handy method to have around.

As a quick TL;DR: interpose is kind of like Array#join , except that it produces a sequence instead of a string.

[1, 2].interpose(:sep).to_a
# => [1, :sep, 2]

Brian’s solution works by building an Enumerator.

module Enumerable
  def interpose(sep)
    Enumerator.new do |y|
      items = each

      loop do
        begin
          y << items.next
        rescue StopIteration
          break
        end

        begin
          items.peek
        rescue StopIteration
          break
        else
          y << sep
        end
      end
    end
  end
end

Returning an Enumerator  is an Enumerable  convention. And it makes for very composable code.

The other characteristic of a conventional Enumerable  method is that it yields elements to a block when called with one:

[1, 2, 3].interpose do |item|
  # ...do something with items and separators...
end

Brian's solution doesn't currently do this, but it would be easy to add.

module Enumerable
  def interpose(sep, &block)
    enum = Enumerator.new do |y|
      items = each

      loop do
        begin
          y << items.next
        rescue StopIteration
          break
        end

        begin
          items.peek
        rescue StopIteration
          break
        else
          y << sep
        end
      end
    end
    if block
      enum.each(&block)
    else
      enum
    end
  end
end

This got me thinking about what it would look like to implement the same functionality based on a yielding idiom. Most Enumerable methods are implemented in terms of yield, only constructing an Enumerator  if they are called without a block.

Let's start with an empty Interpose  module. Rather than globally adding it to Enumerable , we'll make it an optional refinement for Enumerable . (This only works under Ruby 2.4 or newer)

module Interpose
  refine Enumerable do
    # ...
  end
end

The simplest requirement for interpose is that, given an empty sequence, it should yield nothing.

using Interpose

RSpec.describe "Enumerable#interpose" do
  it "yields nothing when given nothing" do
    expect{|b| [].interpose(:sep, &b)}.not_to yield_control
  end
end

This one's easy, all we need is an empty method.

module Interpose
  refine Enumerable do
    def interpose(*)
    end
  end
end

Next, if we give it a sequence with only one method, we should get back the same sequence since there are no pairs of items between which to interpose the separator.

it "yields back a one-item sequence unaltered" do
  expect{|b| [1].interpose(:sep, &b)}.to yield_successive_args(1)
end

Simply delegating to each  is sufficient.

def interpose(*, &block)
  each(&block)
end

A two-element sequence is where things start to get interesting.

it "interposes separator between a pair of items" do
  expect{|b| [1, 2].interpose(:sep, &b)}.to yield_successive_args(1, :sep, 2)
end

Now we have to actually use the method's one parameter, which means giving it a name: separator .

def interpose(separator, &block)
  # ...
end

As for the implementation... well, we could try outputting the separator before each item.

def interpose(separator, &block)
  each do |item|
    yield separator
    yield item
  end
end

But that doesn't give us what we want.

expected given block to yield successively with arguments, but yielded with unexpected arguments
expected: [1, :sep, 2]
     got: [:sep, 1, :sep, 2]

If only we could insert the separator before each item except when it's the first item. Is there a way to tell which iteration we're currently in? Yes, there is! We just need to switch from each  to each_with_index , and then we can check to see if the current index is 0.

def interpose(separator, &block)
  each_with_index do |item, index|
    yield separator if index != 0
    yield item
  end
end

How about longer lists?

it "interposes separator between each item in a 3-element list" do
  expect{|b| [1, 2, 3].interpose(:sep, &b)}
    .to yield_successive_args(1, :sep, 2, :sep, 3)
end

Our existing code handles this just fine with no further changes.

4 examples, 0 failures, 4 passed

There's one last requirement: when called with no block, the method should return an Enumerator  that has the same behavior as the block form.

context "with no block" do
  it "returns an empty enumerator when given nothing" do
    e = [].interpose(:sep)
    expect(e.to_a).to eq([])
  end
end

Ordinarily we could accomplish this using the following magic incantation at the top of the method:

def interpose(separator, &block)
  return to_enum(__callee__) unless block_given?
  # ...
end

This tells Ruby that when there's no block supplied, it should take the current method (__callee__ ), construct an Enumerator around it, and return it.

But this results in a test error:

NoMethodError: undefined method `interpose' for []:Array

The problem here is that we've defined our Enumerable  extension as a refinement. Refinements are strictly lexically-scoped to the current file. When the Enumerator  code gets around to actually sending the interpose  message to the receiver, it occurs over in Ruby's implementation of Enumerator . Since that implementation is in a different file from our interpose.rb  (and a compiled C file at that), the refinement is not in effect.

This is a feature, not a bug. The great virtue of refinements is that it is impossible for them to "leak" into contexts where they are not expected. In order for a message send to be "diverted" by a refinement, that message send must actually be visible in the context where the refinement is used. If you're looking at some code, and you scroll up and don't see a using  declaration, you can be confident that no refinements are in effect.

This does mean, though, that we need a way to construct an Enumerator where the interpose  message send occurs physically within the current file. Fortunately, this isn't difficult. It's just a little more verbose.

def interpose(separator, &block)
  unless block_given?
    return Enumerator.new do |yielder|
      interpose(separator) do |item|
        yielder << item
      end
    end
  end
  # ...
end

With the inner call to interpose  now captured in a block inside this file, the refinement is active and our test passes.

5 examples, 0 failures, 5 passed

This feels like a big distraction before the "meat" of the interpose  method. Let's extract the Enumerator  creation out into its own method.

module Interpose
  refine Enumerable do
    def interpose(separator, &block)
      return interpose_enumerator(separator) unless block_given?
      each_with_index do |item, index|
        yield separator if index != 0
        yield item
      end
    end

    private

    def interpose_enumerator(separator)
      Enumerator.new do |yielder|
        interpose(separator) do |item|
          yielder << item
        end
      end
    end
  end
end

Again: the extra boilerplate for generating an Enumerator  is only because we're defining this code as a refinement. If it were an ordinary module method, returning an optional Enumerator  would be a one-liner.

We've tested the block-less version of interpose with an empty sequence. We'd like to ensure that the generated Enumerator  has the exact same semantics as the block form of interpose, in all scenarios.

There are three more tests for the block-form call. These all have a very similar shape to each other.

it "yields back a one-item sequence unaltered" do
  expect{|b| [1].interpose(:sep, &b)}.to yield_successive_args(1)
end

it "interposes separator between a pair of items" do
  expect{|b| [1,2].interpose(:sep, &b)}
    .to yield_successive_args(1, :sep, 2)
end

it "interposes separator between each item in a 3-element list" do
  expect{|b| [1,2,3].interpose(:sep, &b)}
    .to yield_successive_args(1, :sep, 2, :sep, 3)
end

Let's extract them into a shared example group and re-use them for both block-form and block-less versions of the call. We'll start by abstracting the way interpose  is called into a helper method called expect_transformation .

def expect_transformation(from:,to:)
  expect{|b| from.interpose(:sep, &b) }.to yield_successive_args(*to)
end

it "yields back a one-item sequence unaltered" do
  expect_transformation from: [1], to: [1]
end

it "interposes separator between a pair of items" do
  expect_transformation from: [1,2], to: [1, :sep, 2]
end

it "interposes separator between each item in a 3-element list" do
  expect_transformation from: [1, 2, 3], to: [1, :sep, 2, :sep, 3]
end

Then we'll move these three abstract examples into a shared example group.

shared_examples_for "interpose semantics" do
  it "yields back a one-item sequence unaltered" do
    expect_transformation from: [1], to: [1]
  end

  it "interposes separator between a pair of items" do
    expect_transformation from: [1,2], to: [1, :sep, 2]
  end

  it "interposes separator between each item in a 3-element list" do
    expect_transformation from: [1, 2, 3], to: [1, :sep, 2, :sep, 3]
  end
end

And then we'll create a new spec context specifically for the block form of interpose . We move the "yields nothing when given nothing spec" into this context, along with our yield-oriented definition of expect_transformation .

context "with a block" do
  it "yields nothing when given nothing" do
    expect { |b| [].interpose(:sep, &b) }.not_to yield_control
  end

  def expect_transformation(from:,to:)
    expect{|b| from.interpose(:sep, &b) }.to yield_successive_args(*to)
  end

  include_examples "interpose semantics"
end

Now we can construct a matching example group for the no-block-given scenario.

context "with no block" do
  it "returns an empty enumerator when given nothing" do
    e = [].interpose(:sep)
    expect(e).to be_a(Enumerator)
    expect(e.to_a).to eq([])
  end

  def expect_transformation(from:,to:)
    expect(from.interpose(:sep).to_a).to eq(to)
  end

  include_examples "interpose semantics"
end

Note the separate, context-specific definition of expect_transformation . Instead of using RSpec block expectations, it converts the return value of interpose into an array and compares that to the expected value.

  def expect_transformation(from:,to:)
    expect(from.interpose(:sep).to_a).to eq(to)
  end

We can now confirm that we have the expected semantics both when called with and without a block.

Enumerable#interpose
  with a block
    yields nothing when given nothing
    yields back a one-item sequence unaltered
    interposes separator between a pair of items
    interposes separator between each item in a 3-element list
  with no block
    returns an empty enumerator when given nothing
    yields back a one-item sequence unaltered
    interposes separator between a pair of items
    interposes separator between each item in a 3-element list

Finished in 0.01501 seconds (files took 0.24217 seconds to load)
8 examples, 0 failures

Choosing a yield-based implementation over an Enumerator -based implementation is more than just a matter of style. Constructing and using Ruby Enumerator  objects can be comparatively heavyweight. We can see this if we benchmark the original Enumerator-based version to the yielding version. We'll use the benchmark/ips gem for the comparison.

require "benchmark/ips"
require "interpose"
using Interpose

items = Array.new(10){rand}

Benchmark.ips do |x|
  x.report("enum, w/block") do
    result = []
    items.interpose_enum(:sep) do |i|
      result << i
    end
  end

  x.report("yield, w/block") do
    result = []
    items.interpose(:sep) do |i|
      result << i
    end
  end

  x.report("enum, no block") do
    result = items.interpose_enum(:sep).to_a
  end

  x.report("yield, no block") do
    result = items.interpose(:sep).to_a
  end
end

Here are the results on my machine:

Calculating -------------------------------------
       enum, w/block     32.723k (± 9.3%) i/s -    164.832k in   5.087004s
      yield, w/block    327.682k (± 8.3%) i/s -      1.635M in   5.030449s
      enum, no block     31.692k (±12.7%) i/s -    156.288k in   5.023343s
     yield, no block     91.646k (±12.0%) i/s -    455.760k in   5.047500s

When called with a block, the yielding version is at a major advantage: about 10x faster. This reflects the fact that it doesn't have to construct an Enumerator  object at all, and can get straight to work.

But even when they are both returning Enumerators, the yielding version is still about 3x faster. I'm not sure, but I wonder if this is due in part to the fact that the Enumerator  version is forced to use exceptions for flow control. Raising exceptions tends to introduce a lot of extra overhead.

I'm pretty happy with the end result of this refactoring. It expresses its semantics concisely, and offers a meaningful performance improvement over the original.

module Interpose
  refine Enumerable do
    def interpose(separator, &block)
      return interpose_enumerator(separator) unless block_given?
      each_with_index do |item, index|
        yield separator if index != 0
        yield item
      end
    end

    private

    def interpose_enumerator(separator)
      Enumerator.new do |yielder|
        interpose(separator) do |item|
          yielder << item
        end
      end
    end
  end
end

We've made use of a bunch of different tools and techniques in this article. If there's anything here you want to know more about, and you're a RubyTapas subscriber, here are some links for further exploration:

Not a subscriber? Get a free week of access to check out the links above (and hundreds more) by using coupon code INTERPOSE when you sign up.

2 comments

Leave a Reply

Your email address will not be published. Required fields are marked *