Golfing in Reverse

I got it in my head today to write an RPN calculator. It’s not anything that I needed for any reason, and I’ve probably written half a dozen of them over the years, but it’s a fun little bit of code kata, and I find that I enjoy the exercise every time.

While I was at it, I thought I might turn it into a game of code golf, trying to use as few characters as possible.

Here’s the result:

puts $stdin.read.split.inject([]) { |s, t| s << (t =~ /\d+/ ? t.to_i : s.pop(2).inject(t.to_sym)) }.pop

That’s it. A whole RPN calculator in one line of Ruby! So good!

After saving this into rpn.rb, we can run it from the shell:

$ echo "1 2 + 3 *" | ruby rpn.rb
9

Of course, this code is almost incomprehensible. So, just for giggles, let’s reverse the golfing process and refactor it into a piece of code we can bring home to the family. First, let’s indent that big inject block,

puts $stdin.read.split.inject([]) { |s, t|
  s << (t =~ /\d+/ ? t.to_i : s.pop(2).inject(t.to_sym))
}.pop

And break the ternary operator into a regular conditional:

puts $stdin.read.split.inject([]) { |s, t|
  if t =~ /\d+/
    s << t.to_i
  else
    s << s.pop(2).inject(t.to_sym)
  end
}.pop

Maybe split up the logic and I/O,

def rpn(expr)
  expr.split.inject([]) { |s, t|
    if t =~ /\d+/
      s << t.to_i
    else
      s << s.pop(2).inject(t.to_sym)
    end
  }.pop
end

input = $stdin.read
puts rpn(input)

And we should definitely pick some better variable names.

def rpn(expr)
  expr.split.inject([]) { |stack, token|
    if token =~ /\d+/
      stack << token.to_i
    else
      stack << stack.pop(2).inject(token.to_sym)
    end
  }.pop
end

input = $stdin.read
puts rpn(input)

That big inject block has a nicely functional look about it, but I think the code would be even clearer if we did away with it:

def rpn(expr)
  stack = []

  expr.split.each do |token|
    if token =~ /\d+/
      stack << token.to_i
    else
      stack << stack.pop(2).inject(token.to_sym)
    end
  end

  stack.first
end

input = $stdin.read
puts rpn(input)

Relying on the return value of a destructive operation like pop makes me nervous. I’d feel better if we refactored that into a destructuring splat. I also prefer public_send to inject in this case, since we’re not dealing with a list of arbitrary length.

def rpn(expr)
  stack = []

  expr.split.each do |token|
    if token =~ /\d+/
      stack << token.to_i
    else
      *stack, arg_1, arg_2 = stack
      stack << arg_1.public_send(token, arg_2)
    end
  end

  stack.first
end

input = $stdin.read
puts rpn(input)

And voilà! Now it’s relatively easy to see what’s going on. The rpn method takes an expression expr (a string) and splits it into a list of tokens. Integer tokens are just pushed onto the stack, but encountering a mathematical operator causes the top two elements to be popped off the stack, evaluated by the operator, and the result of the operation to be pushed back on. One element should ultimately be left in the stack, which is the result of evaluating the expression.

This still isn’t perfect—if we were wearing our object-orientation hats we’d really want to apply the extract class and extract method patterns a few times—but it’s hugely better than that mysterious one-liner we started with.

There’s something about refactoring dense or messy code that I find incredibly relaxing (even—or especially—when it’s old code that I wrote myself). It’s like solving a chess puzzle or untangling a knot. There’s a set of well-defined moves that you can make, and by applying them judiciously you can transform a confusing, complex situation into something clear and simple and aesthetically appealing.

You might like these textually similar articles: