Ruby has the Comparable module, which, if you implement the spaceship operator <=> (winner of “Best Named Operator” 10 years running!) then it will give you a bunch of comparator operators for free (<, <=, ==, >= and >). Win. Enumerable’s #sort method uses the spaceship operator to do sorting too, so implementing the spaceship gives you a whole bunch of interesting behaviour pretty much for free.

However, while you might think that Array#uniq used ==, or even <=> directly, to implement its uniqueness semantics, you’d be wrong.

First, let’s take a step back. There are a few ways to determine different qualities of equality in Ruby:

  • == is value equality where you consider two values to be equal. So, 1 == 1 or 1 == 1.0.

  • eql? is stronger than == so that in addition to value equality they must have type equality. The above example demonstrates the difference: 1.eql?(1) is true but 1.eql?(1.0) is false.

  • equal? is even stronger still; they must be the same object id. To be honest, the only time I’ve had call to use this is when testing that a DataMapper-style implementation was doing the right thing.

And it turns out that Ruby uses .eql? to determine whether an object is uniq or not.

Here’s the first thing I found slightly surprising: eql? is not implemented in terms of ==. To my mind, it’s an obvious implementation:

def eql?(other)
  self.class == other.class && self == other
end

and there would be faster implementation in C, of course, but I guess there’s a reason it doesn’t do that. Then again, I wonder if that should be:

def eql?(other)
  is_a?(other) && self == other
end

It’s not really obvious from the Pickaxe, and I can’t think of an example to test it with. Anyway, back on track. So, if you’re wanting to customise the way that #uniq works, you’re going to have to implement #eql? on your own class.

And you’re still going to be surprised that doing so doesn’t make #uniq work. See, there’s a bit of a performance optimisation going on here. First of all, it uses #hash to group together potential duplicates, then it uses #eql? to verify that the duplicates are definitely are the same.

So, if you want to customise the way #uniq works for your particular class, you also have to implement #hash. Here’s what I came up with in the end (imagine that nominal_code fully encompasses the object’s identity and ordering if you will):

class NominalAccount
  attr_reader :nominal_code

  include Comparable

  def <=>(other)
    nominal_code <=> other.nominal_code
  end

  def eql?(other)
    self.class == other.class && self == other
  end

  def hash
    nominal_code.hash
  end
end

Now I can sort and uniqueify collections of these objects through their natural primary key. Win.

A Sneak Peek at
The Internet

If you enjoyed this article, you might be interested in my new project, A Sneak Peek at The Internet. What happens when you enter www.facebook.com into your web browser and hit return? A Sneak Peek at The Internet will take you on a deep dive through the network stack, from HTTP, SSL, TCP and IP, all the way down through the data link layer, back up to Facebook's data centres, and then on the return journey back to the browser.

There's more fun, excitement and peril than a Disneyland rollercoaster!