Unique Marketing, Guaranteed Results.

Don’t Call it “Case Equality”

July 30th, 2009 by Brett Rasmussen

I’ve recently learned to love Ruby’s “triple equals” operator, sometimes referred to as the “case equality operator”. But I stand with Hal Fulton, author of The Ruby Way, in disliking the latter term, since there’s no real equality going on with its usage. It’s also not really an operator–it’s a method–but I’m not going to complain too loudly about that one, considering that I prefer the term “relationship operator”. I’m also not opposed to “trequals”, which has a certain jeunesse doree about it. You could say “trequals” at a trendy restaurant with post-modern decor and everyone wearing black.

With one equals sign you assign a value to a variable:

composer = "Beethoven"

With two equals signs you see if two things are the same thing:

puts "9th Symphony" if melody == "Ode to Joy"

With three equal signs you get, well, essentially you get a placeholder that you can use to define arbitrary relationships between objects which you will mostly never call by hand yourself but which Ruby will call for you when you run case statements:

class Composer
  attr_accessor :works
  def initialize(*works)
    @works = works
  end

  def ===(work)
    @works.include?(work)
  end
end

The trequals operator (ok, method) returns true or false depending on a condition I’ve defined. Now I can test a given work against a bunch of composer objects using a case statement:

beethoven = Composer.new("Fur Elise", "Missa Solemnis", "9th Symphony")
mozart = Composer.new("The Magic Flute", "C Minor Mass", "Requiem")
bach = Composer.new("St. Matthew Passion", "Jesu, Joy of Man's Desiring")

case "Requiem"
  when beethoven
    process_beethoven_work
  when mozart
    process_mozart_work
  when bach
    process_bach_work
end

The trequals is called behind the scenes by Ruby. Since I’ve defined it on the Composer class to look for a matching entry in that composer’s list of works, the case statement becomes a way of running different code based on which composer wrote the work in question.

This example is contrived, of course, because if it was this simple a need you’d probably just check “some_composer.works.include?(‘Requiem’)” by hand. But the example demonstrates the crucial point, that there’s no equality being checked for. A work in no way is the composer. It’s a relationship that the case statement is checking for–the given work was written by the given composer–and it’s a relationship that I’ve defined explicitly for my own music-categorizing purposes.

That case statements work this way is yet another example of the magical and powerful stuff that characterizes Ruby. Instead of simply a strict equality match, we can now switch against multiple types, all with different definitions of what qualifies as a relationship:

class String
  def ===(other_str)
    self.strip[0, other_str.length].downcase == other_str.downcase
  end
end

class Array
  def ===(str)
    self.any? {|elem| elem.include?(str)}
  end
end

class Fixnum
  def ===(str)
    self == str.to_i
  end
end

string_to_test = "99 Monkeys"
case string_to_test
  when "99 monkeys jumping on the bed"
    do_monkey_stuff
  when ["77 Rhinos Jumped", "88 Giraffes Danced", "99 Monkeys Sang"]
    do_animal_behavior_stuff
  when 99
    do_quantity_stuff
  when /^\d+\s+\w+/
     do_regex_stuff
end

Here, if the string to be tested is the first portion of the larger string (case-insensitively speaking), if it is part of any of the elements in the specified array, if it starts out with 99 (string.to_i returns only leading integers), or if it matches the given regular expression, the respective code will be run. In this case, it matches all of them, so only the code for the first case–the string match–will be run (in Ruby, switches automatically stop at the first match, so you don’t need to give each case its own “end” line).

Note that I didn’t need to define (actually, override) the trequals on the regular expression. The relationship operator is a method on Object, so all Ruby objects inherit it. If not overridden, it defaults to a simple double-equals equality check (thus contributing to the momentum of the misnomer “case equality”). But some standard Ruby classes already come with their own definition for trequals. Regexp and Range are the notable examples: Regexp defines it to mean a match on that regular expression, and Range defines it to mean a number that falls somewhere within that range, as such:

num = 77
case num
  when 1..50
    puts "found a lower number"
  when 51..100
    puts "found a higher number"
end

Note that since === is really a method, it is not commutative, meaning you can’t swap sides on the call; “a === b” is not the same as “b === a”. If you think through it, it makes sense. You’re really calling “a.===(b)”. If a is an array, you’re calling a method on Array, which will be defined for Array’s own purposes. If b is a string, and you swapped the order, you’d be calling a String method, which would have a different purpose for its trequals operator, so “b.===(a)” would most likely be something quite different. This concept also means that the variable you’re testing in a case statement is being passed as a parameter to the trequals methods of the various case objects, not the other way around. These two snippets are equivalent:

case "St. Matthew Passion"
  when mozart
    process_mozart_work
end

process_mozart_work if mozart === "St. Matthew Passion"

Note that the second snippet was not

process_mozart_work if "St. Matthew Passion" === mozart

It’s also good (although I’m not sure how useful) to know that the relationship operator is used implicitly by Ruby when rescuing errors in a begin-rescue block.

begin
  do_some_stuff
rescue ArgumentError, SyntaxError
  handle_arg_or_syn_error
rescue IOError
  handle_io_error
rescue NoMemoryError
  handle_mem_error
end

In this example, Ruby runs ArgumentError.===, passing it the global variable $!, which holds the most recent error. If that returns false, it moves along, doing the same with SyntaxError, IOError, and NoMemoryError, each in turn. With errors, the trequals is defined to just compare the class of the error that occurred with that of each candidate class (in this case, ArgumentError, etc.) and its ancestors.

It took me a long time before I cared about this little Ruby feature, which I think is sad. I think I just saw the phrase “case equality” and thought something like “Hmm, another subtle variation on what it means for two objects to be equal. I’m sure I’ll have occasion to use this someday. I’ll figure it out then.” But it’s more useful than that, and I think it would get better traction without the specious nomenclature.

Ruby file trimming app

July 17th, 2009 by hals

We recently had an interesting experience with very large files. These were comma delimited files (.csv) containing hundreds of thousands of records, each with a dozen or so fields.

e.g.

rec1,field2,,,,,,xxx,fieldn,,,1,2,3,,,fieldx

rec2,field22,,,a,s,d,fieldmore,,,,etc

.

.

.

recn,field2n,,,,ring,,,,ring,1,2,,,hello?,,etc

While testing the setup, we had smaller files to work with. The goal was to create a new file containing only the first field from each record.

e.g.

rec1

rec2

.

.

.

recn

During testing this was easily done by opening the file in a spreadsheet program (such as OpenOffice), which would split the records on the comma delimiter and place each field in a different column. Then, it was easy to select the first column and write it out to the new file.

On switching to production files, we discovered that OpenOffice has a limit of 65k rows – a fraction of what we needed. We then tried some other spreadsheet programs, which produced the same results. We knew there was at least one spreadsheet program that would work, but it was not open source.

At this point the comment was made: “well, we ARE ruby developers …”

And that lead to the following simple solution to the problem at hand.

With a few lines of ruby code, the source files could be read in, line by line, split on the comma delimiter, and the first entry written out to the destination file.

So, when the usual tools just don’t work – remember that a new ruby tool might be just around the corner.

#!/usr/bin/ruby

#

#  trimfile.rb

#

require “rubygems”

require “ruby-debug”

class Trimfile

attr_accessor :fileName, :newFile

def initialize(fileName, newFile)

puts “\nSplit off first comma delimited item of each line.”

@fnam = fileName

if @fnam == nil then @fnam = “trimin.txt” end

@newfnam = newFile

if @newfnam == nil then @newfnam = “trimout.txt” end

linecount = 0

puts “\nFilenames – input: #{@fnam}, output: #{@newfnam}”

aFile = File.new(@newfnam, “w”)

IO.foreach(@fnam) do |line|

aFile.puts line.split(‘,’)[0]

linecount += 1

end

aFile.close

puts “\nTotal lines: #{linecount}”

end

end

test = Trimfile.new(ARGV[0], ARGV[1])


Copyright © 2005-2016 PMA Media Group. All Rights Reserved &nbsp