Ruby design pattern: How to make an extensible factory class?
Ok, suppose I have Ruby program to read version control log files and do something with the data. (I don't, but the situation is analogous, and I have fun with these analogies). Let's suppose right now I want to support Bazaar and Git. Let's suppose the program will be executed with some kind of argument indicating which version control software is being used.
Given this, I want to make a LogFileReaderFactory which given the name of a version control program will return an appropriate log file reader (subclassed from a generic) to read the log file and spit out a canonical internal representation. So, of course, I can make BazaarLogFileReader and GitLogFileReader and hard-code them into the program, but I want it to be set up in such a way that adding support for a new version control program is as simple as plopping a new class file in the directory with the Bazaar and Git readers.
So, right now you can call "do-something-with-the-log --software git" and "do-something-with-the-log --software bazaar" because there are log readers for those. What I want is for it to be possible to simply add a SVNLogFileReader class and file to the same directory and automatically be able to call "do-something-with-the-log --software svn" without ANY changes to the rest of the program. (The files can of course be named with a specific pattern and globbed in the require call.)
I know this can be done in Ruby... I just don't how I should do it... or if I should do it at all.
You don't need a LogFileReaderFactory; just teach your LogFileReader class how to instantiate its subclasses:
class LogFileReader
def self.create type
case type
when :git
GitLogFileReader.new
when :bzr
BzrLogFileReader.new
else
raise "Bad log file type: #{type}"
end
end
end
class GitLogFileReader < LogFileReader
def display
puts "I'm a git log file reader!"
end
end
class BzrLogFileReader < LogFileReader
def display
puts "A bzr log file reader..."
end
end
As you can see, the superclass can act as its own factory. Now, how about automatic registration? Well, why don't we just keep a hash of our registered subclasses, and register each one when we define them:
class LogFileReader
@@subclasses = { }
def self.create type
c = @@subclasses[type]
if c
c.new
else
raise "Bad log file type: #{type}"
end
end
def self.register_reader name
@@subclasses[name] = self
end
end
class GitLogFileReader < LogFileReader
def display
puts "I'm a git log file reader!"
end
register_reader :git
end
class BzrLogFileReader < LogFileReader
def display
puts "A bzr log file reader..."
end
register_reader :bzr
end
LogFileReader.create(:git).display
LogFileReader.create(:bzr).display
class SvnLogFileReader < LogFileReader
def display
puts "Subersion reader, at your service."
end
register_reader :svn
end
LogFileReader.create(:svn).display
And there you have it. Just split that up into a few files, and require them appropriately.
You should read Peter Norvig's Design Patterns in Dynamic Languages if you're interested in this sort of thing. He demonstrates how many design patterns are actually working around restrictions or inadequacies in your programming language; and with a sufficiently powerful and flexible language, you don't really need a design pattern, you just implement what you want to do. He uses Dylan and Common Lisp for examples, but many of his points are relevant to Ruby as well.
You might also want to take a look at Why's Poignant Guide to Ruby, particularly chapters 5 and 6, though only if you can deal with surrealist technical writing.
edit: Riffing of off Jörg's answer now; I do like reducing repetition, and so not repeating the name of the version control system in both the class and the registration. Adding the following to my second example will allow you to write much simpler class definitions while still being pretty simple and easy to understand.
def log_file_reader name, superclass=LogFileReader, &block
Class.new(superclass, &block).register_reader(name)
end
log_file_reader :git do
def display
puts "I'm a git log file reader!"
end
end
log_file_reader :bzr do
def display
puts "A bzr log file reader..."
end
end
Of course, in production code, you may want to actually name those classes, by generating a constant definition based on the name passed in, for better error messages.
def log_file_reader name, superclass=LogFileReader, &block
c = Class.new(superclass, &block)
c.register_reader(name)
Object.const_set("#{name.to_s.capitalize}LogFileReader", c)
end
This is really just riffing off Brian Campbell's solution. If you like this, please upvote his answer, too: he did all the work.
#!/usr/bin/env ruby
class Object; def eigenclass; class << self; self end end end
module LogFileReader
class LogFileReaderNotFoundError < NameError; end
class << self
def create type
(self[type] ||= const_get("#{type.to_s.capitalize}LogFileReader")).new
rescue NameError => e
raise LogFileReaderNotFoundError, "Bad log file type: #{type}" if e.class == NameError && e.message =~ /[^: ]LogFileReader/
raise
end
def []=(type, klass)
@readers ||= {type => klass}
def []=(type, klass)
@readers[type] = klass
end
klass
end
def [](type)
@readers ||= {}
def [](type)
@readers[type]
end
nil
end
def included klass
self[klass.name[/[[:upper:]][[:lower:]]*/].downcase.to_sym] = klass if klass.is_a? Class
end
end
end
def LogFileReader type
Here, we create a global method (more like a procedure, actually) called LogFileReader
, which is the same name as our module LogFileReader
. This is legal in Ruby. The ambiguity is resolved like this: the module will always be preferred, except when it's obviously a method call, i.e. you either put parentheses at the end (Foo()
) or pass an argument (Foo :bar
).
This is a trick that is used in a few places in the stdlib, and also in Camping and other frameworks. Because things like include
or extend
aren't actually keywords, but ordinary methods that take ordinary parameters, you don't have to pass them an actual Module
as an argument, you can also pass anything that evaluates to a Module
. In fact, this even works for inheritance, it is perfectly legal to write class Foo < some_method_that_returns_a_class(:some, :params)
.
With this trick, you can make it look like you are inheriting from a generic class, even though Ruby doesn't have generics. It's used for example in the delegation library, where you do something like class MyFoo < SimpleDelegator(Foo)
, and what happens, is that the SimpleDelegator
method dynamically creates and returns an anonymous subclass of the SimpleDelegator
class, which delegates all method calls to an instance of the Foo
class.
We use a similar trick here: we are going to dynamically create a Module
, which, when it is mixed into a class, will automatically register that class with the LogFileReader
registry.
LogFileReader.const_set type.to_s.capitalize, Module.new {
There's a lot going on in just this line. Let's start from the right: Module.new
creates a new anonymous module. The block passed to it, becomes the body of the module – it's basically the same as using the module
keyword.
Now, on to const_set
. It's a method for setting a constant. So, it's the same as saying FOO = :bar
, except that we can pass in the name of the constant as a parameter, instead of having to know it in advance. Since we are calling the method on the LogFileReader
module, the constant will be defined inside that namespace, IOW it will be named LogFileReader::Something
.
So, what is the name of the constant? Well, it's the type
argument passed into the method, capitalized. So, when I pass in :cvs
, the resulting constant will be LogFileParser::Cvs
.
And what do we set the constant to? To our newly created anonymous module, which is now no longer anonymous!
All of this is really just a longwinded way of saying module LogFileReader::Cvs
, except that we didn't know the "Cvs" part in advance, and thus couldn't have written it that way.
eigenclass.send :define_method, :included do |klass|
This is the body of our module. Here, we use define_method
to dynamically define a method called included
. And we don't actually define the method on the module itself, but on the module's eigenclass (via a small helper method that we defined above), which means that the method will not become an instance method, but rather a "static" method (in Java/.NET terms).
included
is actually a special hook method, that gets called by the Ruby runtime, everytime a module gets included into a class, and the class gets passed in as an argument. So, our newly created module now has a hook method that will inform it whenever it gets included somewhere.
LogFileReader[type] = klass
And this is what our hook method does: it registers the class that gets passed into the hook method into the LogFileReader
registry. And the key that it registers it under, is the type
argument from the LogFileReader
method way above, which, thanks to the magic of closures, is actually accessible inside the included
method.
end
include LogFileReader
And last but not least, we include the LogFileReader
module in the anonymous module. [Note: I forgot this line in the original example.]
}
end
class GitLogFileReader
def display
puts "I'm a git log file reader!"
end
end
class BzrFrobnicator
include LogFileReader
def display
puts "A bzr log file reader..."
end
end
LogFileReader.create(:git).display
LogFileReader.create(:bzr).display
class NameThatDoesntFitThePattern
include LogFileReader(:darcs)
def display
puts "Darcs reader, lazily evaluating your pure functions."
end
end
LogFileReader.create(:darcs).display
puts 'Here you can see, how the LogFileReader::Darcs module ended up in the inheritance chain:'
p LogFileReader.create(:darcs).class.ancestors
puts 'Here you can see, how all the lookups ended up getting cached in the registry:'
p LogFileReader.send :instance_variable_get, :@readers
puts 'And this is what happens, when you try instantiating a non-existent reader:'
LogFileReader.create(:gobbledigook)
This new expanded version allows three different ways of defining LogFileReader
s:
- All classes whose name matches the pattern
<Name>LogFileReader
will automatically be found and registered as aLogFileReader
for:name
(see:GitLogFileReader
), - All classes that mix in the
LogFileReader
module and whose name matches the pattern<Name>Whatever
will be registered for the:name
handler (see:BzrFrobnicator
) and - All classes that mix in the
LogFileReader(:name)
module, will be registered for the:name
handler, regardless of their name (see:NameThatDoesntFitThePattern
).
Please note that this is just a very contrived demonstration. It is, for example, definitely not thread-safe. It might also leak memory. Use with caution!
One more minor suggestion for Brian Cambell's answer -
In you can actually auto-register the subclasses with an inherited callback. I.e.
class LogFileReader
cattr_accessor :subclasses; self.subclasses = {}
def self.inherited(klass)
# turns SvnLogFileReader in to :svn
key = klass.to_s.gsub(Regexp.new(Regexp.new(self.to_s)),'').underscore.to_sym
# self in this context is always LogFileReader
self.subclasses[key] = klass
end
def self.create(type)
return self.subclasses[type.to_sym].new if self.subclasses[type.to_sym]
raise "No such type #{type}"
end
end
Now we have
class SvnLogFileReader < LogFileReader
def display
# do stuff here
end
end
With no need to register it
This should work too, without the need for registering class names
class LogFileReader
def self.create(name)
classified_name = name.to_s.split('_').collect!{ |w| w.capitalize }.join
Object.const_get(classified_name).new
end
end
class GitLogFileReader < LogFileReader
def display
puts "I'm a git log file reader!"
end
end
and now
LogFileReader.create(:git_log_file_reader).display