Friday, 6 August 2010

The Venusian Landscape of Ruby Metaprogramming

In this post, I am going to resist the urge to rant about the stench of hackery that both WebMatrix and Microsoft.Data.dll has brought forth and instead use this post to solidify some Ruby idioms in my head.  I use this this blog as either a cathartic vehicle for me to vent my frustrations or when I blog more technically, it is a way for me to cement in my head new concepts that I am likely to forget. 

I want to know what tools are available to me in the dynamic land of Ruby and not code my applications like I would in the more familiar land of C#.  Of course I might end up using them just for the sake of it but it is useful to know none the less.

As I, the author am still very new to this brave new world of Ruby and if you think that I am blogging out my ass then please feel free to write a rude comment telling me how wrong I am.  God knows I have done this enough times myself so maybe a bit of payback will not do me any harm.

As I write more Ruby and dig more into Rails, I am utterly blown away by just how abstract and surreal the language is.  To me, some of the dynamic language constructs are akin to watching a David Lynch film while on a heady cocktail of hallucinogenics.

In Ruby, there really are some very odd concepts and idioms that I see repeated over and over in the source code that I have been reading in my quest to get up to speed with Ruby.

If you have used the Ruby on Rails Default ORM ActiveRecord, you will be very familiar with the syntax below:


    1 class ExpenseType < ActiveRecord::Base

    2   has_many :expenses

    3 end


These method calls are also known as class macros and are the ActiveRecord association class macros that define relationships between entities in much the same way as the <one-to-many/> and <many-to-one/> type elements do in Nhibernate mapping files.

 

I have been writing these methods for a while but I really did not have much of a clue what was going on.  There really is quite a lot of metaprogramming magic going on behind the scenes which is not initially obvious.  As the ActiveRecord source code is the installation when you install the gem, I dug deep into the crazy, crazy code.

It is rather odd to see these association defining method calls dangling outside of a method definition and instead hanging on the class itself. It is now that I will breach the first weird concept.  The has_many class macro method above is not actually an instance method, it is actually what is known as a class method.

This is where things start getting very surreal so let us see how long you stick around before navigating away to some dull and predictable .NET related DDD post like "Obsessive use of Aggregate roots with absolutely no Mutators enables your projects to ship on time.".

Class methods as it turns out are very odd, they are members of what is often known as the singleton class or even more weirdly, the Eigenclass. I still have doubts whether singleton classes and Eigenclasses are the same thing but I think they are.

Time for an example before I try to explain any more:

    1 duck = "Daffy Duck"

    2 

    3 def duck.speak

    4    puts "That's all folks"

    5 end

 
Here the speak method is certainly not part of the String class.  No other string responds to this method.  The speak definition above creates a method that only exists for a single object, not for all classes of that object.  While most object oriented languages have class structures that support both instance methods and class methods (often known as static methods), Ruby only supports instance methods.  If Ruby only supports instance methods, where does the speak method end up?  On the singleton or Eigenclass of course.

This is possible because Ruby classes are actually objects instantiated from the Class class.  That is right, the Class class, you heard me right.  The Class of an object in Ruby is an object instance itself.  Time for another example to flesh this out:

    1 class Duck

    2     def self.waddle()

    3         puts "wibble wobble"

    4     end

    5 

    6     class << self

    7        def swim()

    8           puts "we are swimming"

    9        end

   10     end

   11 

   12     def procreate()

   13        puts "do you come here often"

   14     end  

   15 end


Both waddle and swim are both class methods with swim defined by a different syntax that opens up the singleton class for extension.  procreate is an instance method that behaves as you would expect. 

Confused? I know I have been. 

I could probably spend the whole post writing about this mysterious concept but I now want to get back to the code that is woven from the has_many class method.

Below is a class I am going to use to illustrate just about all the Ruby metaprogramming constructs that I have leaned by this stage so I am going deliberately over the top here.. 

    1 class ClassWithExpensiveMethod

    2   include CacheMixins

    3 

    4   def long_method #expensive method call

    5     sleep 2

    6     "result"

    7   end 

    8 

    9   cache_result :long_method

   10 end



You can see we have a rather badly named class called ClassWithExpensiveMethod which has a class method similar to the has_mas_many class method called cache_result.  cache_result passes as an argument, the name of any method calls it wants to......cache the result of.

What we want this method to do is call any methods it takes as an argument (the long_result instance method in this case) the first time it is called to get the result of the invocation but after that we want to cache the result as the call is expensive.

I am going to illustrate a lot of Ruby dynamic concepts to show how this is achieved but it is suffice to say that I would not go to this much trouble in reality for something so futile. 

You can see in line 2 of the above code that we are including a module called CacheMixins which will Mixin behaviour to any class that includes it, I mentioned mixins in my previous post.

Here is the implementation of CacheMixins which is mixed in to the above class:

    1 module CacheMixins

    2   def self.included(base)

    3     base.extend(ClassMethods)

    4   end

    5 

    6   module ClassMethods

    7     def cache_result(name)

    8       real_method = "_real_#{name}"

    9       alias_method :"#{real_method}", name

   10 

   11       define_method name do

   12         cache = instance_variable_get("@#{name}")

   13 

   14         if cache

   15           return cache

   16         else

   17           result = send(real_method)

   18           instance_variable_set("@#{name}", result)

   19           return result

   20         end

   21       end

   22     end

   23   end

   24 end



There are a lot of concepts so I will take them one at a time.

On line 2, we are overriding the base class method included which is one of the methods that are known as a Hook Method. Ruby provides a number of hooks that cover events in the object model.  Here we are overriding the included hook method which fires whenever the module that overrides the method is included by a class.  You can see in line 2 of ClassWithExpensiveMethod that we have the include method call and the name of the module that will be mixed in.  Whenever the include method is called, the included method is fired and the including class (ClassWIthExpensiveMethod) is passed as an argument to the hook method.

In the included method, the including class which can
often be known as  the inclusor (ClassWithExpensiveMethod in this instance) has class methods mixed in to it via the extend method.  The extend method mixes in behaviour to the singleton class or Eigenclass we mentioned previously.  The class methods are defined in a nested module rather unoriginally called ClassMethods.  It turns out that this is a popular Ruby idiom and you do see it quite a lot or more than once anyway.  Naming the inner module ClassMethods is just a convention and is not a keyword or anything like that.

Still with me or are you reading about the importance of the ubiquitous language in one of the plague of DDD posts that I am trying to weed out of my RSS reader? 

OK, onto the actual implementation of cache_result.

We stated earlier that we want to execute the method call the first time to get the result and then return the cached result thereafter and I will now explain how cache_result is achieving this.

In lines 8 and 9 of cache_result we are using the aliasing feature of Ruby whereby you can give an alternate name to a Ruby method.  Here we are redefining the long_method method of ClassWithExpensiveMethod which is passed to cache_result by the method call cache_result :long_method. 

Using aliases, we are redefining long_method with the alias _real_long_method (line 8).  The alias now refers to the the original method long_method.  When you redefine a method like this, you do not really change the method.  Instead you define a new method and attach an existing name to the new method.  On line 11 we are using one of Ruby's dynamic powers by using define_method to create a class method on the included class.  define_method takes a name for the new method and a block for the functionality.

You can see here we are defining a new method with the name we just aliased.  This allows us to wrap a sort of AOP style functionality around the original method.  This technique is often called an around alias and is another idiom you see quite a lot.

Line 12 calls instance_variable_get which returns the value of an instance variable or nil (which is the case the first time the method is called) if the instance variable has not been set.  The @ part of the variable name should be included for regular instance variables.

Line 14 checks to see if we got an actual value from instance_variable_get and if one exists, we return the cached value.  If cache is nil, we then enter the else part of the if statement on line 17 where we actually call the real method by using Ruby's send method which allows you to call methods dynamically.  Here we are sending a message to the aliased method we created via alias_method on line 9.  We then use instance_value_set to create an instance variable that will hold the value of the returned method call before returning the result of the call.

The end result is this:

    1 obj = ClassWithExpensiveMethod.new

    2 

    3 puts obj.long_method    #actual method is called

    4 puts obj.long_method    #cached result is returned



And that is it, nothing to it.  Perfectly simple and any bugs happening would be easy to track down, right?  We are calling include on the inclusor that in turn calls extend to mixin class methods to the inclusor's eigenclass.  That actually makes sense to me, you might need to dig about the web like I did but I think these concepts are worth noting.

I got through the whole post without ranting uncontrollably  about how much I hate Silverlight and Xaml.







4 comments:

  1. Yes, "singleton class" and "eigenclass" mean the same thing.

    You sound concerned that all this method-pushing weirdness could hide nasty bugs. It could, and so could any advanced technique. Once you get used to them, these idioms become as natural as regular OOP constructs - and just as easy to abuse, of course. :)

    ReplyDelete
  2. @Paolo, thanks for clearing that up about the eigenclass. It just seems weird after programming in C# for so long. Weird and fascinating.

    ReplyDelete
  3. The thing I dislike the most about current Ruby hype is an assumption that people who prefer .NET and DDD do not know what eigenclass or method_missing is.

    ReplyDelete
  4. @ashmind, I never inferred that .NET developers (of which I am one) don't know the dynamic concepts and in fact I stated that it was ME who did not know what they are. I wrote the blog to solidify the concepts in my head.

    Hating DDD is a personal thing and totally down to me.

    ReplyDelete