Wednesday 7 September 2011

This Blog Has Moved


I have finally had enough of blogger and moved my blog here.



Click on the link on the top right of the new site to subscribe via feed burner:

I hope you will continue to follow me as I continue on my war against terror.

Tuesday 14 June 2011

WINRM - Loosely Translated as SSH for Windows

When I am developing a side project, I always make sure it is on a different operating system and a different language than what I would normally use in my day job.  I want there to be a clear distinction or else the lines can become all too blurred.  My current side project is written in ruby and rails and it is while developing this that I have fallen in love with both SSH and Capistrano.  SSH or secure shell is a protocol for remote administration of Unix computers.  Simply put, SSH allows me to stay in the same terminal instance when I am moving from my client machine to other remote servers.  

Capistrano is a DSL for deploying ruby on rails applications.  Capistrano has highlighted just what a gaping hole this is in current ugly .NET deployment practices.  I very much include my own crude powershell scripts in this dirty breed.  Capistrano uses SSH to connect to the remote server and run your deployment from your client machine.  Pretty nifty and yet another source of rails envy I now have.

Returning to my day job and windows, I missed this facility and started looking at powershell for a similar capability.  There is more than one way of executing powershell on a remote server.  Winrm looked to be the best fit for me.  With winrm, you can start and finish powershell sessions on the remote server.  In the examples I am about to give, all the machines have powershell 2.0 installed.  There are different requirements for each operating system.  The following guide should help you install winrm on the client and on any remote servers you wish to access.

Enabling Winrm on a remote Server for Client Requests

Enabling WinrRM between computers on the same domain is very easy, simply run the following command in an elevated Powershell console on the remote server.  

PS> Enable-PSRemoting                  

You should get a response like this:



If you do not get a response like above you might want to check that winrm is installed and that the windows service is running.

By default winrm runs over http on port 5985 and you might need your network guys to open a hole in the firewall for this port.   It took the network guys I work with five weeks to accomplish this task which is an absurd amount of time for such a task but I better not get into that now.  We also restricted access to this port to an I.P. range for extra security.  

If the server is on a different domain, you also need to run the following command which sets a registry entry that allows a client server to authenticate from a different domain to the local server.

PS> New-Itemproperty -name LocalAccountTokenFilterPolicy -path HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System -propertyType DWord -value 1

Another gotcha that I fell for is to make sure that you restart the winrm windows service to pick up the changes. 

Enabling Winrm on the Client

You might not need to do this but I had to add a trust entry for every server I wanted to connect to with the following command on my workstation:

PS> winrm s winrm/config/client '@{TrustedHosts="samedomainserver,99.999.999.999"}'       

You will note that I have more than one server listed here and they are comma delimited.  I also have a computer name and an I.P. address in my list.  The IP address is for a server on a different domain.

Executing Commands on the Remote Server

There are two ways that I know of executing commands on the remote server.  The first is by invoking the Invoke-Command cmdlet which allows you to access one off commands:

PS>  Invoke-Command -ComputerName MyRemoteServer -ScriptBlock {Get-Process} -Credential Get-Credential   

The Get-Credential cmdlet opens a windows form dialog to allow you to log in against the required account:



The second allows you to start a remote powershell session using the Enter-PSsession cmdlet:

PS> Enter-Pssession -ComputerName MyRemoteServer -Credential Get-Credential            

I can also exit my remote session with the Exit-PSsession cmdlet.                                                                                                            

The Icing on the Winrm Cake

I talked about the powershell profile in this post and starting a remote powershell session is a prime candidate for adding to my powershell profile.  I have the following two functions defined in my powershell profile:

  1. function PS-Production
  2. {
  3.     Enter-PSSession -ComputerName 99.999.999.999 -Credential Get-Credential
  4. }
  5.  
  6. function PS-Demo
  7. {
  8.     Enter-PSSession MyRemoteServer -Credential Get-Credential
  9. }

It can get confusing when bouncing about from server to server so I have the following function in my profile to let me know exactly which account I am running under:

  1. function whoami
  2. {
  3.     (get-content env:\userdomain) + "\" + (get-content env:\username);
  4. }

If you are on windows and still flapping about with Dos or worse, using windows explorer to do everything then I beseige you to pick up powershell.  It is worth the effort.

I think a capistrano like experience could be created with powershell and winrm.  Maybe something I will look into at some stage.


Tuesday 7 June 2011

Gracefully handling Nil and Empty in Ruby

One of the nice side effects of extension methods in .NET is that you can easily handle nulls without littering your code with null checks.  

For example if you look at the following extension method that I have defined in C#:

  1.      public static bool HasElements<T>(this IEnumerable<T> collection)
  2.      {
  3.          return collection != null && collection.Count() != 0;
  4.      }

I can then use this method on any object that is an IEnumerable of T:

  1.     if (trainingEvent.Contacts.HasElements())
  2.     {
  3.         //do something
  4.     }

Which is much easier on the eye than:

  1.     if (trainingEvent.Contacts != null && trainingEvent.Contacts.Count > 0)
  2.     {
  3.         //do something
  4.     }

The reason this is possible is because an extension method in .NET is really a bit of syntactic sugar that effectively rearranges the method into a static method call, which might be equivalent to this:

  1.     if (EnumerableExtensions.HasElements(trainingEvent.Contacts))
  2.     {
  3.         // do something
  4.     }

It is all done with compiler magic.

When coding in Ruby, I found myself wanting the same experience.  All too often, I found myself writing code like this:

  1. if !lead.address.nil? && !lead.address.empty?

or this:

  1. if lead.contacts.nil? || lead.contacts.empty?

In ruby everything is an object even Nil (null in .NETspeak) and everything is open for extension.  You just redefine the class and add your own customisations.  I actually thought about extending Nil myself before telling myself that was very hacky.  I then found out that this is exactly what happens in the core extensions of the activesupport module.

In this set of extensions, the authors have extended a number of the core objects to add support for a blank? method.  According to the documentation:

An object is blank when it's false, empty or a whitespace string.  

All that I need to do is add the following line to my ~/config/initializers/requires.rb file and I get the blank? extension method over the whole project.  

  1. require 'active_support/core_ext/object'

I can then use blank? with strings:

  1. if !address.blank?

Or with arrays:

  1. if lead.contacts.blank?

That will take care of nil, empty strings, and arrays with no elements.

I get the same result for arrays, booleans and hashes.

So how is this achieved?  A quick look at the source reveals that some of the core objects are extended (hence the name stupid!).

For example the Nil class is extended to always return true for a call to blank?:

  1. class NilClass #:nodoc:
  2.   def blank?
  3.     true
  4.   end
  5. end

This will obviously elegantly handle all our nil? cases in one full swoop.  The code then opens up several other classes for extension and adds the blank? method.

For example Array is opened and a nice alias is added to enable a call to blank? to respond with empty?

  1. class Array #:nodoc:
  2.   alias_method :blank?, :empty?
  3. end

The Sting class is nicely extended with the ruby not matches operator on a simple regex:

  1. class String #:nodoc:
  2.   def blank?
  3.     self !~ /\S/
  4.   end
  5. end

But the nice bit is the extension that is added to the Object class:

  1. class Object
  2.   def blank?
  3.     respond_to?(:empty?) ? empty? : !self
  4.   end
  5. end

The respond_to? method simply checks whether the object you are sending a blank? message to has such a method.

There are other useful extensions in this module that you can find out about here.

I think this is quite nice and has tidied up my code a lot.  


If you are a seasoned Ruby developer, you probably knew this and have not bothered reading this far.




Sunday 22 May 2011

Own Your Powershell Profile

I have not blogged in quite a while but I think it is worth giving powershell a mention.  I think that a lot of windows developers are missing a trick by first of all not using powershell and secondly by not customising their powershell profile.  

Like a lot of windows developers, I had never really seen the need for powershell and I had written it off without even trying it.  I happily fumbled about with DOS and batch files.  That is until I bought a Mac Book air and decided that the bash shell was a much more agreeable experience compared to messing about with the  Mac windows explorer equivalent called finder.  I was instantly struck how much more efficient using the terminal (command prompt in windows speak) is than messing about with windows.  The obvious advantage being that you can create reusable scripts of the recurring actions you frequently execute during the course of your terminal session.  In bash you can configure your shell by placing customisation commands in 1 of 3 system configuration files, the one I used most was the .bashrc file that is processed every time you open a non-login shell.  One of the most productive customisations that I discovered was the ability to set up typing short cuts or aliases.

Once I returned to windows, I found myself loathe to use windows explorer, filezilla and other graphical user interfaces. I was struck by how inefficient they were and how I was repeating the same actions over and over again.  The whole point of programming is to automate repeating tasks.  I then turned to powershell and I was delighted to discover that powershell has the concept of the powershell profile which is analogous to the .bashrc file in that it is processed every time you open a powershell shell command prompt.  

Your Divine Right To Customise Your Shell

The first task that you that you need to accomplish, that is if you have yet to customise your profile is to find the damn thing. Luckily powershell holds the location in a built in global variable that you can access by typing $PROFILE at  your powershell command prompt:

PS> $PROFILE                                                                                                                                                                                         

$PROFILE returns the full path of the file that powershell will try to run when it starts
                                                                                                                           
PS> C:\users\paul.cowan\Documents\windowsPowerShell\Microsof.Powershell_profile.ps1                                            

Strangely, the file is not created for you by default.   You can of course use powershell to test for its existence using the Test-Path cmdlet:

PS> Test-Path $profile                                                                                                                                                                

If your profile does not exist, you need to create it (obviously) and the New-Item cmdlet will enable you to achieve this

PS> New-Item  -path $profile -type -file -force                                                                                                                           

Once you have created the $profile, you can set to work customising it by opening the file in notepad:

PS> notepad $profile                                                                                                                                                                   

Aliases

I mentioned aliases at the start of this post and this is where we can start to add typing shortcuts to drastically speed up our workflow.  We can for example add an alias that will open up our favourite text editor. This is accomplished with the set-alias cmdlet:

  1. set-alias npp "C:\Program Files (x86)\Notepad++\notepad++.exe"

If we type the above entry into our powershell profile file (Microsoft.PowerShell_profile.ps1) and then try to run the new command by entering npp into an existing powershell command prompt, we will get an error along the lines of

The term 'npp' is not recognised as the name of a cmdlet, function or operable program.

This is because the profile is only processed at the beginning of each new powershell prompt session.  We can though reload the profile using the dot source syntax below or by opening a new powershell command prompt:

PS> . $PROFILE                                                                                                                                                                            

Now if we type npp into the prompt, the notepad++ executable will start.

Functions

Being able to add your own functions to your powershell profile and have them available for every powershell session is insanely useful.  This is best illustrated with a very simple example.  I have a number of functions and aliases that let me teleport from location to location on the file system.  The below example will cd into my projects directory:

  1. function pr
  2. {
  3.     set-location C:\projects\ 
  4. }

PS> pr                                                                                                                                                                                                                                                                                                           

Two keystrokes will get me where I want to go.  You really need to contrast this puerile example with the same actions in windows explorer.

This is ridiculously simple but I have lots of these simple location changers that zip me about from place to place.

Below is a powershell function that I use constantly when finding files:

  1. function ff ([string] $glob)
  2. {
  3.     get-childitem -recurse -include $glob
  4. }

If I want to recursively find all text files in a directory hive, I can simply type the following into the prompt:

PS> ff *.txt                                                                                                                                                                                         

Once you start adding these customisations, you will find yourself constantly tweaking your workflow.

Combining Functions and Aliases

The following example shows how I can create a powershell remote session on another server.  I am not going to get into the syntax here but I might blog about remote powershell sessions in another post which at this rate will be in another year.  The point of this example is to show how I can create the session with 2 key strokes.  First of all is the function to create the session:

  1. function New-PSSecureRemoteSession
  2. {
  3.     param ($sshServerName, $Cred)
  4.     $Session = New-PSSession $sshServerName -UseSSL -Credential $Cred -ConfigurationName C2Remote
  5.     Enter-PSSession -Session $Session
  6. }

I then set an alias to the function:

  1. set-alias sh New-PSRemoteSession

Two keystrokes and I have a new session on the server.  I hope this illustrates what is truly possible as you become more comfortable.  I was inspired to look for the ability to execute remote powershell commands after having this ability with ssh on unix and linux.  I would never have even considered this if I had not looked into other platforms which really is the moral of this story.  You should always look to other platforms to bring ingenuity into your own.

The Icing on the cake

As you build up your profile over time, it becomes indispensable.  You want to have it with you at all times.  I keep mine in git and I can simply clone it onto any new machine that I am working on.  You could also use dropbox.

I think I read in the Pragmatic Programmer that you should know your shell and as usual, this is sound advice.

Here is a link to my entire profile.


Monday 22 November 2010

Creating dynamic methods with closures in Ruby

I came across an interesting problem while working on my yet to be birthed Microsisv project leadcapturer.  I want to document this discovery before I forget all about it.  My product, leadcapturer will scrape company and lead details from directory websites to provide fresh leads for marketing departments. In order to help me achieve this difficult goal, I have been using the totally awesome Nokogiri which is a wrapper around the C based libxml libraries.  WIth Nokogiri’s assistance, I can perform xpath on html documents even if the documents are invalid xml which let us face facts, most html documents are.  

Which brings me to the point of this post.  I wanted to use a regular expression in an xpath expression to return all text nodes that match the regex pattern.  This is not possible in normal xpath but is possible with the help of Nokogiri.  The nokogiri documentation provides the code listed below as an example:

  1 node.xpath('.//title[regex(., "\w+")]', Class.new {
  2 def regex(node_set, regex)
  3   node_set.find_all { |node| node['some_attribute'] =~ /#{regex}/ }
  4 end
  5 }.new)

In the above example, a new instance method named regex is created within an anonymous class defined by Class.new.  This ruby instance method is used as a predicate for the xpath expression.  A predicate is a Boolean expression which in an xpath context will return all nodes that are true for the boolean function defined within the square brackets.  In the above example, we want all nodes with an attribute of some_attribute that match the regex “\w+”.

The problem was that I wanted to make use of closures to call methods on the
outer class that defines the Class.new anonymous class.  When I mention closure in this context, I specifically mean being able to refer to variables from the context in which the closure was created.  The variable I wanted to make use of in this case was the self of the outer class that defines the anonymous class.  In Ruby methods are not closures, only blocks are.  I could not use self in the Class.new regex instance method because self in this context would refer to the anonymous class and not the containing class.

The answer to this puzzle was to use
define_method which allows an author to dynamically create a new method on an object.  define_method takes a method or a block as an argument.  As I stated earlier, methods are not closures in Ruby but blocks are.  As I can pass a block to define_method, my problem was solved, here is a stripped down version of the end result:

  1 lead = self
  2 expression = /\w+/
  3 parent.xpath("./descendant::text()[regex(.)]", Class.new{
  4   define_method(:regex) do |node_set|
  5     result = node_set.find_all do |node|
  6       if node.text =~ expression
  7         lead.attribute = node.text
  8         return true
  9       end
 10     false
 11   end
 12 end
 13 }.new)

In line 1, I am binding the containing or outer class to a variable named lead which means I can call methods of the outer class  in the anonymous inner class defined by Class.new.

In line 4, I am using
define_method to dynamically create a method named regex and also pass a block as an argument to define_method that will become the dynamically created instance method’s method body.

In line 7 I am able to call a method (
attribute=) of the outer class by using the variable I bound to self in line 1.

I think this is pretty cool and one of the reasons why I am really enjoying the different dynamic paradigms available in Ruby.  I also think that this is a more readable alternative that can be used instead of the “magic” of
method_missing. Depending on your circumstances of course.


Friday 6 August 2010

The Venusian Landscape of Ruby Metaprogramming

In this post, I am going to resist the urge to rant about the stench of hackery that both WebMatrix and Microsoft.Data.dll has brought forth and instead use this post to solidify some Ruby idioms in my head.  I use this this blog as either a cathartic vehicle for me to vent my frustrations or when I blog more technically, it is a way for me to cement in my head new concepts that I am likely to forget. 

I want to know what tools are available to me in the dynamic land of Ruby and not code my applications like I would in the more familiar land of C#.  Of course I might end up using them just for the sake of it but it is useful to know none the less.

As I, the author am still very new to this brave new world of Ruby and if you think that I am blogging out my ass then please feel free to write a rude comment telling me how wrong I am.  God knows I have done this enough times myself so maybe a bit of payback will not do me any harm.

As I write more Ruby and dig more into Rails, I am utterly blown away by just how abstract and surreal the language is.  To me, some of the dynamic language constructs are akin to watching a David Lynch film while on a heady cocktail of hallucinogenics.

In Ruby, there really are some very odd concepts and idioms that I see repeated over and over in the source code that I have been reading in my quest to get up to speed with Ruby.

If you have used the Ruby on Rails Default ORM ActiveRecord, you will be very familiar with the syntax below:


    1 class ExpenseType < ActiveRecord::Base

    2   has_many :expenses

    3 end


These method calls are also known as class macros and are the ActiveRecord association class macros that define relationships between entities in much the same way as the <one-to-many/> and <many-to-one/> type elements do in Nhibernate mapping files.

 

I have been writing these methods for a while but I really did not have much of a clue what was going on.  There really is quite a lot of metaprogramming magic going on behind the scenes which is not initially obvious.  As the ActiveRecord source code is the installation when you install the gem, I dug deep into the crazy, crazy code.

It is rather odd to see these association defining method calls dangling outside of a method definition and instead hanging on the class itself. It is now that I will breach the first weird concept.  The has_many class macro method above is not actually an instance method, it is actually what is known as a class method.

This is where things start getting very surreal so let us see how long you stick around before navigating away to some dull and predictable .NET related DDD post like "Obsessive use of Aggregate roots with absolutely no Mutators enables your projects to ship on time.".

Class methods as it turns out are very odd, they are members of what is often known as the singleton class or even more weirdly, the Eigenclass. I still have doubts whether singleton classes and Eigenclasses are the same thing but I think they are.

Time for an example before I try to explain any more:

    1 duck = "Daffy Duck"

    2 

    3 def duck.speak

    4    puts "That's all folks"

    5 end

 
Here the speak method is certainly not part of the String class.  No other string responds to this method.  The speak definition above creates a method that only exists for a single object, not for all classes of that object.  While most object oriented languages have class structures that support both instance methods and class methods (often known as static methods), Ruby only supports instance methods.  If Ruby only supports instance methods, where does the speak method end up?  On the singleton or Eigenclass of course.

This is possible because Ruby classes are actually objects instantiated from the Class class.  That is right, the Class class, you heard me right.  The Class of an object in Ruby is an object instance itself.  Time for another example to flesh this out:

    1 class Duck

    2     def self.waddle()

    3         puts "wibble wobble"

    4     end

    5 

    6     class << self

    7        def swim()

    8           puts "we are swimming"

    9        end

   10     end

   11 

   12     def procreate()

   13        puts "do you come here often"

   14     end  

   15 end


Both waddle and swim are both class methods with swim defined by a different syntax that opens up the singleton class for extension.  procreate is an instance method that behaves as you would expect. 

Confused? I know I have been. 

I could probably spend the whole post writing about this mysterious concept but I now want to get back to the code that is woven from the has_many class method.

Below is a class I am going to use to illustrate just about all the Ruby metaprogramming constructs that I have leaned by this stage so I am going deliberately over the top here.. 

    1 class ClassWithExpensiveMethod

    2   include CacheMixins

    3 

    4   def long_method #expensive method call

    5     sleep 2

    6     "result"

    7   end 

    8 

    9   cache_result :long_method

   10 end



You can see we have a rather badly named class called ClassWithExpensiveMethod which has a class method similar to the has_mas_many class method called cache_result.  cache_result passes as an argument, the name of any method calls it wants to......cache the result of.

What we want this method to do is call any methods it takes as an argument (the long_result instance method in this case) the first time it is called to get the result of the invocation but after that we want to cache the result as the call is expensive.

I am going to illustrate a lot of Ruby dynamic concepts to show how this is achieved but it is suffice to say that I would not go to this much trouble in reality for something so futile. 

You can see in line 2 of the above code that we are including a module called CacheMixins which will Mixin behaviour to any class that includes it, I mentioned mixins in my previous post.

Here is the implementation of CacheMixins which is mixed in to the above class:

    1 module CacheMixins

    2   def self.included(base)

    3     base.extend(ClassMethods)

    4   end

    5 

    6   module ClassMethods

    7     def cache_result(name)

    8       real_method = "_real_#{name}"

    9       alias_method :"#{real_method}", name

   10 

   11       define_method name do

   12         cache = instance_variable_get("@#{name}")

   13 

   14         if cache

   15           return cache

   16         else

   17           result = send(real_method)

   18           instance_variable_set("@#{name}", result)

   19           return result

   20         end

   21       end

   22     end

   23   end

   24 end



There are a lot of concepts so I will take them one at a time.

On line 2, we are overriding the base class method included which is one of the methods that are known as a Hook Method. Ruby provides a number of hooks that cover events in the object model.  Here we are overriding the included hook method which fires whenever the module that overrides the method is included by a class.  You can see in line 2 of ClassWithExpensiveMethod that we have the include method call and the name of the module that will be mixed in.  Whenever the include method is called, the included method is fired and the including class (ClassWIthExpensiveMethod) is passed as an argument to the hook method.

In the included method, the including class which can
often be known as  the inclusor (ClassWithExpensiveMethod in this instance) has class methods mixed in to it via the extend method.  The extend method mixes in behaviour to the singleton class or Eigenclass we mentioned previously.  The class methods are defined in a nested module rather unoriginally called ClassMethods.  It turns out that this is a popular Ruby idiom and you do see it quite a lot or more than once anyway.  Naming the inner module ClassMethods is just a convention and is not a keyword or anything like that.

Still with me or are you reading about the importance of the ubiquitous language in one of the plague of DDD posts that I am trying to weed out of my RSS reader? 

OK, onto the actual implementation of cache_result.

We stated earlier that we want to execute the method call the first time to get the result and then return the cached result thereafter and I will now explain how cache_result is achieving this.

In lines 8 and 9 of cache_result we are using the aliasing feature of Ruby whereby you can give an alternate name to a Ruby method.  Here we are redefining the long_method method of ClassWithExpensiveMethod which is passed to cache_result by the method call cache_result :long_method. 

Using aliases, we are redefining long_method with the alias _real_long_method (line 8).  The alias now refers to the the original method long_method.  When you redefine a method like this, you do not really change the method.  Instead you define a new method and attach an existing name to the new method.  On line 11 we are using one of Ruby's dynamic powers by using define_method to create a class method on the included class.  define_method takes a name for the new method and a block for the functionality.

You can see here we are defining a new method with the name we just aliased.  This allows us to wrap a sort of AOP style functionality around the original method.  This technique is often called an around alias and is another idiom you see quite a lot.

Line 12 calls instance_variable_get which returns the value of an instance variable or nil (which is the case the first time the method is called) if the instance variable has not been set.  The @ part of the variable name should be included for regular instance variables.

Line 14 checks to see if we got an actual value from instance_variable_get and if one exists, we return the cached value.  If cache is nil, we then enter the else part of the if statement on line 17 where we actually call the real method by using Ruby's send method which allows you to call methods dynamically.  Here we are sending a message to the aliased method we created via alias_method on line 9.  We then use instance_value_set to create an instance variable that will hold the value of the returned method call before returning the result of the call.

The end result is this:

    1 obj = ClassWithExpensiveMethod.new

    2 

    3 puts obj.long_method    #actual method is called

    4 puts obj.long_method    #cached result is returned



And that is it, nothing to it.  Perfectly simple and any bugs happening would be easy to track down, right?  We are calling include on the inclusor that in turn calls extend to mixin class methods to the inclusor's eigenclass.  That actually makes sense to me, you might need to dig about the web like I did but I think these concepts are worth noting.

I got through the whole post without ranting uncontrollably  about how much I hate Silverlight and Xaml.







Saturday 24 July 2010

Ditching the Service Layer and IOC for Mxins in Ruby

I still classify myself as very new to Ruby on rails, although I like many other .NET developers are annoying the life out of other people by droning on about how great the brave new(?) world of Ruby is.  Before I start, if anyone disagrees with what I am about to post here then please, please, please write a comment below that explains a better approach.

I have been developing in .NET since framework beta 1 (doesn't everyone say that) since 1892.  I like many, (or so it seems) .NET developers are looking at other frameworks and platforms like Ruby on Rails due to a frustration with the continuous abominations coming out of Microsoft like RIA services, oData, Entity Framework, Worflow, Silverlight etc.  Microsoft continually wants to lower the bar to lesser developers and all this does is....attract lesser developers who offer nothing back.  I find myself just hating everything.  I also find myself getting angry at the constant ill appropriated blog posts about Domain, Driven Design and other nonsense like CQRS that fill up my RSS reader from the better educated .NET developers.  Am I alone in thinking that DDD is just impractical for 99% of all the applications that it is used in and the blue book was pretty dull?  I still enjoy C# but the surrounding patterns, practices and new features just really annoy me now.  I cannot work out if I am truly this enamoured with Ruby on Rails or is it that it is just different. I am probably just hankering after something new and shiny and Ruby on Rails ticks all the new and shiny boxes for me.  I find the dynamic paradigms fascinating and I have been pouring over the source of things like the Rails ORM ActiveRecord for an idea of how to do things differently.  The source of ActiveRecord is crazy, crazy, crazy but I do have a better understanding of just what shape shifting is available in a dynmaic language now.  One thing, I really do not want to do is to start coding in a Ruby on Rails application, just like I was coding a .NET application.  What would be the point?  I want to fully explore what a dynamic language has to offer and why so many Java developers before us felt the need to move here. 

I am working on a MicroIsv project in my spare time and I came across an interesting decision today which has lead me to this post.  My infant product will scrape text from a yellow pages like directory website and then go through a data scrubbing process to make sense of the unstructured data.  I could write the data scrubber myself but I would have to start delving into natural language processing or other time consuming concepts that would probably mean postponing the release date of the application to one that stretches beyond my lifetime.  Thankfully there are a number of 3rd party webservices, REST webservices or whateverwebservices that can unburden me of this somewhat difficult task.  I do not want to favour one just now and I also want to leave the option of changing provider if and when I feel the need.  I also do not want to call the service from my unit tests for all the reasons that are usually trotted out for calling external resources with unit tests.  The obvious thing to do is to isolate this functionality.

If I was doing this in .NET, I would create a service layer that is defined by an inteface for the requested service just like below:

    1 public interface IDataScrubber

    2 {

    3     Lead ScrubData(Dictionary<string, object> parts);

    4 }


I would then autowire this service into the IOC container of my choice that would inject the service via constructor injection into a controller or something where I would use it.  This allows me to mock the service and also depend on an interface rather than the concrete implementation. 

While looking at a lot of the Ruby source code that I have installed in the form of RubyGems on my hard drive, I just do not see this type of implementation.  There are no interfaces in Ruby for starters and you rarely see an IOC container although they do exist.  This has led me to seek a way that takes advantage of the flexibility of a dynamic languages.  As I said previously, I see little point in writing this application in Ruby as if it were a static language like C#.  I want to try to comprehend the power and malleability of a dynamic language.

The first approach does not lean on any dynamic magic, I am just going to suggest using default parameters that are part of Ruby.  I can simply suggest a default implementation in the constructor of the Parser class below.  Initialize is the constructor of a Ruby object if you did not know.


    1 class Parser

    2     attr_reader :scrubber

    3 

    4     def initialize(scrubber = DataScrubber.new)

    5         @scrubber = scrubber

    6     end

    7 end

    8 

    9 class DataScrubber

   10     #real implementation goes here

   11 end



That is ridiculously easy and for simple cases, there is no reason not to use this approach, what could be simpler.  In the constructor of Parser, I have a default parameter for the scrubber member variable.  This will create the real DataScrubber if none other is offered when creating an instance of Parser.  When testing, I can simply create the parser with a stubbed out version of the DataScrubber.

I still do not feel satisfied by this answer and next I want to look at mixins.  From what I can tell in Ruby, it seems they use fewer classes than other OO languages like C#.  In C# we tend to compose things through a lot of small classes that run under the guise of Separation of Concerns.  The problem with this is that there is generally a lot of implementation details needed when creating these interwoven class structures.  Hence we have the IOC container to take away this complex creation process that often exists.
With Mixins, we can implement behaviour in one module that gets mixed into one or more classes that should have that behaviour.  A good way to ensure that the concerns are, in fact separated is to develop the mixins in a test driven manner.    

We could define our DataScrubber in a module like this:

    1 module DataScrubber

    2   def self.included(base)

    3     puts "module included by #{base}"

    4   end

    5 

    6   def scrub_data(parts = {})

    7     parts.keys.each{|p| puts "doing something with #{parts[p]}"}

    8   end

    9 end


Note, I am using the hook method included above that fires whenever it is included by another object.  The hook method here is purely for illustration but it is useful to know it is there.


This then allows me to test the mixin in isolation like this by mixing it in to a simple empty instance

    1 class MixinTest < ActiveSupport::TestCase

    2 

    3   test "should test DataScrubber" do

    4     scrubber = Class.new

    5 

    6     scrubber.instance_eval do

    7       include DataScrubber

    8     end

    9    

   10     scrubber.scrub_data({:one => "one", :two => "two"})

   11    

   12     #assert behaviour here

   13   end

   14 

   15 end


Here I am creating a blank instance of an object and passing a block or anonymous method to instance_eval which will evaluate a block in the context of the instance.  You can add, override and modify object instances at runtime using instance_eval which is part of Ruby's dynamic playground.  I like this approach because we are not having to create a plethora of classes to add behaviour and I can test the behaviour in isolation.  If you have ever looked into creating Mixins in .NET then you will know how difficult a task this is.


When it comes to running a unit test against the Parser class that has this functionality mixed in via the include statement, I somehow want to stub out this behaviour when the DataScrubber is mixed into the Parser class like it is below:

    1 class Parser

    2   include DataScrubber

    3

    4   def parse()

    5     parts = {:one => "one", :two => "two"}

    6 

    7     scrub_data(parts)

    8   end

    9 

   10   def initialize   

   11   end

   12 end

 
As ruby is a dynamic language, this is very easy, here is one way of doing this:

    1   test "should stub out DataScrubber" do

    2     parser = Parser.new

    3 

    4     parser.instance_eval do

    5       undef scrub_data  #not strictly necessary

    6

    7       def scrub_data(parts)

    8         puts "stubbed out method"

    9       end

   10     end

   11 

   12     parser.parse

   13   end

 

Here I am using our old friend instance_eval to open up the parser instance and undefine the scrub_data method that has been mixed in before re-adding it with the behaviour that I want to run in my tests without having to call the 3rd party service.

Of course any modern platform is defined by the number of mocking frameworks on offer and Ruby certainly has plenty,
flexmock is my weapon of choice and below is an example of how I could use flexmock to create a partial mock to return some test data:

 

    1   test "should stub out DataScrubber" do

    2     parser = Parser.new

    3 

    4     flexmock(parser).should_receive(:scrub_data).and_return { "some data"}

    5 

    6     result = parser.parse

    7 

    8     #assert result data

    9   end

 
The last approach I am going to mention is inheriting from ActiveResource. I am not sure this is appropriate to me as the service must understand Rails-style URLs and I might not have that luxury with whatever 3rd party service provider, I choose.

I think mixins are the right choice for me, the Ruby way seems to be (and I could be wrong, I regularly am) to create lots of methods as opposed to interwoven class structures that require complex object creation techniques, just to get to the functionality you require.

If I was approaching this problem in C#, I would say, "I would isolate this functionality in a service that I inject into the controller" but in Ruby, I would say "I would write this as a Ruby module that I include in my class".

Testing, stubbing or mocking are, as is often stated, ridiculously easy.


It is funny to note that when I was musing about this earlier on twitter somebody who describes himself in his bio as a "passionate Rubyist" made the following questionable comment "Composing services with mixins? Ouch, enjoy the impending doom.".  I suspect this guy is like me, a .NET developer who now suddenly thinks he is a "rubyist".  Another thought is that perhaps real Ruby on Rails guys are thinking of moving onto Clojure or Scala, now that Ruby is no longer as hip as it was.