Saturday, 11 April 2009

Parsing the Horn Dsl with the help of a Rhino

In my previous post, I explained the purpose of the horn DSL. In the introduction I mentioned that the open source project named horn that I contribute to is based on the gentoo portage package management system. At the heart of both portage and indeed horn is the metaphor of the package tree. The package tree is conceptually a tree that contains leaves of package build instructions. In reality it is a directory structure that contains Dsl instance files of package build instructions.

In another previous post, I listed the reasons why boo was chosen as the language to host the internal Dsl.

The Dsl contains the build metadata that is required by horn in order to both retrieve an open source project from a remote soure control management system and details of how to build or compile the components.

I mentioned in the previous post that the purpose of the DSL is to create the in-memory domain model or semantic model that horn uses to retrieve and build open source software packages.

Below is the horn.boo file that tells the horn software system how to retrieve and build itself:

install horn:
    description "A .NET build and dependency manager"
    get_from svn("http://scotaltdotnet.googlecode.com/svn/trunk/")
    build_with msbuild, buildfile("src/horn.sln"), FrameworkVersion35
    output "Output"
    shared_library "."

dependencies:
    depend @log4net >> "lib"
    depend @castle >> "castle.core"
    depend @castle >> "Castle.DynamicProxy2"
    depend @castle >> "castle.microKernel"
    depend @castle >> "castle.windsor"


We like to think we are dog fooding by providing a build script for horn but others may differ. I am hoping that we did achieve our goal of constructing a semi English like prose of build instructions but I will have to leave that judgement to the interpretation of the reader, if you think it could be improved then please leave a comment.

I mentioned in the previous post that when horn receives a command line instruction such as horn -install:horn that horn will search through the horn package tree or directory structure of build files until it finds the requested package's build file. In this case the DSL script outlined above.

The main engine of horn or the core as it is known is a .NET assembly written in C#. We also have a thin client .NET console application also written in C# that accepts command line instructions before passing them onto the core for execution. The code to retrieve and search the package tree for the requested build file is non-interesting and standard so we will not mention it. Suffice to say that it retrieves the correct build file before parsing it. We will now take time to explain how we go about parsing a boo build file.

I must mention at this point that if I had not read Ayende's excellent book creating domain specific languages in boo then I do not think I would have been able to use boo as our host language of choice for the DSL. This book really is the best reference guide out there and trust me, I have looked.

One thing to bear in mind when writing a DSL in boo when compared to a more typical internal host language choice like python or ruby is that we do not have an interpeter to parse our scripts. The DSL is compiled down to IL before it is executed. Boo has an extensible compiler architecture that allows us to create a more palatable Dsl.

The very basic structure of the Boo compiler is the pipeline; it is how the compiler transforms bits of text in a file into executable code. The boo language allows you, the user to plug your own steps into the compiler pipeline. This allows you to add to the syntax of the language, just like the language designers did in C# 3.0 by adding query expressions. A compiler step can interact with the parsed code and change the abstract syntax tree. New pipelines and steps can be defined at will. The compiler pipeline is a collection of loosely coupled objects each with their own task to complete.

The technique we have chosen for horn is to create what is known as an anonymous base class that is the basis of our Dsl. The anonymous base class is a standard approach for building a Dsl in boo. It is composed of a base class in our application code, and a compiler step in boo that will turn the Dsl script into a class that is derived from the defined base class. It is known as an anonymous base class because there is no mention of the base class in the Dsl script. If you look at the horn.boo dsl script at the top of the page you can think of the keywords install, description, get_from etc. as methods on the base class that we are simply calling in our script.

Below is a code snippet of the basic outline of the BooConfigReader anonymous base class we have defined in the horn application code to act as a base class for our Dsl scripts:

public abstract class BooConfigReader
{
    [Meta]
    public static Expression install(ReferenceExpression expression, Expression action)
    {
        var installName = new StringLiteralExpression(expression.Name);
 
        return new MethodInvocationExpression(
                new ReferenceExpression("GetInstallerMeta"),
                installName,
                action
            );
    }
 
    public void description(string text)
    {
        Description = text;
    }


Dsl script instances will inherit from this inherit from this class. Below is the first two lines of the horn.boo Dsl build script:

install horn:
    description "A .NET build and dependency manager"


You can see that we have methods in the base classes that correspond to methods in the Dsl Script. We will explain how this works in the next post.

And below is a code snippet of the same anonymous base class written in boo:

abstract class BooConfigReader(IQuackFu): 
    callable Action()
 
    [Meta]
    static def install(expression as ReferenceExpression, action as Expression):
        name = StringLiteralExpression(expression.Name)
 
        return [|
            self.GetInstallerMeta($name, $action)
        |]
 
    def description(text as string):
        desc = text


As with before, I will explain this code in more detail in the next post. I want to concentrate on how to actually parse individual dsl scripts in this post.

It is at this point that I will introduce the excellent Rhino.Dsl library. I gave a mention earlier to Ayende who is the creator and probably the main contributor to the Rhino Tools open source project. I use many of these tools in my day job so it came as no suprise to discover that there is a Rhino.Dsl project that contains components that provide labour intensive functionality for building and running DSLs written in boo.

The Rhino.Dsl library provides a number of classes that will take care of a lot of the heavy lifting for the common Dsl tasks that we would otherwise have to code in horn ourselves. Examples of such tasks are parsing the Dsl files and shielding us from the repetitive boiler plate code that is need to add an anonymous base class to the boo compiler pipeline.

Let us examine the following code that uses 2 such Rhino.Dsl classes:

var factory = new DslFactory
            {
                BaseDirectory = packageTree.CurrentDirectory.FullName
            };
 
factory.Register<BooConfigReader>(new ConfigReaderEngine());


In the above code we are creating an instance of the Rhino.Dsl.DslFactory and telling it to load all the build scripts in a specific folder.

We then have the following line:

factory.Register<BooConfigReader>(new ConfigReaderEngine());


The Register method of the DslFactory allows us to register instances of another Rhino.Dsl class named DslEngine or as in the case above with the ConfigReaderEngine, classes that derive from DslEngine. The DslEngine class abstracts away a lot of the boiler plate code that is required to compile instances of Dsl scripts. The DslEngine contains a number of extension points. One of these extension points is a method called CustomizeCompiler. This method allows us to customise the Compiler pipeline. We mentioned earlier that the main mechanics for the horn Dsl was to create an anonymous base class and register it in the boo compiler pipeline. We are passing in the type of anonymous base class as a generic argument in the register method.

As mentioned earlier, the Rhino.Dsl.Engine class provides extension points for boo compiler pipeline extensibility. The point of extension we use in horn is the CustomizeCompiler method. Below is the code listing for ConfigReader class that inherits from DslEngine and we use in horn to add our own compiler modifications and that we registered in the DslFactory in the previous code listing.

public class ConfigReaderEngine : DslEngine
{
    protected override void CustomizeCompiler(BooCompiler compiler, CompilerPipeline pipeline, string[] urls)
    {
        pipeline.Insert(1, new ImplicitBaseClassCompilerStep(typeof(BooConfigReader), "Prepare", "Horn.Core.Dsl"));
        pipeline.InsertBefore(typeof(ProcessMethodBodiesWithDuckTyping), new RightShiftToMethodCompilerStep());
        pipeline.Insert(2, new UnderscorNamingConventionsToPascalCaseCompilerStep());
        pipeline.Insert(3, new UseSymbolsStep());           
    }
}


The code above is showing how we can add our own custom steps into the compiler pipeline.

The important line of code to observe in the above example is the following:

pipeline.Insert(1, new ImplicitBaseClassCompilerStep(typeof(BooConfigReader), "Prepare", "Horn.Core.Dsl"));


This takes care of adding our anonymous base class to the pipeline which will mean that every build script we parse will inherit from this class. We use another Rhino class named ImplicitBaseClassCompilerStep that will again hide us from a lot of the boo infrastructure code to register an anonymous base class. The ImplicitBaseClassCompilerStep takes as arguments to it's constructor, the type of anonyomous base class, the abstract method that will be called to invoke the script and the namespace of the anonymous base class.

In horn we register the anonymous base class compiler class and we add another compiler step with the code below:

pipeline.InsertBefore(typeof(ProcessMethodBodiesWithDuckTyping), new RightShiftToMethodCompilerStep());


This allows us to transform the >> bitwise operators in the dependencies section of the Dsl into something else:

dependencies:
depend @log4net >> "lib"


We will cover that in a future post.

Below is the code listing for how we get the DslFactory and DslEngine instances to work together to parse the Dsl script into our semantic model:

configReader = factory.Create<BooConfigReader>(buildFilePath);
 
configReader.Prepare();


We call the Create method of the factory passing in the path to the Dsl script and the type of anonymous base as a generic argument. The prepare method is an abstract method on the anonymous base class. When we call this method, the Dsl Script is invoked which will in turn call methods or functionality on the anonymouse base class that is now the base class of the Dsl script.

Thanks to Rhino.Dsl, we can concentrate on authoring our Dsl and let it take care of the boiler plate boo plumbing code.

In the next post we will go line for line through the Dsl Script syntax to see how we achieved the result.

In the next post, we will get describe how the build script syntax was achieved.

If any of this is of interest to you then please join the Horn user group for updates or check out the source here.

2 comments:

  1. Very interesting... I think I'm going to poke around through horn to see how all of this DSL magic works :).

    Thanks for the post.

    -Charles

    ReplyDelete
  2. Interesting blog, i usally be aware all about all different kind of sofware. i am online all the time, and this action allow me to see a site costa rica homes for sale and i like it too much. beyond all doubt without my computer i never would have seen this site too.

    ReplyDelete