C# Tips - I

LINQ Expression tree to generate prefix notation of expressions

Yesterday I was goint through one of the LINQ hands on lab. I was always interested by the new Expression tree in C#3.0 and one of the expression tree sample in the lab grabbed my attention. I built onto it to create a postfix notation generator from any lambda expression.

What are Expression trees

Expression tree is a very interesting concept which allows creation of in-memory expression-tree's out of lambda expressions and then manipulate/inspect the expression as data. Expression trees are created as follows
Expression<Func<int, bool>> filter = n => !((n * 3) < 5);
Now filter contains the expression n => !((n * 3) < 5) as data and it can be manipulated and changed at will.

Pre-fix notation generation

This Expression tree is just as any other tree and can be traversed preorder to generate the prefix notation of the expression. So given the expression !((n * 3) < 5) it should be easy to generate the prefix form as in ! ( < ( * ( n 3 ) 5 )).
I wrote up a small extension method that works on Expressions to print the post fix notation doing a preorder traversal as follows

static void PrefixForm(this Expression exp)
{
if (exp is BinaryExpression)
{
BinaryExpression binEx = (BinaryExpression)exp;
Console.Write(" {0} ", NodeTypeLookUp[(int)binEx.NodeType]);
Console.Write("(");
binEx.Left.PrefixForm();
binEx.Right.PrefixForm();
Console.Write(")");
}
else if (exp is UnaryExpression)
{
UnaryExpression unEx = (UnaryExpression) exp;
Console.Write(" {0} ", NodeTypeLookUp[(int)unEx.NodeType]);
Console.Write("(");
unEx.Operand.PrefixForm();
Console.Write(")");

}
else if (exp is ParameterExpression)
{
Console.Write(" {0} ", ((ParameterExpression)exp).Name);
}
else if (exp is ConstantExpression)
{
Console.Write(" {0} ", ((ConstantExpression)exp).Value);
}
else
{
Console.WriteLine("{0} is not yet supported", exp.GetType().FullName);

Virtual method calls from constructors

C++ and C# varies considerably in how virtual method calls from constructors work. I feel that the approach taken by C++ is significantly better. Let's consider the following code in C++.

#include <iostream>
using namespace std;
class Base
{
public:
Base()
{
Foo();
}
virtual void Foo()
{
cout << "Base::Foo" << endl;
}
};

class Derived : Base
{
public:
Derived()
{
}
void Foo()
{
cout << "Derived::Foo" << endl;
}
};
Derived* der = new Derived();

The output of the program is Base::Foo. 

Here the Base class constructor is calling the virtual method Foo. In C++ (as with most OO language) the Base class is created before the Derived class and hence the base class constructor is called first. So when the call to Foo is made in Base::Base(), Derived is not yet created and hence the call Foo() ends in Base::Foo and not its override in Derived::Foo.
In C# the behavior is different and always the most derived override is called. 

class Base
{
public Base()
{
Foo();
}

public virtual void Foo()
{
Console.WriteLine("Base::Foo");
}
}

class Derived : Base
{
string str = null;
public Derived()
{
str = "Hello world";
}

public override void Foo()
{
Console.WriteLine("Derived::Foo");
//Console.WriteLine(str);
}
}

class Program
{
static void Main(string[] args)
{
Derived der = new Derived();
}
}

In this case the output is Derived::Foo. Even in C# the base class is created first and its constructor is called first. However calls to virtual methods always land on the most derived version which in this case is Derived::Foo().

However there are issues with this. Even though Derived::Foo gets called, Derived class is still not initialized properly and its constructor is not yet called. In the above code the variable str is not initialized and if it is referred from Derived::Foo a null reference will occur. So I tend to believe that even though C++ implementation needs a bit more understanding of object creation (vtable build-up) it is safer.

Due to all these subtleties its always recommended to not refer to virtual methods from ctors. If for example you are writing some code that will be used by others (as in Frameworks) then this may break the client developers if they derive from your class. They may never anticipate that calls to their overrides may hit un-initialized variables which they have explicitly initialized in the derived class's constructor.

Code Generation in multiple languages

I'm currently working on a personal project that needs to spit out code after parsing some XML file. I had previously used the .NET frameworks CodeDom to do on the fly compilation and hence tried digging it up to see if I could use it for code generation. In a small time I was completely blown over by the feature set and what I could achieve in a relatively small time.

I had initially expected to get little support from the framework and had thought I'd manipulate text to generate the C# code. I now figured out that I could use the CodeDom to build the code structure hierarchy and just pass on a language parameter and if that language is supported generate code using that language. Suddenly my application was not limited to C# but I could use VB.NET or VJ# for my output code as well. To demonstrate this I'd skip the XML parsing (serialization and logic) part. The following code generates a hello world program in any of the supported .NET language

private void BtnGenerate_Click(object sender, EventArgs e)
{

TextCode.Text = GenerateCode("C#");
TextCode.Text += GenerateCode("VJ#");
TextCode.Text += GenerateCode("VB");
}
public string GenerateCode(string language)
{
// get CodeDom provider from the language name
CodeDomProvider provider =
CSharpCodeProvider.CreateProvider(language);
// generate the code 
return GenerateCode(provider);
}


public string GenerateCode(CodeDomProvider provider)
{
// open string based in memory streams 
using(StringWriter writer = new StringWriter())
using (IndentedTextWriter tw = 
new IndentedTextWriter(writer, " "))
{
// create top level namespace
CodeNamespace myNamespace = 
new CodeNamespace("AbhinabaNameSpace");
// create file level comment
CodeComment comment = new CodeComment(
string.Format("Generated on {0}",
DateTime.Now.ToLocalTime().ToShortDateString()),
false);
CodeCommentStatement commentStatement =
new CodeCommentStatement(comment);
myNamespace.Comments.Add(commentStatement);
// add using statements for the required namespaces
myNamespace.Imports.Add(
new CodeNamespaceImport("System"));
// define the one and only class
CodeTypeDeclaration mainClass = 
new CodeTypeDeclaration();
mainClass.IsClass = true;
mainClass.Name = "HelloWorldMainClass";
mainClass.Attributes = MemberAttributes.Public;
myNamespace.Types.Add(mainClass);
//define the entry point which'd be 
//Main method in C#
CodeEntryPointMethod mainMethod = 
new CodeEntryPointMethod();
mainMethod.Comments.Add(
new CodeCommentStatement("<summary>", true));
mainMethod.Comments.Add(
new CodeCommentStatement("Entry point", true));
mainMethod.Comments.Add(
new CodeCommentStatement("</summary>", true));
mainClass.Members.Add(mainMethod);
//define the string variable message
CodeVariableDeclarationStatement strDecl = 
new CodeVariableDeclarationStatement(
new CodeTypeReference(typeof(string)),
"message");
mainMethod.Statements.Add(strDecl);
//create the message = "hello world" statement
CodeAssignStatement ptxAssign = 
new CodeAssignStatement(
new CodeVariableReferenceExpression("message"),
new CodeSnippetExpression("\"hello world\""));
mainMethod.Statements.Add(ptxAssign);
//call console.writeline to print the statement
CodeMethodInvokeExpression invokeConsoleWriteLine =
new CodeMethodInvokeExpression(
new CodeTypeReferenceExpression(typeof(Console)), 
"WriteLine",
new CodeExpression[] 
{
new CodeArgumentReferenceExpression(
"message"),
}
);
mainMethod.Statements.Add(invokeConsoleWriteLine);
// code generation options
CodeGeneratorOptions opt = new CodeGeneratorOptions();
opt.BracingStyle = "C";
opt.BlankLinesBetweenMembers = false;
// generate the code and return it
provider.GenerateCodeFromNamespace(myNamespace, tw, opt);
return writer.ToString();
}
}

If only more languages were on .NET I could build the list in http://www2.latech.edu/~acm/HelloWorld.shtml in about couple of hours :)
I get the following output

In C#
=======

// Generated on 2/27/2006
namespace AbhinabaNameSpace
{
using System;

public class HelloWorldMainClass
{
/// 
/// Entry point
/// 
public static void Main()
{
string message;
message = "hello world";
System.Console.WriteLine(message);
}
}
}


VJ#
=======
// Generated on 2/27/2006
package AbhinabaNameSpace;
import System.*;

public class HelloWorldMainClass
{
/** Entry point */
public static void main(String[] args)
{
String message;
message = "hello world";
System.Console.WriteLine(message);
}
}


VB
=======
Imports System

'Generated on 2/27/2006
Namespace AbhinabaNameSpace
Public Class HelloWorldMainClass
'''
'''Entry point
'''
Public Shared Sub Main()
Dim message As String
message = "hello world"
System.Console.WriteLine(message)
End Sub
End Class
End Namespace


optional arguments in C#

One of the things I missed a lot when I moved to C# is optional arguments. In C++ optional arguments are used a lot. Code as below is a common sight.

void foo(int reqdParam, int optParam = 0)
{
// ...
}


foo(5); // gets compiled as foo(5,0)
foo(5, 10);

The reason it is not included in C# is mainly due to versioning problem. Optional arguments are handled in most programming languages by inserting the default value of the optional argument at the call site. So for the above code foo(5) is compiled as foo(5, 0). 
The versioning issue comes to play if the call site and the method are in different assemblies. In the next version of the method the default value of optParam may change from 0 to 1 and 0 can become an unsupported value. However the calling code will still contain 0 and hence we may get a run-time issue. The way to get around is re-compiling all the assemblies that contains calls to the method and this simply does not scale.
Another way of handling optional argument would be to automatically generate method overloads based on optional arguments. So the above code on compilation would yield something like

void foo(int reqdParam)
{
foo(reqdParam, 0)
}
void foo(int reqdParam, int optParam)
{
// ...
}

foo(5); // calls the first overload
foo(5, 10); // calls the actual function

This is versioning safe. However this is not used by most languages including C#. I do not know why this is not used. This is versioning safe and at the same time gives all the benefits of optional arguments. Side effects would be code-bloat, inclusion of these methods in the call-stack.

Abstract base class over interface


Currently I'm reading the book Framework Design Guidelines. This is one of the best books I have read in some time. Most books that cover design in general are one-sided and high-lights the author's beliefs and convictions. However, this book is very different. It gives suggestions about various framework design aspects and at the same time high-lights views/opinions of different people in the .NET team which doesn't necessarily conform to the guidelines. This makes the book a very interesting read. 

One of the suggestions is "Favor defining classes over interface". While this is highly debatable, I agree to this in general. In this section I read a comment from Brian Pepin that reminded me of some framework code I saw long time back which convinced me that Abstract Base Classes are sometimes much superior to interfaces in defining contracts.

In that UI framework one of the requirements was that individual controls should support loading bitmaps that act as application skins with the following method prototypes....

void LoadBitmap(string fileName);
void LoadBitmap(string fileName, Color transparentCol);
void LoadBitmap(string fileName, int width, int height);
void LoadBitmap(string fileName, Color transparentCol, 
int width, int height);


The last method is the actual implementation and all the other methods fill in default values and passes it to this method. 
In case this was defined in an interface as in 

public interface ILoadBitmap
{
void LoadBitmap(string fileName);
void LoadBitmap(string fileName, Color transparentCol);
void LoadBitmap(string fileName, int width, int height);
void LoadBitmap(string fileName, Color transparentCol, 
int width, int height);
}


All classes that implemented the interface had to do method parameter validation for all the 4 methods and call the last method passing the default value for the parameters not specified. Not only this becomes tedious if some 20 types of controls supported skinning, it leads to programmer error in which a wrong default value is passed in some of the controls. This is where Abstract Base Class comes in. Using ABC you can code this as 

public abstract class SkinControl
{
public void LoadBitmap(string fileName)
{
Debug.Assert(!string.IsNullOrEmpty(fileName));
LoadBitmap(fileName, Color.Magenta, -1, -1);
}
public void LoadBitmap(string fileName, Color transparentCol)
{
Debug.Assert(!string.IsNullOrEmpty(fileName));
LoadBitmap(fileName, transparentCol, -1, -1);
}
public void LoadBitmap(string fileName, int width, int height)
{
Debug.Assert(!string.IsNullOrEmpty(fileName));
Debug.Assert(width > 0 && height > 0);
LoadBitmap(fileName, Color.Magenta, width, height);
}
public abstract void LoadBitmap(string fileName, 
Color transparentCol,
int width, int height);
}

public class SkinnedButton : SkinControl
{
public override void LoadBitmap(string fileName, 
Color transparentCol, 
int width, int height)
{
// Actual implementation 

}
}

All the methods are implemented in the ABC and only the last method is made abstract. So for all classes that implements this abstract class the developer needs to implement the fourth overload of the method only. Most of the contract is directly coded into the ABC. This results in less code and less programming error.




Added on December 28, 2007 Comment

Comments

Post a comment