Sunday, December 20, 2009

Expression Trees - serializing your data

Update: Just to clarify - the code snippets below are under the MIT/X11 license.

I spent a few hours over the weekend writing a binary serializer using expression trees. I wanted to see how things would look using the new features available in .NET 4.0. My requirements were pretty simple:

1) Serialize all public properties in a type or a subset of them
2) Control the order in which they're serialized - sometimes you need to interop with an existing and you must write your data in a specific order
3) Control how a primitive is converted - Do you need to write value types in big endian, little endian, middle endian?
4) Easy to use API.

So lets start with the API. This is what I was hoping to use:

public class Secondary
{
public int First { get; set; }
public int Second { get; set; }
public int Third { get { return First + Second; } }
}

public class MyClass
{
public byte ByteProp { get; set; }
public short ShortProp { get; set; }
public int IntProp { get; set; }
public long LongProp { get; set; }
public string StringProp { get; set; }
}

static void Main(string[] args)
{
// Register a message so that all public fields will be serialized
Message.Register<MyClass>();

// Register a message so that only some fields are serialized and
// they are serialized in the specified order
Message.Register<Secondary>(
d => d.Second,
d => d.First
);

// Create a stream to serialize the data to
Stream s = new MemoryStream();
var message = new MyClass {
IntProp = 1,
LongProp= 2,
ByteProp= 3,
ShortProp = 4,
StringProp = "Hello World"
};

// Encode the message to the stream
MessageEncoder.Encode(message, s);

// Rewind the stream and then decode the message
s.Position = 0;
var decoded = MessageDecoder.Decode<MyClass>(s);
}

It's pretty standard stuff. You can work with the standard serializer logic (serialize properties alphabetically) by registering an object without specifying any specific properties or you can customise which properties are serialized. This could also be done using attributes, but using attributes to control the order in which properties are serialized would be more error prone than the above.

Firstly, sometimes you need to write your data in big endian, others you need little endian. Sometimes you won't care. What you need is to be able to control this:
MessageEncoder.RegisterPrimitiveEncoder<int>((value, stream) => {
stream.Write(BitConverter.GetBytes(value));
});

It's simple. Any type which can be directly converted to an array of bytes is classified as a 'primitive'. Each primitive can have an encoder/decoder pair registered as above.

public static class MessageEncoder
{
static Dictionary<Type, Delegate> encoders;
static Dictionary<Type, Delegate> primitives;

static MessageEncoder()
{
encoders = new Dictionary<Type, Delegate>();
primitives = new Dictionary<Type, Delegate>();
RegisterPrimitiveEncoders();
}

static void RegisterPrimitiveEncoders()
{
RegisterPrimitiveEncoder<byte>((value, stream) =>
stream.WriteByte(value)
);

RegisterPrimitiveEncoder<short>((value, stream) =>
stream.Write(BitConverter.GetBytes(IPAddress.HostToNetworkOrder(value)))
);

RegisterPrimitiveEncoder<int>((value, stream) =>
stream.Write(BitConverter.GetBytes(IPAddress.HostToNetworkOrder(value)))
);

RegisterPrimitiveEncoder<long>((value, stream) =>
stream.Write(BitConverter.GetBytes(IPAddress.HostToNetworkOrder(value)))
);

var intWriter = (Action<int, Stream>)primitives[typeof (int)];
RegisterPrimitiveEncoder<string>((value, stream) => {
var buffer = Encoding.UTF8.GetBytes(value);
intWriter(buffer.Length, stream);
stream.Write(buffer);
});
}

public static void RegisterPrimitiveEncoder<T>(Action<T, Stream> encoder)
{
primitives [typeof (T)] = encoder;
}

public static void RegisterMessage<T>(params Expression<Func<T, object>>[] properties)
{
RegisterMessage<T>(properties.Select(p => p.AsPropertyInfo ()));
}

public static void RegisterMessage<T>(IEnumerable<PropertyInfo> properties)
{
var propertyEncoders = new List<Expression>();

// The encode function takes an instance of the class we're decoding and the Stream
// which we should write the data to.
ParameterExpression source = Expression.Parameter(typeof(T), "source_param");
ParameterExpression stream = Expression.Parameter(typeof(Stream), "stream");

// For each property, get the encoder which will convert the value of the property to a byte[]
// which can be written to the stream.
foreach (var property in properties) {
// Get the encoder for this property type
var action = primitives[property.PropertyType];
// Create a var which holds the Action <T, Stream> which encodes the data to the stream
Expression converter = Expression.Constant(action, action.GetType ());
// Invoke the encoder passing the value of the property and the 'stream'
Expression invoker = Expression.Invoke(converter, Expression.Property(source, property), stream);
// Add the encoder for this property to the list.
propertyEncoders.Add(invoker);
}

// Create an expression block which will execute each of the encoders one by one
Expression block = Expression.Block(propertyEncoders);
encoders.Add(typeof(T), Expression.Lambda<Action<T, Stream>>(
block,
source,
stream
).Compile());
}

public static void Encode<T>(T message, Stream s)
{
var encoder = (Action<T, Stream>)encoders[typeof (T)];
encoder (message, s);
}
}

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Linq.Expressions;
using System.Reflection;
using System.Net;
using System.IO;

namespace Encoder
{
public static class MessageDecoder
{
static Dictionary<Type, Delegate> decoders;
static Dictionary<Type, Delegate> primitives;

static MessageDecoder()
{
decoders = new Dictionary<Type, Delegate>();
primitives = new Dictionary<Type, Delegate>();
RegisterDefaultDecoders();
}

static void RegisterDefaultDecoders()
{
RegisterPrimitiveDecoder<byte>((s) => {
var val = s.ReadByte();
if (val == -1)
throw new EndOfStreamException();
return (byte)val;
});

RegisterPrimitiveDecoder<short>((s) => IPAddress.NetworkToHostOrder (s.ReadShort()));
RegisterPrimitiveDecoder<int>(s => IPAddress.NetworkToHostOrder (s.ReadInt()));
RegisterPrimitiveDecoder<long>(s => IPAddress.NetworkToHostOrder (s.ReadLong()));

var intDecoder = (Func<Stream, int>)primitives[typeof(int)];
RegisterPrimitiveDecoder<string>(s => {
var length = intDecoder(s);
var buffer = new byte[length];
s.Read(buffer, 0, buffer.Length);
return Encoding.UTF8.GetString(buffer);
});
}

public static void RegisterPrimitiveDecoder<T>(Func<Stream, T> decoder)
{
primitives.Add(typeof(T), decoder);
}

public static void RegisterMessage<T>(params Expression<Func<T, object>>[] properties)
{
RegisterMessage<T>(properties.Select(d => d.AsPropertyInfo()));
}

public static void RegisterMessage<T>(IEnumerable<PropertyInfo> properties)
{
var propertyDecoders = new List<Expression>();

// The decode function takes an instance of the class we're decoding and the Stream
// containing the data to decode.
ParameterExpression source = Expression.Parameter(typeof(T), "source_param");
ParameterExpression stream = Expression.Parameter(typeof(Stream), "stream");

// For each property, get the primitive decoder which will read data from the stream and
// return a value of the correct type.
foreach (var property in properties) {
var action = primitives[property.PropertyType];
// Create a var which holds the Func <Stream, T> which decodes the data from the stream
Expression decoder = Expression.Constant(action, action.GetType());
// Invoke the decoder passing 'stream' as the parameter
Expression invoker = Expression.Invoke(decoder, stream);
// Store the return value of the decoder in the property.
Expression setter = Expression.Call(source, property.GetSetMethod(), invoker);
// Add the decoder for this property to the list.
propertyDecoders.Add(setter);
}

// Create a block which will execute the decoders for all the fields one after another.
Expression block = Expression.Block(propertyDecoders);
decoders.Add (typeof (T), Expression.Lambda<Action<T, Stream>>(
block,
source,
stream
).Compile ());
}

public static T Decode<T>(Stream s) where T : class, new()
{
T t = new T();
var decoder = (Action<T, Stream>)decoders[typeof(T)];
decoder(t, s);
return t;
}
}
}


The idea is quite simple. For each class we can generate an ideal serializer using expression trees which doesn't require boxing or casting. This way we can avoid the use of reflection when serializing objects and so avoid the performance penalties incurred that. The code above only handles the simple case where a class consists of primitive types (int, long, string) , though it'd be easy enough to extend it to support more complex scenarios.

The serializer as you see it could not have been written with .NET 3.0. Some of the key components like BlockExpression were only introduced with .NET 4.0. If your object contains an array which needs to be serialized, you'll need the new IndexExpression too. Sure, it's possible to fake these using some anonymous delegates and Actions, but that's not pretty :)

The total implementation is less than 170 LOC. I'd be willing to bet that with another 100 LOC you could support most constructs. If you're currently a heavy user of reflection to provide object serialization, it's time to update ;)

Tuesday, December 15, 2009

New years resolutions

It's tradition in quite a lot of countries to make a new years resolution on the 1st of January. Most people forget about them within a few days or weeks. This year, I'll be making one I'm going to keep!

I want to take part in a dancing [0] flash mob whether it's in this country or another.



What ideas do you have? Anything strange, interesting, unusual? Leave a comment and let me know, maybe you have a better idea than being part of a flash mob.

[0] Me and dancing don't get on particularly well, so it'll be an interesting challenge ;)

Sunday, December 06, 2009

Yet another INotifyPropertyChanged with Expression Trees - Part 2

In my last post, I described a method whereby you can implement INotifyPropertyChanged with zero performance overhead and near-zero boilerplate code. The only boilerplate left was the delegate you had to create to invoke the event:

public Book()
{
// Boilerplate - eugh!
Action<string> notify = (propertyName) => {
var h = PropertyChanged;
if (h != null)
h(this, new PropertyChangedEventArgs(propertyName));
};

author = new ChangeNotifier<string> (() => Author, notify);
price = new ChangeNotifier<decimal> (() => Price, notify);
quantity = new ChangeNotifier<int> (() => Quantity, notify);
title = new ChangeNotifier<string> (() => Title, notify);
}

The entire point of my implementation was to avoid writing boilerplate, so this was slightly irritating. Unfortunately, there's no trivial way around the problem as the .NET framework really limits what you can do with events. The first thing you'd think of is "pass the actual object into the ChangeNotifier constructor and just raise the event that way". For example my constructors would change to:

new ChangeNotifier<string>(() => Author, this);

That's well and good, right up until you realise that it's impossible for one object to raise an event that's declared on another object.

public class A
{
public event EventHandler MyEvent;
}

public class B
{
public void AccessEvent (A a)
{
// Invalid - you can't raise an event which is declared in another class
a.MyEvent(this, EventArgs.Empty);

// Invalid - you can't copy the event either
EventHandler h = a.MyEvent;
h(this, EventArgs.Empty);
}
}
Another alternative would be to pass the event itself into the ChangeNotifier object:
new ChangeNotifier<string> (() => Author, PropertyChanged);
But this won't work because a copy of the delegate list is created. That means if anyone adds event handlers later on, they won't be invoked when the property changes. So with that stuck firmly in my mind, I never gave much thought to removing that last remaining bit of boilerplate. That's about to change!

What I really want is for my final implementation to look more like this:

public class Book : INotifyPropertyChanged
{
public event PropertyChangedEventHandler PropertyChanged;

ChangeNotifier<string> author;

public string Author
{
get { return author.Value; }
set { author.Value = value; }
}

public Book()
{
author = ChangeNotifier.Create(() => Author, ????);
}
}

That's short and sweet . The generic types should be automatically inferred, you shouldn't have to create the delegate to raise the event, it's beautiful! The only problem is to figure out what I should replace the question marks with. I need something that will allow me to get at the current list of event handlers from outside of the Book object, i.e. something along the lines of this:

Func<PropertyChangedEventHandler> getter = delegate { return PropertyChanged; };

Prettying it up a little, this is how my Book class looks:

public class Book : INotifyPropertyChanged
{
public event PropertyChangedEventHandler PropertyChanged;

ChangeNotifier<string> author;

public string Author {
get { return author.Value; }
set { author.Value = value; }
}

public Book()
{
author = ChangeNotifier.Create (() => Author, () => PropertyChanged);
}
}

Beautiful! The more astute readers might notice a problem at this stage. Fine, the ChangeNotifier object can get the event list and raise the event, but it can't fill in the 'sender' - it has no reference to the 'book' object! Have no fear, it's already taken care of! The getter delegate has a reference to the book object (Delegate.Target), so we can fill everything in perfectly! The final implementation of the ChangeNotifier class is this:
public static class ChangeNotifier
{
public static ChangeNotifier<TValue> Create<TValue>(Expression<Func<TValue>> expression, Func<PropertyChangedEventHandler> notifier)
{
return new ChangeNotifier<TValue>(expression, notifier);
}
}

public class ChangeNotifier<TValue>
{
Func<PropertyChangedEventHandler> notifier;
string propertyName;
TValue value;

public TValue Value {
get { return value; }
set {
if (!EqualityComparer<TValue>.Default.Equals(this.value, value)) {
this.value = value;
// Get the current list of registered event handlers
// then invoke them with the correct 'sender' and event args
PropertyChangedEventHandler h = notifier();
if (h != null)
h(notifier.Target, new PropertyChangedEventArgs(propertyName));
}
}
}

public ChangeNotifier(Expression<Func<TValue>> expression, Func<PropertyChangedEventHandler> notifier)
{
if (expression.NodeType != ExpressionType.Lambda)
throw new ArgumentException("Value must be a lamda expression", "expression");
if (!(expression.Body is MemberExpression))
throw new ArgumentException("The body of the expression must be a memberref", "expression");

MemberExpression m = (MemberExpression)expression.Body;
this.notifier = notifier;
this.propertyName = m.Member.Name;
}
}
I have one final trick up my sleeve. Suppose you have a field (Progress) whose value is calculated based on other values (CurrentStep, TotalSteps) and you want to get Notifications whenever any of those fields changes, well, that's easy!

public class Worker : INotifyPropertyChanged
{
public event PropertyChangedEventHandler PropertyChanged;

ChangeNotifier<int> currentStep;
ChangeNotifier<int> totalSteps;

public int CurrentStep {
get { return currentStep.Value; }
set { currentStep.Value = value; }
}
public int TotalSteps {
get { return totalSteps.Value; }
set { totalSteps.Value = value; }
}
public double Progress
{
get { return (double)CurrentStep / TotalSteps; }
}

public Worker()
{
Func<PropertyChangedEventHandler> notifier = () => PropertyChanged;

currentStep = ChangeNotifier.Create(() => CurrentStep, notifier);
totalSteps = ChangeNotifier.Create(() => TotalSteps, notifier);

// A PropertyChanged notification will be created for Progress every time
// either the CurrentStep *or* TotalSteps changes.
ChangeNotifier.CreateDependent(
() => Progress,
notifier,
() => CurrentStep,
() => TotalSteps
);
}
}

And the new helper methods are:
public static class ChangeNotifier
{
static string GetPropertyName(Expression expression)
{
while (!(expression is MemberExpression)) {
if (expression is LambdaExpression)
expression = ((LambdaExpression)expression).Body;
else if (expression is UnaryExpression)
expression = ((UnaryExpression)expression).Operand;
}

return ((MemberExpression)expression).Member.Name;
}

public static void CreateDependent<TValue>(Expression<Func<TValue>> property, Func<PropertyChangedEventHandler> notifier, params Expression<Func<object>>[] dependents)
{
// The name of the property which is dependent on the value of other properties
var name = GetPropertyName(property);
// The names of the other properties
var dependentNames = dependents.Select<Expression, string>(GetPropertyName).ToArray();

INotifyPropertyChanged sender = (INotifyPropertyChanged)notifier.Target;
sender.PropertyChanged += (o, e) => {
// If one of our dependents changes, emit a PropertyChanged notification for our property
if (dependentNames.Contains(e.PropertyName)) {
var h = notifier();
if (h != null)
h(o, new PropertyChangedEventArgs (name));
}
};
}

public static ChangeNotifier<TValue> Create<TValue>(Expression<Func<TValue>> expression, Func<PropertyChangedEventHandler> notifier)
{
return new ChangeNotifier<TValue>(expression, notifier);
}
}

The only change is that I need to use a slightly more complicated method of getting the property name as it's possible for certain types to get wrapped in a ConvertExpression.

Saturday, December 05, 2009

Yet another INotifyPropertyChanged with Expression Trees

There are dozens of examples out there showing you how to avoid having to refer to method names as strings when implementing INotifyPropertyChanged. The most important reason why you don't want to have to do this is because method names can get refactored but the hardcoded strings might be forgotten. No-one wants to end up getting a Changed notification for a property which doesn't exist.

My issue with all these examples is that none of them thought far enough ahead. Fine, they all show you how refer to properties without using hardcoded strings but they still require you to write lots of boilerplate code to raise the PropertyChanged event - boilerplate you have to write for every property. What I want is to be able to declare all my properties like:

public string Title {
get { return title; }
set { title = value; }
}

and yet still get my property change notifications. I also want this method to be reasonably high performance. I don't want every property change to have extra memory or CPU overhead as every developer expects that changing the value of a property will not do any complex calculations. So how can I accomplish this?

To start off with, we can all tell that it's impossible to achieve the required behaviour using just the snippet above. We're going to have to add (at least) one additional level of indirection. That means I should be able to implement my requirements using code like:

public string Title {
get { return title.Value; }
set { title.Value = value; }
}

The object 'title' must then contain all the logic required to raise the property changed notification. So what might this magical object look like?

public class ChangeNotifier<TValue>
{
Action<string> notifyHandler;
string propertyName;
TValue value;

public TValue Value {
get { return value; }
set {
if (!EqualityComparer<TValue>.Default.Equals(this.value, value)) {
this.value = value;
notifyHandler(propertyName);
}
}
}


public ChangeNotifier(Expression<Func<TValue>> expression, Action<string> notifyHandler)
{
if (expression.NodeType != ExpressionType.Lambda)
throw new ArgumentException("Value must be a lamda expression", "expression");
if (!(expression.Body is MemberExpression))
throw new ArgumentException("The body of the expression must be a memberref", "expression");

MemberExpression m = (MemberExpression)expression.Body;
this.propertyName = m.Member.Name;
this.notifyHandler = notifyHandler;
}
}

You're probably looking at this thinking "What the hell is this Expression<Func<TValue>> ? How do I even use that monstrosity?". Well... simples!

public class Book : INotifyPropertyChanged
{
public event PropertyChangedEventHandler PropertyChanged;

ChangeNotifier<string> author;
ChangeNotifier<decimal> price;
ChangeNotifier<int> quantity;
ChangeNotifier<string> title;

public string Author {
get { return author.Value; }
set { author.Value = value; }
}
public decimal Price {
get { return price.Value; }
set { price.Value = value; }
}
public int Quantity {
get { return quantity.Value; }
set { quantity.Value = value; }
}
public string Title {
get { return title.Value; }
set { title.Value = value; }
}

public Book()
{
Action<string> notify = (propertyName) => {
var h = PropertyChanged;
if (h != null)
h(this, new PropertyChangedEventArgs(propertyName));
};

author = new ChangeNotifier<string> (() => Author, notify);
price = new ChangeNotifier<decimal> (() => Price, notify);
quantity = new ChangeNotifier<int> (() => Quantity, notify);
title = new ChangeNotifier<string> (() => Title, notify);
}
}

All that happens here is that when constructing the ChangeNotifier object, an Expression referencing the required Property is passed into the constructor, along with a delegate which will raise the PropertyChanged event. We parse that expression tree to retrieve the method name and store it. After that everything Just Works (tm) with little to no performance penalty. The days of writing boilerplate code for INotifyPropertyChanged are gone! You also have the benefit that you can't make a mistake writing the boilerplate code because you don't write it anymore!

Friday, December 04, 2009

Can't you feel the Moonlight? Part deux

As I was saying yesterday, the live version of the silverlight toolkit site didn't work right in moonlight. All the pretty charts rendered as you see them below, very empty.

I figured that since a slightly older version worked near-flawlessly, surely I could fix the live version with only a few minor tweaks. It's not like the would've completely rewritten the Chart controls within the space of 1 release.

I checked everything from DataBinding, to TemplateBinding, to Styles, to Measure/Arrange bugs and nothing was showing up as causing the issue. I finally narrowed it down to a bug in VisualStateGroup. For some reason the Name property was empty even though it was declared with a name in xaml.

One. Tiny. Patch. Later.


Success. I can't believe that the bug was that simple. In the end, those bugs are actually by far the worst. There's no exception thrown or any kind of visible indication that something has failed other than an empty screen. The only reason I found the bug was because the toolkit is opensource and I was running it locally with a few dozen Console.WriteLines, gradually reducing the area of code where I thought the bug was. Unfortunately this fix arrived too late for the 1.99.9 release, but it will definitely be in the release after it.

Wednesday, December 02, 2009

Can't you feel the moonlight?

It's time for the obligatory screenshots again. This is what the Data Visualisation demos from the Silverlight Toolkit (March edition) looked like yesterday:

Note the empty graphs. It doesn't look very pretty now, does it? However, one very minor fix later we now have the following:

Things are near-perfect in all the Data Visualization demos. One graph is missing a background colour and the elements in one graph aren't clickable when they should be. Neither should be particularly difficult to fix, the only problem is figuring out the cause.

Unfortunately the version of the Toolkit Demo on the live site still doesn't render perfectly, but as we already have one version near-perfect, getting a newer revision to work shouldn't be hard! Things are shaping up to give us a great 2.0 release.

Hit Counter