• Shortcuts : 'n' next unread feed - 'p' previous unread feed • Styles : 1 2

» Publishers, Monetize your RSS feeds with FeedShow:  More infos  (Show/Hide Ads)


Date: Wednesday, 18 Nov 2009 23:26

Wow, it has been a long time since I have blogged.  Yesterday, we made the first official public release of Rx on devlabs.  And that means that I can now talk about what has been on my mind for the past while: Rx.  What is Rx?  Rx is short for the Reactive Extensions for .NET.  Think of it as a prescription strength library for writing reactive programs in your favorite .NET language.

Today, I want to give you a brief introduction to Rx that complements my screencast.  In subsequent posts, I will dive deep into the semantics of Rx providing detailed explanations and examples.  Hopefully, this will positively impact your day to day work and allow you to have a lot of fun in the process.  Please feel free to ask any questions that you may have or provide feedback about Rx.  This first release is a sort of preview of the technology.  We will provide regular updates that both extend and refine the library and we want you to be a part of that process.

First Steps

Rx is currently available for .NET Framework 3.5 SP1, .NET Framework 4 Beta 2, and Silverlight 3.  Download and install one of these versions and then open Visual Studio.

image

Create a WPF Application or a Silverlight Application.

image

image

The Chrome

When the designer opens, replace the existing Grid control with the following XAML.

<Canvas>
    <TextBlock
        x:Name="textBlock"
        Text="Rx Rocks!" />
    <Image
        x:Name="image"
        Source="
http://upload.wikimedia.org/wikipedia/commons/7/7a/Rx_symbol.png"
        Width="100"
        Height="100"
        Canvas.Left="0"
        Canvas.Top="25" />
</Canvas>

Then switch to the code view.

image

Starting with Rx

To begin using Rx, first add a reference to System.Reactive and System.CoreEx.  The two other assemblies included with this release are System.Threading (task parallel library) and System.Interactive (more enumerable goodness).

image

image

image 

Once you have added a reference to System.Reactive and System.CoreEx, then you can use the reactive operators found in Observable.

Observable Sequences

Observables, like enumerables, are sequences.  The only difference is that the latter is pull-based and the former is push-based.  Both sequences are a possibly empty sequence of values optionally followed by either normal termination or abnormal termination.

When you want to enumerate an enumerable:

1.  Get the enumerator from the enumerable

2.  Call MoveNext and call Current on the enumerator for each value in the sequence

3.  The sequence ends when MoveNext returns false (normal termination) or MoveNext or Current throws an exception (abnormal termination)

The user writes a foreach statement over an enumerable sequence.

foreach (var x in xs)
    F(x);

With enumerables, each value is pulled out of the source and bound to the iteration variable before executing the user code.  Note that the consumer controls this process.

When you want to observe an observable sequence:

1.  Subscribe an observer to the observable

2.  OnNext is called on the observer for each value in the sequence

3.  The sequence ends when OnCompleted (normal termination) or OnError (abnormal termination) is called on the observer

The user writes a subscribe statement over an observable sequence.

xs.Subscribe(x => F(x));

With observables, each value is pushed out of the source and bound to the handler variable before executing the user code.  Note that the producer controls this process.

Hello, World!

Let’s write our first Rx application, the obligatory “Hello, World!” application.

Remember all those times that you wrote UI applications with asynchronous operations and got the dreaded Cross-Thread operation exception.

image

Well Rx, makes this simpler despite the fact that it has a lot of asynchronous and event-based operations going on at once.

The user may specify the default SynchronizationContext to which to send Rx messages.  By default, it uses a reactive event loop except in Silverlight where it uses the dispatcher by default.  The user can change this by setting Observable.Context to the appropriate SynchronizationContext.  Rx provides a few commonly used synchronization contexts in the class SynchronizationContexts.

Since we are using WPF, we should set the default context to the dispatcher context.

Observable.Context = SynchronizationContexts.CurrentDispatcher;

Now, we can use the UI without getting cross-thread exceptions.

Rx includes a number of basic operations for constructing observable sequences.  For example, Return creates a singleton sequence that contains an OnNext message with the specified value followed by an OnCompleted message.  We can then subscribe to the observable using a value handler which specifies what to do with each OnNext message from the observable sequence.

var helloWorld = Observable.Return("Hello, World!");

helloWorld.Subscribe(value => textBlock.Text = value);

When we run the application, we see that the text is set to “Hello, World!”.

image

A Second Step

Another basic operation for constructing observable sequences, is the Interval operator.  It creates an observable sequence that repeatedly sends an OnNext message after the specified time interval.  For example, we can create an observable sequence that sends a message after every second.

var timer = Observable.Interval(TimeSpan.FromSeconds(1));

timer.Subscribe(value => textBlock.Text = value.ToString());

image 

Wrapping .NET Events

Another source of observable sequences is existing .NET events and asynchronous computations.  We can bring these existing reactive sources into Rx by using conversions.  For example, we can convert .NET events to observable sequences using the FromEvent operator.  The user must specify the type of EventArgs for the event, the target, and the event name.  We can use this to convert the MouseDown event on the image to an observable sequence of events

var mouseDown = Observable.FromEvent<MouseButtonEventArgs>(image, "MouseDown");

LINQ to Rx

Observable contains an implementation of the standard query operators for observable sequences and that means that we can use LINQ to write queries over observable sequences.  For example, we can modify our mouseDown observable sequence to contain a sequence of position values by projecting each event to the current mouse position relative to the image.

var mouseDown = from evt in Observable.FromEvent<MouseButtonEventArgs>(image, "MouseDown")
                select evt.EventArgs.GetPosition(image);

Composition

Rx also contains a number of operators to compose together observable sequences in a way very similar to how Linq to Objects composes together enumerable sequences. For example, the operator Until takes all of the messages from the left source until an OnNext message occurs on the right source at which time it will unsubscribe from the left source and the right source.

var q = timer.Until(mouseDown);

q.Subscribe(value => textBlock.Text = value.ToString());

Drag and Drop

We can put together these simple pieces and build drag and drop.  In the screencast, I wrote drag and drop using the mouse deltas, here I will write it using the mouse offset to show another way to do the same thing.  As you begin to use Rx, you will find that often you can simplify your queries repeatedly and if you do it right then at the end you will get a query that looks remarkably like the specification of the problem.

What is the specification of drag and drop?  Every time the user clicks down on the image, we want to move the image relative to where she clicked down until the mouse button goes up.

Therefore, we need three events: the mouse down on the image, the mouse move on the window, and the mouse up on the window.  For the mouse down, we need the position relative to the image, for the mouse move, we need the position relative to the window, and for the mouse up we do not need the position.

var mouseDown = from evt in Observable.FromEvent<MouseButtonEventArgs>(image, "MouseDown")
                select evt.EventArgs.GetPosition(image);

var mouseUp = Observable.FromEvent<MouseButtonEventArgs>(image, "MouseUp");

var mouseMove = from evt in Observable.FromEvent<MouseEventArgs>(image, "MouseMove")
                select evt.EventArgs.GetPosition(this);

Now, we can write a query that expresses our intent.  When the mouse goes down grab the offset of the mouse relative to the image, and then listen to the mouse moves relative to the window until the mouse goes up.  For each mouse move during this time, compute the image position as the mouse position subtracted by the original offset relative to the image.

var q = from imageOffset in mouseDown
        from pos in mouseMove.Until(mouseUp)
        select new { X = pos.X - imageOffset.X, Y = pos.Y - imageOffset.Y };

Finally, we can subscribe to the observable sequence of positions and update the position of the image.

q.Subscribe(value =>
                {
                    Canvas.SetLeft(image, value.X);
                    Canvas.SetTop(image, value.Y);
                });

image 

Looking Forward

Drag and drop is fun, but it is only one small example of the power of Rx.  Next time, we will start our deep dive into the semantics of Rx.  Until then take the time to download it and take it for a test drive.

Author: "wesdyer" Tags: "Rx"
Comments Send by mail Print  Save  Delicious 
Date: Friday, 11 Jan 2008 03:06

If the word "continuation" causes eyes to glaze over, then the word "monad" induces mental paralysis.  Perhaps, this is why some have begun inventing more benign names for monads.

These days, monads are the celebrities of programming language theory.  They gloss the cover of blogs and have been compared to everything from boxes of fruit to love affairs.  Nerds everywhere are exclaiming that the experience of understanding monads causes a pleasantly painful mental sensation.

Like continuations, monads are simpler than they sound and are very useful in many situations.  In fact, programmers write code in a variety of languages that implicitly use common monads without even breaking a sweat.

With all of the attention that monads get, why am I writing yet another explanation of monads?  Not to compare them to some everyday occurrence or to chronicle my journey to understanding.  I explain monads because I need monads.  They elegantly solve programming problems in a number of languages and contexts.

Introducing Monads

Monads come from category theoryMoggi introduced them to computer scientists to aid in the analysis of the semantics of computations.  In an excellent paper, The Essence of Functional Programming, Wadler showed that monads are generally useful in computer programs to compose together functions which operate on amplified values rather than values.  Monads became an important part of the programming language Haskell where they tackle the awkward squad: IO, concurrency, exceptions, and foreign-function calls.

Monads enjoy tremendous success in Haskell, but like an actor who does well in a particular role, monads are now stereotyped in the minds of most programmers as useful only in pure lazy functional languages.  This is unfortunate, because monads are more broadly applicable.

Controlling Complexity

Composition is the key to controlling complexity in software.  In The Structure and Interpretation of Computer Programs, Abelson and Sussman argue that composition beautifully expresses complex systems from simple patterns.

In our study of program design, we have seen that expert programmers control the complexity of their designs with the same general techniques used by designers of all complex systems. They combine primitive elements to form compound objects, they abstract compound objects to form higher-level building blocks, and they preserve modularity by adopting appropriate large-scale views of system structure.

One form of composition, function composition, succinctly describes the dependencies between function calls.  Function composition takes two functions and plumbs the result from the second function into the input of the first function, thereby forming one function.

public static Func<T, V> Compose<T, U, V>(this Func<U, V> f, Func<T, U> g)
{
    return x => f(g(x));
}

For example, instead of applying g to the value x and then applying f to the result, compose f with g and then apply the result to the value x.  The key difference is the abstraction of the dependency between f and g.

var r = f(g(x));         // without function composition
var r = f.Compose(g)(x); // with function composition

Given the function Identity, function composition must obey three laws.

public static T Identity<T>(this T value)
{
    return value;
}

1.  Left identity

     Identity.Compose(f) = f

2.  Right identity

     f.Compose(Identity) = f

3.  Associative

     f.Compose(g.Compose(h)) = (f.Compose(g)).Compose(h)

Often, values are not enough.  Constructed types amplify values.  The type IEnumerable<T> represents a lazily computed list of values of type T.  The type Nullable<T> represents a possibly missing value of type T.  The type Func<Func<T, Answer>, Answer> represents a function, which returns an Answer given a continuation, which takes a T and returns an Answer.  Each of these types amplifies the type T.

Suppose that instead of composing functions which return values, we compose functions which take values and return amplified values.  Let M<T> denote the type of the amplified values.

public static Func<T, M<V>> Compose<T, U, V>(this Func<U, M<V>> f, Func<T, M<U>> g)
{
    return x => f(g(x)); // error, g(x) returns M<U> and f takes U
}

Function composition fails, because the return and input types do not match.  Composition with amplified values requires a function that accesses the underlying value and feeds it to the next function.  Call that function "Bind" and use it to define function composition.

public static Func<T, M<V>> Compose<T, U, V>(this Func<U, M<V>> f, Func<T, M<U>> g)
{
    return x => Bind(g(x), f);
}

The input and output types determine the signature of Bind.  Therefore, Bind takes an amplified value, M<U>, and a function from U  to M<V>, and returns an amplified value, M<V>.

public static M<V> Bind<U, V>(this M<U> m, Func<U, M<V>> k)

The body of Bind depends on the mechanics of the amplified values, M<T>.  Each amplified type will need a distinct definition of Bind.

In addition to Bind, define a function which takes an unamplified value and amplifies it.  Call this function "Unit".

public static M<T> Unit<T>(this T value)

Together the amplified type, M<T>, the function Bind, and the function Unit enable function composition with amplified values.

Meet the Monads

Viola, we have invented monads.

Monads are a triple consisting of a type, a Unit function, and a Bind function.  Furthermore, to be a monad, the triple must satisfy three laws:

1.  Left Identity

     Bind(Unit(e), k) = k(e)

2.  Right Identity

     Bind(m, Unit) = m

3.  Associative

     Bind(m, x => Bind(k(x), y => h(y)) = Bind(Bind(m, x => k(x)), y => h(y))

The laws are similar to those of function composition.  This is not a coincidence.  They guarantee that the monad is well behaved and composition works properly.

To define a particular monad, the writer supplies the triple, thereby specifying the mechanics of the amplified values.

The Identity Monad

The simplest monad is the Identity monad.  The type represents a wrapper containing a value.

class Identity<T>
{
    public T Value { get; private set; }
    public Identity(T value) { this.Value = value; }
}

The Unit function takes a value and returns a new instance of Identity, which wraps the value.

static Identity<T> Unit<T>(T value)
{
    return new Identity<T>(value);
}

The bind function takes an instance of Identity, unwraps the value, and invokes the delegate, k, with the unwrapped value.  The result is a new instance of Identity.

static Identity<U> Bind<T,U>(Identity<T> id, Func<T,Identity<U>> k)
{
    return k(id.Value);
}

Consider a simple program that creates two Identity wrappers and performs an operation on the wrapped values.  First, bind x to the value within the wrapper containing the value five.  Then, bind y to the value within the wrapper containing the value six.  Finally, add the values, x and y, together.  The result is an instance of Identity wrapping the value eleven.

var r = Bind(Unit(5), x =>
            Bind(Unit(6), y =>
                Unit(x + y)));

Console.WriteLine(r.Value);

While this works, it is rather clumsy.  It would be nice to have syntax for dealing with the monad.  Fortunately, we do.

C# 3.0 introduced query comprehensions which are actually monad comprehensions in disguise.  We can rewrite the identity monad to use LINQ.  Perhaps, it should have been called LINM (Language INtegrated Monads), but it just doesn't have the same ring to it.

Rename the method Unit to ToIdentity and Bind to SelectMany.  Then, make them both extension methods.

public static Identity<T> ToIdentity<T>(this T value)
{
    return new Identity<T>(value);
}

public static Identity<U> SelectMany<T, U>(this Identity<T> id, Func<T, Identity<U>> k)
{
    return k(id.Value);
}

The changes impact the calling code.

var r = 5.ToIdentity().SelectMany(
            x => 6.ToIdentity().SelectMany(
                y => (x + y).ToIdentity()));

Console.WriteLine(r.Value);

Equivalent methods are part of the standard query operators defined for LINQ.  However, the standard query operators also include a slightly different version of SelectMany for performance reasons.  It combines Bind with Unit, so that lambdas are not deeply nested.  The signature is the same except for an extra argument that is a delegate which takes two arguments and returns a value.  The delegate combines the two values together.  This version of SelectMany binds x to the wrapped value, applies k to x, binds the result to y, and then applies the combining function, s, to x and y.  The resultant value is wrapped and returned.

public static Identity<V> SelectMany<T, U, V>(this Identity<T> id, Func<T, Identity<U>> k, Func<T,U,V> s)
{
    return id.SelectMany(x => k(x).SelectMany(y => s(x, y).ToIdentity()));
}

Of course, we can remove some of the code from the generalized solution by using our knowledge of the Identity monad.

public static Identity<V> SelectMany<T, U, V>(this Identity<T> id, Func<T, Identity<U>> k, Func<T,U,V> s)
{
    return s(id.Value, k(id.Value).Value).ToIdentity();
}

The call-site does not need to nest lambdas.

var r = 5.ToIdentity()
         .SelectMany(x => 6.ToIdentity(), (x, y) => x + y);

Console.WriteLine(r.Value);

With the new definition of SelectMany, programmers can use C#'s query comprehension syntax.  The from notation binds the introduced variable to the value wrapped by the expression on the right.  This allows subsequent expressions to use the wrapped values without directly calling SelectMany.

var r = from x in 5.ToIdentity()
        from y in 6.ToIdentity()
        select x + y;

Since the original SelectMany definition corresponds directly to the monadic Bind function and because the existence of a generalized transformation has been demonstrated, the remainder of the post will use the original signature.  But, keep in mind that the second definition is the one used by the query syntax.

The Maybe Monad

The Identity monad is an example of a monadic container type where the Identity monad wrapped a value.  If we change the definition to contain either a value or a missing value then we have the Maybe monad.

Again, we need a type definition.  The Maybe type is similar to the Identity type but adds a property denoting whether a value is missing.  It also has a predefined instance, Nothing, representing all instances lacking a value.

class Maybe<T>
{
    public readonly static Maybe<T> Nothing = new Maybe<T>();
    public T Value { get; private set; }
    public bool HasValue { get; private set; }
    Maybe()
    {
        HasValue = false;
    }
    public Maybe(T value)
    {
        Value = value;
        HasValue = true;
    }
}

The Unit function takes a value and constructs a Maybe instance, which wraps the value.

public static Maybe<T> ToMaybe<T>(this T value)
{
    return new Maybe<T>(value);
}

The Bind function takes a Maybe instance and if there is a value then it applies the delegate to the contained value.  Otherwise, it returns Nothing.

public static Maybe<U> SelectMany<T, U>(this Maybe<T> m, Func<T, Maybe<U>> k)
{
    if (!m.HasValue)
        return Maybe<U>.Nothing;
    return k(m.Value);
}

The programmer can use the comprehension syntax to work with the Maybe monad.  For example, create an instance of Maybe containing the value five and add it to Nothing.

var r = from x in 5.ToMaybe()
        from y in Maybe<int>.Nothing
        select x + y;

Console.WriteLine(r.HasValue ? r.Value.ToString() : "Nothing");

The result is "Nothing".  We have implemented the null propagation of nullables without explicit language support.

The List Monad

Another important container type is the list type.  In fact, the list monad is at the heart of LINQ.  The type IEnumerable<T> denotes a lazily computed list.

The Unit function takes a value and returns a list, which contains only that value.

public static IEnumerable<T> ToList<T>(this T value)
{
    yield return value;
}

The Bind function takes an IEnumerable<T>, a delegate, which takes a T and returns an IEnumerable<U>, and returns an IEnumerable<U>.

public static IEnumerable<U> SelectMany<T, U>(this IEnumerable<T> m, Func<T, IEnumerable<U>> k)
{
    foreach (var x in m)
        foreach (var y in k(x))
            yield return y;
}

Now, the programmer can write the familiar query expressions with IEnumerable<T>.

var r = from x in new[] { 0, 1, 2 }
        from y in new[] { 0, 1, 2 }
        select x + y;

foreach (var i in r)
    Console.WriteLine(i);

Remember that it is the monad that enables the magic.

The Continuation Monad

The continuation monad answers the question that was posed at the end of the last post: how can a programmer write CPS code in a more palatable way?

The type of the continuation monad, K, is a delegate which when given a continuation, which takes an argument and returns an answer, will return an answer.

delegate Answer K<T,Answer>(Func<T,Answer> k);

The type K fundamentally differs from types Identity<T>, Maybe<T>, and IEnumerable<T>.  All the other monads represent container types and allow computations to be specified in terms of the values rather than the containers, but the continuation monad contains nothing.  Rather, it composes together continuations the user writes.

To be a monad, there must be a Unit function which takes a T and returns a K<T,Answer> for some answer type.

public static K<T, Answer> ToContinuation<T, Answer>(this T value)

What should it do?  When in doubt, look to the types.   The method takes a T and returns a function, which takes a function from T to Answer, and returns an Answer.  Therefore, the method must return a function and the only argument of that function must be a function from T to Answer.  Call the argument c.

return (Func<T, Answer> c) => ...

The body of the lambda must return a value of type Answer.  Values of type Func<T,Answer> and a T are available.  Apply c to value and the result is of type Answer.

return (Func<T, Answer> c) => c(value);

To be a monad, Bind must take a K<T,Answer> and a function from T to K<U, Answer> and return a K<U, Answer>.

public static K<U, Answer> SelectMany<T, U, Answer>(this K<T, Answer> m, Func<T, K<U, Answer>> k)

But what about the body?  The result must be of type K<U, Answer>, but how is a result of the correct type formed?

Expand K's definition to gain some insight.

return type

     Func<Func<U, Answer>, Answer>

m's type

     Func<Func<T, Answer>, Answer>

k's type

     Func<T, Func<Func<U, Answer>, Answer>>

Applying k to a value of type T results in a value of type K<U,Answer>, but no value of type T is available.  Build the return type directly by constructing a function, which takes a function from U to Answer.  Call the parameter c.

return (Func<U,Answer> c) => ...

The body must be type of Answer so that the return type of Bind is K<U,Answer>.  Perhaps, m could be applied to a function from T to Answer.  The result is a value of type Answer.

return (Func<U,Answer> c) => m(...)

The expression inside the invocation of m must be of type Func<T,Answer>.  Since there is nothing of that type, construct the function by creating a lambda with one parameter, x, of type T.

return (Func<U,Answer> c) => m((T x) => ...)

The body of this lambda must be of type Answer.  Values of type T, Func<U,Answer>, and Func<T,Func<Func<U,Answer>, Answer>> haven't been used yet.  Apply k to x.

return (Func<U,Answer> c) => m((T x) => k(x)...)

The result is a value of type Func<Func<U,Answer>,Answer>.  Apply the result to c.

return (Func<U,Answer> c) => m((T x) => k(x)(c));

The continuation monad turns the computation inside out.  The comprehension syntax can be used to construct continuations.

Construct a computation, which invokes a continuation with the value seven.  Pass this computation to another computation, which invokes a continuation with the value six.   Pass this computation to another computation, which invokes a continuation with the result of adding the results of the first two continuations together.  Finally, pass a continuation, which replaces "1"s with "a"s, to the result.

var r = from x in 7.ToContinuation<int,string>()
        from y in 6.ToContinuation<int,string>()
        select x + y;

Console.WriteLine(r(z => z.ToString().Replace('1', 'a'))); // displays a3

The continuation monad does the heavy-lifting of constructing the continuations.

Monadic Magic

Beautiful composition of amplified values requires monads.  The Identity, Maybe, and IEnumerable monads demonstrate the power of monads as container types.  The continuation monad, K, shows how monads can readily express complex computation.

Stay tuned for more with monads.  Until then, see what monads can do for you.

Author: "wesdyer" Tags: "Functional Programming, Theory"
Comments Send by mail Print  Save  Delicious 
Date: Saturday, 22 Dec 2007 09:51

There are some technical words that cause quite a stir even amongst geeks.  When someone says the word "continuation", people's eyes glaze over and they seek the first opportunity to change the subject.  The stir is caused because most people don't understand what a continuation is or why someone would want to use one.  This is unfortunate, because they really are rather more simple than they sound.

Defining Continuations

A continuation represents the remainder of a computation given a point in that computation.  For example, suppose that two methods are defined: M and Main.  The method Main invokes M and then writes "Done" to the console.  The method M assigns x the value five, invokes F, increments x, and writes x to the console.

static void M()
{
    var x = 5;
    F();  // <----------- Point in the computation
    ++x;
    Console.WriteLine(x);
}

static void Main()
{
    M();
    Console.WriteLine("Done");
}

The continuation of the computation at the invocation of F is the remainder of the program execution beginning with incrementing x in the method M.  In this case, the continuation includes incrementing x, writing x, returning to Main, and writing "Done" to the console.

Some languages give programmers explicit access to continuations.  For example, Scheme has a function which calls a function passing a function representing the current continuation.  The function is aptly named call with current continuation or call/cc.  If such a function existed in .NET it might return a T given a delegate that returns a T given a delegate from T to T.

T CallCC<T>(Func<Func<T,T>, T> f)

Continuations are the functional counterparts of GOTOs both in power and inscrutability.  They can express arbitrary control flow like coroutines and exceptions while bewildering some of the brightest programmers.

Returning Values with Continuations

The simplest use of a continuation is to simulate returning out of a function.  Suppose that .NET had the function CallCC.  If Main calls a function Foo passing the value four and if Foo immediately invokes CallCC with a lambda that binds Return to the continuation at the point of the call to CallCC, then when Return is invoked inside of the lambda with the value four, computation will immediately jump out of the lambda to the point after CallCC but before the return with the value four on the stack.  This happens regardless what of occurs after the invocation of Return inside the lambda.

static int Foo(int n)
{
    return CallCC<int>(Return =>
        {
            Return(n);
            return n + 1;
        });
}

static void Main()
{
    Foo(4);
}

The result of running the program is therefore four and not five.

All of this demonstrates that returning from a function is equivalent to invoking the continuation defined at the function's call-site.  When a function "returns", it implicitly invokes the continuation of its call-site.

When people explain continuations, they usually discuss stack frames and instruction pointers.  It is easy to see why.  The implicit invocation of the "return" continuation restores the stack frame at the call-site and sets the instruction pointer to the instruction immediately after the call-site.  This is what invoking a continuation does: restore the appropriate stack frame and set the instruction pointer.

Continuation-Passing Style

It is still possible to use continuations if a language does not support a function like call/cc.  A programmer can explicitly construct the continuation of a computation and pass it directly to a function.

To illustrate this transformation, suppose that a function, Identity, is defined that returns the value it is given.

static T Identity<T>(T value)
{
    return value;
}

The return statement in Identity implicitly invokes the continuation of the call-site causing computation to leave Identity and resume at the point immediately after the invocation of Identity at the call-site.  If Main contains only a call to WriteLine passing the result of the call to Identity, then computation resumes with the call to WriteLine.

static void Main()
{
    Console.WriteLine(Identity("foo"));
}

However, instead of implicitly invoking Identity's continuation, the programmer can pass the continuation to Identity explicitly.  An extra argument is added to Identity which is a void-returning delegate with one parameter that is the same type as the return type of the former Identity function.

static void Main()
{
    Identity("foo", s => Console.WriteLine(s));
}

static void Identity<T>(T value, Action<T> k)
{
    k(value);
}

At the call-site, the remainder of the computation has moved into the lambda passed to Identity.  The continuation is explicitly passed.  On the callee's side, the return statement is replaced by the invocation of the continuation parameter, k, with the return value.

This pattern is called continuation-passing style or CPS.

Converting to CPS

Now that we know enough to be dangerous, let's see what we can do.  Consider the typical Max function.

static int Max(int n, int m)
{
    if (n > m)
        return n;
    else
        return m;
}

To convert this function to CPS, follow these steps:

1.  Change the return type to void

2.  Add an extra argument of type Action<T> where T is the original return type

3.  Replace all return statements with invocations of the new continuation argument passing the expression used in the return statement

Applying these steps to the integer version of Max, the function now returns void, takes an extra parameter, k, of type Action<int>, and invokes k everywhere a return statement appeared.

static void Max(int n, int m, Action<int> k)
{
    if (n > m)
        k(n);
    else
        k(m);
}

To convert the call-site to CPS:

1.  Remove all of the remaining computation after the call-site

2.  Put the remaining computation in the body of the lambda corresponding to the continuation argument

For example, suppose the user defined a method Main which invokes WriteLine with the result of applying Max to three and four.

static void Main()
{
    Console.WriteLine(Max(3, 4));
}

The remaining computation after the invocation of Max is the call to WriteLine.  Therefore, the call to WriteLine is moved into the lambda representing the continuation passed to Max.

static void Main()
{
    Max(3, 4, x => Console.WriteLine(x));
}

The steps for converting the call-site skirted the issue of what to do with return statements in the continuation.  For example, suppose there are three methods Main, F, and G where Main calls F and F calls G.  To convert F and G to CPS, follow the same transformation steps as with Max.

static void Main()
{
    Console.WriteLine(F(1) + 1);
}

static int F(int n)
{
    return G(n + 1) + 1;
}

static int G(int n)
{
    return n + 1;
}

However, what should the transformation do with the return statement in F?  The continuation passed to G should definitely not attempt to return a result because the continuation is a void-returning delegate.

static void F(int n, Action<int> k)
{
    G(n + 1, x => { return x + 1; });  // error, Action<int> cannot return a value
}

That wouldn't make any sense.  Delegates with void return-types cannot return values.  Furthermore, the code completely ignores k.

Instead, the return statement should be transformed into an invocation of k.

static void Main()
{
    F(1, x => Console.WriteLine(x + 1));
}

static void F(int n, Action<int> k)
{
    G(n + 1, x => k(x + 1));
}

static void G(int n, Action<int> k)
{
    k(n + 1);
}

This is how functions that use explicit continuations are composed together.

What about recursive functions like Factorial?

static void Main()
{
    Console.WriteLine(Factorial(5));
}

static int Factorial(int n)
{
    if (n == 0)
        return 1;
    else
        return n * Factorial(n - 1);
}

Recursive functions can be transformed just as easily following the same steps.

static void Main()
{
    Factorial(5, x => Console.WriteLine(x));
}

static void Factorial(int n, Action<int> k)
{
    if (n == 0)
        k(1);
    else
        Factorial(n - 1, x => k(n * x));
}

CPS turns a program inside-out.  In the process, the programmer may feel that his brain has been turned inside-out as well.

Why CPS?

What exactly can a programmer do with CPS besides showoff at parties?

Compilers use a more thorough CPS transformation to produce an intermediate form amenable to many analyses.  UI frameworks use CPS to keep the UI responsive while allowing nonlinear program interaction.  Web servers use CPS to allow computation to flow asynchronously across pages.

Most programmers have used functions which take a callback.  Often, the callback is the code that is invoked upon completion of the function.  In these cases, the callback is an explicitly-passed continuation.

Asynchronous Calls

Hiding network latency requires asynchronous calls.  In the first technology preview, Volta allows programmers to add asynchronous versions of methods on tier boundaries.

[RunAtOrigin]
static class C
{
    public static int F(string s)
    {
        return s != null ? s.Length : 0;
    }
}

To make a method asynchronous, define the CPS-equivalent method signature and annotate it with the Async attribute.  Volta will generate the body and modify the call-sites accordingly.

[Async]
public static void F(string s, Action<int> k);

If the programmer invokes an asynchronous method F, then Volta will launch the invocation on another thread and invoke the continuation upon completion of the call to F.

C.F("foo", x => Console.WriteLine(x));

Wrapping it Up

Continuations are powerful.  CPS gives programmers a way to use continuations in their day-to-day work.  Perhaps, someday soon we'll discuss how to make CPS more palatable, but we will need to discuss the "M" word first.

Author: "wesdyer" Tags: "C#, Functional Programming, Volta"
Comments Send by mail Print  Save  Delicious 
Date: Friday, 07 Dec 2007 22:42

It was spring 2003, I had just finished a weekend camping in the southern Arizona desert.  I was dusty and physically exhausted from hours of playing paintball.  For those who have never been in those parts, imagine long straight roads with dry sage brush, painful cactus, and jagged mountains.  I needed a book to pass the time during the drive.  Fortunately, I had brought along "Extreme Programming Explained" by Kent Beck.  After having spent too much time in projects that either completely lacked process or had way too much of it, Beck's position seemed like a revelation.  Over the next few months, I became acquainted with test-driven development, design patterns, refactoring and many of the other topics typically associated with agile programming.

Test Driven Development

In the spring, I had a course that ended up just teaching agile programming concepts.  One of the things that the professor emphasized was following the test-driven development (TDD) process to the letter.

image

The TDD Process:

1.  Add a test

2.  Run all tests and see the new one fail

3.  Write some code

4.  Run the automated tests and see them succeed 

5.  Refactor code

Repeat

When I say refactor, I mean it in the strictest sense.  I mean changing how a program is arranged internally without changing the semantics of the program at all.  This of course ignores the fact that things like timing, working set, or stack size change which can be material to the run-time behavior of the program.

During step three a programmer should remember, "It is important that the code written is only designed to pass the test; no further (and therefore untested) functionality should be predicted and 'allowed for' at any stage".

Machine Learning

I fully bought into this rigorous approach to TDD until one summer day when I was reading through Tom Mitchell's marvelous book "Machine Learning".  The book begins by defining machine learning:

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Intuitively, the learning process is trying to learn some unknown concept.  The learner has access to its past experiences (the training data) and uses them to generate a hypothesis of the unknown concept.  This hypothesis is then evaluated and the new experiences are added to the training data.  The learning process then repeats itself as the learner forms a better approximation of the underlying concept. 

TDD is a learning process.  Where the training experience E is the automated tests, the task T is improving the quality of the program, and the performance measure P is the percentage of tests passed.  In this view of TDD, the programmer continues to hack away at his program generating hypotheses, where each hypothesis is a version of the code, until one satisfies the data.  Note that unless the tests exhaustively specify the desired behavior of a program, then the tests are not the target concept that needs to be learned but rather a few data points where the target concept is manifest.  Furthermore, the program should not only do well against the tests but also against real world scenarios which may not be covered by the tests.

j0385807

Bias and Learning

A learner is characterized by its hypothesis space and how it searches through that space.  Notice that since each learner has a hypothesis space, a given concept may have not be in that space.  One solution to this problem is to make a learner's hypothesis space include all hypotheses, but this leads to a fundamental problem: "a learner that makes no a priori assumptions regarding the identity of the target concept has no rational basis for classifying any unseen instances."  The only thing that an unbiased learner can do is regurgitate the training data because it has no ability to make the assumptions and logical leaps necessary to deal with unseen data.

For example, suppose that I am trying to learn to classify animals as either mammals or non-mammals and suppose that I make no assumptions whatsoever in the process of learning this task.  If I am told that giraffes, lions, mice, bats, and whales are mammals but snakes, butterflies, frogs, lobsters, and  penguins are not, then I only have the ability to answer questions regarding those animals for which I already know the answer.  I could have done much better by trying to guess at what makes an animal a mammal even if I got it wrong!

Returning to TDD, note that at no time is a programmer supposed to write code that generalizes to the target concept.  The programmer should only write either the minimal code to make a test pass or refactor the code and not change the semantics.  So according to the pure TDD approach, the program would only learn to regurgitate the training data and do who knows what against new examples.  A critic might respond that more test cases are needed.  In some cases where the input space is finite and small, this might well help, but in general this is impractical.

Sufficiently Large Projects

A similar situation arises in projects that are sufficiently large.  These projects are characterized by the fact that they cannot be held inside any one person's head.  This means that when someone is making a change, they cannot know all of the consequences of their change.  Often in projects of this size, when people ask if a particular change is good, the response is, "It passed the tests."  This leads to a state where the development process as a whole is a learning process that uses the test cases as the training data and generates programs as hypotheses that are evaluated against the training data.  Most likely, the target concept is not normally fully determined by the tests.

You know the projects that I am talking about, and chances are that you have seen some code that was added in just to satisfy some test but didn't really capture the essence of what the test was originally trying to make the program do.  While tests are invaluable to programmers for debugging purposes, it would be better if programmers returned to the specification in order to understand what they are supposed to develop instead of returning to the tests.

Testing practices like clean-room testing and various forms of white box testing address address the programmer bias deficiency, but it can also be addressed automatically.

Testing an Infinite Space

I had a development lead who liked to say that testing is just sampling.  In a sense he is right, since a test is a particular input sequence given to a program from the set of all possible input sequences.  Think of a program as a state machine encoded in your favorite process calculus.  A test then verifies that a particular path of edges corresponding to the transitions based on the input ends in a particular state.  Most interesting programs have an infinite number of possible input sequences because of their recursive nature and so the set of tests must be sampled from an infinite input space.

Testing creates tests by drawing input sequences out of the input sequence space.  It is easy to imagine that tests could be generated randomly by selecting random input sequences.  Random selection has several advantages.  First, it enables statistical measures of quality.  For example, if we sample uniformly then the pass/fail rate should be representative of the whole given a large enough sample.  We could also keep generating more and more tests to improve our confidence that the program is indeed correct.  Second, random tests have the added benefit of not allowing the programmers to just "pass the tests" since the programmers cannot anticipate what tests will be run.  This forces the programmers to generate a hypothesis that more closely matches the target concept instead of the training data.

There are however some disadvantages.  The first apparent problem is repeatability of the tests.  This can be remedied by storing the seed that generates a set of tests.  A second more serious problem is that a uniform random sample doesn't necessarily deliver the best results.

On Bugs

Not all bugs are created equal.  Practices such as testing the boundary conditions and integration testing are based on that fact.  Various testing practices explore parts of the input space that are richer in bugs or are more likely to contain serious bugs.

image

The cost of a bug can be quantified as the total monetary loss due to the presence of that bug in the software.  This includes all sorts of things like lost sales through negative customer perception, cost of software patches, legal action, etc.  While this is an accurate measure of cost, it is not very practical because it is very hard to estimate.

A more practical measure for estimating the relative cost of a bug might be the probability that users will find it multiplied by the severity of the bug.  The first part of the equation is interesting because it indicates that those bugs that customers never find are not important for testing to find.

Back to Training

The testing ideal is to minimize the total cost of bugs.  There are many good methods for writing test cases.  In addition to these, we could also select input sequences based upon the probability that a user would select such an input sequence.

Imagine if it were possible to collect all of the input sequences that would ever be given to a program including duplicates.  Then our problem would be reduced to uniformly selecting input sequences out of this collection.  Obviously, this collection is unavailable but if we use the same machine learning principles which we have been discussing then we could develop a program that could learn to generate test cases according to some distribution by using a representative set of user input sequences as training data.

Markov Chains

One way that this could be done is by returning to the formulation of a computer program as a state machine.  The object is to learn the probability that a user would take some transition given a particular state and a sequence of previous states and transitions.  This can be formulated as a Markov chain.

Consider generating random test cases for a parser.  A straightforward approach is to randomly derive a string in the language from the root non-terminal; however, it will quickly become apparent that the least used parts of your language are getting undue attention.  Some more sophisticated tools allow productions in the grammar to be attributed with weights.  This works better but forces the tester to learn the proper weights manually and it doesn't accurately describe the weights since the probability that a particular production for a non-terminal will be used is also based on the history of previously expanded productions.

A better approach is to learn weights for the Markov chain corresponding to the pushdown automata.  Since a parser for an arbitrary context-free grammar can be expressed by a pushdown automaton, we can instrument the automaton to collect data on the probability of some transition given a state and a history of states and transitions.  The instrumented automaton can then be fed a large number of user-written programs.  Finally, a generator can be created that uses the data to generate programs that syntactically look like user-written programs.

Another approach might be to instrument branches and event sequences in a program and then train against user input.  The learner could then build a Markov chain that can be used to generate input sequences roughly like user input.

Conclusion

Next time you find yourself in a project that is devolving into an unbiased learning process, restore the sanity by using the specification and not the tests to decide what to implement.

Author: "wesdyer" Tags: "Programming"
Comments Send by mail Print  Save  Delicious 
Date: Thursday, 06 Dec 2007 20:37

Yesterday, Volta was made publicly available for the first time.  It is an experimental project in the early stages of development.  The team decided to release an early technology preview so that developers everywhere can help guide the project through experience and feedback.  We want your feedback.

The first release provides the basic feature set that will be improved upon with time.  It has some obvious shortcomings that we are aware of and are actively addressing.  But really, at this stage, the preview is more concerned with sparking your imagination about what is possible than ironing out all of the details.

Perhaps you disagree.  Maybe the most important feature to you is the completeness of a final product.  If that is the case, then say so and we will seriously consider making it a higher priority for the upcoming early experimental releases.

At some point, Volta may become, feed into, or inform a product, but that is a little way off yet.  So let's enjoy the unique opportunity of working together to make something great.

In the coming months, I will alternate between three types of posts:

1.  Volta focused posts: explaining the motivation, features, and technical details

2.  C#: this includes both 3.0 and eventually 4.0 features

3.  Random thoughts: like it says; two that will be discussed soon are programmer tests and continuations

I hope you enjoy the posts and I look forward to engaging with you in discussion.

Author: "wesdyer" Tags: "Volta"
Comments Send by mail Print  Save  Delicious 
Date: Wednesday, 05 Dec 2007 22:14

Anyone who writes web applications knows that web development is not easy.  Developers wrangle with a soup of technologies distributed across multiple tiers.  We live in a world where programmers accept the fact that they need to know four or five different languages, tools, and environments just to get a site up and running.  In ancient times, the Egyptians built marvelous structures despite the primitive tools that the workmen used.  Building a pyramid took most of the national resources of wealth and labor.  Today, we build structures which are vastly more complicated and yet require only a tiny fraction of the resources.  The difference is in the tools and the infrastructure.

In a similar way, Volta significantly improves web development.  Programmers write web applications using familiar .NET languages, libraries, and tools.  Volta splits the application into multiple parts potentially running on different tiers, say, client and server.  Client code needs only a minimal, JavaScript-enabled browser, though Volta will take advantage of additional runtimes that may be present.

Programmers simply refactor classes that need to run on tiers other than the client and Volta injects the boilerplate code for communication, serialization, synchronization -- all the remoting code.  The developer enjoys all of the benefits of .NET: a great debugger, test tools, profiling, intellisense, refactorings, etc.

Just how simple is Volta?  Let's write an application that uses a button to query the server for a string and displays that string to the client: the hello world web application.

image

Now, let's write the code for the web page.  We need a Div for the output and an Input for the interaction.  Of course, we could have constructed the page elements with HTML/CSS instead.

using System;
using Microsoft.LiveLabs.Volta.Html;
using Microsoft.LiveLabs.Volta.Xml;

namespace HelloWorld
{
    public partial class VoltaPage1 : Page
    {
        public VoltaPage1()
        {
            var output = new Div();

            var b = new Input();
            b.Type = "button";
            b.Value = "Get Message";
            b.Click += () => output.InnerHtml = C.GetMessage();

            Document.Body.AppendChild(output);
            Document.Body.AppendChild(b);
        }
    }

    class C
    {
        public static string GetMessage()
        {
            return "Hello, World";
        }
    }
}

But we want to produce the message on the server.  Time to refactor.

 image

Browser clients call the server, the "Origin", because this ensures the message will come from the same server that supplied the HTML.

[RunAtOrigin]
class C
{
    public static string GetMessage()
    {
        return "Hello, World";
    }
} 

That is it.  Try it out.

image

Now, click "Get Message".

image

Great.  But is it cross-browser compatible?

image

Yes.  And you can debug it.

 image

You can even debug across tiers.

image 

There is a lot more to Volta in the first technology preview which was made publicly available at 11 am PST today and there will be a lot more to come.

Still skeptical?  Try it out for yourself.

http://labs.live.com/volta

image

Author: "wesdyer" Tags: "Volta"
Comments Send by mail Print  Save  Delicious 
Date: Thursday, 31 May 2007 08:34

Thank you everyone for the feedback.  If you have any more to say then please do express your opinion (and yes it is valued ;).  I think there has been a bit of misunderstanding about what partial methods really are.  So let's set the matter straight.  Here is the current spec (quoting from here on out):

Partial methods are a new feature complementing partial classes with the option of declaring “hook” – or compile time events – in the code.

Here is an example

partial class Customer

{

  public string Name {

    get { … }

    set {

      OnNameChanging(value);

      _name = value;

      OnNameChanged();

    }

  }

  partial void OnNameChanging(string value);

  partial void onNameChanged();

}

Partial methods are only allowed in partial types; either classes or structs. There are no other kinds of partial members.

There are two parts of a partial method; the defining and the implementing partial method declaration. There can be at most one defining declaration, and at most one implementing declaration, and the implementing declaration is only allowed if there is a defining declaration. The defining declaration is distinguished by having a “;” instead of a method body. The two can both appear in the same partial type declaration.

Partial methods must always return void. The contextual keyword partial has to appear right before void – that is how we recognize it. They can have ref but not out parameters; both params and this modifiers are allowed in the usual positions.

Access modifiers on partial methods are not allowed, but partial instead implies private. The only modifiers allowed on a partial method declaration (apart from partial) are unsafe and static. Thus a partial method declaration cannot declare or override a virtual method. Also, partial methods cannot be extern, because the presence of the body determines whether they are defining or implementing.

If both a defining and an implementing partial method declaration are given, all modifiers on parameters and the method declarations themselves must be given in both declarations, although not necessarily in the same order.

Partial methods can be generic. Constraints are placed on the defining partial method declaration, and must be repeated on the implementing one. Parameter and type parameter names do not have to be the same in the implementing declaration and the defining one.

Whether or not there is an implementing declaration, a partial method participates in overload resolution and type checking in the normal way. If there is no implementing declaration, however, the method will be removed. Furthermore calls to the method are removed, including evaluation of arguments to the call, in a manner similar to conditional methods.

You can only make a delegate to a partial method if there is an implementing declaration; otherwise a compile time error occurs. Similarly, calls to a partial method occurring within the body of a lambda which is converted to an expression tree are only allowed if there is an implementing declaration of the partial method.

Because partial methods are always private, they cannot implement interface methods, neither implicitly or explicitly.

If both a defining and an implementing partial method declaration are given, attributes on the resulting method and its return type, parameters and type parameters are determined by assembling, in unspecified order, the attributes from both the defining and implementing declarations. Duplicates are not removed. Thus, for attributes that can only occur once on a given declaration, having that attribute on both declarations will result in a compile time error.

Definite assignment is only checked after the possible removal of partial method calls and evaluation of the argument expressions. Thus, if assignments occur within these argument expressions, and the method is removed, they will not affect definite assignment. This can potentially lead to compile time errors.

The Main method of a program can be a partial method, but will only be considered as the entry point for the program if there is an implementing declaration.

XML doc comments are allowed on both defining and implementing partial method declarations, but only the ones that appear on the implementing declaration will be used by the compiler. Other tools may make use of the comments on defining declarations.

Author: "wesdyer"
Comments Send by mail Print  Save  Delicious 
Date: Wednesday, 23 May 2007 21:19

It has been a while since I have posted.  We have been working hard to get Orcas beta 1 and beta 2 done.  So I apologize for the long interlude between posts but I hope that you are enjoying beta 1 and that you are looking forward to beta 2.

Now that beta 1 is out there, what do you all think of it?  Beta 1 has most of the features that we intend to put into Orcas but not all of them.  Some notable differences you will see in later releases are improved performance, error messages, error recovery, stability, and a few refinements.  Basically the kind of polish that people expect as a product nears completion.

Partial Methods

One feature that you may have not heard of yet is a feature called partial methods.  This is some of the work that I did in the last coding milestone a few months back.  Partial methods are methods that enable lightweight event handling.  Here is an example declaration:

partial class C
{
  static partial void M(int i);
}

There are a few notable things here:

1.  Partial methods must be declared within partial classes

2.  Partial methods are indicated by the partial modifier

3.  Partial methods do not always have a body (well look at this more below)

4.  Partial methods must return void

5.  Partial methods can be static

6.  Partial methods can have arguments (including this, ref, and params modifiers)

7.  Partial methods must be private

Now suppose that I make a call to C.M.

partial class C
{

  static void Main()
  {
    C.M(Calculation());
  }
}

Since M has no body, all calls to C.M are removed at compile time as well as the evaluation of the arguments.  So the above program is equivalent to:

partial class C
{

  static void Main()
  {
  }
}

In this sense, partial methods are the distant cousins of conditional methods which also sometimes remove the call and the evaluation of the arguments.  But partial methods go even further.  When a partial method has no body then the partial method is not even emitted to metadata.

So far this might seem a little confusing.  C# now allows users to declare methods for which the calls, the evaluation of the arguments, and the method itself are not emitted.  Fortunately, the story doesn't end there.  Partial methods allow users to define a body for the method so that the method, the calls, and the evaluation of the arguments are emitted.

partial class C
{
  static partial void M(int i); // defining declaration
  static partial void M(int i)  // implementing declaration
  {
  }
}

In the code above, we see that there is a difference between a defining declaration of a partial method and an implementing declaration of a partial method and the difference is whether or not the method has a body.  These definitions don't have to be in the same partial class declaration.  There may only be one defining declaration and if a defining declaration exists then there may be an implementing declaration.

Why Partial Methods?

So how are these partial methods used?  The common scenario is to use them to do lightweight event handling.  For example a tool that generates code may wish to have hooks for users to customize what code is run.  For example, imagine that a tool generated a bunch of code for a class representing a customer.  It might look like this:

partial class Customer
{
  string name;

  public string Name
  {
    get
    {
      return name;
    }
    set
    {
      OnBeforeUpdateName();
      OnUpdateName();
      name = value;
      OnAfterUpdateName();
    }
  }

  partial void OnBeforeUpdateName();
  partial void OnAfterUpdateName();
  partial void OnUpdateName();
}

If the user doesn't add any implementing definitions then this code is equivalent to:

partial class Customer
{
  string name;

  public string Name
  {
    get
    {
      return name;
    }
    set
    {
      name = value;
    }
  }

}

No extra metadata for things that are not used and no extra instructions for useless operations.  On the other hand if the user listened to the OnUpdateName "event" like this:

partial class Customer
{
  partial void OnUpdateName()
  {
    DoSomething();
  }
}

Then the original definition is equivalent to:

partial class Customer
{
  string name;

  public string Name
  {
    get
    {
      return name;
    }
    set
    {
      OnUpdateName();
      name = value;
    }
  }

  partial void OnUpdateName();
}

Comparing Partial Methods to the Alternatives

At this point, it is sensible to ask why not just use subclassing and virtual methods?  Of course, this would also work but it does have the drawback that the calls, the methods, and the evaluation of the arguments will still be emitted even if the virtual methods are not overridden.  So in a system like Linq to SQL that has thousands of little events it allows these events to be very lightweight so that the user only pays for those events that she uses.

A Few Fine Points

Consider the following program...

partial class C
{
  static void Main()
  {
    int i = 3;
    C.M(i = 5);
    Console.WriteLine(i);
  }
}

What does it write to the console?  3, 5, ...?

Actually, it is impossible to tell from just this code.  If there is no implementing declaration then the program will display 3 because the i = 5 will never be evaluated, but if there is an implementing declaration then the program will display 5.  The same is true for conditional methods.  So if you want a side-effect to occur make sure you do not cause the side-effect to occur as an argument to a partial method.  Of course, the same trick can be used to hide expensive calculations.

partial class C
{
  static void Main()
  {
    C.M(VeryVeryExpensiveCalculation());
  }
}

If there is no implementing declaration then the very very expensive calculation will never be performed.

Now what about attributes, how does those work?

partial class D
{
  [W]
  [return:X]
  partial void M<[Y]T>([Z]int foo)
  {
  }

  [Z]
  [return:W]
  partial void M<[X]T>([Y]int foo);
}

What attributes are actually emitted on M?  W and Z are the attributes on M; X and W are the attributes on the return type; Y and X are the attributes on the type parameter; Z and Y are the attributes on the parameter.

Enjoy!

Author: "wesdyer" Tags: "C#"
Comments Send by mail Print  Save  Delicious 
Date: Friday, 23 Mar 2007 11:26

Design patterns have been all of the rage for a number of years now.  We have design patterns for concurrency, user interfaces, data access, object creation, and so many other things.  The seminal work on the topic is the Gang of Four's book, Design Patterns.  When used appropriately they are a fantastic way to codify the wisdom gleaned from the battles we have fought building software systems.

One of the criticisms leveled at design patterns is that they are simply formalisms to address weaknesses in programming languages.  They require the human compiler to generate code whenever a specific recurring problem is encountered that cannot be solved directly with language support.  Now this might sound like heresy to some, but there is some truth to the criticism.  Programmers adapt to the shortcomings in the languages they use by generating pattern like code either by hand or with metaprogramming (generics, dynamic code generation, reflection, expression trees, macros).

Let's take a look at one such example.

The Iterator Design Pattern

Among the many patterns in the literature is the iterator pattern.

Iterator Design Pattern 

In .NET this pattern is embodied by IEnumerable/IEnumerable<T> and IEnumerator/IEnumerator<T>.

IEnumerable<T> and IEnumerator<T>

An IEnumerable<T> is something that can be enumerated (iterated) by calling GetEnumerator which will return an IEnumerator<T> (iterator).  The IEnumerator<T> is used to move a virtual cursor over the items that are iterated.

Implementing the iterator pattern is a bit onerous.  For example, here is the suggested iterator implementation for List<T> from the Design Pattern book.

class ListIterator<T> : IEnumerator<T>
{
  List<T> list;
  int current;

  public ListIterator(List<T> list)
  {
    this.list = list;
    current = -1;
  }

  public T Current { get { return list[current]; } }

  public bool MoveNext()
  {
    return ++current < list.Count;
  }

  public void Reset()
  {
    current = -1;
  }
}

Since we can't change the list itself, we can introduce a ListIterable<T> that wraps a list.

class ListIterable<T> : IEnumerable<T>
{
  List<T> list;

  public ListIterable(List<T> list)
  {
    this.list = list;
  }

  public IEnumerator<T> GetEnumerator()
  {
    return new ListIterator<T>(list);
  }

}

And finally, we can write a method called GetElements which returns an IEnumerable<T> over the elements of a List<T>.

static IEnumerable<T> GetElements<T>(this List<T> list)
{
  return new ListIterable<T>(list);
}

Iterators in C#

Fortunately in most cases programmers don't need to deal with the iterator design pattern directly since the introduction of iterators in C# 2.0.  Instead of writing the iterator above, we can simply write the following:

static IEnumerable<T> GetElements<T>(this List<T> list)
{
  for (int index = 0; index < list.Count; ++index)
    yield return list[index];
}

When the C# compiler sees this method, it translates it into something very similar to the ListIterator<T> and ListIterable<T> above.  Using Reflector or ILDasm we can see that the GetElements method is rewritten as (the names have been changed for clarity as the compiler generates unspeakable names;):


private static IEnumerable<T> GetElements<T>(this List<T> list)
{
  GetElementsIterator<T> temp = new GetElementsIterator<T>(-2);
  temp.listParameter = list;
  return temp;
}

This is remarkably closer to the first GetElements we wrote rather than the second.  Like in the first GetElements, we create an object that implements IEnumerable<T> and parameterize this object with the list that was passed in.  The only other thing that happens in this implementation that doesn't in the first GetElements method is a -2 is passed into the object.  I'll come back to this later, but first let's take a look at the GetElementsIterator<T> class (I omitted a few details and changed names for clarity).


private sealed class GetElementsIterator<T> : IEnumerable<T>, IEnumerator<T>
{
  int state;
  T current;
  public List<T> listParameter;
  int initialThreadId;
  public int index;
  public List<T> list;

  public GetElementsIterator(int state);
  private bool MoveNext();
  IEnumerator<T> IEnumerable<T>.GetEnumerator();
  void IEnumerator.Reset();

  T IEnumerator<T>.Current { get; }
}

The most important thing to note at this time is that GetElementsIterator implements both IEnumerable<T> and IEnumerator<T>.  A strange combination, but we will see why in a little while.  Now let's look at what actually happened when we ran the constructor. 


public GetElementsIterator(int state)
{
  this.state = state;
  this.initialThreadId = Thread.CurrentThread.ManagedThreadId;
}

Did the constructor run the for loop?  No, it didn't.  It took some variable called state in and marked what thread created the iterator (if you look at Whidbey code the initialThreadId stuff won't be there...I'll get to that).  The state variable marks what state the GetElementsIterator is in.  The number -2 is the "I'm an IEnumerable<T>" state.  Now, let's look at the GetEnumerator method.


IEnumerator<T> IEnumerable<T>.GetEnumerator()
{
  GetElementsIterator<T> temp;
  if ((Thread.CurrentThread.ManagedThreadId == initialThreadId) && (state == -2))
  {
    state = 0;
    temp = this;
  }
  else
  {
    temp = new GetElementsIterator<T>(0);
  }
  temp.list = this.listParameter;
  return temp;
}


The method first checks to see if the thread that is calling GetEnumerator is the same thread that created the GetElementsIterator object.  If it is the same thread and if the GetElementsIterator is in the "I'm an IEnumerable<T>" state then the state is changed to the "I'm an initialized IEnumerator<T>" state.  Otherwise, a new object of the same type is created but it is immediately put in the "I'm an initialized IEnumerator<T>" state so that thread safety is maintained.  Finally, in either case the list that was passed from the GetElements method is copied into the list field for consumption by the MoveNext method.

In Whidbey code, you will see that the if statement is different because an Interlock.Exchange was used to perform the same task.  The change was made to improve performance (especially for iterators that have 0 or 1 items to iterate over).

Now we have seen why GetElementsIterator implements both IEnumerable<T> and IEnumerator<T>, because it can morph from the IEnumerable<T> role into the IEnumerator<T> role without creating any new objects in most cases.

The Current property is very simple.


T IEnumerator<T>.Current
{
  get
  {
    return current;
  }
}


The Reset method simply throws a NotSupportedException.

This leaves us with only the MoveNext to examine.  We still haven't seen where the for loop went.  Hopefully, it is in the MoveNext.  The following code is produced with the /o+ compiler option (optimizations turned on).


private bool MoveNext()
{
  switch (state)
  {
  case 0:
    state = -1;
    index = 0;
    break;

  case 1:
    state = -1;
    index++;
    break;

  default:
    goto Done;
  }
  if (index < list.Count)
  {
    current = list[index];
    state = 1;
    return true;
  }
Done:
  return false;
}


Hm...a switch statement over a variable called state with a number integer case labels.  It looks like a state machine and indeed it is.  When MoveNext is first called, state is equal to 0, so the state is set to -1 (the "I'm finished" state) and the index is set to 0.  Then we check to see if the index is less than the number of things in the list.  If so then we set current to the current list element based on the index and set the state to 1.  We return true indicating that there is something to consume.

The second call to MoveNext will run case 1 since the state is equal to 1.  It will again set the state to -1 and increment the index.  If we still have elements in the list then the current will be set appropriately and the state will be set to 1 (the "we need to check again" state) and true will be returned.

This continues until there nothing left to consume and false is returned.

Finally, we found our for loop.  It is encoded in the MoveNext.  But note that on each call to MoveNext is only computes the part of the for loop that is relevant to realize the next element.  It never computes more than it needs to.  This is why iterators are an example of deferred execution.  When the actual GetElements method was called, no elements were realized at all!  Later as MoveNext is called, one element at a time is realized.  This of course enables all of sorts of great scenarios such as the pay-as-you-go model and the ability to have infinite lists.

The Cost of Iterators

Iterators are very performant.  In almost all situations that I have encountered they are more than performant enough and they simplify the code drastically as we have seen.  But sometimes you can get into trouble.

Consider the definition of the Concat sequence operator in Linq to Objects.  It looks something like this:

static IEnumerable<T> Concat<T>(this IEnumerable<T> sequence1, IEnumerable<T> sequence2)
{
  foreach (var item in sequence1)
    yield return item;
  foreach (var item in sequence2)
    yield return item;
}

Let's write a little benchmark to evaluate the performance of concat.

var stopWatch = new Stopwatch();
for (int length = 0; length <= 10000; length += 1000)
{
  var list = new[] { 1 };
  IEnumerable<int> ones = list;
  for (int i = 0; i < length; ++i)
    ones = ones.Concat(list);

  stopWatch.Reset();
  stopWatch.Start();

  foreach (var item in ones) ;

  stopWatch.Stop();
  Console.WriteLine("Length: {0} Time: {1}", length, stopWatch.ElapsedMilliseconds);
}

The results may be perhaps surprising.  The time to evalute the foreach statement is not linearly proportional to the number of concats that are composed together.  In fact it is proportional to the square of the number of concats composed together.

Upon closer inspection the reason why is obvious.  The time complexity of Concat is O(m+n) where m is the number of items in the first sequence and n is the number of items in the second sequence.  But note that in this example, n is always 1.  The outermost call is O(m+1).  The next call has O((m-1)+1), then O((m-2)+1), ... O(1+1).  There are m of these calls so the running time should be O(m^2).  Essentially, composing concats together like this causes O(m^2) yield returns to be executed.

Of course, using a List<T> here and adding on the sequences would have been much more performant because it eliminates the redundant calculations but it would not have been evaluated lazily.

Iterators are even more fun if the data structure that is being enumerated is more complicated.  For example, consider iterating over n-ary trees.  Here is a quick definition of a n-ary tree.

class Tree<T>
{
  public T Value { get; private set; }
  public Tree<T> NextSibling { get; private set; }
  public Tree<T> FirstChild { get; private set; }

  public Tree(T value, Tree<T> nextSibling, Tree<T> firstChild)
  {
    Value = value;
    NextSibling = nextSibling;
    FirstChild = firstChild;
  }

  public IEnumerable<Tree<T>> GetChildren()
  {
    for (var current = FirstChild; current != null; current = current.NextSibling)
      yield return current;
  }
}

Now it is easy to define an iterator that performs a preorder traversal of a n-ary tree.

static IEnumerable<T> PreOrderWalk<T>(this Tree<T> tree)
{
  if (tree == null)
    yield break;
  yield return tree.Value;
  foreach (var subTree in tree.GetChildren())
    foreach (var item in subTree.PreOrderWalk())
      yield return item;
}

Just the way that I like code: clear and concise.  The only problem is that the iterator could be more efficient.  This may or may not be a problem.  In a library it will almost certainly be a problem.

We can improve the efficiency somewhat by changing the code:

static IEnumerable<T> PreOrderWalk<T>(this Tree<T> tree)
{
  var stack = new Stack<Tree<T>>();
  if (tree != null)
    stack.Push(tree);
  while (stack.Count > 0)
  {
    for (var current = stack.Pop(); current != null; current = current.FirstChild)
    {
      yield return current.Value;
      if (current.NextSibling != null)
        stack.Push(current.NextSibling);
    }
  }
}

This second iterator doesn't recursively call iterators thus avoiding both the recursive call and the extra allocations.  Instead, it maintains a stack of work to do after the leftmost path has been exhausted.  Once the leftmost path has been exhausted then a node is popped off and the leftmost traversal is resumed at that node.

When we measure the difference, we see that the improvement is noticeable but that the number of nodes, O(b^d) where b is the branching factor and d is depth, dominates the cost of the traversal.  In the graph below, the green line indicates the total number of nodes in the tree.  The trees have a branching factor of 2.

So the key takeaway here is that iterators have great performance but as always measure the performance of your code.  If you find that performance is suffering, use a combination of profiling and analysis to find the problem.  If the problem is an iterator, you might be able to increase the performance by reworking the iterator as in the n-ary tree case.  In other cases, it might be the usage of the iterators as with pathological Concats.

One Possibility for Language Improvement (not in Orcas)

The Concat sequence operator is interesting because there is a lot of code that is seemingly redundant.  It takes two sequences and then has to iterate over them and yield their elements.  It's like it needs to expand their insides just to package them up together again.  As we have seen this doesn't lead to the best performance and the code is overly verbose.

Bart Jacobs, Erik Meijer, Frank Piessens, and Wolfram Shulte wrote a very interesting paper on a possible language improvement that would improve both the usability and the performance of iterators.  The second half of the paper details what they call nested iterators which avoid the multiple evaluation problem of the composed Concats and implicitly keep an internal stack like the modified n-ary tree iterator.

For example with this language feature, the Concat sequence operator would look something like this:

static IEnumerable<T> Concat<T>(this IEnumerable<T> sequence1, IEnumerable<T> sequence2)
{
  yield foreach sequence1;
  yield foreach sequence2;
}

Notice that the code is simplier and it is also more performant.

A programmer could use yield return, yield break, and yield foreach in the same iterator.  An iterator FromTo can be defined recursively as follows:

static IEnumerable<int> FromTo(int b, int e)
{
  if (b > e)
    yield break;
  yield return b;
  yield foreach FromTo(b + 1, e);
}

If instead of the yield foreach, there was the foreach expansion that yielded each result then the FromTo method would suffer from quadratic performance; however, with nested iterators the performance would be linear.

The next post will pick up on understanding the performance of Linq to objects queries.

Author: "wesdyer" Tags: "Programming, C#, Design Patterns"
Comments Send by mail Print  Save  Delicious 
Date: Tuesday, 20 Mar 2007 12:52

Recently, many people have asked me about the performance of Linq.  The questions have ranged from the broad, "How can I analyze the performance of code using Linq?", to the very specific, "Why is the performance of this code sample not what I expected?"

I want to address these questions.  So before I dive into some of them individually, I want to set up a framework for discussions about performance.  The framework consists of ideas that are well known and understood and yet often when people talk or think about performance they forget about them.

Performance is critical to the success of any application and engineering performance is a fundamental part of what developers should be doing on a day to day basis.  Great performance doesn't just happen.  It must have requirements, be designed, implemented, and tested just like any other feature.  Individuals and teams must foster a culture of performance.

Performance Requirements

Rico Mariani wrote a great post about why performance must be measured in context.  That means that performance goals should be made in relation to what the user sees and cares about.  Then when possible these goals should be translated into more low-level goals that are easier to quantify and directly measure, but never take your eye of the original goals.  Many times when performance is tuned in one dimension it can degrade in other dimensions.  It is like trying to cram a whole bunch of stuff into a tiny space, you may push it in here but it pops out over there.  So how does a programmer know which dimension is most important?  Consider the performance in context.  Furthermore, the performance tuning that a programmer makes might not even matter when considered in context.  So spend your time where in counts.  Finally, it is much easier to consider trade-offs between performance requirements and other requirements when the performance requirements are in context.  Set up a performance budget in terms that are material and important to the customer.

Designing for Performance

Designing for performance is important.  Many performance problems can be fixed as they are identified later, but some are deeply rooted in poor design choices that were made early on that are difficult to resolve later.  However, care must be taken because often these poor design choices were made for the sake of "better performance".  For example, in an app where there is a lot of data parallelism, it doesn't make a lot of sense to make it so complex with stateful operations that it precludes concurrency.

Most of these poor design choices are caused by the design not solving the right problem (performance requirements) or because the requirements changed over time and the earlier design made it difficult to adapt to the new requirements.

Understanding Performance

There are at least two basic tools for understanding performance.

1.  Asymptotic Analysis

I am sure that my readers know and love asymptotic analysis.  It is so important that developers know what they are doing when they call a System.* function, which has an O(n) time complexity, n times -- O(n^2) behavior.  Know what is going on in terms of both space and time.

2.  Benchmarks / Profilers

So often when a performance problem exists, developers will convince themselves that they know what the problem is without accurately measuring the behavior of the system.  They will proceed to "fix" the problem only to discover that what they thought would be of great benefit is in fact unimportant.

I've done that.  When C# 1.0 was in beta back in 2000, I was working at a small company that decided to try out this new .NET stuff.  My first application was a good sized application that including a custom scripting engine inside of it.  The performance wasn't great and so I set about fixing it.  I was sure that I knew what the problem was and so I began improving some algorithms in various complex parts of the system.  But I found that it didn't help matters at all.

At this point, I broke out a profiler and began measuring the actual behavior of the system.  The suprising result (at the time) was that pretty much all of the performance problems came from a small loop that was building up the text buffer for the scripting engine.  It was using the + operator on strings.  Since this was my first .NET application, I had no idea that concatenating hundreds of thousands of strings was a bad idea.

When I changed the code to use a StringBuilder instead, it sailed.  I made a few other improvements (all of which were targeted and very small), and the application was running fine.

Now the point of all of this is not that the + operator on strings should not be used.  If that were true then we would make the C# compiler issue a warning when you used it.  The point is that a programmer should be aware of the costs and tradeoffs involved with a programming decision and act accordingly.  Measurement is a powerful tool.  Knowing where the problems lie is the key to success.  Measure early and often.

The result of profiling is a large set of various statistics about an application.  If the measurements are taken in context then they are a powerful tool.  As with any statistics, what is being measured is as important as the actual measurement.

Analyzing an application is very rewarding.  You are able to see the material benefits of any improvement in a very quantitative way.  Apply the scientific method in the process of analyzing a problem.  Once you suspect you know what the problem is then consider it the hypothesis and then prove it by observing the data (profiling data) from an operation (the suspect code) across various trials.

Try to think big and understand not only what is the problem but why it is a problem.  The best solutions attack not only the symptoms but the fundamental causes of the problem.  Look to understand the nature of the problem.

Some Good Performance Resources

Rico Mariani's (Visual Studio Performance Architect) Blog

Vance Morrison's (CLR Performance Architect) Blog

Author: "wesdyer" Tags: "Programming"
Comments Send by mail Print  Save  Delicious 
Date: Friday, 09 Mar 2007 14:06

When people think of C# 3.0 and Linq, they commonly think of queries and databases.  The phenomenal work of the Linq to SQL guys provides ample reason to think of it this way; nevertheless, C# 3.0 and Linq are really much much more.  I have discussed a number of things that can be done with lambdas, expression trees, and queries and will continue to do so but I want to pause and discuss a little gem that is often overlooked in C# 3.0.  This new language feature has fundamentally changed both the way that I work in C# and my view of the world.  I've been using it a lot without ever drawing attention explicitly to it.  At least one reader noticed it and the possibilities it opens up and at least a couple of readers want an expanded version of it without even knowing it.

So what is the feature?  It's extension methods.

At first glance they don't look very special.  I mean really, all they are is one extra token in the definition of a static method inside of a static class.

static class Foo {

  public static Bar Baz(this Qux Quux) { ...

But as is usually the case, it's the semantics that are more interesting than the particular syntax.

The first argument of an extension method (the argument marked with this) is the implicit receiver of the method.  The extension method appears to be an instance method on the receiver but it is not.  Therefore, it cannot access private or protected members of the receiver.

For example, let's say that I detested the fact that the framework doesn't have a ToInt method defined on string.  Now, I can just provide my own:

public static int ToInt(this string s)
{
  return int.Parse(s);
}

And I can then call it as:

"5".ToInt()

The compiler transforms the call into:

ToInt("5")

Notice how it turns it outside out.  So if I have three extension methods A, B, and C

x.A().B().C()

The calls get turned into

C(B(A(x)))

While all of this explains how extension methods work, it doesn't explain why they are so cool.

A few months back, I was reading various online content related to C# 3.0.  I wanted to get a feel for what customers were feeling and incorporate it as much as possible into the product.  In the process, I came across an interesting post, Why learning Haskell/Python makes you a worse programmer.  The author argues that learning a language like Python or Haskell can make things more difficult for you if your day job is programming in a language like C#.

I sympathize with what the author has to say and have had to spend enough time programming in languages that I didn't like that I think that I understand the pain.

That said, I hope that the author (and others who feel like him) will be pleasantly surprised by C# 3.0.  For example, let's look at his example of painful programming:

"I have a list of Foo objects, each having a Description() method that returns a string. I need to concatenate all the non-empty descriptions, inserting newlines between them."

In Python, he says that he would write:

"\n".join(foo.description() for foo in mylist if foo.description() != "")

In Haskell, his solution looks like:

concat $ List.intersperse "\n" $ filter (/= "") $ map description mylist

These both look like reasonable code and I rather like them.  Fortunately, you can express them in C# 3.0.  Here is the code that looks like the Python solution.

"\n".Join(from x in mylist where x.Description != "" select x.Description)

And here is the code that is closer to his Haskell solution:

mylist.Where(x => x.Description != "").Select(x => x.Description).Intersperse("\n").Concat();

At this point, some will protest that there is no Join instance method on string and there is no Intersperse defined on IEnumerable<T>.  And for that matter, how can you define a method on an interface in the first place?  Of course, extension methods are the answer to all of these questions.

public static string Join(this string s, IEnumerable<string> ss)
{
  return string.Join(s, ss.ToArray());
}


public static IEnumerable<T> Intersperse<T>(this IEnumerable<T> sequence, T value)
{
  bool first = true;
  foreach (var item in sequence)
  {
    if (first)
      first = false;
    else
      yield return value;
    yield return item;
  }
}

It is as if these methods were defined on the receiver to begin with.  At this point the realization sets in: a whole new mode of development has been opened up.

Typically for a given problem, a programmer is accustomed to building up a solution until it finally meets the requirements.  Now, it is possible to extend the world to meet the solution instead of solely just building up until we get to it.  That library doesn't provide what you need, just extend the library to meet your needs.

I find myself switching between the two modes frequently: building up some functionality here and extending some there.  In fact, these days I find that I often start with extension methods and then when certain patterns begin to emerge then I factor those into classes.

It also makes some interesting styles of programming easier.  I am sure it has some name, but since I don't know what it is I'll call it data interface programming.  First we declare an immutable interface that includes only data elements.

interface ICustomer
{
  string Name { get; }
  int ID { get; }
}

Then, we declare an inaccessible implementation of ICustomer that allows customers to be created through a factory that only exposes the immutable version.

class Factory
{
  class Customer : ICustomer
  {
    public string Name { get; set; }
    public int ID { get; set; }
  }

  public static ICustomer CreateCustomer(int id, string name)
  {
    return new Customer { ID = id, Name = name };
  }
}

Then we can declare behavior through extension methods.

public static string GetAlias(this ICustomer customer)
{
  return customer.Name + customer.ID.ToString();
}

And finally, we can use the behavior.

var customer = Factory.CreateCustomer(4, "wes");
Console.WriteLine(customer.GetAlias());

All of this may seem like a round about way to declare an immutable abstract base class with various derived classes.  But there is a fundamental difference, the interface and behavior can change depending upon which extension methods are in scope.  So one part of the program or system can treat them one way and another can have an entirely different view of things.

Of course, what I really want to be able to do (and we don't do it yet) is something like:

var customer = new ICustomer { ID = 4, Name = "wes" };
Console.WriteLine(customer.GetAlias());

And then I skip the whole Factory thing all together.  The customer is immutable and the definition of the type is short and sweet.  All of the work of done by the compiler which incidentally doesn't need the factory because it can name mangle the implementation class and provide customized constructors automatically.  But I digress, hopefully we can do something like that in the future.

Of course extension methods don't make the traditional techniques inapplicable, they are still as useful as ever.  As with all design considerations, there are trade-offs involved.  Care must be taken to manage extension methods so that chaos doesn't ensue, but when they are used appropriately they are fantastically useful.

As I have been writing C# code, I have accumulated a library of useful extension methods and I encourage you to do the same thing so that the ideas that you think roll naturally off of your fingertips.

Author: "wesdyer" Tags: "C#"
Comments Send by mail Print  Save  Delicious 
Date: Thursday, 01 Mar 2007 10:44

How often do you write code that just works?  It seems to happen so rarely that I find myself suspicious if code does just work.  When was the last time that you wrote a non-trivial program that had no compile errors on the first try and then ran fine as well?  If it happened recently then congratulations.  If it hasn't then join the club.  Admittedly, most programmers don't just write a whole program and then submit it to a compiler and a test harness to see if it works.  Typically it is built incrementally: a function here, a class there, testing all the time.  Code-Compile-Test...rinse and repeat.

I recently read a post where the author describes why Haskell programs just seem to work.  It got me thinking about my own experience in Haskell.  One of the most interesting things about Haskell is that many programmers don't use a debugger (GHCi does have one).  It seems that the impulse to break out the debugger to better understand code occurs far less than in other languages. 

Can you imagine working in C# without a debugger?  So why is it that many Haskell users seem to get along without a debugger?

A Tale of Two Sorts

Consider writing a function that sorts an IEnumerable<T> using the quicksort algorithm.  Here is one possible implementation:

static IEnumerable<T> QuickSort<T>(this IEnumerable<T> sequence) where T : IComparable<T>
{
  var array = sequence.ToArray();
  QuickSort(array, 0, array.Length - 1);
  return array;
}

static void QuickSort<T>(this T[] array, int start, int end) where T : IComparable<T>
{
  if (end - start < 1)
    return;
  var pivot = array[start];
  var j = start;
  for (var i = start + 1; i <= end; ++i)
  {
    if (array[i].CompareTo(pivot) < 0)
    {
      ++j;
      var temp = array[j];
      array[j] = array[i];
      array[i] = temp;
    }
  }
  array[start] = array[j];
  array[j] = pivot;
  QuickSort(array, start, j - 1);
  QuickSort(array, j + 1, end);
}

Now imagine that ++j was in the wrong place (at the end of the block).  The output of sorting

4, 3, 6, 9, 1, 0, 2, 7, 5, 8

would be

4, 4, 4, 4, 4, 9, 6, 7, 7, 8

Clearly this isn't right.  So after a quick scan of the code, most programmers would attach a debugger to see what is going on.

The programmer might then step through the code until some state transition ends in a corrupt state.  Hopefully the programmer understands the intent of the code so that a fix can be issued to bring the implementation in line with the intent.

Contrast the previous definition of quicksort with the following definition:

static IEnumerable<T> QuickSort<T>(this IEnumerable<T> sequence) where T : IComparable<T>
{
  if (!sequence.Any())
    return sequence;
  var pivot = sequence.First();
  var remaining = sequence.Skip(1);
  return ((from x in remaining where x.CompareTo(pivot) < 0 select x).QuickSort())
           .Concat(pivot)
           .Concat((from x in remaining where x.CompareTo(pivot) >= 0 select x).QuickSort());
}

What do you notice that is different besides the number of lines of code?

The first definition creates an array and then does in an place quicksort on the array, destructively updating the contents.  It creates several temporary values (i, j) needed to store intermediate results.  If a programmer hasn't written the algorithm recently then chances are that the code will contain at least an one off by one error or some other similar error.  I've seen too many people do poorly on whiteboards on similar problems (where incidentally there is no debugger, syntax highlighting, or intellisense) to believe that most people would just get it right the first time.

The second definition does not modify any data.  Sure it makes assignments but each of these assignments are to a newly introduced variable.  These variables are really just aliases for the expressions that are assigned to them because the expressions are side-effect free.  Furthermore, the second definition is essentially just a brief description of what the quicksort is rather than a step by step description of how it is accomplished.

At this point, some people will probably be thinking, "But isn't the first approach so much faster than the second approach?"

Maybe, maybe not.  Perhaps it doesn't even matter.

I think it is important to use the following principle:  Make it correct, make it clear, make it concise, make it fast.  In that order. 

If I find that the code in question is not performant enough given the user scenarios that we need to support or if the code is overly complex for the sake of eliminating side-effects then I consider adding state.  But note that if I wanted to introduce state to our quicksort, I could do it in two different ways: contained within the quicksort function or exposed to the caller of quicksort.  This is somewhat analogous to calling either the first function or the second function in the first definition of quicksort.  If the state is exposed then quicksort could do an in-place sort on the collection passed to the quicksort function.  Alternatively, quicksort could simply return a new collection that is sorted and not update the original collection.  This leads to the second principle:

Write side-effect free code.  If side-effects are required then confine them to the smallest scope possible.

Most programmers have been following this to some extent for years.  We all know that generally it is not a good idea to use global variables.  This is basically the extreme of exposing side-effects (the global scope).  Unfortunately, many of the programmers who don't use global variables don't realize that the same principles apply to fields, properties, parameters, and variables on a more limited scale: don't mutate them unless you have a good reason. 

A Little Story about State

Some time ago, I wrote a post about a way to save the state of an IEnumerator<T>.  What I didn't tell is the story behind the post.

I was writing a managed lexer and parser generator.  The system is pretty neat and includes things like GLR parsing, automatic inference of the AST types, automatic error recovery, and whole slew of other features.  The parser takes an IEnumerable<IToken> in order to decouple the lexer from the parser.  However, for many of the features of the parser, it needs a way to save the state the of the underlying token stream and possibly restore it at a later time after more of the stream has been consumed.

Originally, I wrote a MarkableEnumerator<T> which wrapped an IEnumerator<T> and consumed the input.  Clients could request a mark which they could then restore at a later point.  If they did this then the MarkableEnumerator<T> would continue at the restored point when MoveNext was called again.  This little piece of software (about 100 lines of code) caused Cyrus and me lots of trouble.  It was super complicated because it was heavily optimized to keep its internal storage as small as possible while preserving requested past state (and possibly restoring it).  It used a cyclical buffer and a number of opaque handles.  One of the worst problems is that it was a somewhat leaky abstraction since the complexity leaked into the code consuming it.

When the code was first introduced it was reasonably contained but over time its inherent complexity began to pervade the whole parser.  Eventually, it was hard to make improvements to the parser without breaking the invariants of the MarkableEnumerator.  It was an absolute horror.  I remember one night well past midnight, Cyrus and I were talking about how much we hated the thing, but we needed it to provide good performance.

Then it occurred to us that we could replace the whole mess of opaque handles, circular buffers, mark and release code, and MarkableEnumerators within MarkableEnumerators with just a linked list that wrapped the enumerator.  If someone wanted to save the state of the enumerator and resume it then all they need to do is keep a reference to the desired node.  To all of the clients, it looked just like an immutable linked list.  That is it.  So much simpler and guess what?  It was almost as fast.  But even more importantly, because of its simplicity we were able to make even more important code faster.  Our development time for new features rocketed again.

There are several lessons from this story, but one of them is perhaps the most important considering our current discussion.  Reducing the number and scope of side-effects simplifies the code.

Immutable Binary Trees

Mutation often seems to just cause problems.

Suppose that we want to create a BinaryTree class.  One straightforward implementation is

class BinaryTree<T>
{
  public T Value              { get; set; } // yes, we have autoimplemented
  public BinaryTree<T> Parent { get; set; } // properties in case you haven't

  public BinaryTree<T> Left   { get; set; } // heard
  public BinaryTree<T> Right  { get; set; }
}

I don't like it very much.  Those sets give me an uneasy feeling.  Of course it depends on how the objects are used, but suppose that we intend to use the binary tree in a multithreaded environment.  If someone changes any of the members of a node there is a chance that some other thread already has a reference to a node in the tree and will then have a tree in an inconsistent state.  Yes, we could use a lock to fix the problem but I tend to like a different approach to this problem. 

class BinaryTree<T>
{
  T value;
  BinaryTree<T> left;
  BinaryTree<T> right;

  public BinaryTree(T value, BinaryTree<T> left, BinaryTree<T> right)
  {
    this.value = value;
    this.left = left;
    this.right = right;
  }

  public T Value { get { return value; } }
  public BinaryTree<T> Left { get { return left; } }
  public BinaryTree<T> Right { get { return right; } }
}

This is essentially the same as the first BinaryTree except that it is immutable (no sets and no methods causing mutation).  This requires that we have a constructor that takes in the appropriate values.  Also, you may have noticed that there is no Parent property.  The reason is that if the node is immutable then it is hard to know what the parent is at the time of creation.  With an immutable tree, you generally can either create parents before children or children before parents.  I prefer the later (despite the biological conundrum posed by this).  Note that this doesn't mean that we can't know what the parent of a given node is.  Indeed we can, but we need to introduce a new concept before addressing this.

Nodes can really be parented by n nodes.  Since any node can be constructed with a given node as either the left or the right child.  But given a tree (a root), we can enforce (assuming that we don't want DAGs) that a node has at most one parent.  We will call this a BinaryTreeVersion.

class BinaryTreeVersion<T>
{
  BinaryTree<T> tree;
  Dictionary<BinaryTree<T>, BinaryTree<T>> parentMap;

  public BinaryTreeVersion(BinaryTree<T> tree)
  {
    this.tree = tree;
  }

  public BinaryTree<T> Tree { get { return tree; } }
  public BinaryTree<T> GetParentOf(BinaryTree<T> node)
  {
    if (parentMap == null)
    {
      parentMap = new Dictionary<BinaryTree<T>, BinaryTree<T>>();
      FillParentMap(tree, null);
    }
    return parentMap[node];
  }

  void FillParentMap(BinaryTree<T> tree, BinaryTree<T> parent)
  {
    if (tree == null)
      return;
    parentMap.Add(tree, parent);
    FillParentMap(tree.Left, tree);
    FillParentMap(tree.Right, tree);
  }

}

The parent operation is exposed on the BinaryTreeVersion.  The parent of a given node is stored in the parentMap which is filled with node-parent pairs.  So now we have a tree that is immutable and the ability to access a node's parent.

But we can take it one step further, we can create functions that will modify the tree.  Note that we will not actually modify the tree but we will return a new BinaryTreeVersion.  The ChangeNodeValues function takes two functions:  a function that determines whether to modify the given node's value and a function to compute the node's new value (there are better ways to do this for specific modifications and yes I could have used a visitor).

class BinaryTreeVersion<T>
{

 ... 

  public BinaryTreeVersion<T> ChangeNodeValues(Func<BinaryTree<T>, bool> predicate, Func<BinaryTree<T>, T> newValue)
  {
    return new BinaryTreeVersion<T>(ChangeNodeValue(tree, predicate, newValue));
  }

  BinaryTree<T> ChangeNodeValue(BinaryTree<T> node, Func<BinaryTree<T>, bool> predicate, Func<BinaryTree<T>, T> newValue)
  {
    if (node == null)
      return null;
    var left = ChangeNodeValue(node.Left, predicate, newValue);
    var right = ChangeNodeValue(node.Right, predicate, newValue);
    var value = predicate(node) ? newValue(node) : node.Value;
    if (value.Equals(node.Value) && left == node.Left && right == node.Right)
      return node;
    return new BinaryTree<T>(value, left, right);
  }
}

The neat thing is that since we use immutable trees, we can share subtrees that are not changed.  Even so, if another thread has a reference to the first BinaryTreeVersion or a node in the first tree then things will just work because we are mucking around with the data.

The Benefits of Purity

One of the most painful things in software development is integrating two or more things (objects, assemblies, functions, ...) together.  Invariably something will go wrong and a programmer that worked on one or the other things in question will protest, "It worked on my machine!"  Yes, indeed it probably did and it probably even works by itself but when put together with other things it doesn't play so nice.

One way to increase the reliability of a unit is to eliminate the side-effects.  This makes composing and integrating units together much easier and more robust.  Since they are side-effect free, they always work the same no matter the environment.  This is called referential transparency.

Conversely, things that are side-effect free are also easier to divide.  For a good example of this take a look at PLinq which farms out querying work to the available threads.  It divides up the work and then integrates the results: map-reduce.

But that is a discussion for another day...

Author: "wesdyer" Tags: "C#, Functional Programming"
Comments Send by mail Print  Save  Delicious 
Date: Friday, 23 Feb 2007 09:02

Last night I was searching for an audio version of Painters and Hackers by Paul Graham.  Pretty soon I had completely forgotten about the book and found myself reading the Wikipedia article about Hackers.  Isn't Internet search great?

Of all of the things in the article, the one thing that captured my attention was the ASCII picture of Adrian Lamo.  I immediately thought, "How cool is that!?"  So instead of popping off my current stack frame and returning to searching for Painters and Hackers, I began thinking about how to create such a picture.  I certainly did not want to create it by hand.  I didn't even want to pick which characters or colors would be printed to the screen.  So accordingly, I began thinking of writing a program to do it for me.

I grabbed the latest bits and began writing code.  A few hours later, I had a working program with a few bells and whistles too.

ASCII Art

Here is an example of converting Gandalf from pixels to ASCII:

How to Convert and Image to ASCII Art

The process works like this:

Determine how big of an area will be used for displaying the ASCII art.  By default, the width of the image will be the width of the console's buffer and the height will be computed so as the preserve the ratio of width to height from the original image.

Now divide the original image into as many rectangles as will be used to display the ASCII.  For each rectangle in the original picture determine its grayscale value by averaging the grayscale values of each pixel.  The grayscale value of a pixel is:

GrayScale = .3 * Red + .59 * Green + .11 * Blue

We then increase the contrast of each region by some amount (d is the luminosity of a region, contrast is any double but usually around 1.5):

public static double Contrast(this double d, double contrast)
{
  return (((d - .5) * contrast) + .5).Bound(0, 1);
}

Finally, for each region in the ASCII art we choose from an array of possible ASCII figures (sorted) by the grayscale value for that region.  The ASCII figure with the closest luminosity is picked.

The only point that I haven't covered is how the array of ASCII figures was generated.  I could have done this by hand if I wanted to, but I didn't want to.  Instead, I generate the grayscale figures each time.  Each figure consists of a character and a console color either ConsoleColor.DarkGray, ConsoleColor.Gray, or ConsoleColor.White.  The characters are one of the alphanumeric characters or the special symbols (things like %, $, @, ...).  The program generates all of the combinations and then measures their grayscale value by drawing them to a hidden bitmap and then computing their average grayscale value.  The figures are sorted by this value and a final grayscale is built.

A Few Snippets

If you take a look at the source, you will notice that it is written rather different than most C# code.  A lot of the code resides in extension methods and there are a number of lambdas and queries.  In fact, loops are kind of rare in the code.  For example, here is a function that returns all of the pixels in a rectangular region of a bitmap.

public static IEnumerable<Color> GetPixels(this Bitmap image, Rectangle rect)
{
  return from y in rect.Top.To(rect.Bottom)
         from x in rect.Left.To(rect.Right)
         select image.GetPixel(x, y);
}

The To function is a helpful little function that I wrote.

public static IEnumerable<int> To(this int start, int end)
{
  var diff = end - start > 0 ? 1 : -1;
  for (var current = start; current != end; current += diff)
    yield return current;
}

Of course, I also could have written Iterate and then written To in terms of that.

public static IEnumerable<T> Iterate<T>(T initialValue, Func<T, bool> predicate, Func<T, T> next)
{
  for (var current = initialValue; predicate(current); current = next(current))
    yield return current;
}

public static IEnumerable<int> To(this int start, int end)
{
  var diff = end - start > 0 ? 1 : -1;
  return Iterate(start, x => x != end, x => x + diff);
}

Another interesting part is setting up the display and then restoring the original display properties of the console.

using (SetupConsole())

  ...

Where SetupConsole is defined as:

static ActionDisposable SetupConsole()
{
  var originalBackgroundColor = Console.BackgroundColor;
  var originalForegroundColor = Console.ForegroundColor;
  return new ActionDisposable(() =>
    {
      Console.BackgroundColor = originalBackgroundColor;
      Console.ForegroundColor = originalForegroundColor;
    });
}

Notice that the original state of the console is captured only in the SetupConsole method and does not clutter the rest of the code.  The closure that is created contains all of the necessary information to restore the console to its original state.  This way we hide data even from private members of the same class.  If they don't need to know, then they don't need to know.  Instead only a means for restoring the original state is provided.

I'm sure that the program can be improved a lot.  So if you have any comments, suggestions, or bugs then just post a comment.  Enjoy!

Update:

I took the comments as well as some of my own ideas and improved the ASCII art generator.  I also modified the How to Convert an Image to ASCII Art section to reflect some changes.  Thank you to everyone that contributed.

Download the Source

File iconAsciiPictureSource.zip

File iconAsciiPictureSourcev1.1.zip

Author: "wesdyer" Tags: "Programming, C#"
Comments Send by mail Print  Save  Delicious 
Date: Wednesday, 14 Feb 2007 00:11

It seems that I riled some people up with my blog post yesterday.  After some thought, I think the primary reason that there was some backlash is because some people feel that I violated one of the sacred principles of FP:  lists are *the* data structure.  Well, let me set the matter straight.  I love lists.  Especially the singlely linked immutable kind that are found in many functional languages.  Furthermore, Microsoft is not trying to launch some massive marketing campaign to sell IEnumerable<T> as the *new* list.  So to be clear, I talked about IEnumerable<T> yesterday because people who have used functional languages will wonder where are the lists in C# 3.0.  IEnumerable<T> is not the same as a list, but there are many similarities between IEnumerable<T> and lazy lists that I pointed out yesterday when I showed they are isomorphic by describing the bijective mapping between them.

Furthermore, it is the case that some data structures are generally better than others.  But there are tradeoffs between using the various data structures.  Also, *not* all of the tradeoffs are captured by looking exclusively at their time complexity for some operation.  There is of course the space complexity and there is the complexity of implementation itself.  And don't forget that often they have different characteristics for different operations.  The key point here is that a given problem will have a number of constraints and each of the competing designs has a number of trade-offs.  As an interviewer, when I notice a candidate is not aware of the trade-offs that he is making then I start to worry that they were not considered at all.  Furthermore, when I notice that a candidate seems to favor one data structure to the exclusion of others, I start to wonder how many tools are in the toolbox.  But after observing this behavior many times, I am convinced that it often is not the coding problem that is driving usage of some data structure but the mode of thinking that the candidate employs.

One more thing before I continue on.  At least one reader wondered why I said the following:

"Where most programmers who are accustomed to imperative style would naturally use an array, a variable, or a mutable object, a functional programmer will often use a list."

Yes, I meant to say exactly what I said.  In fact, when I wrote it, I paused before deciding to include it because I thought it might be misunderstood.  When I first read SICP, the most mind bending and rewarding topic was at the end of chapter 3.  The section on streams.  One of the motivations that was given for using a stream (infinite list) was that variables that change their value over time cause problems.  One way to address this was to have streams represent the state of the variable over time where each element of the stream represents a state of the variable at some point in time.

So without further ado, let's take a look at streams...

What we want is an infinite list.  The problem is that infinite list can never actually be fully realized because of their infinite nature.  So if we attempt to realize an infinite list either the computer will enter an infinite loop or it will run out of resources (stack overflow, out of memory, etc.).

We can overcome this problem by having lazy lists.  Lists where the next element in the list is not realized until it is needed.  Yesterday, I presented one such lazy list which uses an enumerator to realize the next element.  Today, I present another which has a more intriguing definition.

class Stream<T> : IList<T>
{
  Func<IList<T>> next;
  T value;

  public Stream(T value, Func<IList<T>> next)
  {
    this.value = value;
    this.next = next;
  }

  public IList<T> Next { get { return next(); } }
  public T Value { get { return value; } }
}

This lazy list is very similar to a normal list.  The only difference is that instead of taking a list as the value of the next node in the list, it takes a function which will evaluate to the next node in the list.  But this difference is critical.

The first difference can easily be seen by imaging a list type, ErrorList, that when constructed throws an exception.

class ErrorList<T> : IList<T>
{
  public ErrorList()   { throw new Exception(); }
  public IList<T> Next { get { throw new Exception(); } }
  public T Value       { get { throw new Exception(); } }
}

Now consider the following code:

var list = new List<int>(1, new ErrorList<int>());           // error, exception thrown
var stream = new Stream<int>(1, () => new ErrorList<int>()); // no error

An exception is thrown when the list is constructed but when the stream is constructed no exception is thrown.  Unless the Next property is evaluated on the stream, there will never be an exception.

The second difference can be seen in the following code:

IList<BigInteger> onesList = null;
onesList = new List<BigInteger>(1, onesList);
IList<BigInteger> onesStream = null;
onesStream = new Stream<BigInteger>(1, () => onesStream);

If you try to list all of the values in onesList, then you will notice that it only contains one value.  Whereas onesStream contains an infinite number of values.  The reason is that when we constructed onesList, we passed in onesList but onesList had the value null at the time so the next node was set to null.  But in the stream case we passed in a function that would be evaluated sometime in the future.  By the time that we evaluate it, it will return the proper value of onesStream.

A third difference is found in the performance of the two lists.  With the lazy lists, parts of the list that are never realized are never paid for.  So it is a pay as you go model as opposed to pay everything up front.  Furthermore, less space can be required since the whole list is not necessarily held in memory at once but only a process decription that can compute each element as it is required.

So now we have this infinite stream of ones, but can we do anything more interesting?  Sure.  First, let's define a zip function between lists.

  public static IList<V> Zip<T,U,V>(Func<T,U,V> f, IList<T> list1, IList<U> list2)
  {
    if (list1 == null || list2 == null)
      return null;
    return new Stream<V>(f(list1.Value, list2.Value), () => Zip<T,U,V>(f, list1.Next, list2.Next));
  }

Yes, that is somewhat of a mouthful.  Here is what it does.  It takes two lists where the lists may differ from each other in what kind of elements they contain.  It also takes a function from the type of the first list's elements and the type of the second list's elements and returns possibly a new type.  Then if either of the lists is empty we return an empty list (null).  But if both lists are non-empty then we return a stream where the first element is the application of the given function to the first element of both of the lists and the rest of the list will be the evaluation of Zip on the rest of the two lists.  It is important that Zip uses a stream because these lists may be infinite and if we try to immediately evaluate the entire list we will run out of resources.

Now that we have Zip, let's put it to use to define the natural numbers.

IList<BigInteger> ones = null;
ones = new Stream<BigInteger>(1, () => ones);
IList<BigInteger> natural = null;
natural = new Stream<BigInteger>(0, () => Zip((x, y) => x + y, natural, ones));

So what we are saying is that the natural numbers begin with zero and then are followed by the sum of the first natural number with the first element of ones (0 + 1).  The second natural number is the sum of the second element of natural numbers with the second element of ones (1 + 1) and so on.  This works because each element of natural numbers is defined only in terms of the elements of natural numbers that occur previous to it.

So now we can easily define the odd numbers (2k + 1) and even numbers (2k).  But first we need a map function for lists.

public static IList<U> Map<T, U>(Func<T, U> f, IList<T> list)
{
  if (list == null)
    return null;
  return new Stream<U>(f(list.Value), () => Map(f, list.Next));
}

Now here are the definitions of odd and even numbers.

IList<BigInteger> odds = Map(x => 2 * x + 1, natural);
IList<BigInteger> evens = Map(x => 2 * x, natural);

We can also define the fibonacci sequence as a stream.

IList<BigInteger> fibs = null;
fibs = new List<BigInteger>(0, new Stream<BigInteger>(1, () => Zip(add, fibs, fibs.Next)));

What we are saying here is that the first fibonacci number is zero and the second is one but then the next one is the sum of the first number and the second number and so on.  This is similar to the natural numbers definition which used itself to compute itself.

If you try it out, you will also notice that it isn't very efficient.  This is because we are back to our exponential time complexity.  But this can easily be remedied by memoizing the next function in the constructor to the stream:

  public Stream(T value, Func<IList<T>> next)
  {
    this.value = value;
    this.next = next.Memoize();
  }

Now it has linear time complexity by trading off some space (linear as well).

Let's finish by solving the famous problem of producing the Hamming numbers.  The problem is to list all of the positive integers who have no prime factors other than 2, 3, or 5 in ascending order.  The first ten Hamming numbers are:

1, 2, 3, 4, 5, 6, 8, 9, 10, 12

This problem is notoriously difficult without lazy evaluation and is used to demonstrate the power of laziness.

To solve this problem first note that the first hamming number is 1.  Then if h is a hamming number so is 2h, 3h, and 5h.  So we can define three streams which map the hamming numbers to 2h, 3h, and 5h respectively.  The only remaining requirement is that they must be in order.  We can maintain this invariant by defining a function named Merge:

public static IList<T> Merge<T>(IList<T> list1, IList<T> list2) where T : IComparable<T>
{
  if (list1 == null)
    return list2;
  else if (list2 == null)
    return list1;
  int c = list1.Value.CompareTo(list2.Value);
  if (c < 0)
    return new Stream<T>(list1.Value, () => Merge(list1.Next, list2));
  else if (c > 0)
    return new Stream<T>(list2.Value, () => Merge(list1, list2.Next));
  else
    return new Stream<T>(list1.Value, () => Merge(list1.Next, list2.Next));
}

Now we are ready to define the Hamming numbers.  Notice how close the definition in the code is to our description:

IList<BigInteger> hamming = null;
hamming = new Stream<BigInteger>(1, () => Merge(Map(x => x * 2, hamming), Merge(Map(x => x * 3, hamming), Map(x => x * 5, hamming))));

Now for fun, try to think of how to do it without lazy evaluation.

Author: "wesdyer" Tags: "C#, Functional Programming"
Comments Send by mail Print  Save  Delicious 
Date: Monday, 12 Feb 2007 22:41

One of the things that I have noticed when participating in interviews with potential candidates is that most candidates have a favorite data structure.  Presumably this is the data structure that they feel the most comfortable with.  The problem is that no matter what coding question I ask them, they see the answer through the lenses of their favorite data structure.  I have met programmers who are in love with hash tables, others who have a certain fondness for B-trees, and quite a few who adore arrays.  My feeling is that this happens not because they understand the data structure itself better than other data structures (honestly, I can't believe that the guy who loves b-trees doesn't understand what arrays are), but it is because each data structure encourages a certain way of thinking about problems.  So really the candidates are just thinking about the question in a certain way that is expressed on the outside as using some particular data structure.

One of the first things that a newcomer to functional programming will notice is that lists are all the rage.  Where most programmers who are accustomed to imperative style would naturally use an array, a variable, or a mutable object, a functional programmer will often use a list.  Why is this?

If we consider the insight that it is not the data structure itself but rather the mode of thinking that encourages the usage of the data structure then we can make some progress on understanding this.  Lists are the simplest recursive data structure.  A list is just a node consisting of a value and the next node in the list.  The definition of a list is defined in terms of itself.

interface IList<T>
{
  IList<T> Next { get; }
  T Value { get; }
}

But notice how similar this way of thinking is to recursive functions.  For example a very common function defined over lists is Length.  This function computes the length of a given list.

public static int Length<T>(this IList<T> list)
{
  if (list == null)
    return 0;
  return 1 + Length(list.Next);
}

Length is a function that is defined in terms of itself.  Like the definition of list, Length has a value (either the 0 or the 1 +) and it has a next (the recursive call to Length(list.Next)).  So one reason why lists are so popular in functional programming is that they encourage a recursive way of thinking about problems.

In C# 3.0, lists are not so commonly used as IEnumerable<T>.  In fact, IEnumerable<T> is used so much that it sticks out in the same way the usage of lists in functional programming languages sticks out.  The primary Linq provider that will be shipped is Linq to Objects which requires that the source of a query implement IEnumerable<T>.

In some senses, IEnumerable<T> is not so very different from the definition of IList<T> shown above.  Both of them represent sequences of elements where only the current value and the next element are available at a given time.  In fact, it is relatively easy to define one in terms of the other.

Here is a definition of IEnumerable<T> in terms of IList<T>.

public static IEnumerable<T> GetEnumerator<T>(this IList<T> list)
{
  for (var current = list; current != null; current = current.Next)
    yield return current.Value;
}

And here is a corresponding definition of IList<T> in terms of IEnumerable<T> (take from this post).

class LazyList<T> : IList<T>
{
  IList<T> next;
  T value;
  IEnumerator<T> enumerator;

  LazyList(T value, IEnumerator<T> enumerator)
  {
    this.value = value;
    this.enumerator = enumerator;
  }

  public static IList<T> Create(IEnumerable<T> enumerable)
  {
    return Create(enumerable.GetEnumerator());
  }

  public static IList<T> Create(IEnumerator<T> enumerator)
  {
    if (enumerator.MoveNext())
      return new LazyList<T>(enumerator.Current, enumerator);
    return null;
  }

  public IList<T> Next
  {
    get
    {
      if (enumerator != null)
      {
        next = Create(enumerator);
        enumerator = null;
      }
      return next;
    }
  }

  public T Value { get { return value; } }
}

But it should be noted that there is a critical difference between both of these definitions and the following definition of List<T>.

class List<T> : IList<T>
{
  T value;
  IList<T> next;

  public List(T value, IList<T> next)
  {
    this.value = value;
    this.next = next;
  }

  public IList<T> Next { get { return next; } }
  public T Value { get { return value; } }

  public static IList<T> Empty { get { return null; } }
}

Did you notice what it is?  The previous definitions were lazy but this definition is not.  The iterator did not describe the data itself but rather a process for sequencing data that will be executed possibly sometime in the future.  The LazyList also does not actually realize an entire list.  It describes a way to construct the next element of a list on demand.  But the definition of List does not defer anything to the future.  Now the question is, does it matter?

Author: "wesdyer" Tags: "C#, Functional Programming"
Comments Send by mail Print  Save  Delicious 
Date: Sunday, 11 Feb 2007 09:49

Baby Names

I recently finished reading Freakonomics.  It is a fascinating book about a number of strange phenomena.  Its topics range from the economics of dealing crack to cheating in sumo wrestling.  Among the sundry topics is a discussion concerning the psychology and sociology underlying babies names.

This topic has interested me ever since I found out that my wife and I gave our first two children some of the most popular names for their birth year.  We did not intentionally look for popular names, but we picked them any way.  I always wondered how it was that I picked with the crowd.  I had theories but I didn't have any data.

So when I heard Levitt and Dubner's theory about why some baby names are so popular, I was naturally very curious.  Their hypothesis is that affluent and successful families set the trend (not celebrities).  Some baby names begin to take hold in affluent circles.  Then when less successful parents are looking for a baby name, they choose the names of children of privilege and opportunity.  Thus the name continues to extend its influence from one family to another.  Eventually, everyone is giving their child the name (often misspelling it).  Finally as more and more of the common folk use the name, the elitists stop using the name.

Their theory seemed probable enough and they had some good data to back it up, but I had my doubts.  I certainly didn't feel like I picked my son's name because I thought some other child's opportunities would rub off on him.

Later on, I was in need of an app to test Linq to Objects query performance over a large dataset.  Unfortunately, we didn't have a suitable app and I didn't have a lot of time.  Furthermore, I didn't have a large easily accessible in-memory dataset to work with.  So I decided to write a quick little app to screen scrape the Social Security Administration's Popular Baby Names site.  The app pulled down the top one hundred most popular names for every year in every state by gender since the year 1960.  I ended up with 40 megabytes of XML where each element looked something like this:

<PopularName state="Alaska" year="1960" rank="5" gender="Female" name="Susan" count="50" />

I then wrote an app that loaded all of the data into memory.  Each XML element became a PopularName object which has a property for each attribute in the XML.  These names were stored in a local variable called names of type List<PopularName>.

I then wrote a number of queries against the baby name dataset.  One of the queries shows the number of children named that name by year.  This query is run by calling NameUsage and passing in names and the name to use in the query.

NameUsage(names, "Wesley");

Where the body of the method looks like:

static void NameUsage(IEnumerable<PopularName> names, string searchName)
{
  Console.WriteLine("{0} Usage", searchName);
  var q = from name in names
          where name.Name == searchName
          group new { name.Name, name.Count } by name.Year
            into g
            orderby g.Key ascending
            select new { Year = g.Key, TotalCount = g.Sum(x => x.Count) };

  foreach (var item in q) Console.WriteLine(item);
}

This particular query displays:

Wesley Usage
{ Year = 1960, TotalCount = 107 }
...
{ Year = 2005, TotalCount = 159 }

Here is a graph of the data for the usage of the name "Wesley".

So it seems that my parents were victims of their time as well.  But is it only me and my children?

Apparently not.  My wife was also given her name during its period of popularity.  Note that these names are not necessarily really popular names.  Even so, the giving of various names seems to ebb and flow.  It is fascinating to think about how this behavior emerges from the vast array of parents seeking the best name for their child.

 

Nameless Keys

Another one of queries that I wrote listed the most popular names overall.  I wanted to distinguish names by gender usage (Terry, Pam, ...) but how can we do that with queries?

What I really want is to make the equality of the names based on the name itself and the gender usage.  In C# 3.0, we added anonymous types.  These neat little guys are very useful and one of the ways that they are the most useful is as composite keys.

Anonymous types have structural equality semantics within an assembly.  This means that if two anonymous types are created with the same property names in the same order and of the same type then they will be the same type and if the values are the same then the two instances will be equal and have the same hashcode.

We can use these facts to write queries which define equality between objects on several values.  In this case equality depends on the name and the gender.  So in the group...by clause we will group not on the name but on an anonymous type with members corresponding to the name and to the gender of the item.

static void TopNames(List<PopularName> names, int count)
{
  Console.WriteLine("Top {0} Names", count);
  var q = (from name in names
           group name.Count by new { name.Name, name.Gender }
             into g
             let TotalCount = g.Sum()
             orderby TotalCount descending
             select new { g.Key.Name, g.Key.Gender, TotalCount })
          .Take(count)
          .Select((x, rank) => new { Rank = rank + 1, x.Name, x.Gender, x.TotalCount });

  foreach (var item in q) Console.WriteLine(item);
}

Anonymous types can be used as composite keys in other query clauses such as join and orderby.  We can also use them to add multi-argument support to our memoize function.  We will use them as multi-argument keys in the map contained in the memoization function.

public static Func<A, B, R> Memoize<A, B, R>(this Func<A, B, R> f)
{
  var map = new Dictionary<???,R>();
  return (a, b) =>
  {
    var tuple = new { a, b };
    R value;
    if (map.TryGetValue(tuple, out value))
      return value;
    value = f(a, b);
    map.Add(tuple, value);
    return value;
  };
}

Memoize takes a function of two arguments of types A and B respectively.  It also returns a function of two arguments of types A and B.  What is different is that when the two arguments are passed into the lambda then they are put in a tuple and then it checks the map to see if that tuple is already in the map.  Essentially, we are using the anonymous type to form a composite key of the two arguments passed to the lambda.

Mumbling

But how can we create a Dictionary from an anonymous type to type R?  While it is easy to specify the type of map using the contextual keyword var even though the type doesn't have a speakable name, it isn't obvious how to specify the type parameters to the Dictionary constructor when we want to instantiate the type to an anonymous type.

We can get around this problem by introducing a new helper class.

static class DictionaryHelper<Value>
{
  public static Dictionary<Key, Value> Create<Key>(Key prototype)
  {
    return new Dictionary<Key, Value>();
  }
}

Here we put the type parameters that we can name (Value in this case) on the helper class.  Then we create a static method in the helper class that takes the remaining type parameters (Key in this case)  but also takes one parameter of the same type for each type parameter.  This is so we can pass in parameters that will be used by type inference to infer the unspeakable type parameters.  we therefore do not need to specify these types.

This means that we can replace the map creation in Memoize with the following code.

var map = DictionaryHelper<R>.Create(new { a = default(A), b = default(B) });

We specify one of the type parameters (R) of the Dictionary explicitly, but we specify the other (an anonymous type) implicitly by providing an example of the type.

I love using anonymous types as composite keys because they define equality and hashing semantics in terms of their members.  So next time you need a composite key, try using anonymous types.

In any case, now that my wife and I are expecting our third child, I have been writing a number of queries against this dataset to understand the ebb and flow of baby names.

Author: "wesdyer" Tags: "C#, Personal, Anonymous Types"
Comments Send by mail Print  Save  Delicious 
Date: Monday, 05 Feb 2007 21:12

Keith Farmer brought it to my attention that there is at least a little confusion about how closures work.  Hopefully, I can help shed a little light on the subject.  The question is why doesn't the following code actually memoize fib in the call to Test?

Func<int, int> fib = null;
fib = n => n > 1 ? fib(n - 1) + fib(n - 2) : n;
Test(fib.Memoize());

Before I explain why this code doesn't work, I want to return to the example that I used in my last post.

Func<int, int> fib = null;
fib = n => n > 1 ? fib(n - 1) + fib(n - 2) : n;
Func<int, int> fibCopy = fib;
Console.WriteLine(fib(6));                      // displays 8
Console.WriteLine(fibCopy(6));                  // displays 8
fib = n => n * 2;
Console.WriteLine(fib(6));                      // displays 12
Console.WriteLine(fibCopy(6));                  // displays 18

Probably the easiest way to see why this code behaves so strangely is to show what code the C# compiler generates.  This can easily be done with ildasm.exe or reflector.  There are two lambdas defined in the code.  The first one captures the local variable named fib and the second does not capture any locals.  Because fib is capture by a lambda, it must be hoisted on to the heap.  So a class is declared that contains the captured local.

class DisplayClass
{
  public Func<int, int> fib;
}

Now the compiler must emit a method that corresponds to the first lambda.  This method will need access to the members of DisplayClass and must also be convertible to a Func<int,int> delegate.  Today the compiler achieves this by emitting the method on the display class as an instance method (we could also have used currying but that is a story for another day).

class DisplayClass
{
  public Func<int, int> fib;
  public int M1(int n)
  {
    return n > 1 ? fib(n - 1) + fib(n - 2) : n; // <-- 1st lambda body
  }
}

The second lambda does not capture any local variables.  So this method can be emitted as a static method.

static int M2(int n)
{
  return n * 2;
}

Finally, we are ready to show the emitted code for the original code fragment.

DisplayClass locals = new DisplayClass();
locals.fib = null;
locals.fib = locals.M1;
Func<int, int> fibCopy = locals.fib;
Console.WriteLine(locals.fib(6));       // displays 8
Console.WriteLine(fibCopy(6));          // displays 8
locals.fib = M2;
Console.WriteLine(locals.fib(6));       // displays 12
Console.WriteLine(fibCopy(6));          // displays 18

Notice how the first call to fib is really just a call to M1 and when M1 "recurses" on fib it just ends up calling M1 again because that is what is assigned to fib.  The call to fibCopy is a little trickier because the original call is really a call to M1 as well but when it "recurses" it invokes fib instead of fibCopy which also happens to be M1 at the time.  So the first two calls behave as expected.

Now it starts to get a little strange.  First we assign M2 to fib.  Then when we invoke fib on the next line it doesn't invoke our original "fib" function M1 anymore but it not invokes M2.  This of course displays the result of multiplying 6 by 2.

Now for the strangest part, we invoke fibCopy.  But fibCopy actually still references M1 and not M2 and since 6 > 1 then it "recurses" by invoking fib twice with 5 and 4 respectively and then summing the results.  But the calls to fib actually invoke M2 now.  So the results of the calls to fib are 10 and 8 which then are summed to produce 18.

Now let's return to our original problem.

Func<int, int> fib = null;
fib = n => n > 1 ? fib(n - 1) + fib(n - 2) : n;
Test(fib.Memoize());

Notice that the function passed to Test is memoized but the body of the function still calls the unmemoized fib.  The function itself is memoized by all of the "recursive" calls are not.  So it probably will not behave as intended.  This can be corrected by doing the following.

Func<int, int> fib = null;
fib = Memoize(n => n > 1 ? fib(n - 1) + fib(n - 2) : n);
Test(fib);

Now the calls to fib when n > 1 are made to the memoized fib delegate.

If you read my last post on anonymous recursion in C# which introduces the Y combinator then you may have tried the following.

Func<int, int> fib = Y<int, int>(f => n => n > 1 ? f(n - 1) + f(n - 2) : n).Memoize();

This behaves very similarly to the incorrectly written fib above.  Calls to the function itself are memoized but all of the recursive calls are not.  This is because fib is memoized but f is not.  Erik Meijer pointed me to a fantastic paper that discusses how to memoize functions that are the fixed point of functionals.  In this paper, Cook and Launchbury present a function written in Haskell that enables memoized anonymous recursive functions.

memoFix f = let g = f h
                h = memo g
            in h

Using the MemoizeFix function in C# looks very similar to using the Y combinator, but instead of just returning a recursive function, it would also memoize the function and all the recursive calls.

Func<int, int> memoFib = MemoizeFix<int, int>(f => n => n > 1 ? f(n - 1) + f(n - 2) : n);

But there are two problems with implementing the function in C#.  First, Haskell lets programmers write mutually recursive definitions (g is defined in terms of h and h is defined in terms g).  Second, Haskell is a lazy language.  A straightforward implementation in C# will either produce a null reference exception or a stack-overflow because of the eager evaluation and mutually recursive definitions.  Fortunately, lambdas can be used to solve both problems.

static Func<A, R> MemoizeFix<A, R>(this Func<Func<A, R>, Func<A, R>> f)
{
  Func<A, R> g = null;
  Func<A, R> h = null;
  g = a => f(h)(a);
  h = g.Memoize();
  return h;
}

Here we define both g and h before they are assigned their final values.  Then instead of assigning f(h) to g which would immediately invoke f causing a Null Reference exception, we assign a lambda to g which will be evaluated sometime in the future when h has a non-null value.  The rest of the function is very similar to the Haskell version.

Notice what happens when memoFib is invoked.  This will actually invoke h which was defined in MemoizeFix, but h is really just a memoized version of g.  Since this is the first invocation of h and there will be no precomputed value and so g will be invoked.  But when g is invoked then it actually invokes f passing h (memoized g) as the function to use for recursion.  This is exactly what we want.  Beautiful isn't it?

Author: "wesdyer" Tags: "C#, Functional Programming"
Comments Send by mail Print  Save  Delicious 
Date: Friday, 02 Feb 2007 22:17

Recursion is beautiful and lambdas are the ultimate abstraction.  But how can they be used together?  Lambdas are anonymous functions and recursion requires names.  Let's try to define a lambda that computes the nth fibonacci number.

Func<int, int> fib = n => n > 1 ? fib(n - 1) + fib(n - 2) : n;

But this doesn't work.  The compiler will complain about the use of fib in the lambda.

Use of unassigned local variable 'fib'

Eric Lippert has a great blog post on this issue.  The problem is that the right hand side is evaluated before fib is definitely assigned.  In this case, the compiler could potentially deduce (if the language spec allowed it) that fib is not used before it is definitely assigned, but in other cases it might need to be used before fib is assigned.

A quick workaround is to assign the value null to fib and then assign the lambda to fib.  This causes fib to be definitely assigned before it is used.

Func<int, int> fib = null;
fib = n => n > 1 ? fib(n - 1) + fib(n - 2) : n;
Console.WriteLine(fib(6));                        // displays 8

In fact, letrec in Scheme does something very similar.

But our C# workaround doesn't really use recursion.  Recursion requires that a function calls itself.  The fib function really just invokes the delegate that the local variable fib references.  It may seem that this is just nit picking about words, but there is a difference.  For example, consider the following code:

Func<int, int> fib = null;
fib = n => n > 1 ? fib(n - 1) + fib(n - 2) : n;
Func<int, int> fibCopy = fib;
Console.WriteLine(fib(6));                        // displays 8
Console.WriteLine(fibCopy(6));                    // displays 8
fib = n => n * 2;
Console.WriteLine(fib(6));                        // displays 12
Console.WriteLine(fibCopy(6));                    // displays 18

Huh!?  Notice how the result of calling fib changes and that the result of calling fibCopy differs even from the result of calling fib!  (See if you can figure out why)

We can stop this kind of craziness by passing in the function that will be used for the recursive call.  So the lambda looks the same except that it takes an extra parameter f that is called instead of fib.  When a call to f is made, we need to pass f to itself as the first argument.

(f, n) => n > 1 ? f(f,n - 1) + f(f,n - 2) : n

Unfortunately, not all is done yet.  In order to convert the lambda to a delegate, we need a delegate type.  Although it is clear what type fib should be, it may not be apparent what should be the type of our new delegate.  So let's start with the type of fib, Func<int,int>.  We do know that the return type of the delegate should be the same, so it must be int.  We also know that the second argument's type should be the same which is also int.  As for the type of the first argument, we will be passing in a delegate which will then be called with the same arguments as the delegate we are defining.  But wait!  That is recursion.  We are defining the delegate type in terms of itself.  Therefore, the first parameter will be of the same type that the delegate we are defining is.

delegate int Recursive(Recursive r, int n);

The delegate can be generalized by parameterizing the type of the first argument and the return type.

delegate R Recursive<A,R>(Recursive<A,R> r, A a);

Now we can use the lambda that we defined.

Recursive<int, int> fib = (f, n) => n > 1 ? f(f,n - 1) + f(f,n - 2) : n;
Console.WriteLine(fib(fib,6));                      // displays 8

While this is an improvement, neither the lambda nor its application look as nice as the original code.  The fact that the first argument should be the delegate itself seems a little strange.  The lambda can be curried in order to separate the passing of f to f from the mechanics of the fib function.  The innermost lambda will have the same signature as the original fib function.

f => n => n > 1 ? f(f)(n - 1) + f(f)(n - 2) : n

Now that the lambda has been curried, a new definition for the Recursive delegate is required.  It now only takes one argument which is the first argument of the previous definition, but the return type changes.  Recursive returns a function (the effect of currying it) which takes the second parameter of the original definition and returns the original return type.

delegate Func<A,R> Recursive<A,R>(Recursive<A,R> r);

Once the lambda is assigned to a Recursive delegate then we can apply the delegate to itself to get the underlying function.  Furthermore, if we made a copy of the delegate and then changed the original then none of the strange effects that happened early would occur.

Recursive<int, int> fibRec = f => n => n > 1 ? f(f)(n - 1) + f(f)(n - 2) : n;
Func<int, int> fib = fibRec(fibRec);
Console.WriteLine(fib(6));                           // displays 8

It is now possible to clean-up the duplicated self-application of f.  We can get rid of it by creating a lambda inside of the fibRec lambda which contains the essential fibonacci definition but which takes an argument that references what function should be called for recursion.  Then, inside of fibRec, we can do the self-application of f.

Recursive<int, int> fibRec = f => n =>
  {
    Func<Func<int, int>, int, int> g = (h, m) => m > 1 ? h(m - 1) + h(m - 2) : m;
    return g(f(f), n);
  };
Func<int, int> fib = fibRec(fibRec);
Console.WriteLine(fib(6));                           // displays 8

This inner lambda looks very similar to our original lambda which took two parameters.  In fact, it appears that the whole process of construction and self-application that initially happened at the top level is now happening inside of the lambda.  The only difference is that the inner lambda does not pass a reference to itself around.  This has now be abstracted out in the call to g.

We can simplify the fibRec definition by currying the inner lambda.

Recursive<int, int> fibRec = f => n =>
{
  Func<Func<int, int>, Func<int, int>> g = h => m => m > 1 ? h(m - 1) + h(m - 2) : m;
  return g(f(f))(n);
};
Func<int, int> fib = fibRec(fibRec);
Console.WriteLine(fib(6));                           // displays 8

Note that the definition of g does not depend on f or n at all.  Therefore, we can move it outside of the outer lambda.

Func<Func<int, int>, Func<int, int>> g = h => m => m > 1 ? h(m - 1) + h(m - 2) : m;
Recursive<int, int> fibRec = f => n => g(f(f))(n);
Func<int, int> fib = fibRec(fibRec);
Console.WriteLine(fib(6));                          // displays 8

Notice in the above code that g now represents our original concept of the fibonacci function while fibRec does all of the handy work to enable anonymous recursion.  The whole process of building of fibRec and then applying it to itself only requires a reference to g.  So let's move the definition of fibRec and fib to a different function named CreateFib.

static Func<int, int> CreateFib(Func<Func<int, int>, Func<int, int>> g)
{
  Recursive<A, R> fibRec = f => n => g(f(f))(n);
  return fibRec(fibRec);
}

We can now call CreateFib instead of creating fibRec.

Func<Func<int, int>, Func<int, int>> g = h => m => m > 1 ? h(m - 1) + h(m - 2) : m;
Func<int, int> fib =CreateFib(g);
Console.WriteLine(fib(6));                          // displays 8

But really our CreateFib function is useful for more than just creating fibonacci functions.  It can create any recursive function which takes one argument.  Parameterizing the types used by CreateFib leads to what is known as the Y fixed-point combinator.

static Func<A, R> Y<A, R>(Func<Func<A, R>, Func<A, R>> f)
{
  Recursive<A, R> rec = r => a => f(r(r))(a);
  return rec(rec);
}

We can now create any recursive lambda of one parameter that we want.

Func<int,int> fib = Y<int,int>(f => n => n > 1 ? f(n - 1) + f(n - 2) : n);
Func<int, int> fact = Y<int, int>(f => n => n > 1 ? n * f(n - 1) : 1);
Console.WriteLine(fib(6));                          // displays 8
Console.WriteLine(fact(6));                         // displays 720

Finally, the goal of anonymous recursion has been reached.

Author: "wesdyer" Tags: "C#, Functional Programming"
Comments Send by mail Print  Save  Delicious 
Date: Monday, 29 Jan 2007 23:36

When I first heard the term Currying, I thought immediately of tasty Thai and Indian food.  To my dismay, I found that the conversation was not about wonderful spices but rather about transforming a function that takes n arguments into a function that takes only one argument and returns a curried function of n - 1 arguments.  Why in the world would that be useful?

From a theoretical standpoint, it is interesting because it simplifies the lambda calculus to include only those functions which have at most one argument.  From a practical perspective, it allows a programmer to generate families of functions from a base function by fixing the first k arguments.  It is akin to pinning up something on the wall that requires two pins.  Before being pinned, the object is free to move anywhere on the surface; however, when when first pin is put in then the movement is constrained.  Finally, when the second pin is put in then there is no longer any freedom of movement.  Similarly, when a programmer curries a function of two arguments and applies it to the first argument then the functionality is limited by one dimension.  Finally, when he applies the new function to the second argument then a particular value is computed.

What this means in C# is if I have a delegate which is of type Func<A,B,R> (delegate with two arguments of type A and B respectively and returns type R) then I can create a delegate which is of type Func<A,Func<B,R>>.  Notice how the curried delegate only has one argument but it returns a delegate which takes the original function's second argument and finally returns a value.

Consider generating functions from the add function.

Func<int,int,int> add = (x,y) => x + y;

We can curry add by applying the Curry function to add.

Func<int,Func<int,int>> curriedAdd = add.Curry();

This curried add function is really a function that creates functions that add n where n is the argument to the curried add function.  For example, we can create an increment function by applying the curried add function to the value one.

Func<int,int> inc = curriedAdd(1);

The increment function will now return one plus the value of its argument when invoked.  We can now use our three functions to do various forms of addition.

Console.WriteLine(add(3,4));            // 7
Console.WriteLine(curriedAdd(3)(5));    // 8
Console.WriteLine(inc(2));              // 3

So how would this function Curry look?  It's really pretty simple.

public static Func<A, Func<B, R>> Curry<A, B, R>(this Func<A, B, R> f)
{
  return a => b => f(a, b);
}

It just takes a function of two arguments and then returns a lambda which fixes the first argument and then the second argument.  Once both arguments have been provided it evaluates the original function with the arguments.  It is easy to follow the same pattern and create a function Curry which curries a functions of other arities.

Let's examine what went on when we created each of the functions.  First we created a function called add which looked like:

(x, y) => x + y

 Once we curried add, the function became:

x => y => x + y

We created inc by calling curried add with the value 1.  This essentially created the following function:

y => 1 + y

The idea of currying a function and then fixing the first n arguments of the original function can be generalized into an concept called partial function application.  For instance, if we consider our add function from the previous example then we can directly create the increment from add without having to create curriedAdd first.

Func<int,int> inc = add.Partial(1);

Where the Partial function is written as:

public static Func<B, R> Partial<A, B, R>(this Func<A, B, R> f, A a)
{
  return b => f(a, b);
}

Notice how the function takes a function and a value which has the same type as the first argument of the function.  It then returns a function which takes the remaining arguments and then applies all of the arguments to the original function.  This can be generalized into a set of functions that produce partially applied functions.

Author: "wesdyer" Tags: "C#, Functional Programming"
Comments Send by mail Print  Save  Delicious 
Date: Friday, 26 Jan 2007 04:44

One of my favorite pastimes is playing games.  No not XBox 360, PS3, or Wii games nor other computer games, but board games, card games, and other such games.  It's probably because I'm from a large family - I have 8 siblings - and we would often spend time together playing games.  It is a good way to accommodate a wide variety of interests, skills, and aptitudes.  Some of my favorite games are Settlers of Catan, Carcassonne, Spades, Rook, Mafia, Take Two, Stratego, Ticket to Ride, and of course Axis and Allies.

For my birthday this past year, I received an interesting board game titled RoboRally.  The premise of the game is there are a number of robots in a factory and they are bored, so they begin to indulge themselves by competing in races around the factory floor.  Each player takes command of a robot and seeks to win these competitions by controlling the robot.  The players are dealt a number of instruction cards whereupon each player elects five cards to load into his program registers.  After all the players have programmed their robots, their programs are executed somewhat simultaneously.  The robots can interact with each other by shooting, pushing, or otherwise hampering each other's progress.  But the real catch is that the board also interacts with the robots by way of conveyer belts, lasers, pushers, gears, pits, etc.  The uncertainty caused by the sheer number of interactions coupled with the deterministic behavior of programmed robots creates pandemonium.

One day I was just itching to write some code so I thought why not implement a computer version of RoboRally.  I thought that it would be particularly fun because of the number of interacting objects and the amount of state that is involved.  So I began to code away on a board editor.  This took a little bit of time mostly because I'm not very skilled at graphic art.  Here is a screenshot from an early version of the board editor.

The code for the editor consists of three assemblies: the windows application, a game engine, and a functional programming library.  This was one of the first non-trivial apps that I wrote using C# 3.0 and it was very fun to use the new language features.  In fact, it was incredible because I found that many design patterns that I commonly used changed drastically or were no longer needed.  Furthermore, I found that entirely new designs were possible.  Today, I will touch on just one of these design patterns and the underlying principle used in it.

The game engine has an class called Board and these boards are composed of many Tiles which are each associated with a Location.  Tiles can be simple or rather complex.  For instance, there are blank tiles and there are move left express conveyer belt tiles.  But there are also gear tiles with a normal wall on the top side and a laser wall on the left side which also have a flag on the tile.  I made a design decision early on to have basic tiles and decorator tiles.  So complex tiles are formed from the composition of a basic tile and any number of decorator tiles.  I wrote a factory which creates either some basic tile or some decorator tile.  For a first cut, I didn't worry that I created an overabundance of blank tiles even if they are all identical.  Nor did I worry about the fact that all move right conveyer tiles are the same.  However, later as I was refining the design I came back to the factory and wanted to improve it by reducing the number of tiles that were created to the bare minimum.  A typical solution to this problem for the blank tiles would look like this:

IBlankTile blankTile;
public IBlankTile CreateBlankTile()
{
  if (blankTile == null)
    blankTile = new BlankTile();
  return blankTile;
}

But how do we do this for conveyer tiles?

Dictionary<Direction, IConveyerTile> conveyerTiles = new Dictionary<Direction, IConveyerTile>();
public IConveyerTile CreateConveyerTile(Direction direction)
{
  IConveyerTile result;
  if (!conveyerTiles.TryGetValue(direction, out result))
  {
    result = new ConveyerTile(direction);
    conveyerTiles.Add(direction, result);
  }
  return result;
}

Ick!  It only gets even more complicated for tiles that are parameterized on more items.  Furthermore, it seems that there is a lot of repetition going on.  Specifically, all of the machinery to track instance existence and creation is being duplicated for each tile type.  To some extent the Singleton and Multiton patterns also suffer from the same problem.  There must be a better way.

In fact, there is a better way.  What we really want here is a function that returns the same value each time it is applied to the same parameter values.  Functional programmers will instantly recognize this as function memoization.

Memoization can be implemented as a function which takes a function as a parameter and returns a new function which when invoked the first time on some set of parameters will compute the result and when invoked with the same values will not recompute the result but will instead return the previously computed result.

Now let's see how our code would have looked if we had a static method called Memoize is a functional library called Fun (why not? functional programming is fun isn't it?).

public readonly Func<IBlankTile> CreateBlankTile = Fun.Memoize(() => new BlankTile());
public readonly Func<Direction,IConveyerTile> CreateConveyerTile = Fun.Memoize(dir => new ConveyerTile(dir));

The new code is beautiful and doesn't have all of the redundancy and complexity of the old code.  This is a very practical example of the modularity of functional style code that I referred to previously.

"But wait", you say, "how exactly does the magic happen?"  Well, let's take a look.  First, let's examine a memoize function which takes a function of no arguments and returns a memoized version of that function.

public static Func<R> Memoize<R>(this Func<R> f)
{
  R value = default(R);
  bool hasValue = false;
  return () =>
    {
      if (!hasValue)
      {
        hasValue = true;
        value = f();
      }
      return value;
    };
}

Memoize takes a function which has no arguments and a return type of R and returns a function with the same signature.  When Memoize is called it creates two local variables value and hasValue. Memoize then returns a new function that returns value if hasValue is true otherwise it computes value by evaluating the parameter f, sets hasValue to true, and then returns value.  Notice that the function that is returned from Memoize accesses hasValue, value, and f.  These three variables are local to Memoize.  The three variables together with a reference to the function that is created by Memoize form a closure.

So what happens when we invoke the function returned from Memoize for the first time?  The variable hasValue will be set to false, so we will set hasValue to true and then apply f saving the result to value.  Finally, we will return the value which is the result of f.  So the memoized function will return the same thing that f would have returned on its first invocation.

But what about on subsequent invocations?  When we call memoized function again, hasValue will be true so we will simply return value without recomputing it.  So the memoized function will always return the same value as it did on the first invocation without recomputing the result through f.

This has subtle implications such as if f has side effects then those side effects will only occur when we apply the memoized function the first time.  So we need to be careful about using it.  But consider how CreateBlankTile used Memoize.  When CreateBlankTile is run the first time, it will create a blank tile and then return the result, but on subsequent invocations it will continue to return the same blank tile.  This is exactly the behavior that we want.

Can we do the same thing for functions that have arguments?  Absolutely, let's take a look at how to Memoize functions of one argument.

public static Func<A, R> Memoize<A, R>(this Func<A, R> f)
{
  var map = new Dictionary<A, R>();
  return a =>
    {
      R value;
      if (map.TryGetValue(a, out value))
        return value;
      value = f(a);
      map.Add(a, value);
      return value;
    };
}

This time instead of storing the computed values in a local variable directly, we store them in a dictionary and instead of marking that we have computed the value we just check for existence in the dictionary.  This can be generalized to functions of n arguments by creating composite keys of n fields that are used in the dictionary.

Consider how this Memoize function works with CreateConveyerTile.  When we invoke it with some direction for the first time then it will not find that direction in the dictionary and so it will create a new conveyer tile parameterized on that direction and then store it in the dictionary.  Subsequent calls with the same direction will not create new tiles but instead grab out the existing one from the dictionary.  Again this is the behavior that we want.  Furthermore, all of the mechanics of storing and fetching tiles are located in Memoize instead of CreateBlankTile or CreateConveyerTile.

Memoize is also useful for the improving performance of recursive functions.  Eric Lippert has a great post about how not to teach recursion where he specifically mentions not to use fibonacci numbers.  However, I'm going to use fibonacci numbers not to teach recursion but to show how to improve the performance of recursive functions.  Recall, that the definition of the nth fibonacci number where n >= 0 is:

fib(0) = 0

fib(1) = 1

fib(n) = fib(n - 1) + fib(n - 2) if n > 1

We can easily write a little function to do this:

Func<int, int> fib = null;
fib = n => n > 1 ? fib(n - 1) + fib(n - 2) : n;

The problem is this function is very inefficient.  Consider calculating the 5th fibonacci number.

fib(5)

  = fib(4) + fib(3)

  = fib(3) + fib(2) + fib(2) + fib(1)

  = fib(2) + fib(1) + fib(1) + fib(0) + fib(1) + fib(0) + 1

  = fib(1) + fib(0) + 1 + 1 + 0 + 1 + 0 + 1

  = 1 + 0 + 1 + 1 + 0 + 1 + 0 + 1

  = 5

Notice how the computation branches out forming a tree of expansions.  In fact, the number of leaves in a tree for the computation of the nth fibonacci number is fib(n + 1) where the number of ones is fib(n) and the number of zeros is fib(n - 1).

So calculating fib(5) will cause fib to evaluated twelve times (the sum of the fibonacci numbers from 0 to 5) but only six distinct calls will be made (fib(0), fib(1), ..., fib(5)).  The problem only gets worse because the fibonacci numbers grow very quickly.  Here are the first 20 fibonacci numbers:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181

So our straightforward recursive function for computing fibonacci numbers will be exponential in n.  Ouch!  But wait, we can Memoize fib so that if it calls fib again with the same argument then it will not recalculate but instead use the previous result.  This will move our exponential function to a linear one at the cost of linear space to store the previous results (we could possibly use a weak reference hash map instead to solve problems with space if they existed).  In fact, subsequent calls to fib with previously computed values will be evaluated in constant time.

Func<int, int> fib = null;
fib = n => n > 1 ? fib(n - 1) + fib(n - 2) : n;
fib = fib.Memoize();

If you want, you can add a Console.WriteLine to show when the evaluations happen to both versions and compare.  It is very enlightening.

Memoize is turning into quite the useful function.  We have used it to solve parameterized identity creation problems and performance issues.  Later, we will see other ways we can use Memoize to solve design problems.

Author: "wesdyer" Tags: "C#, Functional Programming"
Comments Send by mail Print  Save  Delicious 
Next page
» You can also retrieve older items : Read
» © All content and copyrights belong to their respective authors.«
» © FeedShow - Online RSS Feeds Reader