Fibonacci Fun with F#

As you probably know, the Fibonacci sequence is the infinite sequence of integers where each element is the sum of the previous two (the first two elements being 0 and 1). Recently, I was inspired by a blog post, Ruby vs. Haskell – project Euler #25 deathmatch. In particular, I enjoyed the Haskell solution for its simplicity and declarativeness.

I decided to try and solve the same problem, but using F#, the functional programming language being introduced as a first class .NET citizen for the first time with Visual Studio 2010. If you have never seen F# code before, the snippets included in this post may be difficult to comprehend, especially if you are used to reading code written in imperative languages like C# or Java.

To declaratively create infinite sequences in F#, the Seq module provides the unfold function. This function takes two parameters, a generator function and an initial state. The generator function must take a state parameter and produce an option tuple with a sequence element and a new state. In F# notation, the unfold function has the signature Seq.unfold : ('State -> 'T * 'State option) -> 'State -> seq<'T>. Note that if the generator function always returns Some(_) and never None, the resulting sequence will be infinite.

An example of Seq.unfold in action is shown in the following one-liner, producing an infinite sequence of all positive integers.

let positiveIntegers = Seq.unfold (fun x -> Some(x, x + 1)) 1

In this example, the generator function takes an integer state as input. The sequence element produced by this function is the current state, while the next state is calculated by incrementing the current state. Thus, each time the generator function is called, the input integer state is one higher than the previous time. The initial state, 1, is passed as the final parameter to Seq.unfold. The result is the sequence “1, 2, 3, …, ∞” (or, strictly speaking, as far as 32 bit integers go).

So, how do we go from this sequence to the Fibonacci sequence? First of all, since each element of the Fibonacci sequence is the sum of the previous two, the state cannot consist only of a single integer. Rather, the state has to be a tuple of two integers. By choosing (0, 1) as the initial state, the generator function can use the first tuple element as sequence output and construct the next state as ([next], [current] + [next]), where [current] and [next] are the first and second element, respectively, from the current state tuple.

Translated into F# code, this yields the following definition of the Fibonacci sequence.

let fibonacci =
    Seq.unfold
        (fun (current, next) -> Some(current, (next, current + next)))
        (0, 1)

When enumerating this sequence, however, one problem becomes apparent. Element number 48 is a negative number. This is definitely erroneous behavior, as the Fibonacci sequence consists solely of positive integers. The error is due to the limited value space of 32 bit integers, causing an overflow. To circumvent this problem, we can use the BigInteger type, capable of representing integers of arbitrary size. The only change we need to make to our original Fibonacci definition is to change the initial state tuple to contain BigInteger values. The F# type inference system handles the rest.

open System.Numerics

let fibonacci =
    Seq.unfold
        (fun (current, next) -> Some(current, (next, current + next)))
        (BigInteger 0, BigInteger 1)

Now to the actual solution to Project Euler #25. Modelled after the previously mentioned Haskell solution, my solution also counts the number of elements in the Fibonacci sequence having a value less than 10999.

Again, translating this into F# results in the following code.

open System.Numerics

let limit = BigInteger.Pow(BigInteger 10, 999)

let fibonacci =
    Seq.unfold
        (fun (current, next) -> Some(current, (next, current + next)))
        (BigInteger 0, BigInteger 1)

let term =
    fibonacci
    |> Seq.takeWhile (fun n -> n < limit)
    |> Seq.length

printfn "%d" term

I am intrigued by how this functional solution focuses on what the Fibonacci sequence is, rather than how it is calculated. Constructing an infinite Fibonacci sequence in C# would typically require an iterator consisting of an infinite loop, representing state with two local variables. Counting elements having a value less than 10999, however, could easily have been accomplished in a functional manner using LINQ.

Custom Message Handling with WCF (Part 2 of 2)

In part one, I introduced the concepts behind custom message handling in WCF. This post is all about showing these principles in code. Allow me to first introduce the service I will be working with, a slightly modified version of the service from the default Visual Studio WCF template:

[ServiceContract(Namespace = "https://lookingsharp.wordpress.com/demo/")]
public interface IMyService
{
    [OperationContract]
    CompositeType GetData(CompositeType composite);
}

[DataContract(Namespace = "https://lookingsharp.wordpress.com/demo/")]
public class CompositeType
{
    bool boolValue = true;
    string stringValue = "Hello ";

    [DataMember]
    public bool BoolValue
    {
        get { return boolValue; }
        set { boolValue = value; }
    }

    [DataMember]
    public string StringValue
    {
        get { return stringValue; }
        set { stringValue = value; }
    }
}

My ultimate goal for this post is to communicate with this service, without deserializing the response into a CompositeType object. This requires a custom WCF client, which I prefer to base on the code generated by the Svcutil tool. Running Svcutil against the metadata exposed by the service is done as follows:

C:\temp>Svcutil http://localhost:8731/MyServiceLibrary/MyService/mex
Microsoft (R) Service Model Metadata Tool
[Microsoft (R) Windows (R) Communication Foundation, Version 3.0.4506.2152]
Copyright (c) Microsoft Corporation.  All rights reserved.

Attempting to download metadata from 'http://localhost:8731/MyServiceLibrary/MyS
ervice/mex' using WS-Metadata Exchange or DISCO.
Generating files...
C:\temp\MyService.cs
C:\temp\output.config

C:\temp>

The service definition in MyService.cs generated by Svcutil is equivalent to this code (adjusted to fit the page):

[GeneratedCode("System.ServiceModel", "3.0.0.0")]
[ServiceContract(
    Namespace="https://lookingsharp.wordpress.com/demo/", 
    ConfigurationName="IMyService")]
public interface IMyService
{
    [OperationContract(
        Action="https://lookingsharp.wordpress.com/demo/IMyService/GetData", 
        ReplyAction="https://lookingsharp.wordpress.com/demo/IMyService/GetDataResponse")]
    CompositeType GetData(CompositeType composite);
}

The ConfigurationName property of the ServiceContract attribute refers to the name of the configuration generated in the output.config file.

Now, in order to prevent WCF from deserializing the response of this service, the return type of the GetData method must be Message, and the input parameter must be of type Message or of a type tagged with the MessageContract attribute. Message contracts are explained in the MSDN article “Using Message Contracts“.

According to the metadata of our service, requests are expected to have the following format:

<GetData xmlns="https://lookingsharp.wordpress.com/demo/">
	<composite>
		<BoolValue>True</BoolValue>
		<StringValue>SomeString</StringValue>
	</composite>
</GetData>

Mapping this request format to a WCF message contract is quite straight-forward. Simply reuse most of the CompositeType class generated by Svcutil and wrap it in a request class tagged with the MessageContract attribute, like this:

[MessageContract(
    WrapperName = "GetData",
    WrapperNamespace = "https://lookingsharp.wordpress.com/demo/")]
public class GetDataRequest
{
    [MessageBodyMember(
        Name = "composite",
        Namespace = "https://lookingsharp.wordpress.com/demo/")]
    public CompositeType Request { get; set; }
}

[DataContract(
    Name = "CompositeType",
    Namespace = "https://lookingsharp.wordpress.com/demo/")]
public class CompositeType
{
    [DataMember]
    public bool BoolValue { get; set; }

    [DataMember]
    public string StringValue { get; set; }
}

I can now modify the service contract generated by Svcutil to accept GetDataRequest objects as input and produce Message objects as output. This results in the following definition:

[ServiceContract(
    Namespace = "https://lookingsharp.wordpress.com/demo/",
    ConfigurationName = "IMyService")]
public interface IMyService
{
    [OperationContract(
        Action = "https://lookingsharp.wordpress.com/demo/IMyService/GetData",
        ReplyAction = "https://lookingsharp.wordpress.com/demo/IMyService/GetDataResponse")]
    Message GetData(GetDataRequest request);
}

As for implementing the actual WCF client, I simply inherit ClientBase<IMyService> and delegate requests to base.Channel:

public class MyServiceClient : ClientBase<IMyService>, IMyService
{
    public Message GetData(GetDataRequest request)
    {
        return base.Channel.GetData(request);
    }
}

By default, this client will use the configuration specified by IMyService. I copy the relevant content of the output.config file generated by Svcutil into my project’s app.config file.

That is it. A console application’s Main method using this client could like like this:

static void Main(string[] args)
{
    var client = new MyServiceClient();
    var request = new GetDataRequest
    {
        Request = new CompositeType
        {
            BoolValue = true,
            StringValue = "Hello "
        }
    };
    var response = client.GetData(request);

    Console.WriteLine(response.ToString());
    Console.ReadLine();
}

Feel free to download the complete source code for this post. (Note that the service requires administrative privileges.)

Custom Message Handling with WCF (Part 1 of 2)

Windows Communication Foundation (WCF) is a great framework for creating service-oriented applications. It provides APIs for creating, hosting and consuming services, through text-based protocols like HTTP or binary protocols like TCP or even named pipes. Most of the time, you would conveniently rely on WCF to serialize and deserialize requests and responses on the fly, letting you focus on the business logic. However, sometimes it is necessary to take full control of the messages transferred to another endpoint, or to interpret response messages without WCF interfering with its deserialization functionality.

To take control of the messages, you need to interacts with WCF’s Message class. By default, when you add a service reference in Visual Studio, a set of classes are generated to represent the elements of the messages recognized by the service. To interact with the Message class, you need to write your own WCF client, rather than relying on Visual Studio service references. This might sound intimidating, but is actually quite easy.

With the .NET SDK comes a tool called ServiceModel Metadata Utility Tool (Svcutil.exe). This command line tool generates WCF clients based on WSDL metadata. As a matter of fact, Visual Studio relies on Svcutil to generate its service references. When creating WCF clients without the support of Visual Studio service references, I find it useful to base my code on the types generated by Svcutil.

In order to interact with the Message class, the service interfaces generated by Svcutil require some slight modifications. MSDN provides an article, “Using the Message Class“, which contains information and examples on how to modify the service interface to enable interaction with the Message class. In short, for each operation you require custom handling, replace the parameters (if any) with a single Message parameter or a single MessageContract-attributed type. Further, if it is a non-void operation, the return type must be Message.

Once the interface is modified to fit your requirements, you are ready to create your custom WCF client. This is, in fact, very easy. Creating a custom WCF client is simply a matter of inheriting ClientBase<TChannel> and delegate all method calls to base.Channel.

Consider this simple interface:

[ServiceContract(
    Namespace = "https://lookingsharp.wordpress.com/demo/",
    ConfigurationName = "IMyService")]
public interface IMyService
{
    [OperationContract(
        Action = "https://lookingsharp.wordpress.com/demo/IMyService/GetData",
        ReplyAction = "https://lookingsharp.wordpress.com/demo/IMyService/GetDataResponse")]
    Message GetData(Message request);
}

The code for the corresponding WCF client would the be:

public class MyServiceClient : ClientBase<IMyService>, IMyService
{
    public Message GetData(Message request)
    {
        return base.Channel.GetData(request);
    }
}

This client will by default use the configuration called IMyService, generated by Svcutil along with the IMyService interface. Include this generated configuration in you application’s app.config file, and tweak it as you wish.

In part two I show, step by step, how to create a WCF client that does not rely on the default WCF message deserialization, and rather prints the raw XML content of the response message to the console.

The Lazy Nature of Sequences

One of the great things about sequences is their lazy nature. As discussed in previous posts, sequences in .NET are represented by IEnumerable<T> objects, in the same manner lists are represented by IList<T> objects. Unlike lists, however, a sequence does not have to be readily available for an IEnumerable<T> object to be created. Your sequence reference is merely a handle to a generator which promises the ability to enumerate a sequence when requested. Let me illustrate this with a code example.

public static IEnumerable<string> FindCommonItems(
    IEnumerable<string> sequence1,
    IEnumerable<string> sequence2)
{
    Console.WriteLine("Iteration starting");
    foreach (string string1 in sequence1)
    {
        foreach (string string2 in sequence2)
        {
            if (string1 == string2)
            {
                Console.WriteLine("Common item found: " + string1);
                yield return string1;
            }
        }
    }
    Console.WriteLine("Iteration completed");
}

This method simply enumerates two string sequences and returns a sequence of all items which appear in both of them. To be able to determine when items are found equal and consequently added to the result sequence, some Console.WriteLine statements have been added. The lazy nature of sequences implies that calling this FindCommonItems method returns a reference to an IEnumerable<string> object without any iterations or any comparisons being made. To verify this, consider the following code example.

static void Main(string[] args)
{
    IEnumerable<string> result = FindCommonItems(
        new string[] { "One", "Two", "Three", "Four" },
        new string[] { "One", "Three", "Four", "Six", "Seven" });

    Console.ReadKey();
}

If you run the code above, you will notice that nothing is written to the console. Not a single line of the FindCommonItems method is run. The IEnumerable<string> variable simply holds a reference to a generator which will provide the sequence if it is ever needed. In the code above, the sequence is never enumerated. Hence, its values are not needed, and no CPU cycles are wasted on iterating the arrays and comparing their items. Consider the following code, however.

static void Main(string[] args)
{
    IEnumerable<string> result = FindCommonItems(
        new string[] { "One", "Two", "Three", "Four" },
        new string[] { "One", "Three", "Four", "Six", "Seven" });

    foreach (string item in result)
    {
        Console.WriteLine(item);
        if (Console.ReadKey().Key == ConsoleKey.Escape)
        {
            break;
        }
    }

    Console.ReadKey();
}

This time, we enumerate the result. Note that the user is free to cancel the enumeration at any time by pressing the Escape key. If the enumeration is canceled, no further iterations or comparisons between the input arrays are made. Again, no CPU cycles are wasted on performing calculations which are not needed.

To further illustrate the benefit of laziness, consider the following method.

public static IEnumerable<string> SkipShortStrings(
    IEnumerable<string> inputSequence)
{
    foreach (string input in inputSequence)
    {
        if (input.Length > 3)
        {
            yield return input;
        }
    }
}

The SkipShortStrings method shown above takes a sequence of string objects as input and produces a sequence of those string objects from the input sequence which are more than three characters long.

The sequence generator produced by this method can be combined with the sequence generator produced by the FindCommonItems method, chaining their operations as follows.

static void Main(string[] args)
{
    IEnumerable<string> result = SkipShortStrings(FindCommonItems(
        new string[] { "One", "Two", "Three", "Four" },
        new string[] { "One", "Three", "Four", "Six", "Seven" }));

    foreach (string item in result)
    {
        Console.WriteLine(item);
        break;
    }
    Console.ReadKey();
}

When running this program, the calculation of the first result from SkipShortStrings only involves fetching two elements from the sequence produced by FindCommonItems (the first element being discarded as too short). An illustration of the execution of this program is shown below. Once again, no items become part of the result until they are actually requested.

Chaining Sequence Generators

The easiest way for any .NET 3.5 developer to take advantage of the lazy nature of sequences is to start using the static methods in the class System.Linq.Enumerable, collectively known as LINQ to objects. These are extension methods which operate on sequences and consequently benefit from their laziness. Consider this simple example.

static void Main(string[] args)
{
    IEnumerable<string> result =
        (new[] { "One", "Two", "Three", "Four" })
        .Where(item => item.StartsWith("T"))
        .Select(item => item.ToUpper());

    foreach (string item in result)
    {
        Console.WriteLine(item);
    }
    Console.ReadKey();
}

This code uses LINQ in order to obtain a sequence of the uppercase variants of all input strings which start with a ‘T’. Before the foreach-loop, LINQ will not perform any iterations or calculations on the string array. This only occurs when the result sequence is enumerated.

As shown in this post, taking advantage of the lazy nature of sequences can lead to a more efficient program execution. In many situations, it will prevent your code from making unnecessary calculations. Also, when chaining several sequence operations, the first result can be produced much sooner than if each operation has to run through an entire collection before passing its result to the next operation.

Lists vs. Sequences: Concatenation

In my previous post, I presented some of the benefits of thinking in sequences rather than lists. This post will discuss a concrete example, concatenation of multiple series of elements.

The case is as follows. You have several series of elements and you would like to aggregate all the contained elements into one concatenated series. This resulting series is for iteration only and does not have to be mutable.

To solve the task using lists, one could write a method like this:

public static IList<T> ConcatenateLists<T>(params IList<T>[] lists)
{
	List<T> retList = new List<T>();
	foreach (var list in lists)
	{
		retList.AddRange(list);
	}
	return retList;
}

The elements of all the lists passed to the method are added to a result list which is ultimately returned to the caller. Semantically, there is absolutely nothing wrong with this method. It produces the expected result.

Consider, however, the performance of this method if the input lists contain a million elements altogether. The result list would need to contain one million elements. If you are lucky, the elements are of a reference type, requiring only storage for one million references. If the elements are of a value type, however, one million objects, including their data, would have to be copied to the result list. If each object holds a kilobyte of data, the result list will need to allocate an entire gigabyte of memory before the iteration can even begin!

Surely, a smarter approach would be beneficial. My proposal is to think in sequences. Since the resulting concatenation only needs to be iterated over, there is no specific need for the result of the method to be a list in the first place. A sequence is sufficient. Further, the input elements can be regarded as a series of sequences as well, yielding a method like follows:

public static IEnumerable<T> ConcatenateEnumerables<T>(params IEnumerable<T>[] enumerables)
{
	foreach (var enumerable in enumerables)
	{
		foreach (var item in enumerable)
		{
			yield return item;
		}
	}
}

Notice the yield return statement. As previously discussed, this facilitates lazy execution. If, for some reason, the caller decides it only needs the first ten elements, then the yield return statement is only executed ten times.

Consider again the scenario of a million kilobyte-sized value type elements being concatenated. This time, the caller will be served the sequence one element at a time, from already allocated memory, reusing a single kilobyte-sized variable through the entire iteration.

Evidently, thinking in sequences can lead to remarkable performance gains. Of course, you need to consider the requirements and determine whether a sequence will meet your needs. If, for any reason, the resulting series has to be manipulated as a list, consider still using sequences and relying on LINQ’s Enumerable.ToList<TSource> extension method to aggregate the elements into a list when needed.

Thinking in Sequences

When dealing with series of objects, it is easy to think of them as lists, or arrays. That is, after all, the first collection types most people get acquainted with when becoming a programmer. And by all means, lists are very versatile and easy to employ in most situations. However, they often provide more functionality than you strictly need. Rather than considering collections of objects as lists, I find it helpful to think of them as sequences.

Since version 1.0, the .NET Framework has provided the IEnumerable interface for iterating over a sequence of objects.  These days, its generic cousin, IEnumerable<T>, introduced in version 2.0, is often preferred. Initially, those interfaces existed mainly to support the foreach statement. However, with the advent of LINQ to Objects, IEnumerable got a morale boost. Suddenly, anyone could easily create advanced queries against any sequence of objects.

In my opinion, the main advantage of IEnumerable is its stream-based nature. Like with streams, the first items of an IEnumerable sequence can be made available for processing without having to collect every single item first. A sequences does not even have to be finite. Consider, for instance, the following code, returning an infinite sequence of all positive integers.

public static IEnumerable<int> EnumerateAllPositiveIntegers()
{
	int integer = 0;
	while(true)
	{
		integer++;
		yield return integer;
	}
}

Note the yield return statement. This is a very handy shortcut which C# provides for creating IEnumerable sequences. Simply yield return every item you wish to include in the sequence. Those who are unfamiliar with such iterator blocks in C# may suspect that the code above will lead to an infinite loop. However, every yield statement will lead to the enumerator’s MoveNext method returning, giving control back to the client object. Of course, if the client iterates the sequence using a regular foreach loop expecting the sequence to terminate, an infinite loop will in fact be the result.

The following code shows how the integer-generating method above can be used in a LINQ query, without risking an infinite loop.

IEnumerable<int> firstFiftyOddPositiveIntegers = EnumerateAllPositiveIntegers()
	.Where(num => num % 2 == 1)
	.Take(50);

Rather than generating an infinite number of integers, consider an operation which needs to complete some time-consuming task for every value it returns. In such cases, exposing the results through an IEnumerable sequence using yield return will allow the client to process each value without having to wait for every value to be produced, collected and returned.

So, when are sequences preferable over lists and other collections? Obviously, if a series of items is potentially infinite, a sequence has to be used; it cannot be represented as a list or collection. Generally, sequences are intended for items to simply be iterated over, while lists are collections you can add items to and remove items from. My rule of thumb is to expose IEnumerable sequences whenever feasible, relying on LINQ to convert the sequence into lists or arrays, should it be necessary.

By thinking in sequences, LINQ makes even more sense than before, and it also makes it easier to spot those situations where a sequence is not sufficient for the task.

Consolas for MSDN Code Samples

For some reason, Microsoft recently decided to use their ancient Courier font for code sample on MSDN. Besides being plain ugly, it doesn’t play well with anti-aliasing technologies like ClearType:

msdn-code-before

If you use Firefox, however, it is possible to apply personal style sheets to web sites. The easiest way to accomplish this is through the Stylish extension. After installing Stylish, visit msdn.microsoft.com and create a new style as shown below:

new-style

Give your new style a descriptive name and paste the following style sheet:

@namespace url(http://www.w3.org/1999/xhtml);

@-moz-document domain("msdn.microsoft.com") 
{
  pre
  {
    font-family: Consolas !important;
  }
}

The result, in my opinion, is much easier on the eyes:

msdn-code-after