The Lazy Nature of Sequences

One of the great things about sequences is their lazy nature. As discussed in previous posts, sequences in .NET are represented by IEnumerable<T> objects, in the same manner lists are represented by IList<T> objects. Unlike lists, however, a sequence does not have to be readily available for an IEnumerable<T> object to be created. Your sequence reference is merely a handle to a generator which promises the ability to enumerate a sequence when requested. Let me illustrate this with a code example.

public static IEnumerable<string> FindCommonItems(
    IEnumerable<string> sequence1,
    IEnumerable<string> sequence2)
{
    Console.WriteLine("Iteration starting");
    foreach (string string1 in sequence1)
    {
        foreach (string string2 in sequence2)
        {
            if (string1 == string2)
            {
                Console.WriteLine("Common item found: " + string1);
                yield return string1;
            }
        }
    }
    Console.WriteLine("Iteration completed");
}

This method simply enumerates two string sequences and returns a sequence of all items which appear in both of them. To be able to determine when items are found equal and consequently added to the result sequence, some Console.WriteLine statements have been added. The lazy nature of sequences implies that calling this FindCommonItems method returns a reference to an IEnumerable<string> object without any iterations or any comparisons being made. To verify this, consider the following code example.

static void Main(string[] args)
{
    IEnumerable<string> result = FindCommonItems(
        new string[] { "One", "Two", "Three", "Four" },
        new string[] { "One", "Three", "Four", "Six", "Seven" });

    Console.ReadKey();
}

If you run the code above, you will notice that nothing is written to the console. Not a single line of the FindCommonItems method is run. The IEnumerable<string> variable simply holds a reference to a generator which will provide the sequence if it is ever needed. In the code above, the sequence is never enumerated. Hence, its values are not needed, and no CPU cycles are wasted on iterating the arrays and comparing their items. Consider the following code, however.

static void Main(string[] args)
{
    IEnumerable<string> result = FindCommonItems(
        new string[] { "One", "Two", "Three", "Four" },
        new string[] { "One", "Three", "Four", "Six", "Seven" });

    foreach (string item in result)
    {
        Console.WriteLine(item);
        if (Console.ReadKey().Key == ConsoleKey.Escape)
        {
            break;
        }
    }

    Console.ReadKey();
}

This time, we enumerate the result. Note that the user is free to cancel the enumeration at any time by pressing the Escape key. If the enumeration is canceled, no further iterations or comparisons between the input arrays are made. Again, no CPU cycles are wasted on performing calculations which are not needed.

To further illustrate the benefit of laziness, consider the following method.

public static IEnumerable<string> SkipShortStrings(
    IEnumerable<string> inputSequence)
{
    foreach (string input in inputSequence)
    {
        if (input.Length > 3)
        {
            yield return input;
        }
    }
}

The SkipShortStrings method shown above takes a sequence of string objects as input and produces a sequence of those string objects from the input sequence which are more than three characters long.

The sequence generator produced by this method can be combined with the sequence generator produced by the FindCommonItems method, chaining their operations as follows.

static void Main(string[] args)
{
    IEnumerable<string> result = SkipShortStrings(FindCommonItems(
        new string[] { "One", "Two", "Three", "Four" },
        new string[] { "One", "Three", "Four", "Six", "Seven" }));

    foreach (string item in result)
    {
        Console.WriteLine(item);
        break;
    }
    Console.ReadKey();
}

When running this program, the calculation of the first result from SkipShortStrings only involves fetching two elements from the sequence produced by FindCommonItems (the first element being discarded as too short). An illustration of the execution of this program is shown below. Once again, no items become part of the result until they are actually requested.

Chaining Sequence Generators

The easiest way for any .NET 3.5 developer to take advantage of the lazy nature of sequences is to start using the static methods in the class System.Linq.Enumerable, collectively known as LINQ to objects. These are extension methods which operate on sequences and consequently benefit from their laziness. Consider this simple example.

static void Main(string[] args)
{
    IEnumerable<string> result =
        (new[] { "One", "Two", "Three", "Four" })
        .Where(item => item.StartsWith("T"))
        .Select(item => item.ToUpper());

    foreach (string item in result)
    {
        Console.WriteLine(item);
    }
    Console.ReadKey();
}

This code uses LINQ in order to obtain a sequence of the uppercase variants of all input strings which start with a ‘T’. Before the foreach-loop, LINQ will not perform any iterations or calculations on the string array. This only occurs when the result sequence is enumerated.

As shown in this post, taking advantage of the lazy nature of sequences can lead to a more efficient program execution. In many situations, it will prevent your code from making unnecessary calculations. Also, when chaining several sequence operations, the first result can be produced much sooner than if each operation has to run through an entire collection before passing its result to the next operation.