`Capacity`

Is it worthwhile to set capacities? We are able to specify capacity for C# collections. This optional capacity property adjusts the amount of memory allocated.

Capacity is specified in the constructor of List and Dictionary. Some other types, like StringBuilder, can also set a capacity.

Benchmark

Here is a benchmark that tests capacity settings. Many Dictionaries in real C# programs have a string key and around 100-1000 elements.

So We use a 100-element size of the collections, and add 100 elements to each of the instances.

Result It is advantageous to specify capacity. To estimate a capacity, you can print out a collection's Count property at the end.

Tip If you are working with 150 elements, you can simply use the constant 200 for better performance (saving several array resizes).

Tip 2 Make a static class and use const fields called ElementCountEstimate or other fields with the word estimate.

using System;
using System.Collections.Generic;

class Program
{
    const int _m = 100000;
    static List<string> _values = new List<string>();

    public static void Main()
    {
        // Add 100 strings for testing.
        for (int i = 0; i < 100; i++)
        {
            _values.Add("value" + i.ToString());
        }

        long t1 = Environment.TickCount;

        A_Dictionary();

        long t2 = Environment.TickCount;

        B_DictionaryCapacity();

        long t3 = Environment.TickCount;

        C_DictionaryCapacity();

        long t4 = Environment.TickCount;

        D_DictionaryCapacity();

        long t5 = Environment.TickCount;

        E_List();

        long t6 = Environment.TickCount;

        F_ListCapacity();

        long t7 = Environment.TickCount;

        // Write Dictionary times.
        Console.WriteLine("A_Dictionary: " + (t2 - t1) + " ms");
        Console.WriteLine("B_DictionaryCapacity: " + (t3 - t2) + " ms");
        Console.WriteLine("C_DictionaryCapacity: " + (t4 - t3) + " ms");
        Console.WriteLine("D_DictionaryCapacity: " + (t5 - t4) + " ms");

        // Write List times.
        Console.WriteLine("E_List: " + (t6 - t5) + " ms");
        Console.WriteLine("F_ListCapacity: " + (t7 - t6) + " ms");
    }

    static void A_Dictionary()
    {
        // No capacity.
        for (int i = 0; i < _m; i++)
        {
            var d = new Dictionary<string, int>();
            foreach (string k in _values)
            {
                d.Add(k, 0);
            }
        }
    }

    static void B_DictionaryCapacity()
    {
        // Capacity from collection Count.
        for (int i = 0; i < _m; i++)
        {
            var d = new Dictionary<string, int>(_values.Count);
            foreach (string k in _values)
            {
                d.Add(k, 0);
            }
        }
    }

    static void C_DictionaryCapacity()
    {
        // Const capacity.
        for (int i = 0; i < _m; i++)
        {
            var d = new Dictionary<string, int>(100);
            foreach (string k in _values)
            {
                d.Add(k, 0);
            }
        }
    }

    static void D_DictionaryCapacity()
    {
        // Huge capacity (10 times too large).
        for (int i = 0; i < _m; i++)
        {
            var d = new Dictionary<string, int>(1000);
            foreach (string k in _values)
            {
                d.Add(k, 0);
            }
        }
    }

    static void E_List()
    {
        // No capacity.
        for (int i = 0; i < _m * 5; i++)
        {
            var l = new List<string>();
            foreach (string k in _values)
            {
                l.Add(k);
            }
        }
    }

    static void F_ListCapacity()
    {
        // Exact capacity.
        for (int i = 0; i < _m * 5; i++)
        {
            var l = new List<string>(100);
            foreach (string k in _values)
            {
                l.Add(k);
            }
        }
    }
}A_Dictionary: 500 ms
B_DictionaryCapacity: 328 ms
C_DictionaryCapacity: 329 ms
D_DictionaryCapacity: 484 ms
E_List: 547 ms
F_ListCapacity: 437 ms

Some research

When we don't specify a capacity, the buffer will be reallocated as we keep adding elements. This makes populating a Dictionary or List much slower.

Notes, resizing

Should developers bother with Dictionary or List capacity? Having the runtime resize the underlying arrays is expensive.

And Resizing can double the time required to add elements—even in small collections.

Note Another final thing to consider is memory pressure. This increases when reallocations occur. More garbage collections are required.

The memory model of C# programs is complex. But giving programs hints about the sizes of the collections proves to be a substantial optimization.