HomeSearch

C# Capacity for List, Dictionary

This C# page uses capacity on collections. Setting capacity affects the performance of adding elements.
Capacity. Is it worthwhile to set capacities? We are able to specify capacity for C# collections. This optional capacity property adjusts the amount of memory allocated.
Some notes. Capacity is specified in the constructor of List and Dictionary. Some other types, like StringBuilder, can also set a capacity.ListDictionaryStringBuilder
Some research. When we don't specify a capacity, the buffer will be reallocated as we keep adding elements. This makes populating a Dictionary or List much slower.

Quote: The capacity of a Dictionary(TKey, TValue) is the number of elements that can be added to the Dictionary(TKey, TValue) before resizing is necessary.

Quote: As elements are added to a Dictionary(TKey, TValue), the capacity is automatically increased as required by reallocating the internal array.

Dictionary Constructor: Microsoft Docs
Notes, resizing. Should developers bother with Dictionary or List capacity? Having the runtime resize the underlying arrays is expensive.

And: Resizing can double the time required to add elements—even in small collections.

Note: Another final thing to consider is memory pressure. This increases when reallocations occur. More garbage collections are required.

A benchmark. Here is a benchmark that tests capacity settings. Many Dictionaries in real C# programs have a string key and around 100-1000 elements.

So: We used a 100-element size of the collections. We add 100 elements to each of the instances.

C# program that tests capacity using System; using System.Collections.Generic; class Program { const int _m = 100000; static List<string> _values = new List<string>(); public static void Main() { // Add 100 strings for testing. for (int i = 0; i < 100; i++) { _values.Add("value" + i.ToString()); } long t1 = Environment.TickCount; A_Dictionary(); long t2 = Environment.TickCount; B_DictionaryCapacity(); long t3 = Environment.TickCount; C_DictionaryCapacity(); long t4 = Environment.TickCount; D_DictionaryCapacity(); long t5 = Environment.TickCount; E_List(); long t6 = Environment.TickCount; F_ListCapacity(); long t7 = Environment.TickCount; // Write Dictionary times. Console.WriteLine("A_Dictionary: " + (t2 - t1) + " ms"); Console.WriteLine("B_DictionaryCapacity: " + (t3 - t2) + " ms"); Console.WriteLine("C_DictionaryCapacity: " + (t4 - t3) + " ms"); Console.WriteLine("D_DictionaryCapacity: " + (t5 - t4) + " ms"); // Write List times. Console.WriteLine("E_List: " + (t6 - t5) + " ms"); Console.WriteLine("F_ListCapacity: " + (t7 - t6) + " ms"); } static void A_Dictionary() { // No capacity. for (int i = 0; i < _m; i++) { var d = new Dictionary<string, int>(); foreach (string k in _values) { d.Add(k, 0); } } } static void B_DictionaryCapacity() { // Capacity from collection Count. for (int i = 0; i < _m; i++) { var d = new Dictionary<string, int>(_values.Count); foreach (string k in _values) { d.Add(k, 0); } } } static void C_DictionaryCapacity() { // Const capacity. for (int i = 0; i < _m; i++) { var d = new Dictionary<string, int>(100); foreach (string k in _values) { d.Add(k, 0); } } } static void D_DictionaryCapacity() { // Huge capacity (10 times too large). for (int i = 0; i < _m; i++) { var d = new Dictionary<string, int>(1000); foreach (string k in _values) { d.Add(k, 0); } } } static void E_List() { // No capacity. for (int i = 0; i < _m * 5; i++) { var l = new List<string>(); foreach (string k in _values) { l.Add(k); } } } static void F_ListCapacity() { // Exact capacity. for (int i = 0; i < _m * 5; i++) { var l = new List<string>(100); foreach (string k in _values) { l.Add(k); } } } } Output A_Dictionary: 500 ms B_DictionaryCapacity: 328 ms C_DictionaryCapacity: 329 ms D_DictionaryCapacity: 484 ms E_List: 547 ms F_ListCapacity: 437 ms
Results. It is advantageous to specify capacity in your C# programs. You can modify your program to print out what the Count property of your collections is after they are used.

And: If you are working with 150 elements, you can simply use the constant 200 for better performance (saving several array resizes).

Tip: Make a static class and use const fields called ElementCountEstimate or other fields with the word estimate.

A summary. The memory model of C# programs is complex. But giving programs hints about the sizes of the collections proves to be a substantial optimization.Optimization
© 2007-2019 Sam Allen. Every person is special and unique. Send bug reports to info@dotnetperls.com.
Home
Dot Net Perls