C# HybridDictionary

Squares: different patterns

HybridDictionary attempts to optimize Hashtable. It implements a linked list and hash table data structure, switching over to the second from the first when the number of elements increases past a certain threshold.

Tip:The HybridDictionary type is found in the System.Collections.Specialized namespace.

Performance graph for HybridDictionary

Example

HybridDictionary is used in the same way as Hashtable from System.Collections. The example includes the System.Collections namespace for the DictionaryEntry class, which is an instance of the System.ValueType struct.

DictionaryEntry
Program that uses HybridDictionary: C#

using System;
using System.Collections;
using System.Collections.Specialized;

class Program
{
    static HybridDictionary GetHybridDictionary()
    {
	// Get and return HybridDictionary instance
	HybridDictionary hybrid = new HybridDictionary();
	// Use Add method
	hybrid.Add("cat", 1);
	hybrid.Add("dog", 2);
	// Use indexer
	hybrid["rat"] = 0;
	return hybrid;
    }

    static void Main()
    {
	// Get HybridDictionary
	HybridDictionary hybrid = GetHybridDictionary();

	// Get values from HybridDictionary
	int value1 = (int)hybrid["cat"];
	object value2 = hybrid["notfound"];
	object value3 = hybrid["dog"];
	int count1 = hybrid.Count;

	// Display values
	Console.WriteLine(value1);
	Console.WriteLine(value2 == null);
	Console.WriteLine(value3);
	Console.WriteLine(count1);

	// Enumerate HybridDictionary
	foreach (DictionaryEntry entry in hybrid)
	{
	    Console.Write(entry.Key);
	    Console.Write(": ");
	    Console.WriteLine(entry.Value);
	}
    }
}

Output

1
True
2
3
cat: 1
dog: 2
rat: 0
Main method

The Program class defined in the program has two function members. It has the GetHybridDictionary method (which internally creates a new HybridDictionary type instance) and the Main entry point.

Note:The execution of the program begins in the Main entry point, and then calls into the GetHybridDictionary method.

KeyValuePair: Key and Value properties

Inside GetHybridDictionary, a HybridDictionary is allocated. It is populated with some key-value pairs. You can use the Add method or assign new entries with the string parameter indexer. This behavior is like that of Hashtable.

Hashtable

Getting values by key. When control flow returns to the Main method, the values in the HybridDictionary are tested. You can access the values by using the string indexer. You can assign an object, which aliases System.Object type.

And:After you acquire the object, you must cast the reference or value type. This impacts performance.

Note:When you cast value types, you incur an unboxing instruction, which copies.

Arrow indicates looping

Enumerating HybridDictionary. Finally, the console application loops through the HybridDictionary. The foreach iteration variable is read-only and contains two important properties, the Key and Value members.

Foreach

Also:These are instances of System.Object. They carry no information about their underlying type.

Object

Benchmark

Performance optimization

Here we look at a benchmark of HybridDictionary. The setup of the benchmark tests three collections—the HybridDictionary, the Hashtable, and the Dictionary constructed type—with a series of numbers of elements.

The count of elements tested is the range between 0 and 20 inclusive. After the three tight loops, the results are printed, and the for iteration statement moves to the next test.

Benchmark for HybridDictionary: C#

using System;
using System.Collections;
using System.Collections.Generic;
using System.Collections.Specialized;
using System.Diagnostics;
using System.IO;

class Program
{
    static void Main()
    {
	// Loop over element count
	for (int length = 0; length <= 20; length++)
	{
	    // Get the array
	    var array = GetStringArray(length);

	    // Get the collections
	    var hybrid = GetHybridDictionary(array);
	    var hash = GetHashtable(array);
	    var dict = GetDictionary(array);

	    // Adjust the maximum iterations
	    int m = 1000000;
	    if (length <= 1)
	    {
		m *= 100;
	    }
	    else if (length <= 5)
	    {
		m *= 10;
	    }

	    // Test HybridDictionary [Blue]
	    Stopwatch s1 = Stopwatch.StartNew();
	    for (int i = 0; i < m; i++)
	    {
		foreach (string value in array)
		{
		    bool flag = (bool)hybrid[value];
		}
	    }
	    s1.Stop();

	    // Test Hashtable [Red]
	    Stopwatch s2 = Stopwatch.StartNew();
	    for (int i = 0; i < m; i++)
	    {
		foreach (string value in array)
		{
		    bool flag = (bool)hash[value];
		}
	    }
	    s2.Stop();

	    // Test Dictionary [Green]
	    Stopwatch s3 = Stopwatch.StartNew();
	    for (int i = 0; i < m; i++)
	    {
		foreach (string value in array)
		{
		    bool flag = dict[value];
		}
	    }
	    s3.Stop();

	    // Output benchmarks to Console.
	    Console.Write(s1.ElapsedMilliseconds);
	    Console.Write(", ");
	    Console.Write(s2.ElapsedMilliseconds);
	    Console.Write(", ");
	    Console.Write(s3.ElapsedMilliseconds);
	    Console.Write(" [");
	    Console.Write(m);
	    Console.Write(", ");
	    Console.Write(length);
	    Console.WriteLine("]");
	}
	Console.Read();
    }

    static string[] GetStringArray(int size)
    {
	// Get array of random strings
	string[] array = new string[size];
	for (int i = 0; i < size; i++)
	{
	    array[i] = Path.GetRandomFileName();
	}
	return array;
    }

    static HybridDictionary GetHybridDictionary(string[] array)
    {
	// Get HybridDictionary filled with all the strings
	HybridDictionary hybrid = new HybridDictionary();
	foreach (string value in array)
	{
	    hybrid.Add(value, true);
	}
	return hybrid;
    }

    static Hashtable GetHashtable(string[] array)
    {
	// Get Hashtable filled with all the strings
	Hashtable table = new Hashtable();
	foreach (string value in array)
	{
	    table.Add(value, true);
	}
	return table;
    }

    static Dictionary<string, bool> GetDictionary(string[] array)
    {
	// Get Dictionary filled with all strings
	Dictionary<string, bool> dictionary = new Dictionary<string, bool>();
	foreach (string value in array)
	{
	    dictionary.Add(value, true);
	}
	return dictionary;
    }
}

Results

175,164,63 [100000000, 0]
2001,5116,4898 [100000000, 1]
544,1101,1072 [10000000, 2]
1007,1552,1494 [10000000, 3]
1737,2170,1931 [10000000, 4]
2422,3124,2395 [10000000, 5]
307,390,288 [1000000, 6]
401,427,349 [1000000, 7]
486,390,386 [1000000, 8]
554,506,469 [1000000, 9]
559,504,482 [1000000, 10]
685,617,579 [1000000, 11]
746,824,622 [1000000, 12]
851,780,757 [1000000, 13]
956,875,726 [1000000, 14]
1072,951,762 [1000000, 15]
1073,972,853 [1000000, 16]
1030,943,892 [1000000, 17]
1096,974,961 [1000000, 18]
1381,1182,924 [1000000, 19]
1233,1112,1048 [1000000, 20]
Note

The benchmark uses an increased number of iterations when testing all element counts equal to or less than 5. This is to provide increased accuracy for the faster loops, but is not ideal.

The GetStringArray method here generates a string array of the specified size, and fills it with random strings. The strings are from Path.GetRandomFileName, which (as expected) returns random file names.

String type

Also, the benchmark defines three more methods which return collections of the specified types. Internally, these methods populate the collections with all the strings in the string array passed as a parameter.

Dictionary

Thus:The end result is that the three collections all have the same number of keys with the same values.

In the tight loops in the benchmark, the bool value from the key of each and every string is looked up. These loops test the lookup speed of the collections. The benchmark here does not test collection fill times.

Internals

Framework: NET

The code next shows the implementation of the Add method on HybridDictionary. You can see that there is some additional logic when adding keys, but this amounts to one or two null checks most of the time.

Note:When the count is 8 and you try to add another key, the third condition will evaluate to true and the ChangeOver method will execute.

And:The ChangeOver method copies the entire contents of the HybridDictionary to a Hashtable.

Add method in HybridDictionary: C#

public void Add(object key, object value)
{
    if (this.hashtable != null)
    {
	this.hashtable.Add(key, value);
    }
    else if (this.list == null)
    {
	this.list = new ListDictionary(this.caseInsensitive ?
				       StringComparer.OrdinalIgnoreCase :
				       null);
	this.list.Add(key, value);
    }
    else if ((this.list.Count + 1) >= 9)
    {
	this.ChangeOver();
	this.hashtable.Add(key, value);
    }
    else
    {
	this.list.Add(key, value);
    }
}

Internally, the ChangeOver method is a private void parameterless method that uses a loop to copy all the entries in the ListDictionary to a Hashtable. The Hashtable is initialized with a capacity of 13 elements.

Generics

Generic type

There are measurable performance improvements with the HybridDictionary on smaller data sets. Despite this, there is no collection that has the same "ChangeOver" logic in the System.Collections.Generic namespace.

Depending on low-level implementation details in the CLR, a generic HybridDictionary could prove to be a valuable collection. It could entirely replace Dictionary in many programs and result in an overall performance improvement.

Generic ClassQuestion

Note:The benefits of HybridDictionary occur with the fastest lookups, particularly those with less than five elements.

However:Few programs will have severe performance problems when using five-element collections. So the true gain here is not dramatic.

Summary

We saw the HybridDictionary in the System.Collections.Specialized namespace. The collection has genuine performance improvements on small collections, but suffers measurably on larger collections due to the overhead of the internal logic.


C#: Collections