C# GetHashCode (String Method)Examine the string GetHashCode method and analyze its performance characteristics.
dot net perls

GetHashCode. This method implements hashing in the .NET Framework. We look at the string GetHashCode virtual method implementation.


Notes, implementation. GetHashCode is located in mscorlib. It uses unsafe code (with shifts and bitwise operations) to compute well-distributed hash codes.


An example. We can call GetHashCode on a string, but this usually doesn't do much good. This program shows how when we append a char to a string, the hash code keeps changing.

So The hash computation appears to be effective—no collisions are seen when we just append a character.

C# program that uses GetHashCode
using System; class Program { static void Main() { string value = ""; for (int i = 0; i < 10; i++) { value += "x"; // This calls the unsafe code listed on this page. Console.WriteLine("GETHASHCODE: " + value.GetHashCode()); } } }
GETHASHCODE: -842352680 GETHASHCODE: -838682664 GETHASHCODE: -126710822 GETHASHCODE: 1988791258 GETHASHCODE: 847333320 GETHASHCODE: 844711880 GETHASHCODE: 1653319342 GETHASHCODE: -1698453842 GETHASHCODE: -795746332 GETHASHCODE: -803610652

Internal code. Here is a disassembled version of the intermediate language instructions that form the GetHashCode override method in the System.String type.

Tip Hash codes are used in Dictionary instances and other associative collections.


And This method is located in mscorlib. It is invoked in every program that uses a Dictionary or Hashtable using C# or VB.NET.

Note It uses several magic constants and low-level pointer arithmetic to achieve its performance requirement.

String GetHashCode method implementation: C#
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)] public override unsafe int GetHashCode() { fixed (char* str = this) { char* chPtr = str; int num = 352654597; int num2 = num; int* numPtr = (int*)chPtr; for (int i = this.Length; i > 0; i -= 4) { num = (((num << 5) + num) + (num >> 27)) ^ numPtr[0]; if (i <= 2) { break; } num2 = (((num2 << 5) + num2) + (num2 >> 27)) ^ numPtr[1]; numPtr += 2; } return (num + (num2 * 1566083941)); } }

Notes, method. The method is decorated with the unsafe keyword, which signals that it can use lower-level instructions such as pointer arithmetic.

Note Using the unsafe keyword also has higher security demands on executables. The built-in GetHashCode method avoids these problems.

Notes, fixed. The fixed keyword indicates the char* data must not be moved by the garbage collector during any possible collections while the fixed block is executing.

Note Once you apply the fixed keyword to the context, you can use pointer arithmetic on the string's internal character vector.


Notes, unrolled loop. The method body loop uses loop unwinding. This is a technique that can skip several places ahead on each loop iteration, rather than just a single place.

For: Unrolled

Notes, bit manipulations. The loop body uses the >> and << operators in several places. This operation does a bitwise manipulation that shifts the bits left or right.

Note By shifting bits in a hash computation, we avoid funneling. This is where the computations made later erase those made previously.


Performance. The GetHashCode method here is virtual and this incurs a small performance overhead. Often the optimizations applied runtime cancel out this inefficiency.

Correction This article previously had an error in the section on bit manipulations. Bret Mulvey wrote in with the correct information.

IEqualityComparer. This class can specify how a class should compare items and compute identity numbers for hash tables. We can customize the hash code based on specific data.


A summary. GetHashCode is called when we use a Hashtable or Dictionary with string keys. It has unsafe code that uses pointer arithmetic, bit shifting and an unwound loop.

© 2007-2021 sam allen. send bug reports to info@dotnetperls.com.