C# String Performance

Performance optimization

You are curious how the length and character composition in a string influence string performance. Find out if shorter strings are faster to compare, and how to make your string comparisons faster. Here we measure one aspect of the performance of strings in the C# language.

Benchmark of string Equals on different input

    Unequal strings were compared to each other.
    Different characters near the start were faster.

String ==, loop 1:  669 ms [fastest]
String ==, loop 2:  763 ms
String ==, loop 3: 1019 ms

String equals performance

First, the kind of code we compare here involves two dynamic string variables in the C# language. You can use string.Equals or == on these two strings. We are not using string literals, which would be interned. You can find more information about this with the string.Intern method.

This C# example program tests the performance of comparing strings with different characters.

Test program [C#]

using System;
using System.Diagnostics;

class Program
{
    static void Main()
    {
	string a1 = string.Format("{0}{1}", 1, 1);
	string a2 = string.Format("{0}{1}", 1, 2);

	string b1 = string.Format("{0}{1}{2}{3}{4}{5}", 1, 1, 1, 1, 1, 1);
	string b2 = string.Format("{0}{1}{2}{3}{4}{5}", 1, 1, 1, 2, 2, 2);

	string c1 = string.Format("{0}{1}{2}{3}{4}{5}{6}{7}{8}{9}{10}{11}",
	    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1);
	string c2 = string.Format("{0}{1}{2}{3}{4}{5}{6}{7}{8}{9}{10}{11}",
	    1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2);

	const int m = 100000000;
	Stopwatch s1 = Stopwatch.StartNew();
	for (int i = 0; i < m; i++)
	{
	    if (a1 == a2)
	    {
	    }
	}
	s1.Stop();
	Stopwatch s2 = Stopwatch.StartNew();
	for (int i = 0; i < m; i++)
	{
	    if (b1 == b2)
	    {
	    }
	}
	s2.Stop();
	Stopwatch s3 = Stopwatch.StartNew();
	for (int i = 0; i < m; i++)
	{
	    if (c1 == c2)
	    {
	    }
	}
	s3.Stop();
	Console.WriteLine("{0},{1},{2}", s1.ElapsedMilliseconds,
	    s2.ElapsedMilliseconds, s3.ElapsedMilliseconds);
	Console.Read();
    }
}

Benchmark description. The first part of the benchmark builds up six example strings. The first two strings are each two characters, and the second character is different. The second two strings are each 6 characters, and the fourth character is different. The final two strings are each 12 characters.

Steps

Three loops. The three loops each run 100 million times. The first loop contains the op_Equality instruction for the first two strings. The second and third loops compare the second and third string pairs. The above loops have a substantive difference in performance.

Interpretation of the results. The test indicates that the longer the strings being compared, the longer the comparison takes. However, the results are not tied to the string Length, but the number of characters at the start that are different.

Modifying the benchmark. After running the benchmark, I modified it so that a string in each pair has an earlier character different in the second string. The times dropped.

The first different character. Therefore, the conclusion is that the string equality performance is linked to the first different character. The internal loop for the comparison starts at index 1 and moves forward.

Tip

Programming tip

It is faster to compare two strings when the first different character is nearest the start of the string. Overall it is faster to compare shorter strings.

Summary

Here we saw how you can modify the contents of the strings in C# to enhance performance of Equals and ==. The number of characters at the start of each string that are equal influence the performance negatively. Finally, the impact of string equality comparisons on many programs' performance is low.

String Type
.NET