C# Split String ExamplesUse the string.Split method. Call Split with arguments to separate on newlines, spaces and words.
Split. Consider a sentence with several words—a space separates the words. In C# we can split() this sentence to extract the words into a string array.
Delimiters. This term refers to the separators in string data. We can split lines and words from a string based on chars, strings or newlines.
First example. We examine the simplest Split method. It receives a char array (one that uses the params keyword) but we can specify this with a single char argument.
Part 1 We invoke Split() with a single character argument. The result value is a string array—it contains 2 elements.
Part 2 We use a foreach-loop to iterate over the strings in the array. We display each word.
C# program that splits on character
using System; class Program { static void Main() { // Contains a semicolon delimiter. string input = "cat;bird"; Console.WriteLine($"Input: {input}"); // Part 1: split on a single character. string[] array = input.Split(';'); // Part 2: use a foreach-loop. // ... Print each value in the array. foreach (string value in array) { Console.WriteLine($"Part: {value}"); } } }
Input: cat;bird Part: cat Part: bird
Multiple characters. Next we use Split() to separate a string based on multiple characters. If Split() will not compile correctly, try adding the StringSplitOptions.
Argument 1 The first argument is the delimiter sequence. We create a string array containing one element.
Argument 2 For the second argument, we specify StringSplitOptions.None to ensure the correct method is called.
C# program that splits on string delimiters
using System; class Program { static void Main() { string value = "cat\r\ndog"; // Split the string on line breaks. string[] lines = value.Split(new string[] { "\r\n" }, StringSplitOptions.None); // Loop over the array. foreach (string line in lines) { Console.WriteLine(line); } } }
cat dog
TrimEntries. Often when splitting strings, we want to eliminate some whitespace (like newlines or spaces). In .NET 5, we can use TrimEntries as the second argument to Split.
Warning TrimEntries can help deal with newline sequences, but it will also remove ending and leading spaces.
C# program that uses TrimEntries
using System; class Program { static void Main() { // Windows line break. string value = "linux\r\nwindows"; // Split on newline, and trim resulting strings. // ... This eliminates the other whitespace sequences. string[] lines = value.Split('\n', StringSplitOptions.TrimEntries); for (int i = 0; i < lines.Length; i++) { Console.WriteLine("ITEM: [{0}]", lines[i]); } } }
ITEM: [linux] ITEM: [windows]
RemoveEmptyEntries. Like TrimEntries, this is an enum argument that affects the behavior of Split. In this example, the input string contains 5 commas (delimiters).
Info Two fields between commas are 0 characters long—they are empty. They are treated differently when we use RemoveEmptyEntries.
Result We specify StringSplitOptions RemoveEmptyEntries. The 2 empty fields are not in the result array.
C# program that uses StringSplitOptions
using System; class Program { static void Main() { string value = "x,y,z,,,a"; // Remove empty strings from result. string[] array = value.Split(',', StringSplitOptions.RemoveEmptyEntries); foreach (string entry in array) { Console.WriteLine(entry); } } }
x y z a
Regex.Split, words. We can separate words with Split. Often the best way to separate words in a C# string is to use a Regex that acts upon non-word chars.
Here This example separates words in a string based on non-word characters. It eliminates punctuation and whitespace.
Tip Regex provides more power and control than the string Split methods. But the code is harder to read.
Argument 1 The first argument to Regex.Split is the string we are trying to split apart.
Argument 2 This is a Regex pattern. We can specify any character set (or range) with Regex.Split.
C# program that separates on non-word pattern
using System; using System.Text.RegularExpressions; class Program { static void Main() { const string sentence = "Hello, my friend"; // Split on all non-word characters. // ... This returns an array of all the words. string[] words = Regex.Split(sentence, @"\W+"); foreach (string value in words) { Console.WriteLine("WORD: " + value); } } }
WORD: Hello WORD: my WORD: friend
@ Special verbatim string syntax. \W+ One or more non-word characters together.
Text files. Here we have a text file containing comma-delimited lines of values—a CSV file. We use File.ReadAllLines to read lines, but StreamReader can be used instead.
Then The program displays the values of each line after the line number. The output shows how the file was parsed into the strings.
C# program that splits lines in file
using System; using System.IO; class Program { static void Main() { int i = 0; foreach (string line in File.ReadAllLines("TextFile1.txt")) { string[] parts = line.Split(','); foreach (string part in parts) { Console.WriteLine("{0}:{1}", i, part); } i++; // For demonstration. } } }
Dog,Cat,Mouse,Fish,Cow,Horse,Hyena Programmer,Wizard,CEO,Rancher,Clerk,Farmer
0:Dog 0:Cat 0:Mouse 0:Fish 0:Cow 0:Horse 0:Hyena 1:Programmer 1:Wizard 1:CEO 1:Rancher 1:Clerk 1:Farmer
Directory paths. We can split the segments in a Windows local directory into separate strings. Please note that directory paths are complex. This code may not correctly handle all cases.
Tip We could use Path DirectorySeparatorChar, a char property in System.IO, for more flexibility.
C# program that splits Windows directories
using System; class Program { static void Main() { // The directory from Windows. const string dir = @"C:\Users\Sam\Documents\Perls\Main"; // Split on directory separator. string[] parts = dir.Split('\\'); foreach (string part in parts) { Console.WriteLine(part); } } }
C: Users Sam Documents Perls Main
Benchmark, Split. Here we test strings with 40 and 1200 chars. Speed varied on the contents of strings. The length of blocks, number of delimiters, and total size factor into performance.
Version 1 This code uses Regex.Split to separate the strings apart. It is tested on both a long string and a short string.
Version 2 This code uses the string.Split method, but with the first argument being a char array. Two chars are in the char array.
Version 3 This version uses string.Split as well, but with a string array argument.
Result On .NET 5 for Linux (in 2021), Regex.Split remains the slowest. Splitting on a char or string is faster.
C# program that tests string.Split performance
using System; using System.Diagnostics; using System.Text.RegularExpressions; class Program { const int _max = 100000; static void Main() { // Get long string. string value1 = string.Empty; for (int i = 0; i < 120; i++) { value1 += "01234567\r\n"; } // Get short string. string value2 = string.Empty; for (int i = 0; i < 10; i++) { value2 += "ab\r\n"; } // Put strings in array. string[] tests = { value1, value2 }; foreach (string test in tests) { Console.WriteLine("Testing length: " + test.Length); // Version 1: use Regex.Split. var s1 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = Regex.Split(test, "\r\n", RegexOptions.Compiled); if (result.Length == 0) { return; } } s1.Stop(); // Version 2: use char array split. var s2 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = test.Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries); if (result.Length == 0) { return; } } s2.Stop(); // Version 3: use string array split. var s3 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = test.Split(new string[] { "\r\n" }, StringSplitOptions.None); if (result.Length == 0) { return; } } s3.Stop(); Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); Console.WriteLine(((double)(s3.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); } } }
Testing length: 1200 7546.61 ns 4483.39 ns 5632.97 ns Testing length: 40 786.97 ns 357.58 ns 344.27 ns
Benchmark, array argument. Here we examine delimiter performance. It is worthwhile to declare, and allocate, the char array argument as a local variable.
Version 1 This code creates a new char array with 2 elements on each Split call. These must all be garbage-collected.
Version 2 This version uses a single char array, created before the loop. It reuses the cached char array each time.
Result On .NET 5, in 2021 on Linux, caching the array argument to Split() helps performance.
C# program that tests Split, cached char array
using System; using System.Diagnostics; class Program { const int _max = 10000000; static void Main() { string value = "a b,c"; char[] delimiterArray = new char[] { ' ', ',' }; // Version 1: split with a new char array on each call. var s1 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = value.Split(new char[] { ' ', ',' }); if (result.Length == 0) { return; } } s1.Stop(); // Version 2: split using a cached char array on each call. var s2 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = value.Split(delimiterArray); if (result.Length == 0) { return; } } s2.Stop(); Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); } }
83.70 ns Split, new char[] 76.83 ns Split, existing char[]
Arrays. The string Split method can receive a char array as its first parameter. Each char in the array is considered a string delimiter.
Char Array
And A string array can also be passed to Split(). The string array can be created inline with the Split call.
Performance note. The performance of .NET is undergoing a transformation with .NET 5 and open source in 2021. Now Regex.Split benchmarks closer to other Split methods.
Join. With this method, we can combine separate strings with a separating delimiter. Join() can be used to round-trip data. It is the opposite of split.
Replace. Split does not handle escaped characters. We can instead use Replace on a string input to substitute special characters for any escaped characters.
IndexOf, Substring. Methods can be combined. Using IndexOf and Substring together is another way to split strings. This is sometimes more effective.
A summary. By invoking the Split method, we separate strings. And we solve problems: split divides (separates) strings, and keeps code as simple as possible.
© 2007-2021 sam allen. see site info on the changelog