C# Split String ExamplesUse the string.Split method. Call Split with arguments to separate on newlines, spaces and words.
dot net perls

Split. Bamboo grows in sections. Each part is connected, but also separate. Like bamboo, strings often come in parts—we must separate them with Split().

String parts. Delimiters are the separators in string data. We can split lines and words from a string based on chars, strings or newlines.

First example. We examine the simplest Split method. It receives a char array (one that uses the params keyword) but we can specify this with a single char argument.

Part 1 We invoke Split() with a single character argument. The result value is a string array—it contains 2 elements.


Part 2 We use a foreach-loop to iterate over the strings in the array. We display each word.


C# program that splits on character
using System; class Program { static void Main() { // Contains a semicolon delimiter. string input = "cat;bird"; Console.WriteLine($"Input: {input}"); // Part 1: split on a single character. string[] array = input.Split(';'); // Part 2: use a foreach-loop. // ... Print each value in the array. foreach (string value in array) { Console.WriteLine($"Part: {value}"); } } }
Input: cat;bird Part: cat Part: bird

Multiple characters. Next we use Split() to separate a string based on multiple characters. If Split() will not compile correctly, try adding the StringSplitOptions.

Argument 1 The first argument is the delimiter sequence. We create a string array containing one element.

Argument 2 For the second argument, we specify StringSplitOptions.None to ensure the correct method is called.

C# program that splits on string delimiters
using System; class Program { static void Main() { string value = "cat\r\ndog"; // Split the string on line breaks. string[] lines = value.Split(new string[] { "\r\n" }, StringSplitOptions.None); // Loop over the array. foreach (string line in lines) { Console.WriteLine(line); } } }
cat dog

RemoveEmptyEntries. Sometimes Split() can return an array with empty strings in it—this can be unwanted. This can happen when 2 delimiters are adjacent.

StringSplitOptions This is an enum. It does not need to be allocated with a constructor—it is more like a special int value.


Argument 1 Here we pass arrays for the first argument to string Split(). A char array, and string array, are used.

Argument 2 We use RemoveEntryEmpties as the second parameter to avoid empty results. They are not added to the array.

C# program that splits on multiple characters
using System; class Program { static void Main() { // ... Parts are separated by Windows line breaks. string value = "shirt\r\ndress\r\npants\r\njacket"; // Use a char array of 2 characters (\r and \n). // ... Break lines into separate strings. // ... Use RemoveEmptyEntries so empty strings are not added. char[] delimiters = new char[] { '\r', '\n' }; string[] parts = value.Split(delimiters, StringSplitOptions.RemoveEmptyEntries); Console.WriteLine(":::SPLIT, CHAR ARRAY:::"); for (int i = 0; i < parts.Length; i++) { Console.WriteLine(parts[i]); } // ... Same but uses a string of 2 characters. string[] partsFromString = value.Split(new string[] { "\r\n" }, StringSplitOptions.None); Console.WriteLine(":::SPLIT, STRING:::"); for (int i = 0; i < parts.Length; i++) { Console.WriteLine(parts[i]); } } }
:::SPLIT, CHAR ARRAY::: shirt dress pants jacket :::SPLIT, STRING::: shirt dress pants jacket

Regex.Split, words. We can separate words with Split. Often the best way to separate words in a C# string is to use a Regex that acts upon non-word chars.


Here This example separates words in a string based on non-word characters. It eliminates punctuation and whitespace.

Tip Regex provides more power and control than the string Split methods. But the code is harder to read.

Argument 1 The first argument to Regex.Split is the string we are trying to split apart.

Argument 2 This is a Regex pattern. We can specify any character set (or range) with Regex.Split.

C# program that separates on non-word pattern
using System; using System.Text.RegularExpressions; class Program { static void Main() { const string sentence = "Hello, my friend"; // Split on all non-word characters. // ... This returns an array of all the words. string[] words = Regex.Split(sentence, @"\W+"); foreach (string value in words) { Console.WriteLine("WORD: " + value); } } }
WORD: Hello WORD: my WORD: friend

Regex description:
@ Special verbatim string syntax. \W+ One or more non-word characters together.

Text files. Here we have a text file containing comma-delimited lines of values—a CSV file. We use File.ReadAllLines to read lines, but StreamReader can be used instead.


Then It displays the values of each line after the line number. The output shows how the file was parsed into the strings.

C# program that splits lines in file
using System; using System.IO; class Program { static void Main() { int i = 0; foreach (string line in File.ReadAllLines("TextFile1.txt")) { string[] parts = line.Split(','); foreach (string part in parts) { Console.WriteLine("{0}:{1}", i, part); } i++; // For demonstration. } } }

Contents of input file: TextFile1.txt
Dog,Cat,Mouse,Fish,Cow,Horse,Hyena Programmer,Wizard,CEO,Rancher,Clerk,Farmer
0:Dog 0:Cat 0:Mouse 0:Fish 0:Cow 0:Horse 0:Hyena 1:Programmer 1:Wizard 1:CEO 1:Rancher 1:Clerk 1:Farmer

Directory paths. We can split the segments in a Windows local directory into separate strings. Please note that directory paths are complex. This code may not correctly handle all cases.

Tip We could use Path DirectorySeparatorChar, a char property in System.IO, for more flexibility.


C# program that splits Windows directories
using System; class Program { static void Main() { // The directory from Windows. const string dir = @"C:\Users\Sam\Documents\Perls\Main"; // Split on directory separator. string[] parts = dir.Split('\\'); foreach (string part in parts) { Console.WriteLine(part); } } }
C: Users Sam Documents Perls Main

StringSplitOptions. This affects the behavior of Split. The two values of StringSplitOptions (None and RemoveEmptyEntries) are integers (enums) that tell Split how to work.

Note In this example, the input string contains five commas. These commas are the delimiters.

And Two fields between commas are 0 characters long—they are empty. They are treated differently when we use RemoveEmptyEntries.

First call In the first call to Split, these fields are put into the result array. These elements equal string.Empty.

Second call We specify StringSplitOptions RemoveEmptyEntries. The two empty fields are not in the result array.

C# program that uses StringSplitOptions
using System; class Program { static void Main() { // Input string contain separators. string value1 = "man,woman,child,,,bird"; char[] delimiter1 = new char[] { ',' }; // <-- Split on these // ... Use StringSplitOptions.None. string[] array1 = value1.Split(delimiter1, StringSplitOptions.None); foreach (string entry in array1) { Console.WriteLine(entry); } // ... Use StringSplitOptions.RemoveEmptyEntries. string[] array2 = value1.Split(delimiter1, StringSplitOptions.RemoveEmptyEntries); Console.WriteLine(); foreach (string entry in array2) { Console.WriteLine(entry); } } }
man woman child bird man woman child bird

Benchmark, Split. Here we test strings with 40 and 1200 chars. Speed varied on the contents of strings. The length of blocks, number of delimiters, and total size factor into performance.

Version 1 This code uses Regex.Split to separate the strings apart. It is tested on both a long string and a short string.

Version 2 Uses the string.Split method, but with the first argument being a char array. Two chars are in the char array.

Version 3 Uses string.Split as well, but with a string array argument. The 3 versions are compared.

Result Splitting with a char array is the fastest for both short and long strings. Regex.Split is slowest (but has more features).

C# program that tests string.Split performance
using System; using System.Diagnostics; using System.Text.RegularExpressions; class Program { const int _max = 100000; static void Main() { // Get long string. string value1 = string.Empty; for (int i = 0; i < 120; i++) { value1 += "01234567\r\n"; } // Get short string. string value2 = string.Empty; for (int i = 0; i < 10; i++) { value2 += "ab\r\n"; } // Put strings in array. string[] tests = { value1, value2 }; foreach (string test in tests) { Console.WriteLine("Testing length: " + test.Length); // Version 1: use Regex.Split. var s1 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = Regex.Split(test, "\r\n", RegexOptions.Compiled); if (result.Length == 0) { return; } } s1.Stop(); // Version 2: use char array split. var s2 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = test.Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries); if (result.Length == 0) { return; } } s2.Stop(); // Version 3: use string array split. var s3 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = test.Split(new string[] { "\r\n" }, StringSplitOptions.None); if (result.Length == 0) { return; } } s3.Stop(); Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); Console.WriteLine(((double)(s3.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); } } }
Testing length: 1200 21442.64 ns Regex.Split 5562.63 ns Split char[] 6556.60 ns Split string[] Testing length: 40 2236.22 ns Regex.Split 371.55 ns Split char[] 423.46 ns Split string[]

Benchmark, array argument. Here we examine delimiter performance. It is worthwhile to declare, and allocate, the char array argument as a local variable.

Version 1 This code creates a new char array with 2 elements on each Split call. These must all be garbage-collected.

Version 2 This version uses a single char array, created before the loop. It reuses the cached char array each time.

Result By caching a char array (or string array), we can improve split call performance by a small amount.

C# program that tests Split, cached char array
using System; using System.Diagnostics; class Program { const int _max = 10000000; static void Main() { string value = "a b,c"; char[] delimiterArray = new char[] { ' ', ',' }; // Version 1: split with a new char array on each call. var s1 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = value.Split(new char[] { ' ', ',' }); if (result.Length == 0) { return; } } s1.Stop(); // Version 2: split using a cached char array on each call. var s2 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { string[] result = value.Split(delimiterArray); if (result.Length == 0) { return; } } s2.Stop(); Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns")); } }
87.61 ns Split, new char[] 84.34 ns Split, existing char[]

Arrays. The string Split method receives a character array as the first parameter. Each char in the array designates a new block in the string data.

Char Array

And A string array can also be passed to the Split method. The new string array is created inline with the Split call.


Internals. What is inside Split? The logic internal to the .NET Framework for Split is implemented in managed code. Methods call into an overload with three parameters.

Next The parameters are checked for validity. It uses unsafe code to create a separator list, and a for-loop combined with Substring.


Join. With this method, we can combine separate strings with a separating delimiter. Join() can be used to round-trip data. It is the opposite of split.


Replace. Split does not handle escaped characters. We can instead use Replace on a string input to substitute special characters for any escaped characters.


IndexOf, Substring. Methods can be combined. Using IndexOf and Substring together is another way to split strings. This is sometimes more effective.



A summary. By invoking the Split method, we separate strings. And we solve problems: split divides (separates) strings, and keeps code as simple as possible.

© 2007-2021 sam allen. send bug reports to info@dotnetperls.com.