C#:Regex

.NET Array Dictionary List String 2D Async DataTable Dates DateTime Enum File For Foreach Format IEnumerable If IndexOf Lambda LINQ Parse Path Process Property Regex Replace Row Sort Split Static StringBuilder Substring Switch Tuple

Regex groups. Regex.Match returns a Match object. The Groups property on a Match gets the captured groups within the regular expression. It is useful for extracting a part of a string from a match. It can be used with multiple captured parts.


Example. To start, IndexOf and LastIndexOf are inflexible when compared to Regex.Match. The Regex type gives more control. It lets you specify substrings with a certain range of characters, such as A-Za-z0-9.

Here: This example has good control over what substring it matches. We find characters between two substrings.

C# program that uses Match Groups

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
	// A
	// The input string we are using
	string input = "OneTwoThree";

	// B
	// The regular expression we use to match
	Regex r1 = new Regex(@"One([A-Za-z0-9\-]+)Three");

	// C
	// Match the input and write results
	Match match = r1.Match(input);
	if (match.Success)
	{
	    string v = match.Groups[1].Value;
	    Console.WriteLine("Between One and Three: {0}",
		v);
	}
    }
}

Output

Between One and Three: Two

In part B, we see the verbatim string literal syntax. It escapes characters differently than other string literals. In part C, we call Match on the Regex we created. This returns a Match object. We extract the capture from this object.

String Literal

Note: It is important to use the Groups[1] syntax. The groups are indexed starting at 1, not 0.

Note 2: Some collections in the .NET Framework are indexed starting at 1 not 0. This trips up developers. It makes no sense.


Example 2. Alternatively you can solve this problem by using the IndexOf and LastIndexOf methods. There are many small variations on this code pattern. It is more fragile. It normally requires more development effort.

IndexOfLastIndexOf
C# program that uses Index

using System;

class Program
{
    static void Main()
    {
	// A
	// The input string we are using
	string input = "OneTwoThree";

	// B
	// Find first instance of this string
	int i1 = input.IndexOf("One");
	if (i1 != -1)
	{
	    // C
	    // Find last instance of the last string
	    int i2 = input.LastIndexOf("Three");
	    if (i2 != -1)
	    {
		// D
		// Get the substring and print it.
		int start = i1 + "One".Length;
		int len = i2 - start;
		string bet = input.Substring(start, len);
		Console.WriteLine("Between One and Three: {0}",
		    bet);
	    }
	}
    }
}

Output

Between One and Three: Two

This code uses char positions to find the first instance of the left side string. It then finds the last instance of the right side string. It doesn't restrict the character ranges to certain digits and letters.

Caution: It may fail in other cases. But it is likely that this version is faster.

Regex PerformanceRegexOptions.Compiled

Note: Regex performance is a difficult subject. Often it doesn't matter that much.


Summary. You can find a string between two delimiters of multiple characters. The Split method doesn't give you as much control over delimiters. You can compile these two examples into C# console programs, see their output, and modify as needed.