C# Regex.Match

Regex type

Regex.Match searches strings based on a pattern. It isolates part of a string based on the pattern specified. It requires that you use the text-processing language for the pattern. It proves to be useful and effective in many C# programs.

Strings
Input and output required for examples

Input string:   /content/some-page.aspx
Required match: some-page

Input string:   /content/alternate-1.aspx
Required match: alternate-1

Input string:   /images/something.png
Required match: -

Example

We first see how you can match the filename in a directory path with Regex. This has more constraints regarding the acceptable characters than many methods have. You can see the char range in the second parameter to Regex.Match.

Program that uses Regex.Match: C#

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
	// First we see the input string.
	string input = "/content/alternate-1.aspx";

	// Here we call Regex.Match.
	Match match = Regex.Match(input, @"content/([A-Za-z0-9\-]+)\.aspx$",
	    RegexOptions.IgnoreCase);

	// Here we check the Match instance.
	if (match.Success)
	{
	    // Finally, we get the Group value and display it.
	    string key = match.Groups[1].Value;
	    Console.WriteLine(key);
	}
    }
}

Output

alternate-1
Squares

In this example, we use the @ verbatim string syntax, which designates the syntax we can use in the pattern. Its pattern starts with "content/".
We require that our group,
which is in parentheses,
is after the "content/" string.

String Literal

Also:The symbols in the "[" and "]" are ranges of characters, or single characters. These are the allowed characters in our group.

What it captures from the string. It captures a Group.
The content in the parentheses,
Group,
is collected. Then we require that the match succeeds, and then we access the value with Groups[1].

Tip:It is important to note that the indexing of the Groups collection on Match objects starts at 1.

And:Some computer languages start with 1, but the C# language usually does not. It does here, and we must remember this.

ToLower

String type

Using ToLower instead of RegexOptions.IgnoreCase on the Regex yielded a 10% or higher improvement. Since I needed a lowercase result, calling the C# string ToLower method first was simpler.

ToLower
Program that also uses Regex.Match: C#

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
	// This is the input string.
	string input = "/content/alternate-1.aspx";

	// Here we lowercase our input first.
	input = input.ToLower();
	Match match = Regex.Match(input, @"content/([A-Za-z0-9\-]+)\.aspx$");
    }
}

Static Regex

Programming tip

Here we see that using a Regex instance object is faster than using the static Regex.Match. For performance, you should always use an instance object. It can be shared throughout the entire project.

Static Regex
Program that uses static Regex: C#

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
	// The input string again.
	string input = "/content/alternate-1.aspx";

	// This calls the static method specified.
	Console.WriteLine(RegexUtil.MatchKey(input));
    }
}

static class RegexUtil
{
    static Regex _regex = new Regex(@"/content/([a-z0-9\-]+)\.aspx$");
    /// <summary>
    /// This returns the key that is matched within the input.
    /// </summary>
    static public string MatchKey(string input)
    {
	Match match = _regex.Match(input.ToLower());
	if (match.Success)
	{
	    return match.Groups[1].Value;
	}
	else
	{
	    return null;
	}
    }
}

Output

alternate-1
This section provides information

This static class stores an instance Regex that can be used project-wide. We initialize it inline. The custom method exposes a MatchKey method. This is a useful method I developed to return the string that we want from the input value.

Static Class

Pattern description. It uses a letter range. In this code I show the Regex with the "A-Z" range removed, because the string is already lowercased. I found that removing as many options from the Regex as possible boosted performance.

Tip

Tip:With this code, I found that using RegexOptions.RightToLeft made the pattern slightly faster as well.

Note:The expression engine has to evaluate fewer characters in this case. But this option could slow down or speed up your Regex.

Numbers

Pound symbol

One common requirement is extracting a number from a string. We can do this with Regex.Match. Match handles only one number—if a string has more than one, use instead Regex.Matches.

Next:We extract a group of digit characters and access the Value string representation of that number.

Also:To parse the number, use int.Parse or int.TryParse on the Value here. This will convert it to an int.

int.Parseint.TryParse
Program that uses Match on numbers: C#

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
	// ... Input string.
	string input = "Dot Net 100 Perls";

	// ... One or more digits.
	Match m = Regex.Match(input, @"\d+");

	// ... Write value.
	Console.WriteLine(m.Value);
    }
}

Output

100

Performance

Performance optimization

You can add the RegexOptions.Compiled flag for a substantial performance gain at runtime. This will however make your program start up slower. With RegexOptions.Compiled we see often 30% better performance.

RegexOptions.CompiledPerformance

Summary

We used Regex.Match. This method extracts a single match from the input string. We can access the matched data with the Value property.
And similar methods,
such as IsMatch
and Matches,
are often helpful.

IsMatchMatches

C#: Regex