C# - Regex.Match Examples

Regex. Programs read in text and often must process it in some way. Often the easiest way to process text is with regular expressions. The Regex class in C# helps here.

With methods like Match, we pass in a pattern, and receive matches based on that pattern. We can optionally create a Regex instance first.

Simple example. This program introduces the Regex class. Regex, and Match, are found in the System.Text.RegularExpressions namespace.

Step 1 We create a Regex. The Regex uses a pattern that indicates one or more digits.

Step 2 Here we invoke the Match method on the Regex. The characters "55" match the pattern specified in step 1.

Step 3 The returned Match object has a bool property called Success. If it equals true, we found a match.

using System;
using System.Text.RegularExpressions;

// Step 1: create new Regex.
Regex regex = new Regex(@"\d+");

// Step 2: call Match on Regex instance.
Match match = regex.Match("a55a");

// Step 3: test for Success.
if (match.Success)
{
    Console.WriteLine("MATCH VALUE: " + match.Value);
}
MATCH VALUE: 55

Complex example. We do not need to create a Regex instance to use Match: we can invoke the static Regex.Match. This example builds up some complexity—we access Groups after testing Success.

Part 1 This is the string we are testing. Notice how it has a file name part inside a directory name and extension.

Part 2 We use the Regex.Match static method. The second argument is the pattern we wish to match with.

Part 3 We test the result of Match with the Success property. When true, a Match occurred and we can access its Value or Groups.

Regex Groups

Part 4 We access Groups when Success is true. This collection is indexed at 1, not zero—the first group is found at index 1.

using System;
using System.Text.RegularExpressions;

// Part 1: the input string.
string input = "/content/alternate-1.aspx";

// Part 2: call Regex.Match.
Match match = Regex.Match(input, @"content/([A-Za-z0-9\-]+)\.aspx$",
    RegexOptions.IgnoreCase);

// Part 3: check the Match for Success.
if (match.Success)
{
    // Part 4: get the Group value and display it.
    string key = match.Groups[1].Value;
    Console.WriteLine(key);
}
alternate-1

Start, end matching. We can use metacharacters to match the start and end of strings. This is often done when using regular expressions. Use "^" to match the start, and "$" for the end.

Info Instead of returning a Match object like Regex.Match, IsMatch just returns bool that indicates success.

Also We can use the special start and end-matching characters in Regex.Match—it will return any possible matches at those positions.

using System;
using System.Text.RegularExpressions;

string test = "xxyy";

// Match the start of a string.
if (Regex.IsMatch(test, "^xx"))
{
    Console.WriteLine("START MATCHES");
}

// Match the end of a string.
if (Regex.IsMatch(test, "yy$"))
{
    Console.WriteLine("END MATCHES");
}START MATCHES
END MATCHES

NextMatch. More than one match may be found. We can call NextMatch() to search for a match that comes after the current one in the text. NextMatch can be used in a loop.

Step 1 We call Regex.Match. Two matches occur. This call to Regex.Match returns the first Match only.

Step 2 NextMatch returns another Match object—it does not modify the current one. We assign a variable to it.

using System;
using System.Text.RegularExpressions;

string value = "4 AND 5";

// Step 1: get first match.
Match match = Regex.Match(value, @"\d");
if (match.Success)
{
    Console.WriteLine(match.Value);
}

// Step 2: get second match.
match = match.NextMatch();
if (match.Success)
{
    Console.WriteLine(match.Value);
}4
5

Replace. Sometimes we need to replace a pattern of text with some other text. Regex.Replace helps. We can replace patterns with a string, or with a value determined by a MatchEvaluator.

Here We replace all 2 or more digit matches with a string. The 2 digit sequences are replaced with "bird."

using System;
using System.Text.RegularExpressions;

// Replace 2 or more digit pattern with a string.
Regex regex = new Regex(@"\d+");
string result = regex.Replace("cat 123 456", "bird");
Console.WriteLine("RESULT: {0}", result);
RESULT: cat bird bird

Greedy matching. Some regular expressions want to match as many characters as they can—this is the default behavior. But with the "?" metacharacter, we can change this.

Version 1 Use the lazy "?" character to match as few characters before the slash as possible.

Version 2 Use the default greedy regular expression behavior—the result Value is as long as possible.

using System;
using System.Text.RegularExpressions;

string test = "/bird/cat/";
// Version 1: use lazy (or non-greedy) metacharacter.
var result1 = Regex.Match(test, "^/.*?/");
if (result1.Success)
{
    Console.WriteLine("NON-GREEDY: {0}", result1.Value);
}
// Version 2: default Regex.
var result2 = Regex.Match(test, "^/.*/");
if (result2.Success)
{
    Console.WriteLine("GREEDY:     {0}", result2.Value);
}NON-GREEDY: /bird/
GREEDY:     /bird/cat/

Static. Often a Regex instance object is faster than the static Regex.Match. For performance, we should usually use an instance object. It can be shared throughout an entire project.

static Regex

Info We only need to call Match once in a program's execution. A Regex object does not help here.

Here A static class stores an instance Regex that can be used project-wide. We initialize it inline.

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
        // The input string again.
        string input = "/content/alternate-1.aspx";

        // This calls the static method specified.
        Console.WriteLine(RegexUtil.MatchKey(input));
    }
}

static class RegexUtil
{
    static Regex _regex = new Regex(@"/content/([a-z0-9\-]+)\.aspx$");
    /// <summary>
    /// This returns the key that is matched within the input.
    /// </summary>
    static public string MatchKey(string input)
    {
        Match match = _regex.Match(input.ToLower());
        if (match.Success)
        {
            return match.Groups[1].Value;
        }
        else
        {
            return null;
        }
    }
}
alternate-1

Match, parse numbers. A common requirement is extracting a number from a string. We can do this with Regex.Match. To get further numbers, consider Matches() or NextMatch.

Here We extract a group of digit characters and access the Value string representation of that number.

Tip To parse the number, use int.Parse or int.TryParse on the Value here. This will convert it to an int.

int.Parse

using System;
using System.Text.RegularExpressions;

string input = "Dot Net 100 Perls";
Match match = Regex.Match(input, @"\d+");
if (match.Success)
{
    int.TryParse(match.Value, out int number);
    // Show that we have the numbers.
    Console.WriteLine("NUMBERS: {0}, {1}", number, number + 1);
}
NUMBERS: 100, 101

Value, length, index. A Match object, returned by Regex.Match has a Value, Length and Index. These describe the matched text (a substring of the input).

Info Value is the matched text, represented as a separate string. This is a substring of the original input.

Next Length is the length of the Value string. Here, the Length of "AXXXXY" is 6.

Finally Index is the index where the matched text begins within the input string. The character "A" starts at index 4 here.

using System;
using System.Text.RegularExpressions;

Match m = Regex.Match("123 AXXXXY", @"A.*Y");
if (m.Success)
{
    Console.WriteLine($"Value  = {m.Value}");
    Console.WriteLine($"Length = {m.Length}");
    Console.WriteLine($"Index  = {m.Index}");
}Value  = AXXXXY
Length = 6
Index  = 4

IsMatch. This method tests for a matching pattern. It does not capture groups from this pattern. It just sees if the pattern exists in a valid form in the input string.

Note IsMatch returns a bool value. Both overloads receive an input string that is searched for matches.

return bool

Note 2 When we use the static Regex.IsMatch method, a new Regex is created. This is done in the same way as any instance Regex.

And This instance is discarded at the end of the method. It will be cleaned up by the garbage collector.

using System;
using System.Text.RegularExpressions;

class Program
{
    /// <summary>
    /// Test string using Regex.IsMatch static method.
    /// </summary>
    static bool IsValid(string value)
    {
        return Regex.IsMatch(value, @"^[a-zA-Z0-9]*$");
    }

    static void Main()
    {
        // Test the strings with the IsValid method.
        Console.WriteLine(IsValid("dotnetperls0123"));
        Console.WriteLine(IsValid("DotNetPerls"));
        Console.WriteLine(IsValid(":-)"));
        // Console.WriteLine(IsValid(null)); // Throws an exception
    }
}True
True
False

RegexOptions. With the Regex type, the RegexOptions enum is used to modify method behavior. Often I find the IgnoreCase value helpful.

Tip Lowercase and uppercase letters are distinct in the Regex text language. IgnoreCase changes this.

RegexOptions.IgnoreCase

Tip 2 We can change how the Regex type acts upon newlines with the RegexOptions enum. This is often useful.

RegexOptions.Multiline

using System;
using System.Text.RegularExpressions;

const string value = "TEST";
// ... This ignores the case of the "T" character.
if (Regex.IsMatch(value, "t...", RegexOptions.IgnoreCase))
{
    Console.WriteLine(true);
}
True

Benchmark, Regex. Consider the performance of Regex.Match. If we use the RegexOptions.Compiled enum, and use a cached Regex object, we can get a performance boost.

RegexOptions.Compiled

Version 1 In this version of the code, we call the static Regex.Match method, without any object caching.

Version 2 Here we access a cached object and call Match() on this instance of the Regex.

Result By using a static field Regex, and RegexOptions.Compiled, our method completes twice as fast (tested on .NET 5 for Linux).

Warning A compiled Regex will cause a program to start up slower, and may use more memory—so only compile hot Regexes.

using System;
using System.Diagnostics;
using System.Text.RegularExpressions;

class Program
{
    static int Version1()
    {
        string value = "This is a simple 5string5 for Regex.";
        return Regex.Match(value, @"5\w+5").Length;
    }

    static Regex _wordRegex = new Regex(@"5\w+5", RegexOptions.Compiled);

    static int Version2()
    {
        string value = "This is a simple 5string5 for Regex.";
        return _wordRegex.Match(value).Length;
    }

    const int _max = 1000000;
    static void Main()
    {
        // Version 1: use Regex.Match.
        var s1 = Stopwatch.StartNew();
        for (int i = 0; i < _max; i++)
        {
            if (Version1() != 8)
            {
                return;
            }
        }
        s1.Stop();
        // Version 2: use Regex.Match, compiled Regex, instance Regex.
        var s2 = Stopwatch.StartNew();
        for (int i = 0; i < _max; i++)
        {
            if (Version2() != 8)
            {
                return;
            }
        }
        s2.Stop();
        Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns"));
        Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns"));
    }
}265.90 ns    Regex.Match
138.78 ns    instanceRegex.Match, Compiled

Benchmark, Regex and loop. Regular expressions can be reimplemented with loops. For example, a loop can make sure that a string only contains a certain range of characters.

Info The string must only contain the characters "a" through "z" lowercase and uppercase, and the ten digits "0" through "9."

Version 1 This method uses Regex.IsMatch to tell whether the string only has the range of characters specified.

Version 2 This uses a for-loop to iterate through the character indexes in the string. It employs a switch on the char.

for

switch

Result In .NET 5 for Linux (tested in 2021) the regular expression is slower than the loop. But Regex performance has been improved.

using System;
using System.Diagnostics;
using System.Text.RegularExpressions;

class Program
{
    static bool IsValid1(string path)
    {
        return Regex.IsMatch(path, @"^[a-zA-Z0-9]*$");
    }

    static bool IsValid2(string path)
    {
        for (int i = 0; i < path.Length; i++)
        {
            switch (path[i])
            {
                case 'a':
                case 'b':
                case 'c':
                case 'd':
                case 'e':
                case 'f':
                case 'g':
                case 'h':
                case 'i':
                case 'j':
                case 'k':
                case 'l':
                case 'm':
                case 'n':
                case 'o':
                case 'p':
                case 'q':
                case 'r':
                case 's':
                case 't':
                case 'u':
                case 'v':
                case 'w':
                case 'x':
                case 'y':
                case 'z':
                case 'A':
                case 'B':
                case 'C':
                case 'D':
                case 'E':
                case 'F':
                case 'G':
                case 'H':
                case 'I':
                case 'J':
                case 'K':
                case 'L':
                case 'M':
                case 'N':
                case 'O':
                case 'P':
                case 'Q':
                case 'R':
                case 'S':
                case 'T':
                case 'U':
                case 'V':
                case 'W':
                case 'X':
                case 'Y':
                case 'Z':
                case '0':
                case '1':
                case '2':
                case '3':
                case '4':
                case '5':
                case '6':
                case '7':
                case '8':
                case '9':
                    {
                        continue;
                    }
                default:
                    {
                        return false;
                    }
            }
        }
        return true;
    }

    const int _max = 1000000;
    static void Main()
    {
        // Version 1: use Regex.
        var s1 = Stopwatch.StartNew();
        for (int i = 0; i < _max; i++)
        {
            if (IsValid1("hello") == false || IsValid1("$bye") == true)
            {
                return;
            }
        }
        s1.Stop();
        // Version 2: use for-loop.
        var s2 = Stopwatch.StartNew();
        for (int i = 0; i < _max; i++)
        {
            if (IsValid2("hello") == false || IsValid2("$bye") == true)
            {
                return;
            }
        }
        s2.Stop();
        Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns"));
        Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) / _max).ToString("0.00 ns"));
    }
}265.71 ns    Regex.IsMatch
 10.15 ns    for, switch

Regular expressions are a concise way to process text data. We use Regex.Matches, and IsMatch, to check a pattern (evaluating its metacharacters) against an input string.