C# TextFieldParser

File: text page

TextFieldParser reads in CSV files. With it, we specify a delimiter string, and then can read in the fields of every line in a loop. We can use the TextFieldParser instead of string.Split. We demonstrate the TextFieldParser.

Split

Tip:To access TextFieldParser, go to Add Reference, and select Microsoft.VisualBasic.

Example

CSV file

This short example uses the TextFieldParser. You must assign the Delimiters property to a string array. The delimiters tell the parser where fields end and new fields begin on a single line.

Further:We show the ReadFields method, which returns null when the end of file condition is met.

And:In the result, the array returned contains every field in its own element. We display the length.

Program that uses TextFieldParser: C#

using System;
using Microsoft.VisualBasic.FileIO;

class Program
{
    static void Main()
    {
	using (TextFieldParser parser = new TextFieldParser("C:\\csv.txt"))
	{
	    parser.Delimiters = new string[] { "," };
	    while (true)
	    {
		string[] parts = parser.ReadFields();
		if (parts == null)
		{
		    break;
		}
		Console.WriteLine("{0} field(s)", parts.Length);
	    }
	}
    }
}

Input file: csv.txt

a,b,c
d,e,f
gh,ij

Output

3 field(s)
3 field(s)
2 field(s)

Benchmark

Performance optimization

Next I provide a benchmark for the TextFieldParser and the string.Split method. This benchmark isn't perfect. It only uses a single char delimiter (the comma), and it only tests 3-line and 300-line files.

However, the results are convincing to me that the TextFieldParser cannot be taken seriously for performance work. The string.Split method is many times faster at populating arrays.

Program that benchmarks TextFieldParser: C#

using System;
using System.Diagnostics;
using System.IO;
using Microsoft.VisualBasic.FileIO;

class Program
{
    const int _max = 10000;
    static void Main()
    {
	Method1();
	Method2();
	System.Threading.Thread.Sleep(1000);
	var s1 = Stopwatch.StartNew();
	for (int i = 0; i < _max; i++)
	{
	    Method1();
	}
	s1.Stop();
	var s2 = Stopwatch.StartNew();
	for (int i = 0; i < _max; i++)
	{
	    Method2();
	}
	s2.Stop();
	Console.WriteLine(s1.Elapsed.TotalMilliseconds);
	Console.WriteLine(s2.Elapsed.TotalMilliseconds);
    }

    static void Method1()
    {
	using (TextFieldParser parser = new TextFieldParser("C:\\csv.txt"))
	{
	    parser.Delimiters = new string[] { "," };
	    while (true)
	    {
		string[] parts = parser.ReadFields();
		if (parts == null)
		{
		    break;
		}
		// Console.WriteLine("{0} field(s)", parts.Length);
	    }
	}
    }

    static void Method2()
    {
	char[] delimiters = new char[] { ',' };
	using (StreamReader reader = new StreamReader("C:\\csv.txt"))
	{
	    while (true)
	    {
		string line = reader.ReadLine();
		if (line == null)
		{
		    break;
		}
		string[] parts = line.Split(delimiters);
		// Console.WriteLine("{0} field(s)", parts.Length);
	    }
	}
    }
}

Result with 3-line, 8-field file

2616 ms
 623 ms

Result with 300-line, 800-field file

10762 ms
 1186 ms

The TextFieldParser seems to scale worse. Its performance degraded faster on the large file than did the string.Split and StreamReader method. You can uncomment the Console.WriteLine calls to ensure the methods are correctly working.

Console.WriteLine

Summary

The C# programming language

We looked at the TextFieldParser type from the Microsoft.VisualBasic.FileIO namespace in the C# language. We saw how the ReadFields method can be used to combine a read and a split of the line.

However:We revealed the performance characteristics of TextFieldParser. It is slow and becomes yet slower on larger files.


C#: File