CSV MethodsHandle a CSV text string. Split a CSV file into many separate files.
This page was last reviewed on Jun 19, 2021.
CSV files. A comma-separated values file stores data. It separates each unit with a comma character. We can use built-in methods like Split() to parse CSV files.
For complex situations, we may want to separate a CSV file apart into 2 or more segments. This can allow easier uploading. An example method is here.
First example. To begin, we see the Split() method. This approach to handling a CSV file is well-covered in the Split article. But it is worth reviewing.
using System; class Program { static void Main() { string text = "field one,field2,description,identity"; // Split the cvs on a comma. string[] parts = text.Split(','); foreach (string value in parts) { Console.WriteLine(value); } } }
field one field2 description identity
Separation example. This method separates CSV files. It turns a file into smaller files containing parts of the original data. Sometimes you can only upload 1 MB sections.
Detail Here we see a static class. It divides a large input CSV file, such as example.csv, into smaller files of one megabyte.
Here Pay attention to the method call in the Main method, which specifies files of 1024 times 1024 bytes, or one megabyte.
Main args
Detail We use File.ReadLines to read in the entire source CSV file. In the for-loop, it adds up the current byte length of the strings.
And When it exceeds the maximum length in bytes, it outputs a new file. It generates file names "split_00.txt", "split_01.txt" and more.
using System; class Program { static void Main() { // Split this CSV file into 1 MB chunks. CSVSplitTool.SplitCSV("example.csv", "split", 1024 * 1024); } } /// <summary> /// Tool for splitting CSV files at a certain byte size on a line break. /// </summary> static class CSVSplitTool { /// <summary> /// Split CSV files on line breaks before a certain size in bytes. /// </summary> public static void SplitCSV(string file, string prefix, int size) { // Read lines from source file string[] arr = System.IO.File.ReadAllLines(file); int total = 0; int num = 0; var writer = new System.IO.StreamWriter(GetFileName(prefix, num)); // Loop through all source lines for (int i = 0; i < arr.Length; i++) { // Current line string line = arr[i]; // Length of current line int length = line.Length; // See if adding this line would exceed the size threshold if (total + length >= size) { // Create a new file num++; total = 0; writer.Dispose(); writer = new System.IO.StreamWriter(GetFileName(prefix, num)); } // Write the line to the current file writer.WriteLine(line); // Add length of line in bytes to running size total += length; // Add size of newlines total += Environment.NewLine.Length; } writer.Dispose(); } /// <summary> /// Get an output file name based on a number. /// </summary> static string GetFileName(string prefix, int num) { return prefix + "_" + num.ToString("00") + ".txt"; } }
Verify. Here we verify the correctness of the method to make sure it works. The example CSV file is a 6,409,636-byte CSV file containing 60,000 lines, each with 10 fields.
And Each field is a random number. The sum of the six output files is 6.11 MB, which is the same as the input file.
Result The first five output files are 1024 KB each. This is displayed as 0.99 MB in the file manager. The final file is 116 KB.
Also The lines in the output files were also checked for accuracy. The first file split occurs after line 9816.
Detail Line 9816 is the final line in the first output file, and line 9817 is the first line in the second output file.
Summary. This static method splits CSV files based on byte size. You can use it to split your CSV files on any size boundaries. This is useful for inputting CSV files to a database.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Jun 19, 2021 (simplify).
© 2007-2023 Sam Allen.