C# File

Array Class Collections File Keyword String .NET ASP.NET Cast Compression Data Delegate Directive Enum Exception If Interface LINQ Loop Method Number Process Property Regex Sort StringBuilder Struct Switch Time Windows WPF

Open and close

Files. Consider this. In memory, objects cease to exist when a program ends. But files exist until deletion. They are handled with types in System.IO.


Read and write

Be careful. Files cause errors
and performance problems,
so we must be careful. Syntax forms like the using-statement are useful. They allow cleanup of resources.


Using keyword

StreamReader. For text files, StreamReader and StreamWriter are often the most useful types. We use StreamReader in a using block, a special syntax form.

StreamReaderStringWriterReadLine
Based on:

.NET 4.5.1

Program that uses StreamReader, ReadLine: C#

using System.IO;

class Program
{
    static void Main()
    {
	// Read every line in the file.
	using (StreamReader reader = new StreamReader("file.txt"))
	{
	    string line;
	    while ((line = reader.ReadLine()) != null)
	    {
		// Do something with the line.
		string[] parts = line.Split(',');
	    }
	}
    }
}

Path type

Path. Before any file can be opened, it must be addressed. File paths are complex.
They include the volume,
directory,
name
and extension.

Path
Directory, package

Directory. We can manipulate directories on the file system with System.IO.
The Directory type,
and its static methods,
is necessary for this.

Directory
Size type

FileInfo. We can get information about a file from the file system with FileInfo. This does not load the entire file into memory. It just reads stored stats.

FileInfo
Extensible markup language: XML

HTML, XML. Some files have lots of brackets and tags. These are usually HTML or XML files. We could write custom methods for each program, but standardized approaches exist.

HTMLXML
String

ReadAllText. This program uses this method to load in the file "file.txt" on the C: volume. Then it prints the contents of the file. The data is now stored in a string object.

File.ReadAllText
Program that uses ReadAllText: C#

using System;
using System.IO;

class Program
{
    static void Main()
    {
	string file = File.ReadAllText("C:\\file.txt");
	Console.WriteLine(file);
    }
}

Array type

ReadAllLines. Here we read all the lines from a file and place them in an array. The code reads lines from "file.txt" and uses a foreach-loop on them. This is efficient code.

File.ReadAllLines
Program that uses ReadAllLines: C#

using System.IO;

class Program
{
    static void Main()
    {
	// Read in every line in specified file.
	// ... This will store all lines in an array in memory.
	string[] lines = File.ReadAllLines("file.txt");
	foreach (string line in lines)
	{
	    // Do something with line.
	    if (line.Length > 80)
	    {
		// Important code.
	    }
	}
    }
}

Length property

Count lines. We count the number of lines in a file with few lines of code. The example here is a bit slow. But it works. It references the Length property.

Line Count
Program that counts lines: C#

using System.IO;

class Program
{
    static void Main()
    {
	// Another method of counting lines in a file.
	// ... This is not the most efficient way.
	// ... It counts empty lines.
	int lineCount = File.ReadAllLines("file.txt").Length;
    }
}

LINQ: keywords

Query. Does a line containing a specific string exist in the file? Maybe we want to see if a name or location exists in a line in the file. We use LINQ to find any matching line.

LINQCount
Program that uses LINQ on file: C#

using System.IO;
using System.Linq;

class Program
{
    static void Main()
    {
	// See if line exists in a file.
	// ... Uses a query expression to count matching lines.
	// ... If one matches, exists is set to true.
	bool exists = (from line in File.ReadAllLines("file.txt")
		       where line == "Some line match"
		       select line).Count() > 0;
    }
}

Loop

ReadLines. This method does not immediately read in every line. It instead reads lines only as they are needed. We use it in a foreach-loop.

File.ReadLinesForeach

WriteAllLines. We can write an array to a file. When we are done within-memory processing, we often need to write the data to disk.

Program that writes array to file: C#

using System.IO;

class Program
{
    static void Main()
    {
	// Write a string array to a file.
	string[] stringArray = new string[]
	{
	    "cat",
	    "dog",
	    "arrow"
	};
	File.WriteAllLines("file.txt", stringArray);
    }
}

Results: file.txt

cat
dog
arrow

WriteAllText. A simple method, File.WriteAllText receives two arguments. It receives the path of the output file, and the exact string contents of the text file.

Program that uses WriteAllText: C#

using System.IO;

class Program
{
    static void Main()
    {
	File.WriteAllText("C:\\perls.txt",
	    "Dot Net Perls");
    }
}

Add

AppendAllText. We could read in a file, append to that in memory, and then write it out completely again. That is slow. Its more efficient to use an append.

File.AppendAllText
Drawing: Starry Night

ReadAllBytes. We use File.ReadAllBytes to read an image (a PNG) into memory. With this code, we could cache an image in memory. It outperforms reading the image in each time.

File.ReadAllBytesFile.WriteAllBytes: Compress
Program that caches binary file: C#

static class ImageCache
{
    static byte[] _logoBytes;
    public static byte[] Logo
    {
	get
	{
	    // Returns logo image bytes.
	    if (_logoBytes == null)
	    {
		_logoBytes = File.ReadAllBytes("Logo.png");
	    }
	    return _logoBytes;
	}
    }
}

Letter A

TextReader. The TextReader and TextWriter types form the base class that other, more useful types derive from. Usually they are not useful on their own.

TextReaderTextWriter
Numbers

Binary. BinaryReader and BinaryWriter make reading or writing a binary file much easier. These types introduce a level of abstraction over the raw data.

BinaryReaderBinaryWriter
Arrow indicates movement

Seek. We can seek to a specific location in a file with the Seek method.
We demonstrate,
and benchmark,
this method. It is useful with large binary files.

Seek
Chaos

Actions. We copy, delete, rename or get time information about files. These actions are available through the File type and the FileInfo type.

File.CopyFile.DeleteFile.ExistsFile.GetLastWriteTimeUtcFile.MoveFile.OpenFile.Replace
Stream abstract type

Streams take many forms. Sometimes leaving a file on the disk would impact performance or stability in a negative way. In these cases, please consider MemoryStream.

StreamMemoryStreamBaseStream
The C# programming language

WebClient. Not every file we want to use is local. A file may be remote. We may need to access the network to download a file from a server.

WebClient
Dots: colored circles

Office. It is common to need to control Microsoft Excel with C# code. We introduce a fast approach. This material may be outdated, but it still helps on many systems.

ExcelWord
CSV file

CSV files. These are text-based databases. With the System.IO namespace, we can read them into a C# program. Sadly the TextFieldParser is slow.

TextFieldParser: Parse CSVCSV: Separate Files
Not equal

Equality. How can we tell if two files are exactly equal? Unfortunately, the file system's metadata is not sufficient. A method that compares each byte is effective.

File Equals
Console

Performance. When we access a file in Windows, the operating system puts that file into a memory cache. We provide a benchmark of file system caches.

Access Files TogetherMemoryMappedFileCache LengthsUnderstand File Caches
Future

Research. The performance of file handling is an important part of computer programming. Often, optimizing how files are used is the most effective way to make a program faster.

One of the most significant sources of inefficiency is unnecessary input/output (I/O). McConnell, p. 598

We can build small and fast storage, or large and slow storage, but not storage that is both large and fast. Aho et al., p. 454


Framework: NET

File handling is hard. Even with the helpful types provided in the .NET Framework, it is fraught with errors. We must account for disk errors and invalid data. Testing is essential.

C#