C# WebClient Tutorial

The C# programming language

WebClient downloads files. You can use the WebClient class in the System.Net namespace to download web pages and files using the C# language targeting the .NET Framework. This class makes it possible to easily download web pages for testing, allowing you to automate important tests for important web sites.

These C# examples use WebClient to download files on the Internet.

Example

1

First, to use the WebClient class in your C# code you need to either use the fully specified name System.Net.WebClient or include the System.Net namespace with a using directive. This example uses the namespace and creates a new WebClient object instance and sets its user agent as Internet Explorer 6. This WebClient will then download a page and the server will think it is Internet Explorer 6, allowing you to test this case.

Program that uses client user-agent [C#]

using System;
using System.Net;

class Program
{
    static void Main()
    {
	// Create web client simulating IE6.
	using (WebClient client = new WebClient())
	{
	    client.Headers["User-Agent"] =
		"Mozilla/4.0 (Compatible; Windows NT 5.1; MSIE 6.0) " +
		"(compatible; MSIE 6.0; Windows NT 5.1; " +
		".NET CLR 1.1.4322; .NET CLR 2.0.50727)";

	    // Download data.
	    byte[] arr = client.DownloadData("http://www.dotnetperls.com/");

	    // Write values.
	    Console.WriteLine("--- WebClient result ---");
	    Console.WriteLine(arr.Length);
	}
    }
}

Output

--- WebClient result ---
6585

User-agent header. You can add a new HTTP header to your WebClients download request by assigning an entry in the Headers collection. You can also use the WebHeaderCollection returned by Headers and call the Add, Remove, Set and Count methods on it.

Byte arrays. The DownloadData instance method on the WebClient is called and its reference return value is assigned to a new byte array reference. Internally, the DownloadData method will allocate the bytes on the managed heap. When you assign the result to the variable, you are doing a bitwise copy of the reference to that data.

Using: The program shows that you can use the 'using' statement to ensure that the system resources for the WebClient are cleaned up by the system by placing them on the finalization queue. This is critical for longer programs but not needed for very short and trivial programs.

GZIP client

GZIP compression

Here we look at an example that uses two HTTP request headers set on the Headers collection on WebClient, and then it reads in the ResponseHeaders collection. This style of code is ideal when you need to make sure your web server is returning the proper headers for certain clients. This code is independent of ASP.NET and would work on any web server.

Program that uses Headers [C#]

using System;
using System.Net;

class Program
{
    static void Main()
    {
	// Create web client.
	WebClient client = new WebClient();

	// Set user agent and also accept-encoding headers.
	client.Headers["User-Agent"] =
	    "Googlebot/2.1 (+http://www.googlebot.com/bot.html)";
	client.Headers["Accept-Encoding"] = "gzip";

	// Download data.
	byte[] arr = client.DownloadData("http://www.dotnetperls.com/");

	// Get response header.
	string contentEncoding = client.ResponseHeaders["Content-Encoding"];

	// Write values.
	Console.WriteLine("--- WebClient result ---");
	Console.WriteLine(arr.Length);
	Console.WriteLine(contentEncoding);
    }
}

Output

--- WebClient result ---
2040
gzip

Multiple request headers. To set many request headers on the WebClient, you can simply assign the string keys to the string values you want the headers to be set to. Alternatively, you can use the Add method on the WebHeaderCollection.

Content-encoding. This part of the example shows how you can get a response HTTP header using the client.ResponseHeaders collection. You can access this much like a hashtable or dictionary. If there is no header set for that key, the result will be null.

Download strings

String type

Next, you can download a web page from the Internet into a string in your C# program. You need to create a new WebClient class instance and then specify the URL you want to download as the parameter to the DownloadString method, which will return a string containing the HTML source. If no accept-encoding was specified, the server usually returns a plain text string.

Program that uses DownloadString [C#]

using System;
using System.Net;

class Program
{
    static void Main()
    {
	// Create web client.
	WebClient client = new WebClient();

	// Download string.
	string value = client.DownloadString("http://www.dotnetperls.com/");

	// Write values.
	Console.WriteLine("--- WebClient result ---");
	Console.WriteLine(value.Length);
	Console.WriteLine(value);
    }
}

Result
    The program prints the page length in characters.
    The program prints the HTML source for the download.

DownloadString. Internally, the DownloadString method will call into lower-level system routines in the Windows network stack. It will allocate the resulting string on the managed heap, and will return a value referencing that data. When you assign the string variable to the result, that storage location is copied bit-by-bit and is now accessible through the variable.

Request headers

Programming tip

You can set the request HTTP headers on the WebClient class. The examples in this article show that you can do this either through the Headers get accessor, such as Headers["a"] = "b". Also, you can access the Headers variable as a WebHeaderCollection, allowing to perform more complex logic on the values. This collection is very useful when you are testing external and important web pages.

Response headers

You can access the response HTTP headers on the WebClient class after you invoke the DownloadData or DownloadString methods. If you are working with an ASP.NET web site, the headers added in Response.AddHeader or Response.AppendHeader will be found in the ResponseHeaders collection on WebClient after it is executed. This is extremely useful for testing validity of all responses from your site.

Asynchronous downloads

Threads illustration

It is possible to access web pages on separate threads in your C# program using WebClient. The WebClient class in System.Net provides OpenReadAsync, DownloadDataAsync, DownloadFileAsync, and DownloadStringAsync methods, which are not shown in the examples in this article. These allow you to continue running the present method while the download has not completed, and they return void.

Threading. Depending on the use of your program, it is sometimes better to put the WebClient code in a BackgroundWorker and access it synchronously on a separate thread. This can allow clearer code and logic for the calling code. For very simple quality analysis tools, it is best to avoid threading entirely, as it will likely cause bugs.

BackgroundWorker Tutorial

Dispose method

Using keyword

The WebClient class in the .NET Framework holds onto some system resources which are required to access the network stack in Microsoft Windows. The behavior of the CLR will ensure these resources are always cleaned up eventually. However, if you manually call Dispose or use the 'using' statement, you can make these resources be cleaned up at more predictable times that will improve the performance of larger programs.

Using Statement Calls Dispose

Console program

Console screenshot

This console program receives two arguments from the process invocation: the target URL you want to download, and the local file you want to append to. If the local file is not found, it will be created. If the target URL is not found, an exception will be thrown and reported.

Program that downloads web page and saves it [C#]

using System;
using System.IO;
using System.Net;

class Program
{
    static void Main(string[] args)
    {
	try
	{
	    Console.WriteLine("*** Log Append Tool ***");
	    Console.WriteLine("    Specify file to download and log file name");
	    Console.WriteLine("Downloading: {0}", args[0]);
	    Console.WriteLine("Appending: {0}", args[1]);
	    // Download url.
	    using (WebClient client = new WebClient())
	    {
		string value = client.DownloadString(args[0]);
		// Append url.
		File.AppendAllText(args[1],
		    string.Format("--- {0} ---\n", DateTime.Now) +
		    value);
	    }
	}
	finally
	{
	    Console.WriteLine("[Done]");
	    Console.ReadLine();
	}
    }
}

Program usage

1. Compile to EXE.
2. Make shortcut to the EXE.
3. Specify the target URL and the local file to append to.
   Such as "http://test/index.html" "C:\test.txt"
Process illustration

Uses. This program can be used for monitoring how a specific text file on the Internet changes with time. For example, if your website exposes some statistics or debugging information at a certain URL, you can configure this program to download that data and log it. It is also possible to use this program on a timer or invoke the program through other programs, with the Process.Start method in the C# language.

Process.Start Examples Timer Tutorial

Tip: You can write a console program that accesses a specific URL and then stores it in a log file. The program here is configurable and can be used for many different purposes.

Time downloads

Performance optimization

This program implements a console application in the C# language that allows you to time a certain web page at any URL. It downloads the web page a certain number of times, and then reports the total and average time required for downloading the page. This gives insight into your download speed and if performance problems exist on a certain site; it can provide details about the relative speed of your site when compared against other sites.

Program that times web page downloads [C#]

using System;
using System.Diagnostics;
using System.Net;

class Program
{
    const int _max = 5;
    static void Main(string[] args)
    {
	try
	{
	    // Get url.
	    string url = args[0];

	    // Report url.
	    Console.ForegroundColor = ConsoleColor.White;
	    Console.WriteLine("... PageTimeTest: times loads of web page over network");
	    Console.ResetColor();
	    Console.WriteLine("Testing: {0}", url);

	    // Fetch page.
	    using (WebClient client = new WebClient())
	    {
		// Set gzip.
		client.Headers["Accept-Encoding"] = "gzip";

		// Download.
		// ... Do an initial run to prime the cache.
		byte[] data = client.DownloadData(url);

		// Start timing.
		Stopwatch stopwatch = Stopwatch.StartNew();

		// Iterate.
		for (int i = 0; i < Math.Min(100, _max); i++)
		{
		    data = client.DownloadData(url);
		}

		// Stop timing.
		stopwatch.Stop();

		// Report times.
		Console.WriteLine("Time required: {0} ms", stopwatch.Elapsed.TotalMilliseconds);
		Console.WriteLine("Time per page: {0} ms", stopwatch.Elapsed.TotalMilliseconds / _max);
	    }
	}
	catch (Exception ex)
	{
	    Console.WriteLine(ex.ToString());
	}
	finally
	{
	    Console.WriteLine("[Done]");
	    Console.ReadLine();
	}
    }
}

Usage
    Create a shortcut of the EXE of the program.
    Then specify the URL on the command-line in the shortcut.

Description. The program introduces the Main entry point and this method is wrapped in a try/catch/finally block. The program begins in the try block, and here it reads the command-line argument and writes the parameters to the screen. It sets the Accept-Encoding HTTP header; this enables you to test the compressed versions of your web page correctly. Then, it downloads the page up to 100 times. Finally, it averages the total milliseconds elapsed and prints this to the screen as well.

Main Args Examples Try Keyword Catch Examples Finally

GZIP headers. Performance-oriented web sites almost all use GZIP compression for transferring pages, and this is one of the more important performance tasks. For this reason, the program only tests GZIP pages by setting the Accept-Encoding header. You can modify the program to test both GZIP and uncompressed pages.

Example run. Let's demonstrate how the program works when benchmarking pages. For the example, we will use Google, because they have lots of bandwidth. This test run shows that the Google homepage was loaded in about 52 milliseconds each time on average. That is extremely fast.

Possible results

... PageTimeTest: times loads of web page over network
Testing: http://www.google.com/
Time required: 259.7351 ms
Time per page: 51.94702 ms
[Done]

Note: This code has many limitations and does not adequately simulate the web browser environment, but can be helpful for benchmarking the remote database, or the network itself, and for comparing different web sites.

Summary

.NET Framework information

We looked at the WebClient class in the System.Net namespace in the .NET Framework using the C# programming language. This class allows you to download web pages into strings and byte arrays using a very simple and reliable method. It is recommended for testing live web sites or for developing programs that must fetch some external resources.

.NET Framework Info
.NET