Compression

Array Class Collections File String .NET Algorithm ASP.NET Cast Compression Data Delegate Directive Enum Exception If Interface Keyword LINQ Loop Method Number Regex Sort StringBuilder Struct Switch Time Value Windows

Compression concept illustration

Compression optimizes for space. It takes the bit patterns that occur most often and represents them with shorter sequences. Time passes and the quantity of data increases. Compression algorithms such as PNG, GZIP and 7Z become even more important. Executable programs and framework methods are used for data compression.

In general, if our messages are such that some symbols appear very frequently and some very rarely, we can encode data more efficiently (i.e., using fewer bits per message) if we assign shorter codes to the frequent symbols. Abelson & Sussman, p. 161

Expanded file [input] size
    (Typical text file.)

26,747 bytes

Compressed file [output] size
    (72% smaller.)

 7,388 bytes

7-Zip

7-Zip icon

The 7-Zip compression utility is an open-source project mainly developed by Igor Pavlov. It provides excellent compression ratios, far greater than those in most compression utilities such as the one in Windows. We focus on using the 7-Zip executables in a programmatic way.

7-Zip Command-Line Examples
Input file [folder with 1574 files]

perls      7.89 MB

Output files [compressed]

perls.zip  4.41 MB [Windows compressed folder]
perls.7z   3.20 MB [7-Zip LZMA ultra compression]
Question and answer

DEFLATE benchmark. How can you improve files you are compressing with DEFLATE with 7-Zip? In this benchmark, I test various command-line options and present an optimal command line.

DEFLATE

PPMd benchmark. The acronym PPMd stands for Prediction by Partial Matching. It is very effective on certain kinds of text-based files usually. In my testing, I found it can compress text files containing English very well.

PPMd

C# examples

The C# programming language

You can directly compress and decompress data in the C# programming language and .NET Framework. We reveal how you can do this. The code is somewhat more complex than would be ideal, but it is reliable and tested.

Compress Decompress GZipStream 7-Zip Executable Tutorial

Testing GZIP files. Next, there are some articles that help you detect and rewrite GZIP files directly in the C# programming language. They can be used in algorithms that must detect unknown data types, and also for improving compression ratios.

GZIP File Test GZIP Header Flag ByteASP.NET web programming framework

Compression in ASP.NET. You can built GZIP compression directly into your ASP.NET website. You do not need to have IIS compress your data for you. We introduce compression approaches in ASP.NET.

Accept-Encoding GZIP HTTP Compression Overview GZIP Output.NET Framework information

System.IO.Compression. New release versions of the .NET Framework provide the System.IO.Compression namespace as part of the input/output built-in support. In the old days, .NET developers had to turn to third-party compression algorithms, but now they don't need to distribute that code. There are very few types in the System.IO.Compression namespace. They provide important functionality.

Classes
    DeflateStream
    GZipStream

Enum
    CompressionMode
	Compress
	Decompress
Programming tip

DeflateStream versus GZipStream. The difference between DeflateStream and GZipStream is that GZipStream is implemented with the DeflateStream. When you use GZipStream, you are using a simple wrapper type around an actual DeflateStream. Because of this, they will provide the same compression ratios.

Note: If you have a GZIP file, you need to use GZipStream because otherwise an error will occur. GZipStream provides support for GZIP headers.

Note

CompressionMode. When you use the DeflateStream or GZipStream constructors, you need to pass in an existing stream, which can be a MemoryStream or FileStream, as well as a CompressionMode enumerated value. The CompressionMode can be CompressionMode.Compress or CompressionMode.Decompress.

Tip: Compression can enhance performance in many situations, including web site loading and backups. The compression ratios provided by System.IO.Compression is not the best, but it is fairly competitive and overall effective.

String type

Strings. In the C# language, strings are encoded with two bytes representing each character. ASCII strings, however, require only one byte per character. By using byte arrays, you can reduce memory usage of your data.

ASCII String Representation

CSS

Cascading style sheet (CSS)

Every time a visitor loads your website, the CSS content will be downloaded and processed. You can reduce the amount of time this takes by minifying your CSS text. This article provides some tips.

Minify CSS

DeflOpt

The DeflOpt utility is an interesting additional optimization you can add to files you compress. This program improves compression ratios on most files in GZIP format—test it and find out if it works for you.

DeflOpt

Images

Image (graphical text)

You can also optimize images such as PNG images and also ICO images. These articles describe methods and provide benchmarks for image compression approaches. They are useful for optimizing website performance.

favicon.ico PNG

Summary

Compression of data does not just save space. By representing the data in a more compact way, algorithms can act upon that data while touching fewer memory regions. This results in an increase in spatial locality and overall performance.

Dot Net Perls