Compression

Array Class Collections File Keyword String .NET ASP.NET Cast Compression Data Delegate Directive Enum Exception If Interface LINQ Loop Method Number Process Property Regex Sort StringBuilder Struct Switch Time Windows WPF

Compression

Compression. Data increases at a phenomenal pace.
All things are digitized,
recorded,
stored. This requires more and more storage. In compression, we tame this data dragon.


Compression

Ratios. A data compression ratio indicates how well an algorithm works. Often we trade space for time. A slower, more thorough algorithm yields a greater ratio.


Steps

Strategies. We use 7-Zip to compress files. Then we explore the compression tools in the .NET Framework. We even compress images and minify CSS files.


7z: 7-Zip icon

7-Zip. The 7-Zip compression utility is an open-source project developed by Igor Pavlov. It provides excellent compression ratios, far greater than those in most compression utilities.

7-Zip Command-Line
Logo

DEFLATE. We test DEFLATE in 7-Zip. It is used in GZIP. We test various DEFLATE command-line options and present an optimal command line.

DEFLATE
About part

PPMd. Stands for Prediction by Partial Matching. It is often effective on certain kinds of text-based files. This is a good option if you must compress Shakespeare plays.

PPMd
About part

DeflOpt. The DeflOpt utility is an interesting additional optimization you can add to files you compress. It improves compression ratios. The improvements are small.

DeflOpt
C# programming language

C# programs. We can directly compress and decompress data in the C# language. The code is reliable and tested. These examples use the System.IO.Compression namespace.

CompressDecompressGZipStream7-Zip Executable
ASPNET web programming framework

ASP.NET sites. You can build GZIP compression directly into your ASP.NET website. We introduce compression approaches in ASP.NET.

Accept-EncodingHTTP CompressionGZIP Output
GZIP compression

Test GZIP files. GZIP files have specific header bytes. We detect and rewrite GZIP files directly in the C# language. These methods help in programs that handle compressed files.

GZIP File TestGZIP Header Flag Byte
Framework: NET

Classes. The .NET Framework provides the System.IO.Compression namespace. In the old days, .NET developers had to turn to third-party compression algorithms. This is no longer required.


Question

Difference. GZipStream is implemented with the DeflateStream. It is a simple wrapper type around an actual DeflateStream. GZipStream provides support for GZIP headers.


ABC: letters

CompressionMode. When we use DeflateStream
or GZipStream,
we pass in an existing stream. This can be a MemoryStream or FileStream. We also supply a CompressionMode or CompressionLevel.

CompressionLevel
Copy: new object copied

ZipFile. In .NET 4.5, a class that makes compression easier is available. The ZipFile class, and its methods CreateFromDirectory and ExtractToDirectory, enables compression of a directory.

ZipFile
Char

Values. Strings in the C# language use two bytes for each character. But ASCII strings require only one byte per character. By using byte arrays, we can reduce memory usage of data.

ASCII Strings
Cascading style sheet: CSS

Styles. Every time a visitor loads your website, the CSS content will be downloaded and processed. You can reduce the amount of time this takes by minifying your CSS text.

Minify CSS
Copyright

Images. You can also optimize images such as PNG images and ICO images. These articles describe methods and provide benchmarks for image compression approaches.

favicon.icoPNG
Reading

Research. We use an algorithm called Huffman coding for many kinds of lossless data compression. It represents the most frequent symbols with the shortest codes.


Copyright

In SICP, we learn about Huffman encoding. "We can encode data more efficiently... if we assign shorter codes to the frequent symbols" (page 161).

SICP
Time

Compression of data saves not just space. By representing the data in a more compact way, algorithms acting upon that data touch fewer memory regions. This makes them faster.

C#