HomeSearch

C# ASCII String Representation

This C# example program represents strings with one byte per character.
ASCII strings. A string has two bytes representing each character. If the strings are only ASCII, you can change them to be stored as single bytes. This reduces the memory usage by one byte per letter. We change string representations to be smaller.StringsOptimization
Example. The concept behind this benchmark is simple. It allocates an array of 10,000 strings. The memory this requires is measured. Then another method (Compress) changes each string into a byte array. And the memory of this array is measured.Byte Array

Info: The string[] required about 480,000 bytes. The byte[][] (a jagged array of byte arrays) required 320,000 bytes.

And: There was no data loss in these strings because the strings were ASCII-only.

GC.CollectJagged ArraysConvert String, Byte Array

Tip: You can convert the byte arrays back into strings by calling ASCIIEncoding.ASCII.GetString.

Warning: Please note this will have a performance and memory cost to create new strings.

C# program that changes string representation using System; using System.IO; using System.Text; class Program { static void Main() { long a = GC.GetTotalMemory(true); string[] array = Get(); long b = GC.GetTotalMemory(true); array[0] = null; long c = GC.GetTotalMemory(true); byte[][] array2 = Compress(Get()); long d = GC.GetTotalMemory(true); array2[0] = null; Console.WriteLine(a); Console.WriteLine(b); Console.WriteLine(c); Console.WriteLine(d); } static string[] Get() { string[] output = new string[10000]; for (int i = 0; i < 10000; i++) { output[i] = Path.GetRandomFileName(); } return output; } static byte[][] Compress(string[] array) { byte[][] output = new byte[array.Length][]; for (int i = 0; i < array.Length; i++) { output[i] = ASCIIEncoding.ASCII.GetBytes(array[i]); } return output; } } Output 39128 479800 39784 320056
Discussion. Is this useful? Probably not. However, if you have a program that stores a huge number of ASCII strings that are rarely needed, but must be stored in memory, this could be a useful optimization.

However: There is an additional cost when you need to convert back into strings.

Summary. We looked at an optimization that can compress ASCII strings to use only one byte per character instead of two bytes. In some cases, this alternate representation could save a significant amount of memory.
© 2007-2019 Sam Allen. Every person is special and unique. Send bug reports to info@dotnetperls.com.
Home
Dot Net Perls