C# Normalize, IsNormalized Methods

Use the Normalize method. Understand the purpose of Normalize and IsNormalized.
Normalize. The Normalize method changes Unicode character sequences. A string's buffer is represented in Unicode. Normalize affects how the Unicode characters are ordered.Strings
We explore how the representations of string data change. This method is not need in many C# programs, but when it is needed, it is important.
This program introduces a string with an accent on the lowercase a. We call Normalize with no parameters, and then Normalize with the parameters NormalizationForm.FormD, FormKC, and FormKD.

Then: We print, with Console.WriteLine, the resulting strings to the screen as we go along.


Here: The first call to Normalize uses the NormalizationForm.FormC enum in its implementation. This detail can be seen in IL Disassembler.

IL Disassembler

Info: The results have 2 forms: the "a" with the accent on top, and an ASCII "a" with a single-quote character following it.

Finally: In FormD and FormKD, the single-quote character follows the accented letter.

C# program that uses Normalize method using System; using System.Text; class Program { static void Main() { const string input = "á"; string val2 = input.Normalize(); Console.WriteLine(val2); string val3 = input.Normalize(NormalizationForm.FormD); Console.WriteLine(val3); string val4 = input.Normalize(NormalizationForm.FormKC); Console.WriteLine(val4); string val5 = input.Normalize(NormalizationForm.FormKD); Console.WriteLine(val5); } } Output á a ' á a '
IsNormalized. In Unicode strings, there are different normalization forms. With the IsNormalized method you can test for normalized character data.

Example: We declare a string that has an accent in it. With Normalize and IsNormalized, only non-ASCII characters are affected.

And: IsNormalized returns true if the string is normalized to FormC. It returns false if the form is FormD.


Note 2: You can also pass an argument to IsNormalized. In this case, that specific normalization form is checked.

C# program that uses IsNormalized using System; using System.Text; class Program { static void Main() { const string input = "á"; string val2 = input.Normalize(); string val3 = input.Normalize(NormalizationForm.FormD); Console.WriteLine(input.IsNormalized()); Console.WriteLine(val2.IsNormalized()); Console.WriteLine(val3.IsNormalized()); Console.WriteLine( val3.IsNormalized(NormalizationForm.FormD)); } } Output True True False True
A discussion. Mainly, the Normalize method is useful for interoperability purposes. If you have to interact with another program that uses Unicode, it would be important to call Normalize.

Tip: There is no reason to call Normalize if you are just using ASCII or if you are not interoperating with another Unicode form.

Discussion, continued. IsNormalized addresses the need to determine the normalization status of a string. Normalization is necessary when interoperating with other systems.

Typically: You can ignore IsNormalized and just leave strings in their default normalization format.

A summary. Normalize() provides interoperation with other systems. It is not a commonly needed string method. But it reveals an important detail of the string implementation.
Dot Net Perls
© 2007-2020 Sam Allen. Every person is special and unique. Send bug reports to