This directory contains some Unicode normalization routines.

The main function to care about is UtfNormal::toNFC(); this will convert
a given UTF-8 string to Normalization Form C if it's not already such.
The function assumes that the input string is already valid UTF-8; if there
are corrupt characters this may produce erroneous results.

Performance is kind of stinky in absolute terms, though it should be speedy
on pure ASCII text. ;) On text that can be determined quickly to already be
in NFC it's not too awful but it can quickly get uncomfortably slow,
particularly for Korean text (the hangul decomposition/composition code is
extra slow).


== Regenerating data tables ==

UtfNormalData.inc is generated from the Unicode Character Database by
the script UtfNormalGenerate.php. On a *nix system 'make' should fetch the
necessary files and regenerate it if the scripts have been changed or you
remove it.


== Testing ==

'make test' will run the conformance test (UtfNormalTest.php), fetching the
data from from the net if necessary. If it reports failure, something is
going wrong!