Adapt StringUtils::isUtf8 to the top of Unicode at U+10FFFF
RFC 3629 defines the legal range of characters as U+0000..U+10FFFF
and forbids overlong forms (encodings of a character that use more
bytes than necessary). Let's make StringUtils::isUtf8() match the
specification.
* Changed the maximum value in the pure PHP code path and added a
check for overlong forms.
* Added another check, specific to PHP 5.3's mbstring extension,
for values above U+10FFFF.
* Fixed the mbstring test errors in PHP 5.4 using changes to
StringUtilsTest by Platonides <platonides@gmail.com>.
* Uncommented some other tests that could fail because of the
missing check for overlong forms.
* Added additional tests for extra continuation bytes, overlong
sequences/forms, and values in the UTF-16 surrogate range.
The changes to the function were so extensive that I might as
well say I rewrote it.
Bug: 43679
Change-Id: I56ae496d17ffc3747550e06a72dacab3ac55da61