This comment originates from rSVN1420 (
9d51f616), dated 2 July 2003,
where it was written as "ISO 8859-* don't allow 0x80-0x9F... But that
breaks interlanguage links at the moment". It was rephrased to the
current form in rSVN2621 (
840dee3a).
It is incorrect for two reasons:
* "Theoretically 0x80-0x9F of ISO 8859-1 should be disallowed..."
We cannot disallow 0x80-0x9F here; this config variable actually
specifies the valid ranges of *bytes* rather than characters, and
0x80 to 0x9F can happily appear in valid UTF-8 encodings of other
characters.
In case we wanted to disallow the Unicode characters U+0080 to U+009F
(encoded in UTF-8 as 0xC2 0x80 to 0xC2 0x9F), it would probably have
to be done explicitly in MediaWikiTitleCodec::splitTitleString().
(The task for this is T7732.)
* "...but this breaks interlanguage links"
Back then, most wikis were using single-byte ISO encodings rather
than UTF-8, and that is the only configuration this comment applies
to: disallowing the bytes 0x80-0x9F in page titles on wikis using
single-byte ISO encodings would indeed have broken interlanguage
links from them to wikis using UTF-8. However, disallowing the
Unicode characters U+0080 to U+009F today definitely would not break
interlanguage links.
Change-Id: Ic5ba502ccfbb9cf3ff56cc47eb7fe463e7d45959