From d67daf851fcfbba829d47fbe85d7203b485788d6 Mon Sep 17 00:00:00 2001 From: =?utf8?q?Bartosz=20Dziewo=C5=84ski?= Date: Fri, 26 Jan 2018 00:28:35 -0800 Subject: [PATCH] Remove misleading comment for $wgLegalTitleChars This comment originates from rSVN1420 (9d51f616), dated 2 July 2003, where it was written as "ISO 8859-* don't allow 0x80-0x9F... But that breaks interlanguage links at the moment". It was rephrased to the current form in rSVN2621 (840dee3a). It is incorrect for two reasons: * "Theoretically 0x80-0x9F of ISO 8859-1 should be disallowed..." We cannot disallow 0x80-0x9F here; this config variable actually specifies the valid ranges of *bytes* rather than characters, and 0x80 to 0x9F can happily appear in valid UTF-8 encodings of other characters. In case we wanted to disallow the Unicode characters U+0080 to U+009F (encoded in UTF-8 as 0xC2 0x80 to 0xC2 0x9F), it would probably have to be done explicitly in MediaWikiTitleCodec::splitTitleString(). (The task for this is T7732.) * "...but this breaks interlanguage links" Back then, most wikis were using single-byte ISO encodings rather than UTF-8, and that is the only configuration this comment applies to: disallowing the bytes 0x80-0x9F in page titles on wikis using single-byte ISO encodings would indeed have broken interlanguage links from them to wikis using UTF-8. However, disallowing the Unicode characters U+0080 to U+009F today definitely would not break interlanguage links. Change-Id: Ic5ba502ccfbb9cf3ff56cc47eb7fe463e7d45959 --- includes/DefaultSettings.php | 3 --- 1 file changed, 3 deletions(-) diff --git a/includes/DefaultSettings.php b/includes/DefaultSettings.php index 8f4c3468b0..2b2695cdf7 100644 --- a/includes/DefaultSettings.php +++ b/includes/DefaultSettings.php @@ -3936,9 +3936,6 @@ $wgNamespaceAliases = []; * because articles can be created such that they are hard to view or edit. * * In some rare cases you may wish to remove + for compatibility with old links. - * - * Theoretically 0x80-0x9F of ISO 8859-1 should be disallowed, but - * this breaks interlanguage links */ $wgLegalTitleChars = " %!\"$&'()*,\\-.\\/0-9:;=?@A-Z\\\\^_`a-z~\\x80-\\xFF+"; -- 2.20.1