Remove most named character references from output
authorAryeh Gregor <simetrical@users.mediawiki.org>
Sun, 30 May 2010 17:33:59 +0000 (17:33 +0000)
committerAryeh Gregor <simetrical@users.mediawiki.org>
Sun, 30 May 2010 17:33:59 +0000 (17:33 +0000)
commit74a21f3bd1692dac958ddf3e09226a72b7bc65b7
treeec8e3a5cdcf45e7b10eac5fd5d65a713d5f93207
parent86572930d98936727e58828e93443eda08c147e0
Remove most named character references from output

Recommit of r66254 to trunk.  This was just

find extensions phase3 -iname '*.php' \! -iname '*.i18n.php' \! -iname 'Messages*.php' \! -iname '*_Messages.php' -exec sed -i 's/&nbsp;/\&#160;/g;s/&mdash;/―/g;s/&bull;/•/g;s/&aacute;/á/g;s/&acute;/´/g;s/&agrave;/à/g;s/&alpha;/α/g;s/&auml;/ä/g;s/&ccedil;/ç/g;s/&copy;/©/g;s/&darr;/↓/g;s/&deg;/°/g;s/&eacute;/é/g;s/&ecirc;/ê/g;s/&euml;/ë/g;s/&egrave;/è/g;s/&euro;/€/g;s/&harr;//g;s/&hellip;/…/g;s/&iacute;/í/g;s/&igrave;/ì/g;s/&larr;/←/g;s/&ldquo;/“/g;s/&middot;/·/g;s/&minus;/−/g;s/&ndash;/–/g;s/&oacute;/ó/g;s/&ocirc;/ô/g;s/&oelig;/œ/g;s/&ograve;/ò/g;s/&otilde;/õ/g;s/&ouml;/ö/g;s/&pound;/£/g;s/&prime;/′/g;s/&Prime;/″/g;s/&raquo;/»/g;s/&rarr;/→/g;s/&rdquo;/”/g;s/&Sigma;/Σ/g;s/&times;/×/g;s/&uacute;/ú/g;s/&uarr;/↑/g;s/&uuml;/ü/g;s/&yen;/¥/g' {} +

followed by reading over every single line of the resulting diff and
fixing a whole bunch of false positives.  The reason for this change is
given in <http://lists.wikimedia.org/pipermail/wikitech-l/2010-April/047617.html>.
I cleared it with Tim and Brion on IRC before committing.  It might
cause a few problems, but I tried to be careful; please report any
issues.

I skipped all messages files.  I plan to make a follow-up commit that
alters wfMsgExt() with 'escapenoentities' to sanitize all the entities.
That way, the only messages that will be problems will be ones that
output raw HTML, and we want to get rid of those anyway.

This should get rid of all named entities everywhere except messages.  I
skipped a few things like &nbsp that I noticed in manual inspection,
because they weren't well-formed XML anyway.

Also, to everyone who uses non-breaking spaces when they could use a
normal space, or nothing at all, or CSS padding: I still hate you.  Die.
41 files changed:
config/Installer.php
includes/Article.php
includes/ChangeTags.php
includes/ChangesList.php
includes/EditPage.php
includes/GlobalFunctions.php
includes/HTMLForm.php
includes/HistoryPage.php
includes/LogEventsList.php
includes/MessageCache.php
includes/Pager.php
includes/Preferences.php
includes/Sanitizer.php
includes/Skin.php
includes/Xml.php
includes/diff/DifferenceEngine.php
includes/diff/DifferenceInterface.php
includes/installer/WebInstaller.php
includes/parser/Parser.php
includes/specials/SpecialAllmessages.php
includes/specials/SpecialBlockip.php
includes/specials/SpecialBooksources.php
includes/specials/SpecialContributions.php
includes/specials/SpecialExport.php
includes/specials/SpecialIpblocklist.php
includes/specials/SpecialListusers.php
includes/specials/SpecialLockdb.php
includes/specials/SpecialMergeHistory.php
includes/specials/SpecialMovepage.php
includes/specials/SpecialProtectedpages.php
includes/specials/SpecialProtectedtitles.php
includes/specials/SpecialUndelete.php
includes/specials/SpecialUnlockdb.php
includes/specials/SpecialUpload.php
includes/specials/SpecialWatchlist.php
includes/specials/SpecialWhatlinkshere.php
includes/templates/Userlogin.php
languages/Language.php
profileinfo.php
skins/MonoBook.php
skins/Vector.php