Non-word characters don't terminate tag names.
authorC. Scott Ananian <cscott@cscott.net>
Tue, 6 Aug 2013 15:17:38 +0000 (11:17 -0400)
committerC. Scott Ananian <cscott@cscott.net>
Tue, 6 Aug 2013 15:46:34 +0000 (11:46 -0400)
commitf8b7cc890d9fa6fbb6c9673391f37e81abde274e
tree6d2d8d1e4d9941aa9cbe957a44267851a4ac5313
parentd11d0f08b25293453c08c012afd6c7620fa4a1d6
Non-word characters don't terminate tag names.

The PHP sanitizer was including only \w+ in tag names.  This meant that
<b.foo> and <bä> were converted to <b> tags (bug 17663); <s.foo> and
<s-id> were treated as <s> tags (bug 40670), and <sub-ID#1> was treated
as a <sub> tag (bug 52022).  (But note that <strike> *is* actually a valid
synonym for <s>.)

Fix the sanitizer.

Bug: 17663
Change-Id: Iceec404f46703065bf080dd2cbfed1f88c204fa5
includes/Sanitizer.php
tests/parser/parserTests.txt