Fix incorrect escaping of nested <em>, <strong>, <q>, <ruby>, and <bdo>
authorC. Scott Ananian <cscott@cscott.net>
Wed, 10 Jul 2013 17:03:21 +0000 (13:03 -0400)
committerC. Scott Ananian <cscott@cscott.net>
Wed, 10 Jul 2013 17:07:45 +0000 (13:07 -0400)
commit1be73506625a5a8db8e0394cdfd8a6e7f000d62d
tree2cc4125c4a6f8ce489d9377755188154b7ac15fc
parenteba2c630596bc9f1921fa8cb39852c85eb93783f
Fix incorrect escaping of nested <em>, <strong>, <q>, <ruby>, and <bdo>

The parser, when given "<em>X<em>Y</em>Z</em>" was emitting
"<p><em>X&lt;em&gt;Y</em>Z&lt;/em&gt;</p>".  This is the same as bug
41545, but with a different set of tags.

Note that the HTML spec
(http://www.w3.org/TR/html5/text-level-semantics.html) gives an
explicit meaning for nested <em>, <strong>, <q>, <ruby>, and <bdo>.

There are other nestable tags (<b>, <i>, <s>, <u>, <cite>, <dfn>,
<abbr>, <time>, <code>, <mark>, <rt>, <rp>, <bdi>) which I've chosen
not to fix in this commit since the spec allows but does not give
semantics for them.  A wikipedian authoring content with these nested
tags is probably making an error; the escaped content will make this
obvious.

Bug: 51081
Change-Id: Ia940ac54e9527bba7fee75d3bd91babee2f91c57
includes/Sanitizer.php
tests/parser/parserTests.txt