* An open <dt> (;) should be closed when we encounter a new <dd> (:)
char even if it is on a new line that has other nested lists inside.
* Tidy was hiding this PHP parser bug by closing a <dt> and opening
a <dd> when given this HTML: "<dl><dt>a<ul><li>b</li></ul></dt></dl>"
It generates "<dl><dt>a</dt><dd><ul><li>b</li></ul></dd></dl>"
However, a HTML5 parser like RemexHTML, domino (used by Parsoid),
or browsers don't do this fixup.
* So, what I thought was a bug in RemexHTML turned out to be a bug
in the PHP parser that was being hidden by the use of Tidy.
* Added a regression test.
Bug: T175099
Change-Id: I6d5b225b82cecf9a43f23837ed8ec359b31aadad
$output .= $this->nextItem( $prefix[$commonPrefixLength - 1] );
}
+ # Close an open <dt> if we have a <dd> (":") starting on this line
+ if ( $this->DTopen && $commonPrefixLength > 0 && $prefix[$commonPrefixLength - 1] === ':' ) {
+ $output .= $this->nextItem( ':' );
+ }
+
# Open prefixes where appropriate.
if ( $lastPrefix && $prefixLength > $commonPrefixLength ) {
$output .= "\n";
}
while ( $prefixLength > $commonPrefixLength ) {
- $char = substr( $prefix, $commonPrefixLength, 1 );
+ $char = $prefix[$commonPrefixLength];
$output .= $this->openList( $char );
if ( ';' === $char ) {
!! end
+# This is a regression test for T175099
+# html/php+tidy is insufficient since Tidy covers up the bug.
+# But once Tidy is replaced with RemexHTML, html/php+tidy is good enough
+!! test
+Definition Lists: Mixed Lists: Test 11
+!! wikitext
+;a
+:*b
+!! html
+<dl><dt>a</dt>
+<dd>
+<ul><li>b</li></ul></dd></dl>
+
+!! end
+
# The Parsoid team disagrees with the PHP parser's seemingly-random
# rules regarding dd/dt on the next two tests. Parsoid is more
# consistent, and recognizes the shared nesting and keeps the
# (And tidy again converts <dt> to <dd> before 'bar'.)
!! test
-Definition Lists: Mixed Lists: Test 11
+Definition Lists: Mixed Lists: Test 12
!! wikitext
*#*#;*;;foo :bar
*#*#;boo :baz