From: MatmaRex Date: Wed, 6 Mar 2013 20:06:31 +0000 (+0100) Subject: (bug 23393) (bug 45803) Parser: Fix whitespace handling within headings X-Git-Tag: 1.31.0-rc.0~20257^2 X-Git-Url: https://git.cyclocoop.org/%242?a=commitdiff_plain;h=b3dd3881;p=lhc%2Fweb%2Fwiklou.git (bug 23393) (bug 45803) Parser: Fix whitespace handling within headings * HTML headings containing line breaks are now handled correctly (bug 23393). * Whitespace within == Headline == syntax and within headings is now non-significant and not preserved in the HTML output (bug 45803). Change-Id: I0f2d81dd0b2f7742c5cdb6b7d2cc58a15d3f1029 --- diff --git a/RELEASE-NOTES-1.21 b/RELEASE-NOTES-1.21 index 13dc67394b..c77ff832f8 100644 --- a/RELEASE-NOTES-1.21 +++ b/RELEASE-NOTES-1.21 @@ -121,6 +121,10 @@ production. * (bug 45526) Add QUnit assertion helper "QUnit.assert.htmlEqual" for asserting structual equality of HTML (ignoring insignificant differences like quotmarks, order and whitespace in the attribute list). +* (bug 23393) HTML headings containing line breaks are now handled + correctly. +* (bug 45803) Whitespace within == Headline == syntax and within headings + is now non-significant and not preserved in the HTML output. === Bug fixes in 1.21 === * (bug 40353) SpecialDoubleRedirect should support interwiki redirects. diff --git a/includes/parser/Parser.php b/includes/parser/Parser.php index dea0764045..8209f8a304 100644 --- a/includes/parser/Parser.php +++ b/includes/parser/Parser.php @@ -4108,7 +4108,7 @@ class Parser { # Get all headlines for numbering them and adding funky stuff like [edit] # links - this is for later, but we need the number of headlines right now $matches = array(); - $numMatches = preg_match_all( '/[1-6])(?P.*?'.'>)(?P
.*?)<\/H[1-6] *>/i', $text, $matches ); + $numMatches = preg_match_all( '/[1-6])(?P.*?'.'>)\s*(?P
[\s\S]*?)\s*<\/H[1-6] *>/i', $text, $matches ); # if there are fewer than 4 headlines in the article, do not show TOC # unless it's been explicitly enabled. @@ -4170,7 +4170,7 @@ class Parser { $serial = $markerMatches[1]; list( $titleText, $sectionIndex ) = $this->mHeadings[$serial]; $isTemplate = ( $titleText != $baseTitleText ); - $headline = preg_replace( "/^$markerRegex/", "", $headline ); + $headline = preg_replace( "/^$markerRegex\\s*/", "", $headline ); } if ( $toclevel ) { @@ -4416,7 +4416,7 @@ class Parser { } # split up and insert constructed headlines - $blocks = preg_split( '/.*?<\/H[1-6]>/i', $text ); + $blocks = preg_split( '/[\s\S]*?<\/H[1-6]>/i', $text ); $i = 0; // build an array of document sections diff --git a/tests/parser/parserTests.txt b/tests/parser/parserTests.txt index 03f9659cf5..a5d92f29fb 100644 --- a/tests/parser/parserTests.txt +++ b/tests/parser/parserTests.txt @@ -4459,7 +4459,7 @@ List interrupted by empty line or heading
    • bar
-

[edit] A heading

+

[edit] A heading

  • Another list item
@@ -7524,7 +7524,7 @@ More ===Smaller headline=== Blah blah !! result -

[edit] Headline 1

+

[edit] Headline 1

Some text

[edit] Headline 2

@@ -7569,11 +7569,11 @@ Some text -

[edit] Headline 1

-

[edit] Subheadline 1

-
[edit] Skipping a level
-
[edit] Skipping a level
-

[edit] Headline 2

+

[edit] Headline 1

+

[edit] Subheadline 1

+
[edit] Skipping a level
+
[edit] Skipping a level
+

[edit] Headline 2

Some text

[edit] Another headline

@@ -7624,12 +7624,12 @@ Handling of sections up to level 6 and beyond -

[edit] Level 1 Heading

-

[edit] Level 2 Heading

-

[edit] Level 3 Heading

-

[edit] Level 4 Heading

-
[edit] Level 5 Heading
-
[edit] Level 6 Heading
+

[edit] Level 1 Heading

+

[edit] Level 2 Heading

+

[edit] Level 3 Heading

+

[edit] Level 4 Heading

+
[edit] Level 5 Heading
+
[edit] Level 6 Heading
[edit] = Level 7 Heading=
[edit] == Level 8 Heading==
[edit] === Level 9 Heading===
@@ -7666,12 +7666,12 @@ TOC regression (bug 9764) -

[edit] title 1

-

[edit] title 1.1

-

[edit] title 1.1.1

-

[edit] title 1.2

-

[edit] title 2

-

[edit] title 2.1

+

[edit] title 1

+

[edit] title 1.1

+

[edit] title 1.1.1

+

[edit] title 1.2

+

[edit] title 2

+

[edit] title 2.1

!! end @@ -7702,12 +7702,12 @@ wgMaxTocLevel=3 -

[edit] title 1

-

[edit] title 1.1

-

[edit] title 1.1.1

-

[edit] title 1.2

-

[edit] title 2

-

[edit] title 2.1

+

[edit] title 1

+

[edit] title 1.1

+

[edit] title 1.1.1

+

[edit] title 1.2

+

[edit] title 2

+

[edit] title 2.1

!! end @@ -7747,8 +7747,8 @@ Resolving duplicate section names == Foo bar == == Foo bar == !! result -

[edit] Foo bar

-

[edit] Foo bar

+

[edit] Foo bar

+

[edit] Foo bar

!! end @@ -7758,8 +7758,8 @@ Resolving duplicate section names with differing case (bug 10721) == Foo bar == == Foo Bar == !! result -

[edit] Foo bar

-

[edit] Foo Bar

+

[edit] Foo bar

+

[edit] Foo Bar

!! end @@ -7824,9 +7824,9 @@ __TOC__
  • 2 title 2
  • -

    [edit] title 1

    -

    [edit] title 1.1

    -

    [edit] title 2

    +

    [edit] title 1

    +

    [edit] title 1.1

    +

    [edit] title 2

    !! end @@ -7887,19 +7887,19 @@ section 5
  • 5 text " text
  • -

    [edit] text > text

    +

    [edit] text > text

    section 1

    -

    [edit] text < text

    +

    [edit] text < text

    section 2

    -

    [edit] text & text

    +

    [edit] text & text

    section 3

    -

    [edit] text ' text

    +

    [edit] text ' text

    section 4

    -

    [edit] text " text

    +

    [edit] text " text

    section 5

    !! end @@ -7928,6 +7928,45 @@ Headers with excess '=' characters !! end +!! test +HTML headers vs TOC (bug 23393) +(__NOEDITSECTION__ for clearer output, doesn't matter here) +!! input +

    Header 1

    +== Header 1.1 == +== Header 1.2 == + +

    Header 2 +

    +== Header 2.1 == +== Header 2.2 == +__NOEDITSECTION__ +!! result +

    Contents

    + +
    +

    Header 1

    +

    Header 1.1

    +

    Header 1.2

    +

    Header 2

    +

    Header 2.1

    +

    Header 2.2

    + +!! end + !! test BUG 1219 URL next to image (broken) !! input @@ -9211,7 +9250,7 @@ Fuzz testing: Parser14 == onmouseover= == http://__TOC__ !! result -

    [edit] onmouseover=

    +

    [edit] onmouseover=

    http://

    Contents

    • 1 onmouseover=
    • @@ -11150,7 +11189,7 @@ anchorencode encodes like the TOC generator: (bug 18431) {{anchorencode: _ +:.3A%3A&&]] }} __NOEDITSECTION__ !! result -

      _ +:.3A%3A&&]]

      +

      _ +:.3A%3A&&]]

      .2B:.3A.253A.26.26.5D.5D

      !! end @@ -11380,7 +11419,7 @@ language=sr variant=sr-ec !! input == -{Naslov}- == !! result -

      [уреди] Naslov

      +

      [уреди] Naslov

      !! end @@ -12541,7 +12580,7 @@ __TOC__
    • 1 Lost episodes
    -

    [edit] Lost episodes

    +

    [edit] Lost episodes

    !! end @@ -12558,7 +12597,7 @@ __TOC__
  • 1 should be bold then normal text
  • -

    [edit] should be bold then normal text

    +

    [edit] should be bold then normal text

    !! end @@ -12575,7 +12614,7 @@ __TOC__
  • 1 Image
  • -

    [edit] Image Foobar.jpg

    +

    [edit] Image Foobar.jpg

    !! end @@ -12592,7 +12631,7 @@ __TOC__
  • 1 Quote
  • -

    [edit]
    Quote

    +

    [edit]
    Quote

    !! end @@ -12611,7 +12650,7 @@ QED
  • 1 Proof: 2 < 3
  • -

    [edit] Proof: 2 < 3

    +

    [edit] Proof: 2 < 3

    Hanc marginis exiguitas non caperet. QED

    @@ -12631,8 +12670,8 @@ __TOC__
  • 2 Foo Bar
  • -

    [edit] Foo Bar

    -

    [edit] Foo
    Bar

    +

    [edit] Foo Bar

    +

    [edit] Foo
    Bar

    !! end @@ -12650,8 +12689,8 @@ __TOC__
  • 2 b">Evilbye
  • -

    [edit] Hello

    -

    [edit] b">Evilbye

    +

    [edit] Hello

    +

    [edit] b">Evilbye

    !! end @@ -12678,11 +12717,11 @@ __TOC__
  • 5 Attributes after dir on these span tags must be deleted from the TOC
  • -

    [edit] C++

    -

    [edit] זבנג!

    -

    [edit] The attributes on these span tags must be deleted from the TOC

    -

    [edit] All attributes on these span tags must be deleted from the TOC

    -

    [edit] Attributes after dir on these span tags must be deleted from the TOC

    +

    [edit] C++

    +

    [edit] זבנג!

    +

    [edit] The attributes on these span tags must be deleted from the TOC

    +

    [edit] All attributes on these span tags must be deleted from the TOC

    +

    [edit] Attributes after dir on these span tags must be deleted from the TOC

    !! end @@ -12699,7 +12738,7 @@ title=[[Main Page]] !! input {{int:Bug32057}} !! result -

    [edit] Headline text

    +

    [edit] Headline text

    !! end