From 00db615f85d8c5900f2e3d20b233162bc31b9c11 Mon Sep 17 00:00:00 2001 From: Arlo Breault Date: Wed, 28 Aug 2019 12:05:09 -0400 Subject: [PATCH] Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 06a41a99d7811a361446b894da7c5c8224398ad1 Change-Id: I9d324b3e6e9167683c15b7fee2a53b598d79a67c --- tests/parser/parserTests.txt | 688 +++++++++++++++++++++++++---------- 1 file changed, 504 insertions(+), 184 deletions(-) diff --git a/tests/parser/parserTests.txt b/tests/parser/parserTests.txt index 0fa91d4afc..d563235cc0 100644 --- a/tests/parser/parserTests.txt +++ b/tests/parser/parserTests.txt @@ -3182,7 +3182,7 @@ Parsoid: Pipes in external links in template parameter

link

!! html/parsoid -

link

+

link

!! end !! test @@ -3193,7 +3193,7 @@ Parsoid: pipe in transclusion parameter

http://foo.com/a%7Cb

!! html/parsoid -

http://foo.com/a%7Cb

+

http://foo.com/a%7Cb

!! end !! test @@ -3220,7 +3220,7 @@ Parsoid: Pipe in template with nested template in external link target in templa

bar

!! html/parsoid -

bar

+

bar

!! end !! test @@ -4049,7 +4049,7 @@ Definition list with news link containing colon
news:alt.wikipedia.rox
This isn't even a real newsgroup!
!! html/parsoid -
news:alt.wikipedia.rox
This isn't even a real newsgroup!
+
news:alt.wikipedia.rox
This isn't even a real newsgroup!
!! end !! test @@ -4670,7 +4670,7 @@ Definition Lists: Mixed Lists: Test 13 !! end # FIXME: Maybe get rid of this test? -# From whitelist: +# From old whitelist description: # * The test is wrong, there are two colons where there should be :; # * The PHP parser is wrong to close the
after the
containing the !! html/parsoid -

http://[2404:130:0:1000::187:2]/index.php

+

http://[2404:130:0:1000::187:2]/index.php

Examples from RFC 2373, section 2.2:

- +

Examples from RFC 2732, section 2:

- + !! end !! test @@ -5936,24 +5963,24 @@ Examples from RFC 2732, section 2:
  • 6
  • 7
  • !! html/parsoid -

    test

    +

    test

    Examples from RFC 2373, section 2.2:

    - +

    Examples from RFC 2732, section 2:

    - + !! end !! test @@ -5999,7 +6026,7 @@ Non-extlinks in brackets [fool's] errand [fool's errand] [url=foo] -[url=http://example.com] +[url=http://example.com] [http:// bare protocols don't count]

    !! end @@ -6011,7 +6038,7 @@ Percent encoding in external links

    Search

    !! html/parsoid -

    Search

    +

    Search

    !! end !! test @@ -6022,7 +6049,7 @@ http://example.com

    http://example.com

    !! html/parsoid -

    http://example.com

    +

    http://example.com

    !! end !! test @@ -6054,14 +6081,14 @@ http://example.com/a)b

    foo

    !! html/parsoid -

    http://example.com)

    -

    http://example.com/test)

    -

    http://example.com/(test)

    -

    http://example.com/((test)

    -

    (http://example.com/(test))

    -

    (http://example.com/(test)))))

    -

    http://example.com/a)b

    -

    foo

    +

    http://example.com)

    +

    http://example.com/test)

    +

    http://example.com/(test)

    +

    http://example.com/((test)

    +

    (http://example.com/(test))

    +

    (http://example.com/(test)))))

    +

    http://example.com/a)b

    +

    foo

    !! end !! test @@ -6075,9 +6102,9 @@ Parenthesis in external links, w/ transclusion or comment

    (http://example.com)

    !! html/parsoid -

    (http://example.com/hi)

    +

    (http://example.com/hi)

    -

    (http://example.com)

    +

    (http://example.com)

    !! end !! test @@ -6654,6 +6681,8 @@ Allow +/- in 2nd and later cells in a row, in 1st cell when td-attrs are present !!end +# Differences between Parsoid and PHP re: trailing whitespace in a +# table cell. !! test Table rowspan !! wikitext @@ -6665,7 +6694,7 @@ Table rowspan |Cell 1, row 2 |Cell 3, row 2 |} -!! html +!! html/php
    Cell 1, row 1 @@ -6679,6 +6708,15 @@ Table rowspan Cell 3, row 2
    +!! html/parsoid + + + + + + + +
    Cell 1, row 1Cell 2, row 1 (and 2)Cell 3, row 1
    Cell 1, row 2Cell 3, row 2
    !! end !! test @@ -6771,7 +6809,7 @@ parsoid=wt2html,html2html !! html/parsoid -
    [ftp://%7Cx]" onmouseover="alert(document.cookie)">test
    +[ftp://%7Cx]" onmouseover="alert(document.cookie)">test !! end !! test @@ -7695,13 +7733,17 @@ Broken link

    !! end +# The PHP parser strips the hash fragment for non-existent pages, but +# Parsoid does not. (T227693) !! test Broken link with fragment !! wikitext [[Zigzagzogzagzig#zug]] -!! html +!! html/php

    Zigzagzogzagzig#zug

    +!! html/parsoid +

    Zigzagzogzagzig#zug

    !! end !! test @@ -7713,13 +7755,16 @@ Special page link with fragment

    !! end +# Parsoid does not strip fragment from red links: T227693 !! test Nonexistent special page link with fragment !! wikitext [[Special:ThisNameWillHopefullyNeverBeUsed#anchor]] -!! html +!! html/php

    Special:ThisNameWillHopefullyNeverBeUsed#anchor

    +!! html/parsoid +

    Special:ThisNameWillHopefullyNeverBeUsed#anchor

    !! end !! test @@ -8170,7 +8215,7 @@ Plain link to URL

    [[1]]

    !! html/parsoid -

    []

    +

    []

    !! end !! test @@ -8199,7 +8244,7 @@ Plain link to protocol-relative URL

    [[1]]

    !! html/parsoid -

    []

    +

    []

    !! end !! test @@ -8242,7 +8287,7 @@ Piped link to URL: [[http://www.example.com|an example URL]]

    Piped link to URL: [example URL]

    !! html/parsoid -

    Piped link to URL: [example URL]

    +

    Piped link to URL: [example URL]

    !! end !! test @@ -8264,13 +8309,13 @@ parsoid=wt2html

    [http://www.example.com

    !! html/parsoid -

    [http://www.example.com

    +

    [http://www.example.com

    -

    [|123]

    +

    [|123]

    -

    {{echo|[|123}}

    +

    {{echo|[|123}}

    -

    [http://www.example.com

    +

    [http://www.example.com

    !! end !! test @@ -8867,8 +8912,8 @@ Interwiki links that cannot be represented in wiki syntax

    meatball:ok ok with fragment ok ending with ? mark -has query -is just fragment

    +has query +is just fragment

    !! end !! test @@ -11445,7 +11490,7 @@ X[https://tools.ietf.org/html/rfc1234 foo]

    !! html/parsoid

    Xfoo

    -

    Xfoo

    +

    Xfoo

    !! end !! test @@ -11505,14 +11550,44 @@ Template with invalid target containing wikilink

    {{Main Page}}

    !! end +# The html2html output of this test is currently failing +# because the html2wt output is broken; see +# https://phabricator.wikimedia.org/T220018#5123777 for a discussion. +# Not (yet) including html2wt as a test mode because there are +# a couple of different correct ways this could be 'ed. !! test Template with just whitespace in it, T70421 !! wikitext {{echo|{{ }}}} +!! options +parsoid=wt2html,html2html +!! html/php+tidy +

    {{ }} +

    !! html/parsoid

    {{ }}

    !! end +# This is currently the wikitext output of html2wt on the above test +# case; note that it is broken! Adding a around the closing +# brace changes how the open braces associate, breaking the outer +# {{echo}} template invocation. *However* this "broken" wikitext +# exposed a useful tokenizer bug (T221384) in how the broken_template +# rule was being backtracked into, so it's a useful test case even +# if/when the above test case gets its html2wt output fixed. +!! test +Template with just whitespace (bad template brace matching) +!! options +parsoid=wt2html +!! wikitext +{{echo|{{ }}}} +!! html/php+tidy +

    {{echo|{{ }}}} +

    +!! html/parsoid +

    {{echo|{{ }}}}

    +!! end + !! article Template:test !! text @@ -11893,9 +11968,11 @@ Template:loop2 Template infinite loop !! wikitext {{loop1}} -!! html +!! html/php

    Template loop detected: Template:Loop1

    +!! html/parsoid +

    Template loop detected: Template:Loop1

    !! end !! test @@ -12034,7 +12111,7 @@ Templates with intersecting and overlapping ranges hi !! html/parsoid -

    ha

    +

    ha

    ho

    @@ -12115,9 +12192,11 @@ Template:Includes2 being included !! wikitext {{Includes2}} -!! html +!! html/php+tidy

    Foo

    +!! html/parsoid +

    Foo

    !! end @@ -12131,9 +12210,11 @@ Template:Includes3 and being included !! wikitext {{Includes3}} -!! html +!! html/php+tidy

    Foo

    +!! html/parsoid +

    Foo

    !! end # FIXME: Parsoid's markup for this is quite ugly. @@ -12152,7 +12233,20 @@ Foozarbar Un-closed !! wikitext -!! html +!! html/php+tidy +!! html/parsoid + +!! end + +!! test +Empty +!! wikitext +Hello! +!! html/php+tidy +

    Hello! +

    +!! html/parsoid +

    Hello!

    !! end !! test @@ -12273,6 +12367,7 @@ Un-closed ## will normalize the include directives to serialize on their own line. ## Selser will take care of preserving formatting in scenarios where they ## intermingled with other wikitext. +## This test also triggered T223411 during Parsoid-PHP porting. !! test Includes and comments at SOL !! options @@ -12813,9 +12908,9 @@ parsoid=wt2html
  • {{echo|Breaks template, however}}
  • !! html/parsoid !! end @@ -13877,7 +13972,7 @@ language=zh

    hi[1] hi[2] hi[3]

    -
    1. ↑ hi
    2. ↑
    3. ↑
    +
    1. ↑ hi
    2. ↑
    3. ↑
    !! end ### @@ -15867,7 +15962,7 @@ thumbsize=220 !! html/php !! html/parsoid -
    http://example.com
    +
    http://example.com
    !! end !! test @@ -15880,7 +15975,7 @@ parsoid=wt2html,wt2wt,html2html !! html/php !! html/parsoid -
    Alteration
    http://example.com
    +
    Alteration
    http://example.com
    !! end !! test @@ -15967,7 +16062,7 @@ T3887: A mailto link with a thumbnail !! html/php !! html/parsoid -
    Please mailto:nobody@example.com
    +
    Please mailto:nobody@example.com
    !! end # Pending resolution to T2368 @@ -16150,7 +16245,7 @@ T5090: External links other than http: in image captions !! html/php
    This caption has irc and Secure ext links in it.
    !! html/parsoid -
    This caption has irc and Secure ext links in it.
    +
    This caption has irc and Secure ext links in it.
    !! end !! test @@ -16650,7 +16745,7 @@ Render invalid page names as plain text (T53090) [[.]] [[..]] [[foo././bar]] -[[fooxyz]]

    +[[fooxyz]]

    [[./../foo|bar]] [[foo/.|bar]] @@ -16748,6 +16843,18 @@ cat=MediaWiki_User's_Guide sort=MediaWiki User's Guide !! end +!! test +Category with template-generated sort key +!! options +cat +!! wikitext +[[Category:MediaWiki User's Guide|MediaWiki {{echo|Foo}} Guide]] +!! html/php +cat=MediaWiki_User's_Guide sort=MediaWiki Foo Guide +!! html/parsoid + +!! end + !! test Category with empty sort key !! options @@ -17649,7 +17756,7 @@ http://example.com [[File:Foobar.jpg]]

    http://example.com Foobar.jpg

    !! html/parsoid -

    http://example.com

    +

    http://example.com

    !!end # Parsoid doesn't wt2wt this cleanly because it adds s. @@ -17952,7 +18059,7 @@ http://example.com[[File:Foobar.jpg]]

    http://example.comFoobar.jpg

    !! html/parsoid -

    http://example.com

    +

    http://example.com

    !!end !! test @@ -18192,6 +18299,17 @@ I always thought &xacute; was a cute letter.

    !! end +!! test +Text with HTML5 semicolon-less entity (should not decode) +!! wikitext +& +!! html/php+tidy +

    &ampamp; +

    +!! html/parsoid +

    &ampamp;

    +!! end + !! test HTML5 tags !! wikitext @@ -19765,11 +19883,11 @@ mailto:inline@mail.tld

    mailto:inline@mail.tld

    !! html/parsoid -

    -

    ftp://inlineftp

    -

    With target

    -

    -

    mailto:inline@mail.tld

    +

    +

    ftp://inlineftp

    +

    With target

    +

    +

    mailto:inline@mail.tld

    !! end @@ -19817,7 +19935,7 @@ http://

    onmouseover= -

    http://__TOC__

    +

    http://__TOC__

    !! end !! test @@ -19884,12 +20002,31 @@ Fuzz testing: Parser22 http://===r:::https://b {| -!! html +!! html/php

    http://===r:::https://b

    +!! html/parsoid +

    http://===r:::https://b

    + +
    +!! end + +# The above 'Parser24' fuzz test exposed a tokenizer bug (T221384); +# this is a minimized version of the above test to catch regressions. +!! test +Fuzz testing: Parser24 (minimized) +!! options +parsoid=wt2html +!! wikitext +{{}} +!! html/php+tidy +

    {{}} +

    +!! html/parsoid +

    {{}}

    !! end ## Remex doesn't account for fostered content. @@ -19973,7 +20110,7 @@ http://example.com junk

    http://example.com junk

    !! html/parsoid -

    http://example.com junk

    +

    http://example.com junk

    !! end !!test @@ -19984,7 +20121,7 @@ http://example.comjunk

    http://example.comjunk

    !! html/parsoid -

    http://example.comjunk

    +

    http://example.comjunk

    !! end !! test @@ -19996,7 +20133,7 @@ http://example.com
    junk
    !! html/php+tidy

    http://example.com

    junk
    !! html/parsoid -

    http://example.com

    junk
    +

    http://example.com

    junk
    !! end !! test @@ -20020,7 +20157,7 @@ parsoid=wt2html
    
     !! html/parsoid
     
    
    +" typeof="mw:Extension/pre" about="#mwt2" data-mw='{"name":"pre","attrs":{"dir":""},"body":{"extsrc":""}}'>
     !! end
     
     !! test
    @@ -21049,7 +21186,7 @@ Handling of 
     in URLs
     !! html/php
     
     !! html/parsoid
    -
    +
     !! end
     
     !! test
    @@ -21059,7 +21196,7 @@ Handling of %0A in URLs
     !! html/php
     
     !! html/parsoid
    -
    +
     !! end
     
     # The PHP parser strips the empty tags out for giggles; parsoid doesn't.
    @@ -21258,7 +21395,7 @@ image4    |300px| centre
     
  • -
  • +
  • !! end @@ -21799,6 +21936,30 @@ File:Foobar.jpg !! end +!! test +Gallery in nolines mode +!! wikitext + +File:Foobar.jpg|foo + +!! html/php + +!! html/parsoid + +!! end + !! test Gallery in slideshow mode !! wikitext @@ -21835,7 +21996,55 @@ File:Foobar.jpg !! html/parsoid +!! end + +!! test +Gallery in packed-overlay mode +!! wikitext + +File:Foobar.jpg|foo + +!! html/php + +!! html/parsoid + +!! end + +!! test +Gallery in packed-hover mode +!! wikitext + +File:Foobar.jpg|foo + +!! html/php + +!! html/parsoid + !! end @@ -21895,14 +22104,19 @@ parsoid=wt2html,wt2wt,html2html # See: https://www.w3.org/TR/html5/syntax.html#character-references # Note that U+000C (form feed) is not a valid XML character, so # it is banned even though allowed in HTML5. +# Note there are also weird legacy numeric entities which are mapped +# elsewhere; see T113194 !! test -Illegal character references (T106578) +Illegal character references (T106578, T113194) +!! options +parsoid={ "modes": ["wt2html","html2html"], "normalizePhp": true } !! wikitext ; Null: � ; FF: ; CR: ; Control (low):  ; Control (high):  Ÿ +; Unsupported legacy: € ‚ ƒ – Ÿ ; Surrogate: �� ; This is an okay astral character: 💩 !! html+tidy @@ -21916,6 +22130,8 @@ Illegal character references (T106578)
    
    Control (high)
     Ÿ
    +
    Unsupported legacy
    +
    € ‚ ƒ – Ÿ
    Surrogate
    ��
    This is an okay astral character
    @@ -22009,7 +22225,7 @@ T24905: followed by ISBN followed by

    (fr) ISBN 2753300917 example.com

    !! html/parsoid -

    (fr) ISBN 2753300917 example.com

    +

    (fr) ISBN 2753300917 example.com

    !! end !! test @@ -22151,7 +22367,7 @@ Images with the "|" character in the comment !! html/php
    An external URL
    !! html/parsoid -
    An external URL
    +
    An external URL
    !! end !! test @@ -23381,7 +23597,7 @@ Nested: -{zh-hans:Hi -{zh-cn:China;zh-sg:Singapore;}-;zh-hant:Hello -{zh-tw:Taiw

    Nested: Hello Hong Kong!

    !! html/parsoid -

    Nested: !

    +

    Nested: !

    !! end !! test @@ -23394,7 +23610,7 @@ language=zh variant=zh-cn

    A

    !! html/parsoid -

    +

    !! end !! test @@ -23407,7 +23623,7 @@ language=zh variant=zh-cn

    A

    !! html/parsoid -

    +

    !! end # Parsoid and PHP disagree on how to parse this example: Parsoid @@ -23552,13 +23768,13 @@ gopher://www.google.com www.гоогле.цом

    !! html/parsoid -

    http://www.google.com -gopher://www.google.com -http://www.google.com -gopher://www.google.com -irc://www.google.com -www.google.com/ftp://dir -www.google.com

    +

    http://www.google.com +gopher://www.google.com +http://www.google.com +gopher://www.google.com +irc://www.google.com +www.google.com/ftp://dir +www.google.com

    !! end !! test @@ -24245,7 +24461,7 @@ language=fa

    [Û±]

    !! html/parsoid -

    +

    !! end !! test @@ -25625,7 +25841,7 @@ T36939 - Case insensitive link parsing ([HttP://])

    [1]

    !! html/parsoid -

    +

    !! end !!test @@ -25645,7 +25861,7 @@ HttP://MediaWiki.Org/

    HttP://MediaWiki.Org/

    !! html/parsoid -

    HttP://MediaWiki.Org/

    +

    HttP://MediaWiki.Org/

    !! end !!test @@ -25779,6 +25995,17 @@ parsoid=wt2html,wt2wt
    !! end +## Just a regression test +!! test +Wikilink with only closing tag in target +!! options +parsoid=wt2html +!! wikitext +[[Test|]] +!! html/parsoid +

    +!! end + #### ---------------------------------------------------------------- #### Parsoid-only testing of Parsoid's impl of LST #### Not implemented yet, see @@ -28417,7 +28644,7 @@ parsoid=html2wt !!end !! test -Don't block XML namespace declaration +T72867: Don't block XML namespace declaration !! wikitext MediaWiki !! html/php @@ -28591,6 +28818,23 @@ parsoid=html2wt [[es:Toxine_bactérienne]] !! end +# Regression test for T219023 +!! test +Emit simple non-piped link where possible +!! options +parsoid=html2wt +!! html/parsoid +VisualEditor +visualEditor +VisualEditor link +visualEditor link +!! wikitext +[[VisualEditor]] +[[visualEditor]] +[[VisualEditor link]] +[[visualEditor link]] +!! end + !! test Image: Modifying size of an image (1) !! options @@ -29604,7 +29848,7 @@ WTS of autolinks with nowikis (round-trip) !! wikitext xhttp://cscott.netx !! html/parsoid -

    xhttp://cscott.netx

    +

    xhttp://cscott.netx

    !! end # this is the "easy" test because it leaves in place all the @@ -29659,6 +29903,58 @@ parsoid=html2wt http://example.com http://example.com is not a link. !! end +!! test +WTS of an autolink surrounded by square brackets (T220018) +!! options +parsoid=html2wt +!! html/parsoid +

    [http://example.com]

    +!! wikitext +[http://example.com] +!! end + +!! test +WTS of edited autolink surrounded by square brackets (T220018) +!! options +parsoid={ + "modes": ["wt2wt"], + "changes": [ + [ "a", "before", "[" ], + [ "a", "after", "]" ] + ] +} +!! wikitext +http://example.com +!! wikitext/edited +[http://example.com] +!! end + +!! test +WTS of an external link surrounded by square brackets (T220018) +!! options +parsoid=html2wt +!! html/parsoid +

    [foo]

    +!! wikitext +[[http://example.com foo]] +!! end + +!! test +WTS of edited external link surrounded by square brackets (T220018) +!! options +parsoid={ + "modes": ["wt2wt"], + "changes": [ + [ "a", "before", "[" ], + [ "a", "after", "]" ] + ] +} +!! wikitext +[http://example.com foo] +!! wikitext/edited +[[http://example.com foo]] +!! end + !! test Magic links inside links (not autolinked) !! wikitext @@ -29687,10 +29983,10 @@ Magic links inside links (not autolinked) PMID 1234 ISBN 123456789x

    -

    http://example.com -RFC 1234 -PMID 1234 -ISBN 123456789x

    +

    http://example.com +RFC 1234 +PMID 1234 +ISBN 123456789x

    !! end !! test @@ -29706,7 +30002,7 @@ Magic links inside image captions (autolinked) !! html/parsoid -
    http://example.com
    +
    http://example.com
    RFC 1234
    PMID 1234
    ISBN 123456789x
    @@ -30136,7 +30432,7 @@ parsoid=wt2html !! wikitext {{echo|hi}}[http://example.com [[ho]]] !! html/parsoid -

    hiho

    +

    hiho

    !! end !! test @@ -30152,7 +30448,7 @@ Use data-parsoid.firstWikitextNode to compute newline constraints for template c !! options parsoid=html2wt !! html/parsoid -a +a
    d
    @@ -30211,9 +30507,9 @@ parsoid={

    123
    !! html/parsoid -

    +

    -
    123
    +
    123
    !! end # -------------------------------------------- @@ -31216,6 +31512,20 @@ parsoid=html2wt {{echo|foo}} !! end +!! test +Only html p-tag is strong indent pre suppressing +!! options +parsoid=html2wt +!! html/parsoid +

    test2 + test3 +

    +!! wikitext +test2 + test3 + +!! end + # ----------------------------------------------------------------- # End of section for Parsoid-only html2wt tests for serialization # of new content @@ -31771,8 +32081,8 @@ T51672: Test for brackets in attributes of elements in external link texts link span

    !! html/parsoid -

    link span -link span

    +

    link span +link span

    !! end !! test @@ -32869,3 +33179,13 @@ header *foo footer !! end + +!! test +Ensure disambiguation links are marked properly +!! options +parsoid=wt2html +!! wikitext +[[Disambiguation]] +!! html/parsoid +

    Disambiguation

    +!! end -- 2.20.1