Html: Reject </script> from inlineScript() and leave rest unescaped
There are three problems with the CDATA approach:
1. It doesn't work.
HTML5 already interprets the contents of <script> tags as CDATA,
which means escaping of characters like & is not needed. In fact,
in HTML5 mode, a plain script tag with <script>0&1;</script>
would be a syntax error. Indicating it is not interpreted as
text, but as CDATA. Effectively, the only thing an HTML parser
looks for is </script>.
And that's exactly the problem. Producing an inline script
containing the characters "</string>" for legitimate reasons,
is currently broken.
No alternate wrapping or setting can make it work, either.
See also:
https://people.wikimedia.org/~krinkle/200506-html-inlinescript.html
which contains:
<script>/*<![CDATA[*/
if (true && true) {
console.log('This is a <script></script> tag (original)');
}
/*]]>*/</script>
In a browser, the script is terminated by the first "</script>",
leaving the code unfinished, throwing a SyntaxError, and outputting
the rest of the script as plain text on the page.
2. CDATA is only for XML mode, whereas MediaWiki does not support
the XML/XHTML output mode (since MediaWiki 1.22). Instead, we only
output HTML (5). Code that does need to produce XML, should use the
class from Xml.php instead.
3. It gives a false sense of security.
We could just remove the CDATA code as-is and that in itself would be an
improvement per point 2 and 3, and would break nothing per point 1.
However, this commit attempts to address the underlying bug by rejecting
the characters "</script>" from input. If this is needed in a literal,
it is the responsibility of the caller to escape it in a way that is
appropiate for how it is used (string, comment, regex, etc.).
There are two ways this can be used currently in core:
* User input as exported through JSON (e.g. mw.config, or mw.messages).
This is already fine as both FormatJson::encode and json_encode handle
escape either < or / in the string by default already.
* Previews of edits to user scripts. This is currently already broken and
causes the script to end early and produce arbitrary HTML on the page.
This commit limits the impact by refusing to output such script in a
broken way. I will further address that use case in a follow-up.
Bug: T200506
Change-Id: I67ceb34eabf2f62fd3f3841b8f1459289fad28fb