'ArticlePageDataBefore': Before loading data of an article from the database.
&$wikiPage: WikiPage (object) that data will be loaded
&$fields: fields (array) to load from the database
+&$tables: tables (array) to load from the database
+&$joinConds: join conditions (array) to load from the database
'ArticlePrepareTextForEdit': Called when preparing text to be saved.
$wikiPage: the WikiPage being saved
in a Category page. Gives extensions the opportunity to batch load any
related data about the pages.
$type: The category type. Either 'page', 'file' or 'subcat'
-$res: Query result from DatabaseBase::select()
+$res: Query result from Wikimedia\Rdbms\IDatabase::select()
'CategoryViewer::generateLink': Before generating an output link allow
extensions opportunity to generate a more specific or relevant link.
'ChangesListSpecialPageQuery': Called when building SQL query on pages
inheriting from ChangesListSpecialPage (in core: RecentChanges,
RecentChangesLinked and Watchlist).
-
Do not use this to implement individual filters if they are compatible with the
ChangesListFilter and ChangesListFilterGroup structure.
-
Instead, use sub-classes of those classes, in conjunction with the
ChangesListSpecialPageStructuredFilters hook.
-
This hook can be used to implement filters that do not implement that structure,
or custom behavior that is not an individual filter.
$name: name of the special page, e.g. 'Watchlist'
filters for pages inheriting from ChangesListSpecialPage (in core: RecentChanges,
RecentChangesLinked, and Watchlist). Generally, you will want to construct
new ChangesListBooleanFilter or ChangesListStringOptionsFilter objects.
-
When constructing them, you specify which group they belong to. You can reuse
existing groups (accessed through $special->getFilterGroup), or create your own
(ChangesListBooleanFilterGroup or ChangesListStringOptionsFilterGroup).
If you create new groups, you must register them with $special->registerFilterGroup.
+Note that this is called regardless of whether the user is currently using
+the new (structured) or old (unstructured) filter UI. If you want your boolean
+filter to show on both the new and old UI, specify all the supported fields.
+These include showHide, label, and description.
+See the constructor of each ChangesList* class for documentation of supported
+fields.
$special: ChangesListSpecialPage instance
'ChangeTagAfterDelete': Called after a change tag has been deleted (that is,
'GetIP': modify the ip of the current user (called only once).
&$ip: string holding the ip as determined so far
+ 'GetLangPreferredVariant': Called in LanguageConverter#getPreferredVariant() to
+ allow fetching the language variant code from cookies or other such
+ alternative storage.
+ &$req: language variant from the URL (string) or boolean false if no variant
+ was specified in the URL; the value of this variable comes from
+ LanguageConverter#getURLVariant()
+
'GetLinkColours': modify the CSS class of an array of page links.
$linkcolour_ids: array of prefixed DB keys of the pages linked to,
indexed by page_id.
Return false to stop further processing of the tag
$reader: XMLReader object
+'ImportHandleUnknownUser': When a user doesn't exist locally, this hook is called
+to give extensions an opportunity to auto-create it. If the auto-creation is
+successful, return false.
+$name: User name
+
'ImportHandleUploadXMLTag': When parsing a XML tag in a file upload.
Return false to stop further processing of the tag
$reader: XMLReader object
callable here. The callable is passed the ParserOptions object and the option
name.
+'ParserOutputPostCacheTransform': Called from ParserOutput::getText() to do
+post-cache transforms.
+$parserOutput: The ParserOutput object.
+&$text: The text being transformed, before core transformations are done.
+&$options: The options array being used for the transformation.
+
'ParserSectionCreate': Called each time the parser creates a document section
from wikitext. Use this to apply per-section modifications to HTML (like
wrapping the section in a DIV). Caveat: DIVs are valid wikitext, and a DIV
or request state must be added through MakeGlobalVariablesScript instead.
&$vars: array( variable name => value )
-'ResourceLoaderGetLessVars': Called in ResourceLoader::getLessVars after
-variables from $wgResourceLoaderLESSVars are added. Can be used to add
-context-based variables.
+'ResourceLoaderGetLessVars': DEPRECATED! Called in ResourceLoader::getLessVars
+to add global LESS variables. Loaded after $wgResourceLoaderLESSVars is added.
+Global LESS variables are deprecated. Use ResourceLoaderModule::getLessVars()
+instead to expose variables only in modules that need them.
&$lessVars: array of variables already added
'ResourceLoaderJqueryMsgModuleMagicWords': Called in
added to any module.
&$ResourceLoader: object
-'RevisionInsertComplete': Called after a revision is inserted into the database.
-&$revision: the Revision
-$data: the data stored in old_text. The meaning depends on $flags: if external
- is set, it's the URL of the revision text in external storage; otherwise,
- it's the revision text itself. In either case, if gzip is set, the revision
- text is gzipped.
-$flags: a comma-delimited list of strings representing the options used. May
- include: utf8 (this will always be set for new revisions); gzip; external.
+'RevisionRecordInserted': Called after a revision is inserted into the database.
+$revisionRecord: the RevisionRecord that has just been inserted.
+
+'RevisionInsertComplete': DEPRECATED! Use RevisionRecordInserted hook instead.
+Called after a revision is inserted into the database.
+$revision: the Revision
+$data: DEPRECATED! Always null!
+$flags: DEPRECATED! Always null!
'SearchableNamespaces': An option to modify which namespaces are searchable.
&$arr: Array of namespaces ($nsId => $name) which will be used.
$terms: String of the search terms entered
$specialSearch: The SpecialSearch object
&$query: Array of query string parameters for the link representing the search result.
+&$attributes: Array of title link attributes, can be modified by extension.
'SidebarBeforeOutput': Allows to edit sidebar just before it is output by skins.
Warning: This hook is run on each display. You should consider to use
instead.
&$form: UploadForm object
+'UploadForm:getInitialPageText': After the initial page text for file uploads
+is generated, to allow it to be altered.
+&$pageText: the page text
+$msg: array of header messages
+$config: Config object
+
'UploadForm:initial': Before the upload form is generated. You might set the
member-variables $uploadFormTextTop and $uploadFormTextAfterSummary to inject
text (HTML) either before or after the editform.
$add: Array of strings corresponding to groups added
$remove: Array of strings corresponding to groups removed
-'UserSaveOptions': Called just before saving user preferences/options.
-$user: User object
-&$options: Options, modifiable
+'UserSaveOptions': Called just before saving user preferences. Hook handlers can either add or
+manipulate options, or reset one back to it's default to block changing it. Hook handlers are also
+allowed to abort the process by returning false, e.g. to save to a global profile instead. Compare
+to the UserSaveSettings hook, which is called after the preferences have been saved.
+$user: The User for which the options are going to be saved
+&$options: The users options as an associative array, modifiable
-'UserSaveSettings': Called when saving user settings.
-$user: User object
+'UserSaveSettings': Called directly after user preferences (user_properties in the database) have
+been saved. Compare to the UserSaveOptions hook, which is called before.
+$user: The User for which the options have been saved
'UserSetCookies': DEPRECATED! If you're trying to replace core session cookie
handling, you want to create a subclass of MediaWiki\Session\CookieSessionProvider
&$opts: Options to use for the query
&$join: Join conditions
-'WikiPageDeletionUpdates': manipulate the list of DataUpdates to be applied when
+'WikiPageDeletionUpdates': manipulate the list of DeferrableUpdates to be applied when
a page is deleted. Called in WikiPage::getDeletionUpdates(). Note that updates
specific to a content model should be provided by the respective Content's
getDeletionUpdates() method.
$page: the WikiPage
-$content: the Content to generate updates for (or null, if the Content could not be loaded
-due to an error)
-&$updates: the array of DataUpdate objects. Hook function may want to add to it.
+$content: the Content to generate updates for, or null in case the page revision could not be
+ loaded. The delete will succeed despite this.
+&$updates: the array of objects that implement DeferrableUpdate. Hook function may want to add to
+ it.
'WikiPageFactory': Override WikiPage class used for a title
$title: Title of the page
*/
use MediaWiki\MediaWikiServices;
+use MediaWiki\Logger\LoggerFactory;
+
/**
* Base class for language conversion.
* @ingroup Language
*/
static public $languagesWithVariants = [
'en',
+ 'crh',
'gan',
'iu',
'kk',
$req = $this->getURLVariant();
+ Hooks::run( 'GetLangPreferredVariant', [ &$req ] );
+
if ( $wgUser->isSafeToLoad() && $wgUser->isLoggedIn() && !$req ) {
$req = $this->getUserVariant();
} elseif ( !$req ) {
if ( $this->guessVariant( $text, $toVariant ) ) {
return $text;
}
-
/* we convert everything except:
- * 1. HTML markups (anything between < and >)
- * 2. HTML entities
- * 3. placeholders created by the parser
- */
- $marker = '|' . Parser::MARKER_PREFIX . '[\-a-zA-Z0-9]+';
+ 1. HTML markups (anything between < and >)
+ 2. HTML entities
+ 3. placeholders created by the parser
+ IMPORTANT: Beware of failure from pcre.backtrack_limit (T124404).
+ Minimize use of backtracking where possible.
+ */
+ $marker = '|' . Parser::MARKER_PREFIX . '[^\x7f]++\x7f';
// this one is needed when the text is inside an HTML markup
- $htmlfix = '|<[^>]+$|^[^<>]*>';
+ $htmlfix = '|<[^>\004]++(?=\004$)|^[^<>]*+>';
+
+ // Optimize for the common case where these tags have
+ // few or no children. Thus try and possesively get as much as
+ // possible, and only engage in backtracking when we hit a '<'.
// disable convert to variants between <code> tags
- $codefix = '<code>.+?<\/code>|';
+ $codefix = '<code>[^<]*+(?:(?:(?!<\/code>).)[^<]*+)*+<\/code>|';
// disable conversion of <script> tags
- $scriptfix = '<script.*?>.*?<\/script>|';
+ $scriptfix = '<script[^>]*+>[^<]*+(?:(?:(?!<\/script>).)[^<]*+)*+<\/script>|';
// disable conversion of <pre> tags
- $prefix = '<pre.*?>.*?<\/pre>|';
+ $prefix = '<pre[^>]*+>[^<]*+(?:(?:(?!<\/pre>).)[^<]*+)*+<\/pre>|';
+ // The "|.*+)" at the end, is in case we missed some part of html syntax,
+ // we will fail securely (hopefully) by matching the rest of the string.
+ $htmlFullTag = '<(?:[^>=]*+(?>[^>=]*+=\s*+(?:"[^"]*"|\'[^\']*\'|[^\'">\s]*+))*+[^>=]*+>|.*+)|';
- $reg = '/' . $codefix . $scriptfix . $prefix .
- '<[^>]+>|&[a-zA-Z#][a-z0-9]+;' . $marker . $htmlfix . '/s';
+ $reg = '/' . $codefix . $scriptfix . $prefix . $htmlFullTag .
+ '&[a-zA-Z#][a-z0-9]++;' . $marker . $htmlfix . '|\004$/s';
$startPos = 0;
$sourceBlob = '';
$literalBlob = '';
// Guard against delimiter nulls in the input
// (should never happen: see T159174)
$text = str_replace( "\000", '', $text );
+ $text = str_replace( "\004", '', $text );
$markupMatches = null;
$elementMatches = null;
+
+ // We add a marker (\004) at the end of text, to ensure we always match the
+ // entire text (Otherwise, pcre.backtrack_limit might cause silent failure)
while ( $startPos < strlen( $text ) ) {
- if ( preg_match( $reg, $text, $markupMatches, PREG_OFFSET_CAPTURE, $startPos ) ) {
+ if ( preg_match( $reg, $text . "\004", $markupMatches, PREG_OFFSET_CAPTURE, $startPos ) ) {
$elementPos = $markupMatches[0][1];
$element = $markupMatches[0][0];
+ if ( $element === "\004" ) {
+ // We hit the end.
+ $elementPos = strlen( $text );
+ $element = '';
+ } elseif ( substr( $element, -1 ) === "\004" ) {
+ // This can sometimes happen if we have
+ // unclosed html tags (For example
+ // when converting a title attribute
+ // during a recursive call that contains
+ // a < e.g. <div title="<">.
+ $element = substr( $element, 0, -1 );
+ }
} else {
- $elementPos = strlen( $text );
- $element = '';
+ // If we hit here, then Language Converter could be tricked
+ // into doing an XSS, so we refuse to translate.
+ // If non-crazy input manages to reach this code path,
+ // we should consider it a bug.
+ $log = LoggerFactory::getInstance( 'languageconverter' );
+ $log->error( "Hit pcre.backtrack_limit in " . __METHOD__
+ . ". Disabling language conversion for this page.",
+ [
+ "method" => __METHOD__,
+ "variant" => $toVariant,
+ "startOfText" => substr( $text, 0, 500 )
+ ]
+ );
+ return $text;
}
-
// Queue the part before the markup for translation in a batch
$sourceBlob .= substr( $text, $startPos, $elementPos - $startPos ) . "\000";
// Translate any alt or title attributes inside the matched element
if ( $element !== ''
- && preg_match( '/^(<[^>\s]*)\s([^>]*)(.*)$/', $element, $elementMatches )
+ && preg_match( '/^(<[^>\s]*+)\s([^>]*+)(.*+)$/', $element, $elementMatches )
) {
+ // FIXME, this decodes entities, so if you have something
+ // like <div title="foo<bar"> the bar won't get
+ // translated since after entity decoding it looks like
+ // unclosed html and we call this method recursively
+ // on attributes.
$attrs = Sanitizer::decodeTagAttributes( $elementMatches[2] );
+ // Ensure self-closing tags stay self-closing.
+ $close = substr( $elementMatches[2], -1 ) === '/' ? ' /' : '';
$changed = false;
foreach ( [ 'title', 'alt' ] as $attrName ) {
if ( !isset( $attrs[$attrName] ) ) {
}
if ( $changed ) {
$element = $elementMatches[1] . Html::expandAttributes( $attrs ) .
- $elementMatches[3];
+ $close . $elementMatches[3];
}
}
$literalBlob .= $element . "\000";
$out = '';
$length = strlen( $text );
$shouldConvert = !$this->guessVariant( $text, $variant );
-
- while ( $startPos < $length ) {
- $pos = strpos( $text, '-{', $startPos );
-
- if ( $pos === false ) {
+ $continue = 1;
+
+ $noScript = '<script.*?>.*?<\/script>(*SKIP)(*FAIL)';
+ $noStyle = '<style.*?>.*?<\/style>(*SKIP)(*FAIL)';
+ // phpcs:ignore Generic.Files.LineLength
+ $noHtml = '<(?:[^>=]*+(?>[^>=]*+=\s*+(?:"[^"]*"|\'[^\']*\'|[^\'">\s]*+))*+[^>=]*+>|.*+)(*SKIP)(*FAIL)';
+ while ( $startPos < $length && $continue ) {
+ $continue = preg_match(
+ // Only match -{ outside of html.
+ "/$noScript|$noStyle|$noHtml|-\{/",
+ $text,
+ $m,
+ PREG_OFFSET_CAPTURE,
+ $startPos
+ );
+
+ if ( !$continue ) {
// No more markup, append final segment
$fragment = substr( $text, $startPos );
$out .= $shouldConvert ? $this->autoConvert( $fragment, $variant ) : $fragment;
return $out;
}
- // Markup found
+ // Offset of the match of the regex pattern.
+ $pos = $m[0][1];
+
// Append initial segment
$fragment = substr( $text, $startPos, $pos - $startPos );
$out .= $shouldConvert ? $this->autoConvert( $fragment, $variant ) : $fragment;
-
- // Advance position
+ // -{ marker found, not in attribute
+ // Advance position up to -{ marker.
$startPos = $pos;
-
// Do recursive conversion
+ // Note: This passes $startPos by reference, and advances it.
$out .= $this->recursiveConvertRule( $text, $variant, $startPos, $depth + 1 );
}
-
return $out;
}
*
* @param string $text Text to be converted
* @param string $variant The target variant code
- * @param int $startPos
+ * @param int &$startPos
* @param int $depth Depth of recursion
*
* @throws MWException