Improved partitioning scheme for refreshLinks jobs
authorAaron Schulz <aschulz@wikimedia.org>
Tue, 19 Nov 2013 00:12:12 +0000 (16:12 -0800)
committerAaron Schulz <aschulz@wikimedia.org>
Wed, 27 Nov 2013 00:02:44 +0000 (16:02 -0800)
commit721731f43c4d082d4098f1a3f1fd0f350217f084
tree348d0cfd4f07405a72ef585595ac6ecf2a11c599
parent8318bde3a8df0b8dad649b4e05c008333b1404b4
Improved partitioning scheme for refreshLinks jobs

* The changes refreshLinks to handle both per-title (leaf) and backlink jobs.
  The base job now splits into some leaf jobs and a remaining partition job.
  The partition job does the same until there are only a small number
  of backlinks in the remaining range (so only leaf jobs are added).
  Since the leaf jobs are pushed first, this works well for FIFO queues
  to avoid bloating the queue. This also improves per-title job
  de-duplication, which isQueueDeprioritized() pretty much killed.
* The refreshLinks2 class is no longer used for new jobs.
* Fix process cache bug with JobQueueGroup::push with empty arrays.
* This adds a BacklinksJobUtils with helper functions for partitioning.
* RefreshLinksJob jobs now have a simple version parameter.
* Also moved refreshLinks2Job to its own file.

Change-Id: Id378d47df17248ae02938d5a54ef7ecd29efadbd
includes/AutoLoader.php
includes/DefaultSettings.php
includes/deferred/LinksUpdate.php
includes/job/JobQueueGroup.php
includes/job/jobs/RefreshLinksJob.php
includes/job/jobs/RefreshLinksJob2.php [new file with mode: 0644]
includes/job/utils/BacklinkJobUtils.php [new file with mode: 0644]
tests/phpunit/includes/jobqueue/RefreshLinksPartitionTest.php [new file with mode: 0644]