From: Brion Vibber Date: Tue, 12 Apr 2005 00:41:38 +0000 (+0000) Subject: Change .doc extension to .txt so people stop asking why we have Word documents. WE... X-Git-Tag: 1.5.0alpha1~296 X-Git-Url: http://git.cyclocoop.org/%24image?a=commitdiff_plain;h=27b500c4aa62a5ea7e60a987a3c43edf4d9db59c;p=lhc%2Fweb%2Fwiklou.git Change .doc extension to .txt so people stop asking why we have Word documents. WE DONT THEY ARE TEXT!!!!111eleven --- diff --git a/docs/deferred.doc b/docs/deferred.doc deleted file mode 100644 index 425395fa63..0000000000 --- a/docs/deferred.doc +++ /dev/null @@ -1,19 +0,0 @@ - -DEFERRED.DOC - -A few of the database updates required by various functions here -can be deferred until after the result page is displayed to the -user. For example, updating the view counts, updating the -linked-to tables after a save, etc. PHP does not yet have any -way to tell the server to actually return and disconnect while -still running these updates (as a Java servelet could), but it -might have such a feature in the future. - -We handle these by creating a deferred-update object (in a real -O-O language these would be classes that implement an interface) -and putting those objects on a global list, then executing the -whole list after the page is displayed. We don't do anything -smart like collating updates to the same table or such because -the list is almost always going to have just one item on it, if -that, so it's not worth the trouble. - diff --git a/docs/deferred.txt b/docs/deferred.txt new file mode 100644 index 0000000000..425395fa63 --- /dev/null +++ b/docs/deferred.txt @@ -0,0 +1,19 @@ + +DEFERRED.DOC + +A few of the database updates required by various functions here +can be deferred until after the result page is displayed to the +user. For example, updating the view counts, updating the +linked-to tables after a save, etc. PHP does not yet have any +way to tell the server to actually return and disconnect while +still running these updates (as a Java servelet could), but it +might have such a feature in the future. + +We handle these by creating a deferred-update object (in a real +O-O language these would be classes that implement an interface) +and putting those objects on a global list, then executing the +whole list after the page is displayed. We don't do anything +smart like collating updates to the same table or such because +the list is almost always going to have just one item on it, if +that, so it's not worth the trouble. + diff --git a/docs/design.doc b/docs/design.doc deleted file mode 100644 index 8adff443f0..0000000000 --- a/docs/design.doc +++ /dev/null @@ -1,128 +0,0 @@ -This is a brief overview of the new design. - -Primary source files/objects: - - index.php - Main script. It creates the necessary global objects and parses - the URL to determine what to do, which it then generally passes - off to somebody else (depending on the action to be taken). - - All of the functions to which it might delegate generally do - their job by sending content to the $wgOut object. After returning, - the script flushes that out by calling $wgOut->output(). If there - are any changes that need to be made to the database that can be - deferred until after page display, those happen at the end. - - Note that the order in the includes is touchy; Language uses - some global functions, etc. Likewise with the creation of the - global variables. Don't move them around without some forethought. - - User - Encapsulates the state of the user viewing/using the site. - Can be queried for things like the user's settings, name, etc. - Handles the details of getting and saving to the "user" table - of the database, and dealing with sessions and cookies. - More details in USER.DOC. - - OutputPage - Encapsulates the entire HTML page that will be sent in - response to any server request. It is used by calling its - functions to add text, headers, etc., in any order, and then - calling output() to send it all. It could be easily changed - to send incrementally if that becomes useful, but I prefer - the flexibility. This should also do the output encoding. - The system allocates a global one in $wgOut. This class - also handles converting wikitext format to HTML. - - Title - Represents the title of an article, and does all the work - of translating among various forms such as plain text, URL, - database key, etc. For convenience, and for historical - reasons, it also represents a few features of articles that - don't involve their text, such as access rights. - - Article - Encapsulates access to the "cur" table of the database. The - object represents a Wikipedia article, and maintains state - such as text (in Wikitext format), flags, etc. - - Skin - Encapsulates a "look and feel" for the wiki. All of the - functions that render HTML, and make choices about how to - render it, are here, and are called from various other - places when needed (most notably, OutputPage::addWikiText()). - The StandardSkin object is a complete implementation, and is - meant to be subclassed with other skins that may override - some of its functions. The User object contains a reference - to a skin (according to that user's preference), and so - rather than having a global skin object we just rely on the - global User and get the skin with $wgUser->getSkin(). - - Language - Represents the language used for incidental text, and also - has some character encoding functions and other locale stuff. - A global one is allocated in $wgLang. - - LinkCache - Keeps information on existence of articles. See LINKCACHE.DOC. - -Naming/coding conventions: - - These are meant to be descriptive, not dictatorial; I won't - presume to tell you how to program, I'm just describing the - methods I chose to use for myself. If you do choose to - follow these guidelines, it will probably be easier for you - to collaborate with others on the project, but if you want - to contribute without bothering, by all means do so (and don't - be surprised if I reformat your code). - - - I have the code indented with tabs to save file size and - so that users can set their tab stops to any depth they like. - I use 4-space tab stops, which work well. I also use K&R brace - matching style. I know that's a religious issue for some, - so if you want to use a style that puts opening braces on the - next line, that's OK too, but please don't use a style where - closing braces don't align with either the opening brace on - its own line or the statement that opened the block--that's - confusing as hell. - - - PHP doesn't have "private" member variables of functions, - so I've used the comment "/* private */" in some places to - indicate my intent. Don't access things marked that way - from outside the class def--use the accessor functions (or - make your own if you need them). Yes, even some globals - are marked private, because PHP is broken and doesn't - allow static class variables. - - - Member variables are generally "mXxx" to distinguish them. - This should make it easier to spot errors of forgetting the - required "$this->", which PHP will happily accept by creating - a new local variable rather than complaining. - - - Globals are particularly evil in PHP; it sets a lot of them - automatically from cookies, query strings, and such, leading to - namespace conflicts; when a variable name is used in a function, - it is silently declared as a new local masking the global, so - you'll get weird error because you forgot the global declaration; - lack of static class member variables means you have to use - globals for them, etc. Evil, evil. - - I think I've managed to pare down the number of globals we use - to a scant few dozen or so, and I've prefixed them all with "wg" - so you can spot errors better (odds are, if you see a "wg" - variable being used in a function that doesn't declare it global, - that's probably an error). - - Other conventions: Top-level functions are wfFuncname(), names - of session variables are wsName, cookies wcName, and form field - values wpName ("p" for "POST"). - - - Be kind to your release manager and don't use CVS keywords (Id, - Revision, etc.) to mark file versions. They make merging code - between different branches a pain for CVS, and are kind of sketchy - for versions after that. (Yes, you can use the '-kk' flag so that - merges ignore keywords, but that messes up binary files. See - https://www.cvshome.org/docs/manual/cvs-1.11.18/cvs_5.html#SEC64). - - - \ No newline at end of file diff --git a/docs/design.txt b/docs/design.txt new file mode 100644 index 0000000000..8adff443f0 --- /dev/null +++ b/docs/design.txt @@ -0,0 +1,128 @@ +This is a brief overview of the new design. + +Primary source files/objects: + + index.php + Main script. It creates the necessary global objects and parses + the URL to determine what to do, which it then generally passes + off to somebody else (depending on the action to be taken). + + All of the functions to which it might delegate generally do + their job by sending content to the $wgOut object. After returning, + the script flushes that out by calling $wgOut->output(). If there + are any changes that need to be made to the database that can be + deferred until after page display, those happen at the end. + + Note that the order in the includes is touchy; Language uses + some global functions, etc. Likewise with the creation of the + global variables. Don't move them around without some forethought. + + User + Encapsulates the state of the user viewing/using the site. + Can be queried for things like the user's settings, name, etc. + Handles the details of getting and saving to the "user" table + of the database, and dealing with sessions and cookies. + More details in USER.DOC. + + OutputPage + Encapsulates the entire HTML page that will be sent in + response to any server request. It is used by calling its + functions to add text, headers, etc., in any order, and then + calling output() to send it all. It could be easily changed + to send incrementally if that becomes useful, but I prefer + the flexibility. This should also do the output encoding. + The system allocates a global one in $wgOut. This class + also handles converting wikitext format to HTML. + + Title + Represents the title of an article, and does all the work + of translating among various forms such as plain text, URL, + database key, etc. For convenience, and for historical + reasons, it also represents a few features of articles that + don't involve their text, such as access rights. + + Article + Encapsulates access to the "cur" table of the database. The + object represents a Wikipedia article, and maintains state + such as text (in Wikitext format), flags, etc. + + Skin + Encapsulates a "look and feel" for the wiki. All of the + functions that render HTML, and make choices about how to + render it, are here, and are called from various other + places when needed (most notably, OutputPage::addWikiText()). + The StandardSkin object is a complete implementation, and is + meant to be subclassed with other skins that may override + some of its functions. The User object contains a reference + to a skin (according to that user's preference), and so + rather than having a global skin object we just rely on the + global User and get the skin with $wgUser->getSkin(). + + Language + Represents the language used for incidental text, and also + has some character encoding functions and other locale stuff. + A global one is allocated in $wgLang. + + LinkCache + Keeps information on existence of articles. See LINKCACHE.DOC. + +Naming/coding conventions: + + These are meant to be descriptive, not dictatorial; I won't + presume to tell you how to program, I'm just describing the + methods I chose to use for myself. If you do choose to + follow these guidelines, it will probably be easier for you + to collaborate with others on the project, but if you want + to contribute without bothering, by all means do so (and don't + be surprised if I reformat your code). + + - I have the code indented with tabs to save file size and + so that users can set their tab stops to any depth they like. + I use 4-space tab stops, which work well. I also use K&R brace + matching style. I know that's a religious issue for some, + so if you want to use a style that puts opening braces on the + next line, that's OK too, but please don't use a style where + closing braces don't align with either the opening brace on + its own line or the statement that opened the block--that's + confusing as hell. + + - PHP doesn't have "private" member variables of functions, + so I've used the comment "/* private */" in some places to + indicate my intent. Don't access things marked that way + from outside the class def--use the accessor functions (or + make your own if you need them). Yes, even some globals + are marked private, because PHP is broken and doesn't + allow static class variables. + + - Member variables are generally "mXxx" to distinguish them. + This should make it easier to spot errors of forgetting the + required "$this->", which PHP will happily accept by creating + a new local variable rather than complaining. + + - Globals are particularly evil in PHP; it sets a lot of them + automatically from cookies, query strings, and such, leading to + namespace conflicts; when a variable name is used in a function, + it is silently declared as a new local masking the global, so + you'll get weird error because you forgot the global declaration; + lack of static class member variables means you have to use + globals for them, etc. Evil, evil. + + I think I've managed to pare down the number of globals we use + to a scant few dozen or so, and I've prefixed them all with "wg" + so you can spot errors better (odds are, if you see a "wg" + variable being used in a function that doesn't declare it global, + that's probably an error). + + Other conventions: Top-level functions are wfFuncname(), names + of session variables are wsName, cookies wcName, and form field + values wpName ("p" for "POST"). + + - Be kind to your release manager and don't use CVS keywords (Id, + Revision, etc.) to mark file versions. They make merging code + between different branches a pain for CVS, and are kind of sketchy + for versions after that. (Yes, you can use the '-kk' flag so that + merges ignore keywords, but that messes up binary files. See + https://www.cvshome.org/docs/manual/cvs-1.11.18/cvs_5.html#SEC64). + + + \ No newline at end of file diff --git a/docs/globals.doc b/docs/globals.doc deleted file mode 100644 index ac1a609163..0000000000 --- a/docs/globals.doc +++ /dev/null @@ -1,29 +0,0 @@ -GLOBALS.DOC - -PHP loves globals. I hate them. This is not a great -combination, but I manage. I could get rid of most of -them by having a single "HTTP request" object, and using -it to hold everything that's now global (which is exactly -what I'd do in a Java servlet). But that's really -awkward in PHP, and wouldn't really provide much benefit -in readability or maintainability, so I go with the flow -of PHP and use globals. Here's documentation on the -important globals used by the system. - -$wgOut - OutputPage object for HTTP response. - -$wgTitle - Title object created from the request URL. - -$wgLang - Language object for this request. - -$wgArticle - Article object corresponsing to $wgTitle. - -$wgLinkCache - LinkCache object. - -... - diff --git a/docs/globals.txt b/docs/globals.txt new file mode 100644 index 0000000000..ac1a609163 --- /dev/null +++ b/docs/globals.txt @@ -0,0 +1,29 @@ +GLOBALS.DOC + +PHP loves globals. I hate them. This is not a great +combination, but I manage. I could get rid of most of +them by having a single "HTTP request" object, and using +it to hold everything that's now global (which is exactly +what I'd do in a Java servlet). But that's really +awkward in PHP, and wouldn't really provide much benefit +in readability or maintainability, so I go with the flow +of PHP and use globals. Here's documentation on the +important globals used by the system. + +$wgOut + OutputPage object for HTTP response. + +$wgTitle + Title object created from the request URL. + +$wgLang + Language object for this request. + +$wgArticle + Article object corresponsing to $wgTitle. + +$wgLinkCache + LinkCache object. + +... + diff --git a/docs/hooks.doc b/docs/hooks.doc deleted file mode 100644 index 78b66ba31d..0000000000 --- a/docs/hooks.doc +++ /dev/null @@ -1,338 +0,0 @@ -HOOKS.DOC - -This document describes how event hooks work in MediaWiki; how to add -hooks for an event; and how to run hooks for an event. - -==Glossary== - -event - Something that happens with the wiki. For example: a user logs - in. A wiki page is saved. A wiki page is deleted. Often there are - two events associated with a single action: one before the code - is run to make the event happen, and one after. Each event has a - name, preferably in CamelCase. For example, 'UserLogin', - 'ArticleSave', 'ArticleSaveComplete', 'ArticleDelete'. - -hook - A clump of code and data that should be run when an event - happens. This can be either a function and a chunk of data, or an - object and a method. - -hook function - The function part of a hook. - -==Rationale== - -Hooks allow us to decouple optionally-run code from code that is run -for everyone. It allows MediaWiki hackers, third-party developers and -local administrators to define code that will be run at certain points -in the mainline code, and to modify the data run by that mainline -code. Hooks can keep mainline code simple, and make it easier to -write extensions. Hooks are a principled alternative to local patches. - -Consider, for example, two options in MediaWiki. One reverses the -order of a title before displaying the article; the other converts the -title to all uppercase letters. Currently, in MediaWiki code, we -would handle this as follows (note: not real code, here): - - function showAnArticle($article) { - global $wgReverseTitle, $wgCapitalizeTitle; - - if ($wgReverseTitle) { - wfReverseTitle($article); - } - - if ($wgCapitalizeTitle) { - wfCapitalizeTitle($article); - } - - # code to actually show the article goes here - } - -An extension writer, or a local admin, will often add custom code to -the function -- with or without a global variable. For example, -someone wanting email notification when an article is shown may add: - - function showAnArticle($article) { - global $wgReverseTitle, $wgCapitalizeTitle; - - if ($wgReverseTitle) { - wfReverseTitle($article); - } - - if ($wgCapitalizeTitle) { - wfCapitalizeTitle($article); - } - - # code to actually show the article goes here - - if ($wgNotifyArticle) { - wfNotifyArticleShow($article)); - } - } - -Using a hook-running strategy, we can avoid having all this -option-specific stuff in our mainline code. Using hooks, the function -becomes: - - function showAnArticle($article) { - - if (wfRunHooks('ArticleShow', array(&$article))) { - - # code to actually show the article goes here - - wfRunHooks('ArticleShowComplete', array(&$article)); - } - } - -We've cleaned up the code here by removing clumps of weird, -infrequently used code and moving them off somewhere else. It's much -easier for someone working with this code to see what's _really_ going -on, and make changes or fix bugs. - -In addition, we can take all the code that deals with the little-used -title-reversing options (say) and put it in one place. Instead of -having little title-reversing if-blocks spread all over the codebase -in showAnArticle, deleteAnArticle, exportArticle, etc., we can -concentrate it all in an extension file: - - function reverseArticleTitle($article) { - # ... - } - - function reverseForExport($article) { - # ... - } - -The setup function for the extension just has to add its hook -functions to the appropriate events: - - setupTitleReversingExtension() { - global $wgHooks; - - $wgHooks['ArticleShow'][] = 'reverseArticleTitle'; - $wgHooks['ArticleDelete'][] = 'reverseArticleTitle'; - $wgHooks['ArticleExport'][] = 'reverseForExport'; - } - -Having all this code related to the title-reversion option in one -place means that it's easier to read and understand; you don't have to -do a grep-find to see where the $wgReverseTitle variable is used, say. - -If the code is well enough isolated, it can even be excluded when not -used -- making for some slight savings in memory and load-up -performance at runtime. Admins who want to have all the reversed -titles can add: - - require_once('extensions/ReverseTitle.php'); - -...to their LocalSettings.php file; those of us who don't want or need -it can just leave it out. - -The extensions don't even have to be shipped with MediaWiki; they -could be provided by a third-party developer or written by the admin -him/herself. - -==Writing hooks== - -A hook is a chunk of code run at some particular event. It consists of: - - * a function with some optional accompanying data, or - * an object with a method and some optional accompanying data. - -Hooks are registered by adding them to the global $wgHooks array for a -given event. All the following are valid ways to define hooks: - - $wgHooks['EventName'][] = 'someFunction'; # function, no data - $wgHooks['EventName'][] = array('someFunction', $someData); - $wgHooks['EventName'][] = array('someFunction'); # weird, but OK - - $wgHooks['EventName'][] = $object; # object only - $wgHooks['EventName'][] = array($object, 'someMethod'); - $wgHooks['EventName'][] = array($object, 'someMethod', $someData); - $wgHooks['EventName'][] = array($object); # weird but OK - -When an event occurs, the function (or object method) will be called -with the optional data provided as well as event-specific parameters. -The above examples would result in the following code being executed -when 'EventName' happened: - - # function, no data - someFunction($param1, $param2) - # function with data - someFunction($someData, $param1, $param2) - - # object only - $object->onEventName($param1, $param2) - # object with method - $object->someMethod($param1, $param2) - # object with method and data - $object->someMethod($someData, $param1, $param2) - -Note that when an object is the hook, and there's no specified method, -the default method called is 'onEventName'. For different events this -would be different: 'onArticleSave', 'onUserLogin', etc. - -The extra data is useful if we want to use the same function or object -for different purposes. For example: - - $wgHooks['ArticleSaveComplete'][] = array('ircNotify', 'TimStarling'); - $wgHooks['ArticleSaveComplete'][] = array('ircNotify', 'brion'); - -This code would result in ircNotify being run twice when an article is -saved: once for 'TimStarling', and once for 'brion'. - -Hooks can return three possible values: - - * true: the hook has operated successfully - * "some string": an error occurred; processing should - stop and the error should be shown to the user - * false: the hook has successfully done the work - necessary and the calling function should skip - -The last result would be for cases where the hook function replaces -the main functionality. For example, if you wanted to authenticate -users to a custom system (LDAP, another PHP program, whatever), you -could do: - - $wgHooks['UserLogin'][] = array('ldapLogin', $ldapServer); - - function ldapLogin($username, $password) { - # log user into LDAP - return false; - } - -Returning false makes less sense for events where the action is -complete, and will normally be ignored. - -==Using hooks== - -A calling function or method uses the wfRunHooks() function to run -the hooks related to a particular event, like so: - - class Article { - # ... - function protect() { - global $wgUser; - if (wfRunHooks('ArticleProtect', array(&$this, &$wgUser))) { - # protect the article - wfRunHooks('ArticleProtectComplete', array(&$this, &$wgUser)); - } - } - -wfRunHooks() returns true if the calling function should continue -processing (the hooks ran OK, or there are no hooks to run), or false -if it shouldn't (an error occurred, or one of the hooks handled the -action already). Checking the return value matters more for "before" -hooks than for "complete" hooks. - -Note that hook parameters are passed in an array; this is a necessary -inconvenience to make it possible to pass reference values (that can -be changed) into the hook code. Also note that earlier versions of -wfRunHooks took a variable number of arguments; the array() calling -protocol came about after MediaWiki 1.4rc1. - -==Events and parameters== - -This is a list of known events and parameters; please add to it if -you're going to add events to the MediaWiki code. - -'ArticleDelete': before an article is deleted -$article: the article (object) being deleted -$user: the user (object) deleting the article -$reason: the reason (string) the article is being deleted - -'ArticleDeleteComplete': after an article is deleted -$article: the article that was deleted -$user: the user that deleted the article -$reason: the reason the article was deleted - -'ArticleProtect': before an article is protected -$article: the article being protected -$user: the user doing the protection -$protect: boolean whether this is a protect or an unprotect -$reason: Reason for protect -$moveonly: boolean whether this is for move only or not - -'ArticleProtectComplete': after an article is protected -$article: the article that was protected -$user: the user who did the protection -$protect: boolean whether it was a protect or an unprotect -$reason: Reason for protect -$moveonly: boolean whether it was for move only or not - -'ArticleSave': before an article is saved -$article: the article (object) being saved -$user: the user (object) saving the article -$text: the new article text -$summary: the article summary (comment) -$isminor: minor flag -$iswatch: watch flag -$section: section # - -'ArticleSaveComplete': after an article is saved -$article: the article (object) saved -$user: the user (object) who saved the article -$text: the new article text -$summary: the article summary (comment) -$isminor: minor flag -$iswatch: watch flag -$section: section # - -'BlockIp': before an IP address or user is blocked -$block: the Block object about to be saved -$user: the user _doing_ the block (not the one being blocked) - -'BlockIpComplete': after an IP address or user is blocked -$block: the Block object that was saved -$user: the user who did the block (not the one being blocked) - -'EmailUser': before sending email from one user to another -$to: address of receiving user -$from: address of sending user -$subject: subject of the mail -$text: text of the mail - -'EmailUserComplete': after sending email from one user to another -$to: address of receiving user -$from: address of sending user -$subject: subject of the mail -$text: text of the mail - -'TitleMoveComplete': after moving an article (title) -$old: old title -$nt: new title -$user: user who did the move -$oldid: old article database ID -$newid: new article database ID - -'UnknownAction': An unknown "action" has occured (useful for defining - your own actions) -$action: action name -$article: article "acted on" - -'UnwatchArticle': before a watch is removed from an article -$user: user watching -$article: article object to be removed - -'UnwatchArticle': after a watch is removed from an article -$user: user that was watching -$article: article object removed - -'UserLoginComplete': after a user has logged in -$user: the user object that was created on login - -'UserLogout': before a user logs out -$user: the user object that is about to be logged out - -'UserLogoutComplete': after a user has logged out -$user: the user object _after_ logout (won't have name, ID, etc.) - -'WatchArticle': before a watch is added to an article -$user: user that will watch -$article: article object to be watched - -'WatchArticleComplete': after a watch is added to an article -$user: user that watched -$article: article object watched - diff --git a/docs/hooks.txt b/docs/hooks.txt new file mode 100644 index 0000000000..78b66ba31d --- /dev/null +++ b/docs/hooks.txt @@ -0,0 +1,338 @@ +HOOKS.DOC + +This document describes how event hooks work in MediaWiki; how to add +hooks for an event; and how to run hooks for an event. + +==Glossary== + +event + Something that happens with the wiki. For example: a user logs + in. A wiki page is saved. A wiki page is deleted. Often there are + two events associated with a single action: one before the code + is run to make the event happen, and one after. Each event has a + name, preferably in CamelCase. For example, 'UserLogin', + 'ArticleSave', 'ArticleSaveComplete', 'ArticleDelete'. + +hook + A clump of code and data that should be run when an event + happens. This can be either a function and a chunk of data, or an + object and a method. + +hook function + The function part of a hook. + +==Rationale== + +Hooks allow us to decouple optionally-run code from code that is run +for everyone. It allows MediaWiki hackers, third-party developers and +local administrators to define code that will be run at certain points +in the mainline code, and to modify the data run by that mainline +code. Hooks can keep mainline code simple, and make it easier to +write extensions. Hooks are a principled alternative to local patches. + +Consider, for example, two options in MediaWiki. One reverses the +order of a title before displaying the article; the other converts the +title to all uppercase letters. Currently, in MediaWiki code, we +would handle this as follows (note: not real code, here): + + function showAnArticle($article) { + global $wgReverseTitle, $wgCapitalizeTitle; + + if ($wgReverseTitle) { + wfReverseTitle($article); + } + + if ($wgCapitalizeTitle) { + wfCapitalizeTitle($article); + } + + # code to actually show the article goes here + } + +An extension writer, or a local admin, will often add custom code to +the function -- with or without a global variable. For example, +someone wanting email notification when an article is shown may add: + + function showAnArticle($article) { + global $wgReverseTitle, $wgCapitalizeTitle; + + if ($wgReverseTitle) { + wfReverseTitle($article); + } + + if ($wgCapitalizeTitle) { + wfCapitalizeTitle($article); + } + + # code to actually show the article goes here + + if ($wgNotifyArticle) { + wfNotifyArticleShow($article)); + } + } + +Using a hook-running strategy, we can avoid having all this +option-specific stuff in our mainline code. Using hooks, the function +becomes: + + function showAnArticle($article) { + + if (wfRunHooks('ArticleShow', array(&$article))) { + + # code to actually show the article goes here + + wfRunHooks('ArticleShowComplete', array(&$article)); + } + } + +We've cleaned up the code here by removing clumps of weird, +infrequently used code and moving them off somewhere else. It's much +easier for someone working with this code to see what's _really_ going +on, and make changes or fix bugs. + +In addition, we can take all the code that deals with the little-used +title-reversing options (say) and put it in one place. Instead of +having little title-reversing if-blocks spread all over the codebase +in showAnArticle, deleteAnArticle, exportArticle, etc., we can +concentrate it all in an extension file: + + function reverseArticleTitle($article) { + # ... + } + + function reverseForExport($article) { + # ... + } + +The setup function for the extension just has to add its hook +functions to the appropriate events: + + setupTitleReversingExtension() { + global $wgHooks; + + $wgHooks['ArticleShow'][] = 'reverseArticleTitle'; + $wgHooks['ArticleDelete'][] = 'reverseArticleTitle'; + $wgHooks['ArticleExport'][] = 'reverseForExport'; + } + +Having all this code related to the title-reversion option in one +place means that it's easier to read and understand; you don't have to +do a grep-find to see where the $wgReverseTitle variable is used, say. + +If the code is well enough isolated, it can even be excluded when not +used -- making for some slight savings in memory and load-up +performance at runtime. Admins who want to have all the reversed +titles can add: + + require_once('extensions/ReverseTitle.php'); + +...to their LocalSettings.php file; those of us who don't want or need +it can just leave it out. + +The extensions don't even have to be shipped with MediaWiki; they +could be provided by a third-party developer or written by the admin +him/herself. + +==Writing hooks== + +A hook is a chunk of code run at some particular event. It consists of: + + * a function with some optional accompanying data, or + * an object with a method and some optional accompanying data. + +Hooks are registered by adding them to the global $wgHooks array for a +given event. All the following are valid ways to define hooks: + + $wgHooks['EventName'][] = 'someFunction'; # function, no data + $wgHooks['EventName'][] = array('someFunction', $someData); + $wgHooks['EventName'][] = array('someFunction'); # weird, but OK + + $wgHooks['EventName'][] = $object; # object only + $wgHooks['EventName'][] = array($object, 'someMethod'); + $wgHooks['EventName'][] = array($object, 'someMethod', $someData); + $wgHooks['EventName'][] = array($object); # weird but OK + +When an event occurs, the function (or object method) will be called +with the optional data provided as well as event-specific parameters. +The above examples would result in the following code being executed +when 'EventName' happened: + + # function, no data + someFunction($param1, $param2) + # function with data + someFunction($someData, $param1, $param2) + + # object only + $object->onEventName($param1, $param2) + # object with method + $object->someMethod($param1, $param2) + # object with method and data + $object->someMethod($someData, $param1, $param2) + +Note that when an object is the hook, and there's no specified method, +the default method called is 'onEventName'. For different events this +would be different: 'onArticleSave', 'onUserLogin', etc. + +The extra data is useful if we want to use the same function or object +for different purposes. For example: + + $wgHooks['ArticleSaveComplete'][] = array('ircNotify', 'TimStarling'); + $wgHooks['ArticleSaveComplete'][] = array('ircNotify', 'brion'); + +This code would result in ircNotify being run twice when an article is +saved: once for 'TimStarling', and once for 'brion'. + +Hooks can return three possible values: + + * true: the hook has operated successfully + * "some string": an error occurred; processing should + stop and the error should be shown to the user + * false: the hook has successfully done the work + necessary and the calling function should skip + +The last result would be for cases where the hook function replaces +the main functionality. For example, if you wanted to authenticate +users to a custom system (LDAP, another PHP program, whatever), you +could do: + + $wgHooks['UserLogin'][] = array('ldapLogin', $ldapServer); + + function ldapLogin($username, $password) { + # log user into LDAP + return false; + } + +Returning false makes less sense for events where the action is +complete, and will normally be ignored. + +==Using hooks== + +A calling function or method uses the wfRunHooks() function to run +the hooks related to a particular event, like so: + + class Article { + # ... + function protect() { + global $wgUser; + if (wfRunHooks('ArticleProtect', array(&$this, &$wgUser))) { + # protect the article + wfRunHooks('ArticleProtectComplete', array(&$this, &$wgUser)); + } + } + +wfRunHooks() returns true if the calling function should continue +processing (the hooks ran OK, or there are no hooks to run), or false +if it shouldn't (an error occurred, or one of the hooks handled the +action already). Checking the return value matters more for "before" +hooks than for "complete" hooks. + +Note that hook parameters are passed in an array; this is a necessary +inconvenience to make it possible to pass reference values (that can +be changed) into the hook code. Also note that earlier versions of +wfRunHooks took a variable number of arguments; the array() calling +protocol came about after MediaWiki 1.4rc1. + +==Events and parameters== + +This is a list of known events and parameters; please add to it if +you're going to add events to the MediaWiki code. + +'ArticleDelete': before an article is deleted +$article: the article (object) being deleted +$user: the user (object) deleting the article +$reason: the reason (string) the article is being deleted + +'ArticleDeleteComplete': after an article is deleted +$article: the article that was deleted +$user: the user that deleted the article +$reason: the reason the article was deleted + +'ArticleProtect': before an article is protected +$article: the article being protected +$user: the user doing the protection +$protect: boolean whether this is a protect or an unprotect +$reason: Reason for protect +$moveonly: boolean whether this is for move only or not + +'ArticleProtectComplete': after an article is protected +$article: the article that was protected +$user: the user who did the protection +$protect: boolean whether it was a protect or an unprotect +$reason: Reason for protect +$moveonly: boolean whether it was for move only or not + +'ArticleSave': before an article is saved +$article: the article (object) being saved +$user: the user (object) saving the article +$text: the new article text +$summary: the article summary (comment) +$isminor: minor flag +$iswatch: watch flag +$section: section # + +'ArticleSaveComplete': after an article is saved +$article: the article (object) saved +$user: the user (object) who saved the article +$text: the new article text +$summary: the article summary (comment) +$isminor: minor flag +$iswatch: watch flag +$section: section # + +'BlockIp': before an IP address or user is blocked +$block: the Block object about to be saved +$user: the user _doing_ the block (not the one being blocked) + +'BlockIpComplete': after an IP address or user is blocked +$block: the Block object that was saved +$user: the user who did the block (not the one being blocked) + +'EmailUser': before sending email from one user to another +$to: address of receiving user +$from: address of sending user +$subject: subject of the mail +$text: text of the mail + +'EmailUserComplete': after sending email from one user to another +$to: address of receiving user +$from: address of sending user +$subject: subject of the mail +$text: text of the mail + +'TitleMoveComplete': after moving an article (title) +$old: old title +$nt: new title +$user: user who did the move +$oldid: old article database ID +$newid: new article database ID + +'UnknownAction': An unknown "action" has occured (useful for defining + your own actions) +$action: action name +$article: article "acted on" + +'UnwatchArticle': before a watch is removed from an article +$user: user watching +$article: article object to be removed + +'UnwatchArticle': after a watch is removed from an article +$user: user that was watching +$article: article object removed + +'UserLoginComplete': after a user has logged in +$user: the user object that was created on login + +'UserLogout': before a user logs out +$user: the user object that is about to be logged out + +'UserLogoutComplete': after a user has logged out +$user: the user object _after_ logout (won't have name, ID, etc.) + +'WatchArticle': before a watch is added to an article +$user: user that will watch +$article: article object to be watched + +'WatchArticleComplete': after a watch is added to an article +$user: user that watched +$article: article object watched + diff --git a/docs/language.doc b/docs/language.doc deleted file mode 100644 index 06639f73ba..0000000000 --- a/docs/language.doc +++ /dev/null @@ -1,24 +0,0 @@ -LANGUAGE.DOC - -The Language object handles all readable text produced by the -software. The most used function is getMessage(), usually -called with the wrapper function wfMsg() which calls that method -on the global language object. It just returns a piece of text -given a text key. It is recommended that you use each key only -once--bits of text in different contexts that happen to be -identical in English may not be in other languages, so it's -better to add new keys than to reuse them a lot. Likewise, -if there is text that gets combined with things like names and -titles, it is better to put markers like "$1" inside a piece -of text and use str_replace() than to compose such messages in -code, because their order may change in other languages too. - -While the system is running, there will be one global language -object, which will be a subtype of Language. The methods in -these objects will return the native text requested if available, -otherwise they fall back to sending English text (which is why -the LanguageEn object has no code at all--it just inherits the -English defaults of the Language base class). - -The names of the namespaces are also contained in the language -object, though the numbers are fixed. diff --git a/docs/language.txt b/docs/language.txt new file mode 100644 index 0000000000..06639f73ba --- /dev/null +++ b/docs/language.txt @@ -0,0 +1,24 @@ +LANGUAGE.DOC + +The Language object handles all readable text produced by the +software. The most used function is getMessage(), usually +called with the wrapper function wfMsg() which calls that method +on the global language object. It just returns a piece of text +given a text key. It is recommended that you use each key only +once--bits of text in different contexts that happen to be +identical in English may not be in other languages, so it's +better to add new keys than to reuse them a lot. Likewise, +if there is text that gets combined with things like names and +titles, it is better to put markers like "$1" inside a piece +of text and use str_replace() than to compose such messages in +code, because their order may change in other languages too. + +While the system is running, there will be one global language +object, which will be a subtype of Language. The methods in +these objects will return the native text requested if available, +otherwise they fall back to sending English text (which is why +the LanguageEn object has no code at all--it just inherits the +English defaults of the Language base class). + +The names of the namespaces are also contained in the language +object, though the numbers are fixed. diff --git a/docs/linkcache.doc b/docs/linkcache.doc deleted file mode 100644 index b0afbeec6e..0000000000 --- a/docs/linkcache.doc +++ /dev/null @@ -1,31 +0,0 @@ -LINKCACHE.DOC - -The LinkCache class maintains a list of article titles and -the information about whether or not the article exists in -the database. This is used to mark up links when displaying -a page. If the same link appears more than once on any page, -then it only has to be looked up once. - -In practice, what happens is that the global cache object -$wgLinkCache is consulted and updated every time the function -getArticleID() from Title is called. - -This has a side benefit that we take advantage of. We have -tables "links" and "brokenlinks" which we use to do things -like the Orphans page and Whatlinkshere page. It just so -happens that after we update a page, we display it--and as -we're displaying it, we look up all the links on that page, -causing them to be put into the cache. That information is -exactly what we need to update those two tables. So, we do -something tricky when we update pages: just after the update -and before we display, we clear the cache. Then we display -the updated page. Finally, we put a LinksUpdate object onto -the deferred updates list, which fetches its information from -the cache. - -There's a minor complication: displaying a page also looks up -a few things like the talk page link in the quick bar and the -date links. Since we don't want those in the link tables, we -must take care to suspend the cache while we look those up. -Skin.php does exactly that--see dateLink(), for example. - diff --git a/docs/linkcache.txt b/docs/linkcache.txt new file mode 100644 index 0000000000..b0afbeec6e --- /dev/null +++ b/docs/linkcache.txt @@ -0,0 +1,31 @@ +LINKCACHE.DOC + +The LinkCache class maintains a list of article titles and +the information about whether or not the article exists in +the database. This is used to mark up links when displaying +a page. If the same link appears more than once on any page, +then it only has to be looked up once. + +In practice, what happens is that the global cache object +$wgLinkCache is consulted and updated every time the function +getArticleID() from Title is called. + +This has a side benefit that we take advantage of. We have +tables "links" and "brokenlinks" which we use to do things +like the Orphans page and Whatlinkshere page. It just so +happens that after we update a page, we display it--and as +we're displaying it, we look up all the links on that page, +causing them to be put into the cache. That information is +exactly what we need to update those two tables. So, we do +something tricky when we update pages: just after the update +and before we display, we clear the cache. Then we display +the updated page. Finally, we put a LinksUpdate object onto +the deferred updates list, which fetches its information from +the cache. + +There's a minor complication: displaying a page also looks up +a few things like the talk page link in the quick bar and the +date links. Since we don't want those in the link tables, we +must take care to suspend the cache while we look those up. +Skin.php does exactly that--see dateLink(), for example. + diff --git a/docs/memcached.doc b/docs/memcached.doc deleted file mode 100644 index 6752e9c81d..0000000000 --- a/docs/memcached.doc +++ /dev/null @@ -1,132 +0,0 @@ -memcached support for MediaWiki: - -From ca August 2003, MediaWiki has optional support for memcached, a -"high-performance, distributed memory object caching system". -For general information on it, see: http://www.danga.com/memcached/ - -Memcached is likely more trouble than a small site will need, but -for a larger site with heavy load, like Wikipedia, it should help -lighten the load on the database servers by caching data and objects -in memory. - -== Requirements == - -* PHP must be compiled with --enable-sockets - -* libevent: http://www.monkey.org/~provos/libevent/ - (as of 2003-08-11, 0.7a is current) - -* optionally, epoll-rt patch for Linux kernel: - http://www.xmailserver.org/linux-patches/nio-improve.html - -* memcached: http://www.danga.com/memcached/download.bml - (as of this writing, 1.1.9 is current) - -Memcached and libevent are under BSD-style licenses. - -The server should run on Linux and other Unix-like systems... you -can run multiple servers on one machine or on multiple machines on -a network; storage can be distributed across multiple servers, and -multiple web servers can use the same cache cluster. - - -********************* W A R N I N G ! ! ! ! ! *********************** -Memcached has no security or authentication. Please ensure that your -server is appropriately firewalled, and that the port(s) used for -memcached servers are not publicly accessible. Otherwise, anyone on -the internet can put data into and read data from your cache. - -An attacker familiar with MediaWiki internals could use this to give -themselves developer access and delete all data from the wiki's -database, as well as getting all users' password hashes and e-mail -addresses. -********************* W A R N I N G ! ! ! ! ! *********************** - -== Setup == - -If you want to start small, just run one memcached on your web -server: - - memcached -d -l 127.0.0.1 -p 11000 -m 64 - -(to run in daemon mode, accessible only via loopback interface, -on port 11000, using up to 64MB of memory) - -In your LocalSettings.php file, set: - - $wgUseMemCached = true; - $wgMemCachedServers = array( "127.0.0.1:11000" ); - -The wiki should then use memcached to cache various data. To use -multiple servers (physically separate boxes or multiple caches -on one machine on a large-memory x86 box), just add more items -to the array. To increase the weight of a server (say, because -it has twice the memory of the others and you want to spread -usage evenly), make its entry a subarray: - - $wgMemCachedServers = array( - "127.0.0.1:11000", # one gig on this box - array("192.168.0.1:11000", 2) # two gigs on the other box - ); - - -== PHP client for memcached == - -As of this writing, MediaWiki includes version 1.0.10 of the PHP -memcached client by Ryan Gilfether . -You'll find some documentation for it in the 'php-memcached' -subdirectory under the present one. - -We intend to track updates, but if you want to check for the lastest -released version, see http://www.danga.com/memcached/apis.bml - -If you don't set $wgUseMemCached, we still create a MemCacheClient, -but requests to it are no-ops and we always fall through to the -database. If the cache daemon can't be contacted, it should also -disable itself fairly smoothly. - -== Keys used == - -User: - key: $wgDBname:user:id:$sId - ex: wikidb:user:id:51 - stores: instance of class User - set in: User::loadFromSession() - cleared by: User::saveSettings(), UserTalkUpdate::doUpdate() - -Newtalk: - key: $wgDBname:newtalk:ip:$ip - ex: wikidb:newtalk:ip:123.45.67.89 - stores: integer, 0 or 1 - set in: User::loadFromDatabase() - cleared by: User::saveSettings() # ? - expiry set to 30 minutes - -LinkCache: - key: $wgDBname:lc:title:$title - ex: wikidb:lc:title:Wikipedia:Welcome,_Newcomers! - stores: cur_id of page, or 0 if page does not exist - set in: LinkCache::addLink() - cleared by: LinkCache::clearBadLink() - should be cleared on page deletion and rename -MediaWiki namespace: - key: $wgDBname:messages - ex: wikidb:messages - stores: an array where the keys are DB keys and the values are messages - set in: wfMsg(), Article::editUpdates() both call wfLoadAllMessages() - cleared by: nothing - -Watchlist: - key: $wgDBname:watchlist:id:$userID - ex: wikidb:watchlist:id:4635 - stores: HTML string - cleared by: nothing, expiry time $wgWLCacheTimeout (1 hour) - note: emergency optimisation only - -IP blocks: - key: $wgDBname:ipblocks - ex: wikidb:ipblocks - stores: array of arrays, for the BlockCache class - cleared by: BlockCache:clear() - -... more to come ... diff --git a/docs/memcached.txt b/docs/memcached.txt new file mode 100644 index 0000000000..6752e9c81d --- /dev/null +++ b/docs/memcached.txt @@ -0,0 +1,132 @@ +memcached support for MediaWiki: + +From ca August 2003, MediaWiki has optional support for memcached, a +"high-performance, distributed memory object caching system". +For general information on it, see: http://www.danga.com/memcached/ + +Memcached is likely more trouble than a small site will need, but +for a larger site with heavy load, like Wikipedia, it should help +lighten the load on the database servers by caching data and objects +in memory. + +== Requirements == + +* PHP must be compiled with --enable-sockets + +* libevent: http://www.monkey.org/~provos/libevent/ + (as of 2003-08-11, 0.7a is current) + +* optionally, epoll-rt patch for Linux kernel: + http://www.xmailserver.org/linux-patches/nio-improve.html + +* memcached: http://www.danga.com/memcached/download.bml + (as of this writing, 1.1.9 is current) + +Memcached and libevent are under BSD-style licenses. + +The server should run on Linux and other Unix-like systems... you +can run multiple servers on one machine or on multiple machines on +a network; storage can be distributed across multiple servers, and +multiple web servers can use the same cache cluster. + + +********************* W A R N I N G ! ! ! ! ! *********************** +Memcached has no security or authentication. Please ensure that your +server is appropriately firewalled, and that the port(s) used for +memcached servers are not publicly accessible. Otherwise, anyone on +the internet can put data into and read data from your cache. + +An attacker familiar with MediaWiki internals could use this to give +themselves developer access and delete all data from the wiki's +database, as well as getting all users' password hashes and e-mail +addresses. +********************* W A R N I N G ! ! ! ! ! *********************** + +== Setup == + +If you want to start small, just run one memcached on your web +server: + + memcached -d -l 127.0.0.1 -p 11000 -m 64 + +(to run in daemon mode, accessible only via loopback interface, +on port 11000, using up to 64MB of memory) + +In your LocalSettings.php file, set: + + $wgUseMemCached = true; + $wgMemCachedServers = array( "127.0.0.1:11000" ); + +The wiki should then use memcached to cache various data. To use +multiple servers (physically separate boxes or multiple caches +on one machine on a large-memory x86 box), just add more items +to the array. To increase the weight of a server (say, because +it has twice the memory of the others and you want to spread +usage evenly), make its entry a subarray: + + $wgMemCachedServers = array( + "127.0.0.1:11000", # one gig on this box + array("192.168.0.1:11000", 2) # two gigs on the other box + ); + + +== PHP client for memcached == + +As of this writing, MediaWiki includes version 1.0.10 of the PHP +memcached client by Ryan Gilfether . +You'll find some documentation for it in the 'php-memcached' +subdirectory under the present one. + +We intend to track updates, but if you want to check for the lastest +released version, see http://www.danga.com/memcached/apis.bml + +If you don't set $wgUseMemCached, we still create a MemCacheClient, +but requests to it are no-ops and we always fall through to the +database. If the cache daemon can't be contacted, it should also +disable itself fairly smoothly. + +== Keys used == + +User: + key: $wgDBname:user:id:$sId + ex: wikidb:user:id:51 + stores: instance of class User + set in: User::loadFromSession() + cleared by: User::saveSettings(), UserTalkUpdate::doUpdate() + +Newtalk: + key: $wgDBname:newtalk:ip:$ip + ex: wikidb:newtalk:ip:123.45.67.89 + stores: integer, 0 or 1 + set in: User::loadFromDatabase() + cleared by: User::saveSettings() # ? + expiry set to 30 minutes + +LinkCache: + key: $wgDBname:lc:title:$title + ex: wikidb:lc:title:Wikipedia:Welcome,_Newcomers! + stores: cur_id of page, or 0 if page does not exist + set in: LinkCache::addLink() + cleared by: LinkCache::clearBadLink() + should be cleared on page deletion and rename +MediaWiki namespace: + key: $wgDBname:messages + ex: wikidb:messages + stores: an array where the keys are DB keys and the values are messages + set in: wfMsg(), Article::editUpdates() both call wfLoadAllMessages() + cleared by: nothing + +Watchlist: + key: $wgDBname:watchlist:id:$userID + ex: wikidb:watchlist:id:4635 + stores: HTML string + cleared by: nothing, expiry time $wgWLCacheTimeout (1 hour) + note: emergency optimisation only + +IP blocks: + key: $wgDBname:ipblocks + ex: wikidb:ipblocks + stores: array of arrays, for the BlockCache class + cleared by: BlockCache:clear() + +... more to come ... diff --git a/docs/schema.doc b/docs/schema.doc deleted file mode 100644 index e915dc2d31..0000000000 --- a/docs/schema.doc +++ /dev/null @@ -1,286 +0,0 @@ -SCHEMA.DOC - -The most up-to-date schema for the tables in the database -should always be "tables.sql" in the maintenance directory, -which is called from the installation script. Here are a -few highlights that may be out of date: - -user (Wikipedia users) - - user_id - integer, primary key, autoincrement - user_name - Usernames must be unique, must not be in the form of - an IP address. _Shouldn't_ allow slashes or case - conflicts. Spaces are allowed, and are _not_ converted - to underscores like titles. (Conflicts?) - user_password - Hash of current password. - user_newpassword - Generated for mail-a-new-password feature - user_email - Note -- email should be restricted, not public info. - Same with passwords. ;) - user_options - Newline-separated list of name=value pairs. - user_token - A pseudorandomly generated value that is stored in - a cookie when the "remember password" feature is - used (previously, a hash of the password was used, but - this was vulnerable to cookie-stealing attacks) - - - -cur (Wikipedia "current" articles) - - cur_id - integer, primary key, autoincrement - cur_namespace - integer index into list of namespaces. See the - Namespace class for more details. - cur_title - Title of article (in dbkey form--see Title), without - namespace. The combination of namespace,title should - be unique in this table. - cur_text - Wikitext of the article. - cur_comment - The summary of the last change. - cur_user - User id who made the last change, or 0 if unknown. - cur_user_text - Name of the user above, or IP address. - cur_timestamp - Time of the last change. - cur_minor_edit - Flag: 0 or 1 is last change was a "minor" edit. - cur_restrictions - Who may or may not edit the article. - cur_counter - Number of times this page has been viewed. - cur_ind_title - Text version of title for fulltext searches. - cur_ind_text - Plaintext version of text for fulltext searches. - cur_is_redirect - 1 indicates the article is a redirect. - cur_minor_edit - 1 indicates this was a minor edit. - cur_is_new - 1 indicates this is the first revision of a new entry. - cur_random - Random value between 0 and 1, used for - Special:Randompage - - - -old (Historical versions articles. Most fields - correspond to the same fields in "cur") - - old_id - old_namespace - old_title - old_text - old_comment - old_user - old_user_text - old_timestamp - old_minor_edit - old_flags - This last is currently unused. - - - -archive (Temporary storage of deleted articles which may be restored. - Fields correspond to those of "cur" and "old") - ar_namespace - ar_title - ar_text - ar_comment - ar_user - ar_user_text - ar_timestamp - ar_minor_edit - ar_flags - This last is currently unused. - - - -links (Internal links to existing articles) - - l_from - ID of source article. (currently title, may be changed) - l_to - ID of target article. - - - -brokenlinks (Internal links to non-existent articles) - - bl_from - ID of source link. - bl_to - Title of target link. - - - -imagelinks (Internal links to images via [[Image:filename]] syntax) - - il_from - Title of target article. - il_to - Filename of target image. - - - -categorylinks (Track category inclusions) - - cl_from - corresponds to cur_id of the linking page - cl_to - corresponds to cur_title of the category page - cl_sortkey - the title of the linking page, or an optional override - cl_timestampe - when the link was last added - - - -linkscc (Stores (possibly gzipped) serialized objects with - cache arrays to reduce database load slurping up - from links and brokenlinks.) - - lcc_pageid - The ID of the linking page - lcc_cacheobj - A serialized LinkCache object - - - -image (Uploaded images and other files) - - img_name - Filename. - img_size - File size in bytes. - img_description - Description field given during upload. - img_user - User ID who uploaded the file. - img_user_text - User name who uploaded the file. - img_timestamp - Timestamp when upload took place. - - - -oldimage (Old versions of images stored for potential revert) - - oi_name - Original filename. - oi_archive_name - Filename of stored old revision; timestamp and - exclaimation point prepended to oi_name - oi_size - File size in bytes. - oi_description - Description field given during upload. - oi_user - User ID who uploaded the file. - oi_user_text - User name who uploaded the file. - oi_timestamp - Timestamp when upload took place. - - - -ipblocks (IP addresses and users blocked from editing) - ipb_id - Primary key, introduced for privacy. - ipb_address - Blocked IP address in dotted-quad form or user name. - ipb_user - Blocked user ID or 0 for IP blocks. - ipb_by - User ID who made the block. - ipb_reason - Text comment made by blocker. - ipb_timestamp - Creation (or refresh) date in standard YMDHMS form. IP - blocks expire automatically. - ipb_auto - Indicates that the IP address was banned because a banned - user accessed a page through it. If this is 1, ipb_address - will be hidden. - - -site_stats (Site-wide statistics) - - ss_row_id - Token for where clauses. There's only one row in - this table. At some point we might want to use a - date here so we can get stats-by-date. - ss_total_views - Number of total views of all pages. - ss_total_edits - Number of total page edits. - ss_good_articles - Number of "countable" articles. - - - -hitcounter (Stores an ID for every time any article is visited; - depending on $wgHitcounterUpdateFreq, it is - periodically cleared and the cur_counter column - in the cur table updated for the all articles - that have been visited.) - hc_id - The ID of an article, representing one hit - - - -recentchanges - - (Will document further when working) - - - -watchlist - - wl_user - Foreign key -> user_id - wl_namespace - Namespace -> cur_namespace - Note that these should only include even-numbered - namespaces for regular pages; associated talk pages - (odd numbered namespaces) are folded in. - wl_title - Page title -> cur_title - Note also that the linked page may not exist in page - or talk namespace, or at all. - - -searchindex (Used for MySQL fulltext searching) - - si_page - The ID of an article - si_title - The title of an article, indexed for searching - si_text - The text of an article, indexed for searching - - - -interwiki (Recognized interwiki link prefixes) - iw_prefix - The interwiki prefix, (e.g. "Meatball", or the - language prefix "de") - iw_url - The URL of the wiki, with "$1" as a placeholder - for an article name - iw_local - A boolean value indicating whether the wiki is - in this project (used, for example, to detect - redirect loops) - - diff --git a/docs/schema.txt b/docs/schema.txt new file mode 100644 index 0000000000..e915dc2d31 --- /dev/null +++ b/docs/schema.txt @@ -0,0 +1,286 @@ +SCHEMA.DOC + +The most up-to-date schema for the tables in the database +should always be "tables.sql" in the maintenance directory, +which is called from the installation script. Here are a +few highlights that may be out of date: + +user (Wikipedia users) + + user_id + integer, primary key, autoincrement + user_name + Usernames must be unique, must not be in the form of + an IP address. _Shouldn't_ allow slashes or case + conflicts. Spaces are allowed, and are _not_ converted + to underscores like titles. (Conflicts?) + user_password + Hash of current password. + user_newpassword + Generated for mail-a-new-password feature + user_email + Note -- email should be restricted, not public info. + Same with passwords. ;) + user_options + Newline-separated list of name=value pairs. + user_token + A pseudorandomly generated value that is stored in + a cookie when the "remember password" feature is + used (previously, a hash of the password was used, but + this was vulnerable to cookie-stealing attacks) + + + +cur (Wikipedia "current" articles) + + cur_id + integer, primary key, autoincrement + cur_namespace + integer index into list of namespaces. See the + Namespace class for more details. + cur_title + Title of article (in dbkey form--see Title), without + namespace. The combination of namespace,title should + be unique in this table. + cur_text + Wikitext of the article. + cur_comment + The summary of the last change. + cur_user + User id who made the last change, or 0 if unknown. + cur_user_text + Name of the user above, or IP address. + cur_timestamp + Time of the last change. + cur_minor_edit + Flag: 0 or 1 is last change was a "minor" edit. + cur_restrictions + Who may or may not edit the article. + cur_counter + Number of times this page has been viewed. + cur_ind_title + Text version of title for fulltext searches. + cur_ind_text + Plaintext version of text for fulltext searches. + cur_is_redirect + 1 indicates the article is a redirect. + cur_minor_edit + 1 indicates this was a minor edit. + cur_is_new + 1 indicates this is the first revision of a new entry. + cur_random + Random value between 0 and 1, used for + Special:Randompage + + + +old (Historical versions articles. Most fields + correspond to the same fields in "cur") + + old_id + old_namespace + old_title + old_text + old_comment + old_user + old_user_text + old_timestamp + old_minor_edit + old_flags + This last is currently unused. + + + +archive (Temporary storage of deleted articles which may be restored. + Fields correspond to those of "cur" and "old") + ar_namespace + ar_title + ar_text + ar_comment + ar_user + ar_user_text + ar_timestamp + ar_minor_edit + ar_flags + This last is currently unused. + + + +links (Internal links to existing articles) + + l_from + ID of source article. (currently title, may be changed) + l_to + ID of target article. + + + +brokenlinks (Internal links to non-existent articles) + + bl_from + ID of source link. + bl_to + Title of target link. + + + +imagelinks (Internal links to images via [[Image:filename]] syntax) + + il_from + Title of target article. + il_to + Filename of target image. + + + +categorylinks (Track category inclusions) + + cl_from + corresponds to cur_id of the linking page + cl_to + corresponds to cur_title of the category page + cl_sortkey + the title of the linking page, or an optional override + cl_timestampe + when the link was last added + + + +linkscc (Stores (possibly gzipped) serialized objects with + cache arrays to reduce database load slurping up + from links and brokenlinks.) + + lcc_pageid + The ID of the linking page + lcc_cacheobj + A serialized LinkCache object + + + +image (Uploaded images and other files) + + img_name + Filename. + img_size + File size in bytes. + img_description + Description field given during upload. + img_user + User ID who uploaded the file. + img_user_text + User name who uploaded the file. + img_timestamp + Timestamp when upload took place. + + + +oldimage (Old versions of images stored for potential revert) + + oi_name + Original filename. + oi_archive_name + Filename of stored old revision; timestamp and + exclaimation point prepended to oi_name + oi_size + File size in bytes. + oi_description + Description field given during upload. + oi_user + User ID who uploaded the file. + oi_user_text + User name who uploaded the file. + oi_timestamp + Timestamp when upload took place. + + + +ipblocks (IP addresses and users blocked from editing) + ipb_id + Primary key, introduced for privacy. + ipb_address + Blocked IP address in dotted-quad form or user name. + ipb_user + Blocked user ID or 0 for IP blocks. + ipb_by + User ID who made the block. + ipb_reason + Text comment made by blocker. + ipb_timestamp + Creation (or refresh) date in standard YMDHMS form. IP + blocks expire automatically. + ipb_auto + Indicates that the IP address was banned because a banned + user accessed a page through it. If this is 1, ipb_address + will be hidden. + + +site_stats (Site-wide statistics) + + ss_row_id + Token for where clauses. There's only one row in + this table. At some point we might want to use a + date here so we can get stats-by-date. + ss_total_views + Number of total views of all pages. + ss_total_edits + Number of total page edits. + ss_good_articles + Number of "countable" articles. + + + +hitcounter (Stores an ID for every time any article is visited; + depending on $wgHitcounterUpdateFreq, it is + periodically cleared and the cur_counter column + in the cur table updated for the all articles + that have been visited.) + hc_id + The ID of an article, representing one hit + + + +recentchanges + + (Will document further when working) + + + +watchlist + + wl_user + Foreign key -> user_id + wl_namespace + Namespace -> cur_namespace + Note that these should only include even-numbered + namespaces for regular pages; associated talk pages + (odd numbered namespaces) are folded in. + wl_title + Page title -> cur_title + Note also that the linked page may not exist in page + or talk namespace, or at all. + + +searchindex (Used for MySQL fulltext searching) + + si_page + The ID of an article + si_title + The title of an article, indexed for searching + si_text + The text of an article, indexed for searching + + + +interwiki (Recognized interwiki link prefixes) + iw_prefix + The interwiki prefix, (e.g. "Meatball", or the + language prefix "de") + iw_url + The URL of the wiki, with "$1" as a placeholder + for an article name + iw_local + A boolean value indicating whether the wiki is + in this project (used, for example, to detect + redirect loops) + + diff --git a/docs/skin.doc b/docs/skin.doc deleted file mode 100644 index 3b7a74ed08..0000000000 --- a/docs/skin.doc +++ /dev/null @@ -1,48 +0,0 @@ - -SKIN.DOC - -This document describes the overall architecture of MediaWiki's HTML rendering -code as well as some history about the skin system. It is placed here rather -than in comments in the code itself to help reduce the code size. - -== Version 1.4 == - -MediaWiki still use the PHPTal skin system introduced in version 1.3 but some -changes have been made to the file organisation. - -PHP class and PHPTal templates have been moved to /skins/ (respectivly from -/includes/ and /templates/). This way skin designer and end user just stick to -one directory. - -Two samples are provided to start with, one for PHPTal use (SkinPHPTal.sample) -and one without (Skin.sample). - - -== Version 1.3 == - -The following might help a bit though. - -Firstly, there's Skin.php; this file will check various settings, and it -contains a base class from which new skins can be derived. - -Before version 1.3, each skin had its own PHP file (with a sub-class to Skin) -to generate the output. The files are: - * SkinCologneBlue.php - * SkinNostalgia.php - * SkinStandard.php - * SkinWikimediaWiki.php -If you want to change those skins, you have to edit these PHP files. - -Since 1.3 a new special skin file is available: SkinPHPTal.php. It makes use of -the PHPTal template engine and allows you to separate code and layout of the -pages. The default 1.3 skin is MonoBook and it uses the SkinPHPTAL class. - -To change the layout, just edit the PHPTal template (templates/xhtml_slim.pt) -as well as the stylesheets (stylesheets/monobook/*). - - -== pre 1.3 version == - -Unfortunately there isn't any documentation, and the code's in a bit of a mess -right now during the transition from the old skin code to the new template-based -skin code in 1.3. diff --git a/docs/skin.txt b/docs/skin.txt new file mode 100644 index 0000000000..3b7a74ed08 --- /dev/null +++ b/docs/skin.txt @@ -0,0 +1,48 @@ + +SKIN.DOC + +This document describes the overall architecture of MediaWiki's HTML rendering +code as well as some history about the skin system. It is placed here rather +than in comments in the code itself to help reduce the code size. + +== Version 1.4 == + +MediaWiki still use the PHPTal skin system introduced in version 1.3 but some +changes have been made to the file organisation. + +PHP class and PHPTal templates have been moved to /skins/ (respectivly from +/includes/ and /templates/). This way skin designer and end user just stick to +one directory. + +Two samples are provided to start with, one for PHPTal use (SkinPHPTal.sample) +and one without (Skin.sample). + + +== Version 1.3 == + +The following might help a bit though. + +Firstly, there's Skin.php; this file will check various settings, and it +contains a base class from which new skins can be derived. + +Before version 1.3, each skin had its own PHP file (with a sub-class to Skin) +to generate the output. The files are: + * SkinCologneBlue.php + * SkinNostalgia.php + * SkinStandard.php + * SkinWikimediaWiki.php +If you want to change those skins, you have to edit these PHP files. + +Since 1.3 a new special skin file is available: SkinPHPTal.php. It makes use of +the PHPTal template engine and allows you to separate code and layout of the +pages. The default 1.3 skin is MonoBook and it uses the SkinPHPTAL class. + +To change the layout, just edit the PHPTal template (templates/xhtml_slim.pt) +as well as the stylesheets (stylesheets/monobook/*). + + +== pre 1.3 version == + +Unfortunately there isn't any documentation, and the code's in a bit of a mess +right now during the transition from the old skin code to the new template-based +skin code in 1.3. diff --git a/docs/title.doc b/docs/title.doc deleted file mode 100644 index 1ae445601d..0000000000 --- a/docs/title.doc +++ /dev/null @@ -1,72 +0,0 @@ -TITLE.DOC - -The Wikipedia software's "Title" class represents article -titles, which are used for many purposes: as the human-readable -text title of the article, in the URL used to access the article, -the wikitext link to the article, the key into the article -database, and so on. The class in instantiated from one of -these forms and can be queried for the others, and for other -attributes of the title. This is intended to be an -immutable "value" class, so there are no mutator functions. - -To get a new instance, call one of the static factory -methods WikiTitle::newFromURL(), WikiTitle::newFromDBKey(), -or WikiTitle::newFromText(). Once instantiated, the -other non-static accessor methods can be used, such as -getText(), getDBKey(), getNamespace(), etc. - -The prefix rules: a title consists of an optional Interwiki -prefix (such as "m:" for meta or "de:" for German), followed -by an optional namespace, followed by the remainder of the -title. Both Interwiki prefixes and namespace prefixes have -the same rules: they contain only letters, digits, space, and -underscore, must start with a letter, are case insensitive, -and spaces and underscores are interchangeable. Prefixes end -with a ":". A prefix is only recognized if it is one of those -specifically allowed by the software. For example, "de:name" -is a link to the article "name" in the German Wikipedia, because -"de" is recognized as one of the allowable interwikis. The -title "talk:name" is a link to the article "name" in the "talk" -namespace of the current wiki, because "talk" is a recognized -namespace. Both may be present, and if so, the interwiki must -come first, for example, "m:talk:name". If a title begins with -a colon as its first character, no prefixes are scanned for, -and the colon is just removed. Note that because of these -rules, it is possible to have articles with colons in their -names. "E. Coli 0157:H7" is a valid title, as is "2001: A Space -Odyssey", because "E. Coli 0157" and "2001" are not valid -interwikis or namespaces. Likewise, ":de:name" is a link to -the article "de:name"--even though "de" is a valid interwiki, -the initial colon stops all prefix matching. - -Character mapping rules: Once prefixes have been stripped, the -rest of the title processed this way: spaces and underscores are -treated as equivalent and each is converted to the other in the -appropriate context (underscore in URL and database keys, spaces -in plain text). "Extended" characters in the 0x80..0xFF range -are allowed in all places, and are valid characters. They are -encoded in URLs. Other characters may be ASCII letters, digits, -hyphen, comma, period, apostrophe, parentheses, and colon. No -other ASCII characters are allowed, and will be deleted if found -(they will probably cause a browser to misinterpret the URL). -Extended characters are _not_ urlencoded when used as text or -database keys. - -Character encoding rules: TODO - -Canonical forms: the canonical form of a title will always be -returned by the object. In this form, the first (and only the -first) character of the namespace and title will be uppercased; -the rest of the namespace will be lowercased, while the title -will be left as is. The text form will use spaces, the URL and -DBkey forms will use underscores. Interwiki prefixes are all -lowercase. The namespace will use underscores when returned -alone; it will use spaces only when attached to the text title. - -getArticleID() needs some explanation: for "internal" articles, -it should return the "cur_id" field if the article exists, else -it returns 0. For all external articles it returns 0. All of -the IDs for all instances of Title created during a request are -cached, so they can be looked up wuickly while rendering wiki -text with lots of internal links. - diff --git a/docs/title.txt b/docs/title.txt new file mode 100644 index 0000000000..1ae445601d --- /dev/null +++ b/docs/title.txt @@ -0,0 +1,72 @@ +TITLE.DOC + +The Wikipedia software's "Title" class represents article +titles, which are used for many purposes: as the human-readable +text title of the article, in the URL used to access the article, +the wikitext link to the article, the key into the article +database, and so on. The class in instantiated from one of +these forms and can be queried for the others, and for other +attributes of the title. This is intended to be an +immutable "value" class, so there are no mutator functions. + +To get a new instance, call one of the static factory +methods WikiTitle::newFromURL(), WikiTitle::newFromDBKey(), +or WikiTitle::newFromText(). Once instantiated, the +other non-static accessor methods can be used, such as +getText(), getDBKey(), getNamespace(), etc. + +The prefix rules: a title consists of an optional Interwiki +prefix (such as "m:" for meta or "de:" for German), followed +by an optional namespace, followed by the remainder of the +title. Both Interwiki prefixes and namespace prefixes have +the same rules: they contain only letters, digits, space, and +underscore, must start with a letter, are case insensitive, +and spaces and underscores are interchangeable. Prefixes end +with a ":". A prefix is only recognized if it is one of those +specifically allowed by the software. For example, "de:name" +is a link to the article "name" in the German Wikipedia, because +"de" is recognized as one of the allowable interwikis. The +title "talk:name" is a link to the article "name" in the "talk" +namespace of the current wiki, because "talk" is a recognized +namespace. Both may be present, and if so, the interwiki must +come first, for example, "m:talk:name". If a title begins with +a colon as its first character, no prefixes are scanned for, +and the colon is just removed. Note that because of these +rules, it is possible to have articles with colons in their +names. "E. Coli 0157:H7" is a valid title, as is "2001: A Space +Odyssey", because "E. Coli 0157" and "2001" are not valid +interwikis or namespaces. Likewise, ":de:name" is a link to +the article "de:name"--even though "de" is a valid interwiki, +the initial colon stops all prefix matching. + +Character mapping rules: Once prefixes have been stripped, the +rest of the title processed this way: spaces and underscores are +treated as equivalent and each is converted to the other in the +appropriate context (underscore in URL and database keys, spaces +in plain text). "Extended" characters in the 0x80..0xFF range +are allowed in all places, and are valid characters. They are +encoded in URLs. Other characters may be ASCII letters, digits, +hyphen, comma, period, apostrophe, parentheses, and colon. No +other ASCII characters are allowed, and will be deleted if found +(they will probably cause a browser to misinterpret the URL). +Extended characters are _not_ urlencoded when used as text or +database keys. + +Character encoding rules: TODO + +Canonical forms: the canonical form of a title will always be +returned by the object. In this form, the first (and only the +first) character of the namespace and title will be uppercased; +the rest of the namespace will be lowercased, while the title +will be left as is. The text form will use spaces, the URL and +DBkey forms will use underscores. Interwiki prefixes are all +lowercase. The namespace will use underscores when returned +alone; it will use spaces only when attached to the text title. + +getArticleID() needs some explanation: for "internal" articles, +it should return the "cur_id" field if the article exists, else +it returns 0. For all external articles it returns 0. All of +the IDs for all instances of Title created during a request are +cached, so they can be looked up wuickly while rendering wiki +text with lots of internal links. + diff --git a/docs/user.doc b/docs/user.doc deleted file mode 100644 index ec3949f08e..0000000000 --- a/docs/user.doc +++ /dev/null @@ -1,63 +0,0 @@ - -USER.DOC - -Documenting the Wikipedia User object. - -(DISCLAIMER: The documentation is not guaranteed to be in sync with -the code at all times. If in doubt, check the table definitions -and User.php.) - -Database fields: - - user_id - Unique integer identifier; primary key. Sent to user in - cookie "{$wgDBname}UserID". - - user_name - Text of full user name; title of "user:" page. Displayed - on change lists, etc. Sent to user as cookie "{$wgDBname}UserName". - Note that user names can contain spaces, while these are - converted to underscores in page titles. - - user_rights - Comma-separated list of rights. Right now, only "sysop", - "developer", "bureaucrat", and "bot" have meaning. - - user_password - Salted md5 hash of md5-hashed user login password. If user option to - remember password is set, an md5 password hash is stored in cookie - "{$wgDBname}UserPassword". The original password and the hashed password - can be compared to the salted-hashed-hashed password. - - user_newpassword - Hash for randomly generated password sent on 'send me a new password'. - If a match is made on login, the new password will replace the old one. - - user_email - User's e-mail address. (Optional, used for user-to-user - e-mail and password recovery.) - - user_options - A urlencoded string of name=value pairs to set various - user options. - - user_touched - Timestamp updated when the user logs in, changes preferences, alters - watchlist, or when someone edits their user talk page or they clear - the new-talk field by viewing it. Used to invalidate old cached pages - from the user's browser cache. - - user_real_name - "Real name" optionally used in some metadata lists. - -The user object encapsulates all of the settings, and clients -classes use the getXXX() functions to access them. These functions -do all the work of determining whether the user is logged in, -whether the requested option can be satisfied from cookies or -whether a database query is needed. Most of the settings needed -for rendering normal pages are set in the cookie to minimize use -of the database. - -Options - The user_options field is a list of name-value pairs. The - following option names are used at various points in the system: diff --git a/docs/user.txt b/docs/user.txt new file mode 100644 index 0000000000..ec3949f08e --- /dev/null +++ b/docs/user.txt @@ -0,0 +1,63 @@ + +USER.DOC + +Documenting the Wikipedia User object. + +(DISCLAIMER: The documentation is not guaranteed to be in sync with +the code at all times. If in doubt, check the table definitions +and User.php.) + +Database fields: + + user_id + Unique integer identifier; primary key. Sent to user in + cookie "{$wgDBname}UserID". + + user_name + Text of full user name; title of "user:" page. Displayed + on change lists, etc. Sent to user as cookie "{$wgDBname}UserName". + Note that user names can contain spaces, while these are + converted to underscores in page titles. + + user_rights + Comma-separated list of rights. Right now, only "sysop", + "developer", "bureaucrat", and "bot" have meaning. + + user_password + Salted md5 hash of md5-hashed user login password. If user option to + remember password is set, an md5 password hash is stored in cookie + "{$wgDBname}UserPassword". The original password and the hashed password + can be compared to the salted-hashed-hashed password. + + user_newpassword + Hash for randomly generated password sent on 'send me a new password'. + If a match is made on login, the new password will replace the old one. + + user_email + User's e-mail address. (Optional, used for user-to-user + e-mail and password recovery.) + + user_options + A urlencoded string of name=value pairs to set various + user options. + + user_touched + Timestamp updated when the user logs in, changes preferences, alters + watchlist, or when someone edits their user talk page or they clear + the new-talk field by viewing it. Used to invalidate old cached pages + from the user's browser cache. + + user_real_name + "Real name" optionally used in some metadata lists. + +The user object encapsulates all of the settings, and clients +classes use the getXXX() functions to access them. These functions +do all the work of determining whether the user is logged in, +whether the requested option can be satisfied from cookies or +whether a database query is needed. Most of the settings needed +for rendering normal pages are set in the cookie to minimize use +of the database. + +Options + The user_options field is a list of name-value pairs. The + following option names are used at various points in the system: