diff options
author | Pierre Schmitz <pierre@archlinux.de> | 2013-08-12 09:28:15 +0200 |
---|---|---|
committer | Pierre Schmitz <pierre@archlinux.de> | 2013-08-12 09:28:15 +0200 |
commit | 08aa4418c30cfc18ccc69a0f0f9cb9e17be6c196 (patch) | |
tree | 577a29fb579188d16003a209ce2a2e9c5b0aa2bd /docs/contenthandler.txt | |
parent | cacc939b34e315b85e2d72997811eb6677996cc1 (diff) |
Update to MediaWiki 1.21.1
Diffstat (limited to 'docs/contenthandler.txt')
-rw-r--r-- | docs/contenthandler.txt | 184 |
1 files changed, 184 insertions, 0 deletions
diff --git a/docs/contenthandler.txt b/docs/contenthandler.txt new file mode 100644 index 00000000..899554af --- /dev/null +++ b/docs/contenthandler.txt @@ -0,0 +1,184 @@ +The ContentHandler facility adds support for arbitrary content types on wiki pages, instead of relying on wikitext +for everything. It was introduced in MediaWiki 1.21. + +Each kind of content ("content model") supported by MediaWiki is identified by unique name. The content model determines +how a page's content is rendered, compared, stored, edited, and so on. + +Built-in content types are: + +* wikitext - wikitext, as usual +* javascript - user provided javascript code +* css - user provided css code +* text - plain text + +In PHP, use the corresponding CONTENT_MODEL_XXX constant. + +A page's content model is available using the Title::getContentModel() method. A page's default model is determined by +ContentHandler::getDefaultModelFor($title) as follows: + +* The global setting $wgNamespaceContentModels specifies a content model for the given namespace. +* The hook ContentHandlerDefaultModelFor may be used to override the page's default model. +* Pages in NS_MEDIAWIKI and NS_USER default to the CSS or JavaScript model if they end in .js or .css, respectively. + Pages in NS_MEDIAWIKI default to the wikitext model otherwise. +* The hook TitleIsCssOrJsPage may be used to force a page to use the CSS or JavaScript model. + This is a compatibility feature. The ContentHandlerDefaultModelFor hook should be used instead if possible. +* The hook TitleIsWikitextPage may be used to force a page to use the wikitext model. + This is a compatibility feature. The ContentHandlerDefaultModelFor hook should be used instead if possible. +* Otherwise, the wikitext model is used. + +Note that is currently no mechanism to convert a page from one content model to another, and there is no guarantee that +revisions of a page will all have the same content model. Use Revision::getContentModel() to find it. + + +== Architecture == + +Two class hierarchies are used to provide the functionality associated with the different content models: + +* Content interface (and AbstractContent base class) define functionality that acts on the concrete content of a page, and +* ContentHandler base class provides functionality specific to a content model, but not acting on concrete content. + +The most important function of ContentHandler is to act as a factory for the appropriate implementation of Content. These +Content objects are to be used by MediaWiki everywhere, instead of passing page content around as text. All manipulation +and analysis of page content must be done via the appropriate methods of the Content object. + +For each content model, a subclass of ContentHandler has to be registered with $wgContentHandlers. The ContentHandler +object for a given content model can be obtained using ContentHandler::getForModelID( $id ). Also Title, WikiPage and +Revision now have getContentHandler() methods for convenience. + +ContentHandler objects are singletons that provide functionality specific to the content type, but not directly acting +on the content of some page. ContentHandler::makeEmptyContent() and ContentHandler::unserializeContent() can be used to +create a Content object of the appropriate type. However, it is recommended to instead use WikiPage::getContent() resp. +Revision::getContent() to get a page's content as a Content object. These two methods should be the ONLY way in which +page content is accessed. + +Another important function of ContentHandler objects is to define custom action handlers for a content model, see +ContentHandler::getActionOverrides(). This is similar to what WikiPage::getActionOverrides() was already doing. + + +== Serialization == + +With the ContentHandler facility, page content no longer has to be text based. Objects implementing the Content interface +are used to represent and handle the content internally. For storage and data exchange, each content model supports +at least one serialization format via ContentHandler::serializeContent( $content ). The list of supported formats for +a given content model can be accessed using ContentHandler::getSupportedFormats(). + +Content serialization formats are identified using MIME type like strings. The following formats are built in: + +* text/x-wiki - wikitext +* text/javascript - for js pages +* text/css - for css pages +* text/plain - for future use, e.g. with plain text messages. +* text/html - for future use, e.g. with plain html messages. +* application/vnd.php.serialized - for future use with the api and for extensions +* application/json - for future use with the api, and for use by extensions +* application/xml - for future use with the api, and for use by extensions + +In PHP, use the corresponding CONTENT_FORMAT_XXX constant. + +Note that when using the API to access page content, especially action=edit, action=parse and action=query&prop=revisions, +the model and format of the content should always be handled explicitly. Without that information, interpretation of +the provided content is not reliable. The same applies to XML dumps generated via maintenance/dumpBackup.php or +Special:Export. + +Also note that the API will provide encapsulated, serialized content - so if the API was called with format=json, and +contentformat is also json (or rather, application/json), the page content is represented as a string containing an +escaped json structure. Extensions that use JSON to serialize some types of page content may provide specialized API +modules that allow access to that content in a more natural form. + + +== Compatibility == + +The ContentHandler facility is introduced in a way that should allow all existing code to keep functioning at least +for pages that contain wikitext or other text based content. However, a number of functions and hooks have been +deprecated in favor of new versions that are aware of the page's content model, and will now generate warnings when +used. + +Most importantly, the following functions have been deprecated: + +* Revisions::getText() and Revisions::getRawText() is deprecated in favor Revisions::getContent() +* WikiPage::getText() is deprecated in favor WikiPage::getContent() + +Also, the old Article::getContent() (which returns text) is superceded by Article::getContentObject(). However, both +methods should be avoided since they do not provide clean access to the page's actual content. For instance, they may +return a system message for non-existing pages. Use WikiPage::getContent() instead. + +Code that relies on a textual representation of the page content should eventually be rewritten. However, +ContentHandler::getContentText() provides a stop-gap that can be used to get text for a page. Its behavior is controlled +by $wgContentHandlerTextFallback; per default it will return the text for text based content, and null for any other +content. + +For rendering page content, Content::getParserOutput() should be used instead of accessing the parser directly. +ContentHandler::makeParserOptions() can be used to construct appropriate options. + + +Besides some functions, some hooks have also been replaced by new versions (see hooks.txt for details). +These hooks will now trigger a warning when used: + +* ArticleAfterFetchContent was replaced by ArticleAfterFetchContentObject +* ArticleInsertComplete was replaced by PageContentInsertComplete +* ArticleSave was replaced by PageContentSave +* ArticleSaveComplete was replaced by PageContentSaveComplete +* ArticleViewCustom was replaced by ArticleContentViewCustom (also consider a custom implementation of the view action) +* EditFilterMerged was replaced by EditFilterMergedContent +* EditPageGetDiffText was replaced by EditPageGetDiffContent +* EditPageGetPreviewText was replaced by EditPageGetPreviewContent +* ShowRawCssJs was deprecated in favor of custom rendering implemented in the respective ContentHandler object. + + +== Database Storage == + +Page content is stored in the database using the same mechanism as before. Non-text content is serialized first. The +appropriate serialization and deserialization is handled by the Revision class. + +Each revision's content model and serialization format is stored in the revision table (resp. in the archive table, if +the revision was deleted). The page's (current) content model (that is, the content model of the latest revision) is also +stored in the page table. + +Note however that the content model and format is only stored if it differs from the page's default, as determined by +ContentHandler::getDefaultModelFor( $title ). The default values are represented as NULL in the database, to preserve +space. + +Storage of content model and format can be disabled altogether by setting $wgContentHandlerUseDB = false. In that case, +the page's default model (and the model's default format) will be used everywhere. Attempts to store a revision of a page +using a model or format different from the default will result in an error. + + +== Globals == + +There are some new globals that can be used to control the behavior of the ContentHandler facility: + +* $wgContentHandlers associates content model IDs with the names of the appropriate ContentHandler subclasses. + +* $wgNamespaceContentModels maps namespace IDs to a content model that should be the default for that namespace. + +* $wgContentHandlerUseDB determines whether each revision's content model should be stored in the database. + Defaults is true. + +* $wgContentHandlerTextFallback determines how the compatibility method ContentHandler::getContentText() will behave for + non-text content: + 'ignore' causes null to be returned for non-text content (default). + 'serialize' causes the serialized form of any non-text content to be returned (scary). + 'fail' causes an exception to be thrown for non-text content (strict). + + +== Caveats == + +There are some changes in behavior that might be surprising to users: + +* Javascript and CSS pages are no longer parsed as wikitext (though pre-save transform is still applied). Most +importantly, this means that links, including categorization links, contained in the code will not work. + +* With $wgContentHandlerUseDB = false, pages can not be moved in a way that would change the +default model. E.g. [[MediaWiki:foo.js]] can not be moved to [[MediaWiki:foo bar]], but can still be moved to +[[User:John/foo.js]]. Also, in this mode, changing the default content model for a page (e.g. by changing +$wgNamespaceContentModels) may cause it to become inaccessible. + +* action=edit will fail for pages with non-text content, unless the respective ContentHandler implementation has +provided a specialized handler for the edit action. This is true for the API as well. + +* action=raw will fail for all non-text content. This seems better than serving content in other formats to an +unsuspecting recipient. This will also cause client-side diffs to fail. + +* File pages provide their own action overrides that do not combine gracefully with any custom handlers defined by a +ContentHandler. If for example a File page used a content model with a custom revert action, this would be overridden by +WikiFilePage's handler for the revert action. |