Revision as of 03:25, 21 January 2006 editBrooke Vibber (talk | contribs)Extended confirmed users10,086 edits Add a notice about teh horror← Previous edit | Revision as of 04:12, 21 January 2006 edit undoBrooke Vibber (talk | contribs)Extended confirmed users10,086 edits Please don't change my signed words while leaving my sig. It is certainly not true that we have as a group told people not to use meta-templates for performance reasons.Next edit → |
(One intermediate revision by one other user not shown) | |
(No difference) |
Revision as of 04:12, 21 January 2006
There's a lot of talk about this 'policy' which attempts to divine meaning from things other people have said rather than just asking for details.
Complicated templates-within-templates generally ought to be thought twice about before being used, because they can be confusing and fragile. There are some good notes about that on this page; please don't go all willy-nilly with illegible code just because it sort-of works.
There are other notes on this page about server performance which are not necessarily clear or well-supported. In particular, there's no known evidence that moderate usage of meta-templates has any noticeable impact on server performance.
While there are potential issues with cache invalidation, that's a separate issue which can be separately solved -- and is little better with "regular" templates.
I'd like to ask that anyone fighting against ugly, fragile meta templates at this time do so based on their ugliness and fragility. Please don't go around claiming "the developers" laid down the law and said nobody can use meta-templates because they hurt the servers; that just isn't true.
--Brion 03:25, 21 January 2006 (UTC)
Template messages allow certain standard text to be included on many pages, usually with the idea that in the future, any changes to that text block can be changed in one place. "Meta-templates" as used in this article are those that are created and used to keep other templates in a standard format. They are extremely useful and convenient for many purposes.
However, note that Wikimedia developers agree that templates within templates can be severely problematic to server performance and that, as such, they should be avoided for technical reasons.
Wikimedia database developer Jamesday has stated that they create a noticeable performance problem in general. However Brion (lead developer and Wikimedia's Chief Technical Officer) has recently stated '"Policy" shouldn't really concern itself with server load except in the most extreme of cases; keeping things tuned to provide what the user base needs is our job' and 'If someone's advising against them because they "increase server load", ask them for their benchmarks. I'd love to see them.' on the Village pump .
Use of meta-templates
The use of metatemplates should be avoided in almost all circumstances. If you are considering using a meta-template, ask these questions.
- Is the end product essential to Misplaced Pages, or is it a primarily decorative feature? Meta-templates that are not essential should be avoided.
- Is the template likely to be high-profile? High-profile templates cause more server load, and so are less appropriate for meta-templates.
- Is the desired effect only achievable through a meta template, or can a template of basically the same appearance be made without them? If the same effect can be achieved differently, even if it is more difficult, a meta-template should be avoided.
You should not use metatemplates unless you have a really good reason that you have to.
Harmful effects
The impacts of meta-templates include not only direct server load effects but also indirect effects, such as creating vandalism vulnerabilities.
Server load
The developers have noted that meta-templates increase the number of server calls, and have a noticeable negative effect on server load. When a meta-template is edited, many pages need to be updated in a single instance, rather than spreading out the changes over a longer period of time by editing each template manually. In some cases, editing meta-templates has caused enough server stress to temporarily lock the database.
Vandalism
Meta-templates, which would be featured on a very high percentage of pages, are an excellent denial-of-service attack vector, since changing it or any component used in it would flush a substantial percentage of the site caches, which are critical to site performance and normally serve some 75–80% of all hits. Making even one subtle change, like the addition of a space, causes the effect. This has led administrators to protect some of the highest-risk meta-templates.
Complexity
Some meta-templating schemes are so complex that they are prohibitively difficult for the average editor to grasp. As a result, routine maintenance and changes are often not done or delayed, and improper usage can proliferate. The solution, meant to become an easy replacement, becomes more difficult than the function it was meant to improve.
How a page is built and cached
Here's some technical background which may be of use. Jamesday 07:52, 13 Feb 2005 (UTC) and clarified by LarryLACa 00:22, 11 November 2005 (UTC) (Clarifications show in 's).
[Re: Rendering impact: The first time a page is viewed, i.e. after it has been removed from caches, the page request causes several steps:]
- Each item in the base page portion is requested from the database (images and CSS aren't in the main part). The page you edit, each template, each template included in the template and so on. Two templates, two database records to be retrieved. One template on its own, one read, one template including another, two. Plus the one for the base page.
- Once that and the rest of what is called the parsing is done, the page is saved in the parser cache. That's kept in RAM in memcached.
- Finally the skin is applied and the page is passed on to the Squids, which cache it in RAM and on disk (to get larger capacity but at slower access time) for all who aren't logged in (will only be useful if it's the normal skin) and send it on to the person who originally requested it.
- Whenever any part of a page is changed, be it the page itself or a template or image used in it:
- The page is marked as changed ("touched") and will be regenerated next time it is requested. Both the Squid and parser caches have it removed. Necessary so people see the correct version.
- The marking process involves a database update to every affected page, which for many pages can produce "replication lag" , with outdated information displayed to those using the page. This (the marking process) also shows up as slower response times when the lag is less than ten seconds. The effect is minimised by affecting small numbers of pages and maximised by affecting a large number, in part because the wait of up to ten seconds makes batches in the few thousand range effectively invisible except for delay in page load times. Touching about 18,000 pages currently takes a database slave with 4GB of RAM, 6 drive RAID 10 array and write caching disk controller some 90 seconds (that's from a real touch operation).
- Assuming even distribution across 18,000 pages, Each of the 8 edits would flush from cache one eighth of the pages in each edit. On the database side, lets look at that 90 second case:
- 90/8 = 11.25 seconds, call it 12.
- Those who view in the first 2 seconds will wait ten seconds then see out of date information.
- Those who view in the remaining ten seconds will see delay of up to ten seconds and then completely current data.
- So, splitting it has removed most of the visible lag and visible problem.
The Squids, because of the limitations in the way they can work, with much less work per page, are inherently the fastest way of serving the pages and just 4 machines can serve some 75% of all hits to the site. But they are restricted in what they can serve. Next step is using the parser cache via the apache web servers. That allows all of the user settings for logged in people but uses more web server CPU so it's much less efficient.
We could switch everyone to using the apaches but that would be far less efficient and would require something like 4-5 times as many apaches and database servers as we have today, far more than the 4 machines gained by not using them as squids. And the page views would be slower, because it's an inherently slower process.
While all template use causes an extra database read and flush all pages using it, meta-templates are a special case because they use twice the number of database queries and can cause flushing of many more pages than other templates. If a meta-template is used in only a few or perhaps a few dozen templates which are fundamentally unrelated, it's debatable whether the extra equipment costs are worth it, compared to the relatively modest work involved in updating the individual templates. The replication lag issue can't so easily be addressed — that work is done by however many database servers are purchased. We're trying to reduce the effect but updates affecting many pages are inherently more problematic in this area than those affecting fewer.
Alternatives
- MediaWiki needs developers. Metatemplates that did not cause the server hit would be of great benefit to the project, if someone can work out how to implement them in such a fashion. m:How to become a MediaWiki hacker
- Please read "The touching process involves a database update to every affected page". There are changes in 1.5 which will help. How much is currently not known. Jamesday 14:55, 6 August 2005 (UTC)
- Design, document, and implement — To give an example in the case of Misplaced Pages:Sister projects, a proposal was made to use a meta-template. In this area, it is much better to decide on a common look and format, document that standard for easy reference in case new related templates are needed, and then implement it across the few templates being used. When changes are needed, this gives one central place for discussion and revision. After there is consensus for the change, interested editors can quickly apply them. Creating a page which displays each template also helps to locate templates which don't follow the agreed upon format.
- Make use of CSS — Some meta-templates serve only to produce a specific visual format — such as size, position, color. If these were identified, CSS classes could be added to the site's global stylesheets. The meta-templates could then be replaced with the CSS classes in relevant pages. This would accomplish the same purpose — maintaining uniform style across the site — without placing a burden on the server. This also allows the visual style to vary depending on the skin or user agent.
- This would be very useful. I've previously discussed difficulties with using CSS, notably lack of project and page-specific CSS and the inability to use the span tag, which is part of the blocked HTML set. Jamesday 14:55, 6 August 2005 (UTC)
- Hmm?? Span tags were added to the whitelist in December 2004. — Omegatron 01:47, 17 December 2005 (UTC)
- This would be very useful. I've previously discussed difficulties with using CSS, notably lack of project and page-specific CSS and the inability to use the span tag, which is part of the blocked HTML set. Jamesday 14:55, 6 August 2005 (UTC)
- Use lists, not templates or categories — Some templates and categories are used with the goal of helping editors find articles under a specific topic area of interest. For this use, it is often more valuable to create a list, which can be annotated and prioritized. Many WikiProjects already maintain an area for reporting articles that need work. (Note that categories also heavily load the servers, for similar reasons.) See Misplaced Pages:Categories, lists, and series boxes
- Use {{subst: }} when creating the daughter templates. This copies the text message to the daughter template (substitution) rather than causing a transclusion, thus eliminating the second template call. (See the descriptions of the use of subst: here and here.) Of course, this means that when the meta-template is changed, the daughter templates won't update, and won't be tracked by the What links here feature, which probably defeats the point of using a meta-template in the first place: If all of the daughter templates are to be updated, this would then have to be done by hand, expending a considerable amount of time, work and bandwidth. It's very unclear that this is better than just listing the templates and documenting the format in use on a project page for easy copy-and-pasting — see "Design, document and implement", above.
- You can also do this on the end user pages. Where it is unimportant that the latest version is displayed this is a useful approach. Having the very latest version of say a stub message on a page, when that status will change only if the page is edited and the message can then be updated, doesn't seem to matter very much, since the message that it is a stub is the important part. Jamesday 14:55, 6 August 2005 (UTC)
- Actually using subst with stub templates is a bad idea, see discussion at Misplaced Pages Talk:subst (archived here). Since changes to the stub categories can require revisiting all articles currently tagged with a particular stub. and for several other reasons, these should not be placed with subst. However subst can be and is used to generate new stub templates from the metastub template. DES 01:53, 17 December 2005 (UTC)
- Protection — Meta-templates that are used in many instances but rarely changed can be protected, so that they can only be edited by administrators. This prevents vandalism and reduces the server load problem; if the meta-template is never changed, the daughter templates don't need to be updated. Note that this creates a permanantly protected page, which is also to be avoided.