Misplaced Pages

talk:Requests for adminship/ProtectionBot: Difference between revisions - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
< Misplaced Pages talk:Requests for adminship Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 20:36, 7 January 2007 editRoyalguard11 (talk | contribs)13,192 edits Someone please explain to me...: comment← Previous edit Latest revision as of 20:23, 1 October 2024 edit undoLegobot (talk | contribs)Bots1,669,043 editsm Bot: Fixing lint errors, replacing obsolete HTML tags: <tt> (1x)Tag: Fixed lint errors 
(201 intermediate revisions by 43 users not shown)
Line 5: Line 5:
:Not sure, I have read it and it seems to be safe releasing the source. ]<small> <sup>(Need help? ])</sup></small> 18:26, 7 January 2007 (UTC) :Not sure, I have read it and it seems to be safe releasing the source. ]<small> <sup>(Need help? ])</sup></small> 18:26, 7 January 2007 (UTC)


If Dragons flight released the source, I would withdraw my opposition. My only significant beef is the needless secrecy. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 19:05, 7 January 2007 (UTC) If Dragons flight released the source, I would withdraw my opposition. My only significant beef is the needless secrecy. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 19:05, 7 January 2007 (UTC)


Dragons flight has stated (see comment under Oppose #1), "The code has been released to trusted members of the community for review, but it will not be made public. I feel the risk of people adapting certain functions to create powerful vandalbots is too great." Perhaps other users who have seen and reviewed the code can comment on this issue. This seems a plausible concern to me but an even bigger concern to me is that releasing the code would allow the vandals to try to reverse-engineer ways around it (compare ]). ] 19:10, 7 January 2007 (UTC) Dragons flight has stated (see comment under Oppose #1), "The code has been released to trusted members of the community for review, but it will not be made public. I feel the risk of people adapting certain functions to create powerful vandalbots is too great." Perhaps other users who have seen and reviewed the code can comment on this issue. This seems a plausible concern to me but an even bigger concern to me is that releasing the code would allow the vandals to try to reverse-engineer ways around it (compare ]). ] 19:10, 7 January 2007 (UTC)
: There is no WP:BEANS here. This is nothing that couldn't be done with the freely and openly available pywikipedia framework. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 19:21, 7 January 2007 (UTC) : There is no WP:BEANS here. This is nothing that couldn't be done with the freely and openly available pywikipedia framework. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 19:21, 7 January 2007 (UTC)


::I agree, pywikipedia framwork, the perl wikimedia module, or just plain html scripting can get the same results. The functions this bot performs are not difficult to reproduce. What's more, the code would not be able to perform admin functions on a non-admin account anyways, so it is really just the recursive unprotected template/image finder. If the bot is functioning, then this list of unprotected pages will not be a threat. I read the source, I see no reason to keep it a secret, but I respect the authors right to do so. ]<small> <sup>(Need help? ])</sup></small> 19:28, 7 January 2007 (UTC) ::I agree, pywikipedia framwork, the perl wikimedia module, or just plain html scripting can get the same results. The functions this bot performs are not difficult to reproduce. What's more, the code would not be able to perform admin functions on a non-admin account anyways, so it is really just the recursive unprotected template/image finder. If the bot is functioning, then this list of unprotected pages will not be a threat. I read the source, I see no reason to keep it a secret, but I respect the authors right to do so. ]<small> <sup>(Need help? ])</sup></small> 19:28, 7 January 2007 (UTC)


:::Earlier today, I was thinking the same thing as you, HighinBC, but I've realised the potential issue with releasing the code. I'm going to break WP:BEANS here (on the understanding that the code won't be released), in order to enlighten everyone. The simple matter is that the bot code could be changed to automatically vandalise every unprotected page, perhaps before the bot would be able to protect, and cause the vandalised page to be protected. This is a very serious possibility, allowing vandals to easily impose mass vandalism (esp image vandalism). I anyone thinks that this comment is severely WP:BEANS, blank it. <strong>]<font color="red">]</font></strong> 20:03, 7 January 2007 (UTC) :::Earlier today, I was thinking the same thing as you, HighinBC, but I've realised the potential issue with releasing the code. I'm going to break WP:BEANS here (on the understanding that the code won't be released), in order to enlighten everyone. The simple matter is that the bot code could be changed to automatically vandalise every unprotected page, perhaps before the bot would be able to protect, and cause the vandalised page to be protected. This is a very serious possibility, allowing vandals to easily impose mass vandalism (esp image vandalism). I anyone thinks that this comment is severely WP:BEANS, blank it. <strong>]]</strong> 20:03, 7 January 2007 (UTC)


*This could be easily done with ANY bot framework - or - so where's the specific risk? Please clarify. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 20:21, 7 January 2007 (UTC) *This could be easily done with ANY bot framework - or - so where's the specific risk? Please clarify. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 20:21, 7 January 2007 (UTC)
**We get it. The point is that we want to make it as hard as possible for people to do that. Would you like Tawker to release the source for AVB too? That would be incredibly stupid too. -]<small>(]·]·])</small> 20:36, 7 January 2007 (UTC) **We get it. The point is that we want to make it as hard as possible for people to do that. Would you like Tawker to release the source for AVB too? That would be incredibly stupid too. -]<small>(]·]·])</small> 20:36, 7 January 2007 (UTC)
***Peter, I'm sure that perlwikipedia doesn't allow you to find all unprotected pages/files linked from one, does it? <strong>]]</strong> 20:41, 7 January 2007 (UTC)
****Actually, yeah, it does, thanks to the lovely patch the devs made to the transclusion list code. ] ] 22:11, 7 January 2007 (UTC)
****I can get every contribution an editor's made, every edit to an article by '''x''' users - getting all transclusions is trivial, since it's just a api.php hack. I appreciate the security concerns, but I feel they are unwarranted. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 20:53, 7 January 2007 (UTC)
*****You can get the list of pages trancluded on one page using api.php?! Wow - I didn't know that (though I do use api.php a lot for my bots, I tend to stick to the same queries). Can you give me a link to show this (just out of interest)? <strong>]]</strong> 21:30, 7 January 2007 (UTC)
**That is a really simple sub-routine to make for anyone capable of editing existing code. ]<small> <sup>(Need help? ])</sup></small> 20:42, 7 January 2007 (UTC)
***I'm fairly certain that I could write a decent vandalbot in under 5 minutes with perlwikipedia, it's not like this sort of thing requires a rocket scientist <nowiki></cliche></nowiki>. Any fifth grader with a decent knowledge of Perl and a copy of the WWW::Mechanize module can write one. ] ] 22:11, 7 January 2007 (UTC)

On thing that just came to my mind - Dragons flight noted on the ] that the bot would run on random times etc. to prevent the vandals from predicting its execution and racing to vandalism. I haven't seen the code yet, but this feature (or something similar) may well be the reason that the release of the source code would violate ]. ]] 22:01, 7 January 2007 (UTC)
*It isn't all that hard to design a RNG algorithm such that determining the times from it without having direct access is too hard to be plausible. Video games have managed that for a while, I think that a bot can. -] <small><sup>]</sup><sub>]</sub></small> 22:06, 7 January 2007 (UTC)
**I agree with Amarkov here ... any even somewhat decent implementation of a RNG would not allow anyone to predict its random numbers, even with access to the source. Besides, even if it was a simple timestamp RNG, on-wiki actions are only reported to the nearest second, whereas the script would be using a more fine-grained time seed than that. So there really would be no way to try to predict when it would run again. --] 22:19, 7 January 2007 (UTC)

*I don't really think my point got across. If you wanted to build a vandalbot, you could do it from this, grabbing the unprotected pages... or you could just remove the checkpage requirement from AWB, set it on auto mode, and vandalize away. Much easier than introducing editing functionality to a bot that doesn't have it, and plus, as long as you have a user and user talk page, and are careful not to remove or add too much stuff, it won't look any more suspicious than any other AWB fix. While removing the checkpage requirement isn't a trivial matter, anyone who could turn this bot into a vandalbot could manage it. -] <small><sup>]</sup><sub>]</sub></small> 22:32, 7 January 2007 (UTC)

::So far as I can tell, the code of ] isn't public (at least I couldn't find it)... why is no one freaking out about that? It's a much more complicated bot that can make edits to every page on Misplaced Pages. It makes more edits in a '''day''' than the protection bot will in an entire '''year'''. If it 'went berserk' it could require vastly more work to clean up than the proposed protection bot ever would. In short, all the concerns expressed about 'protection bot' are vastly more applicable to 'antivandal bot'... yet the code is not public and no one seems to mind. Why do you suppose that is? Why do you suppose that 'auto wiki browser' isn't just given out to anyone who wants it? My own theory is that most people realize that 'making smarter vandals' is a bad idea. Yes, a vandal ''could'' build their own version of 'anti vandal bot' that instead ''creates'' vandalism... some have. But most of them aren't 'dedicated' enough to figure out the hows of it and eventually go away. Does it really make sense to HAND those people a ready made vandalism tool that just requires a few tweaks to create a massive mess? That's what making 'protection bot' or 'anti vandal bot' code publically available would do... give general vandals the ability to do ''alot'' more damage. We can handle the few vandals who are capable of building their own bots. Let's not give ''every'' vandal the ability to make bot attacks. --] 23:01, 7 January 2007 (UTC)

:::AntiVandalBot obviously does have a vandalism problem. It can edit anything already, it can do it fast, and it requires no human intervention. This bot can only edit images and templates, and even then only to add or remove three specific things, so it would take loads more work to convert it into a useful vandalbot. And as I've reiterated a lot already, we already have the full source of AWB, which would be much easier to convert to a vandalbot. (It wouldn't even be conversion, really). -] <small><sup>]</sup><sub>]</sub></small> 23:07, 7 January 2007 (UTC)
::::Alternatively, you can use my ] to write a vandalbot. I just wrote a dirt simple, proof-of-concept one with the framework, 24 lines of code, that uses threading and multiple usernames. Elapsed time: 4 minutes. Just because the bot is open-source doesn't make it an automatic target for vandals trying to create vandalbots. It would probably be harder to convert ProtectionBot into a vandalbot than it would be to write one from scratch using pywikipedia. ] ] 23:31, 7 January 2007 (UTC)

::I agree, bot making is not some secret, anyone can learn it and use existing frameworks. ]<small> <sup>(Need help? ])</sup></small> 23:32, 7 January 2007 (UTC)

:::Absolutely agree. Not that I don't trust HighInBC, but I believe strongly in trust-but-verify. I already know pretty well how Antivandalbot works just by having seen what types of things it's done, and it would ''not'' hard to write vandalbots from what's already out there. Our anti-vandalism techniques need to be just as open, so that when the vandalbot runners find a way around them (and you believe me, they will), we can respond quickly and improve our own techniques (and perhaps find weaknesses ''before'' they're exploited). Security through obscurity isn't-and if this bot's code is too insecure to post, it's too insecure period, let alone to trust with an admin flag. ] 00:04, 8 January 2007 (UTC)
:::::This is one of the best comments I've seen for the release of the source code. Just posting this to highlight it... ] <sup>]</sup> 16:08, 9 January 2007 (UTC)
: To echo some comments from other editors that I think are most worthy of consideration: nothing this bot could do is difficult or uniquely complex, there's no good reason not to publish, publication would facilitate bug discovery and resolution. The bot could be blocked if it ever caused problems. It should also be possible to distribute a version this bot set to run in semi-automatic attended mode, which would enable the word to be done efficiently without the risk that comes with a fully automatic bot, of being fooled by cleverly written malware or mischievous humans. --] 07:12, 8 January 2007 (UTC)

I believe many people here are grossly underestimating how little modification to the source it would take to turn a bot that looks for vulnerabilities in order to protect them, into a bot that looks for vulnerabilities in order to vandalize them. Changing fewer than 5 lines would turn this into effective malware. Changing a few more than that would be enough to let it rampage all over the place. If you are unwilling to accept this as private source, then by all means kill it, but I have no intention of making the source public. ] 07:34, 8 January 2007 (UTC)

:I said this in my oppose !vote-if this bot code, through error or malice, is that dangerous (and if danger exists, either error or malice can lead into it), and would be that dangerous if a ''non-admin'' had possession of it, it is more, not less, critical that the code be open to continuous review-not just now but during its operation. It's not like we've never seen a vandalbot, but if this code is suddenly released we'll have a flood of them. (Please note-you certainly have the ''right'' to keep your code secret, but even if most seem alright with that, I think it's a bad idea and will in the end ''decrease'' the effectiveness of the response against vandalism. And for myself, I can't support it without seeing it.) ] 07:50, 8 January 2007 (UTC)

:Um... with AWB, just removing checkpage functionality, which should be much easier, leaves you with a pretty effective vandalbot. You won't be guaranteed to hit the unprotected things transcluded on main page articles, no, but you could just vandalise the pages themselves, and I do not see how that's worse. -] <small><sup>]</sup><sub>]</sub></small> 15:50, 8 January 2007 (UTC)

Everyone does know that ] was written for a reason, right? FOLLOW IT! It is absolute nonsense to argue over who can write a vandal-bot faster. You've got a responce above how fast someone could turn this into a vandal-bot. There's no point in making it easier for someone to do it. Yes, people can make vandal-bots, but lets make them write them ''entirely themselves''. Let's not hand then one that's already mostly pre-written! -]<small>(]·]·])</small> 04:23, 9 January 2007 (UTC)

:You're missing my point. There ''already is one'', and it's patently obvious that removing the checkpage would leave you with a vandalbot. Thus, it is obviously not all that much of a problem. -] <small><sup>]</sup><sub>]</sub></small> 05:11, 9 January 2007 (UTC)

::Is there really anything in the code that cannot be gotten from ]? ]<small> <sup>(Need help? ])</sup></small> 14:44, 9 January 2007 (UTC)


== Current status question == == Current status question ==
Line 21: Line 56:
(cross-posted to bot approval page) With the RfA now pending, is ProtectionBot currently operating during the RfA period? I hope that it is, at least on an ongoing trial basis. ] 20:21, 7 January 2007 (UTC) (cross-posted to bot approval page) With the RfA now pending, is ProtectionBot currently operating during the RfA period? I hope that it is, at least on an ongoing trial basis. ] 20:21, 7 January 2007 (UTC)


:A member of the BAG ended the trial after one day and instructed DF to shut down the bot ], and DF did as he requested, so no, it's not running. —] <font color="#C46100" size="1">]</font> 20:30, 7 January 2007 (UTC) :A member of the BAG ended the trial after one day and instructed DF to shut down the bot ], and DF did as he requested, so no, it's not running. —] ] 20:30, 7 January 2007 (UTC)


===Suggest continued trial operation during RfA period=== ===Suggest continued trial operation during RfA period===
If Dragons flight is willing I would like to see this bot continue operating on a trial basis during the RfA period, both so we have the benefit of its services during the next week and so that in the unlikely event of an issue arising the RfA !voters could consider it. Comments? ] 20:32, 7 January 2007 (UTC) If Dragons flight is willing I would like to see this bot continue operating on a trial basis during the RfA period, both so we have the benefit of its services during the next week and so that in the unlikely event of an issue arising the RfA !voters could consider it. Comments? ] 20:32, 7 January 2007 (UTC)
:I think BAG shut it down, In the meantime we have ]. Which as stated on the RFA page, is fixed and will preform correctly. Cheers! —— ] <sup>(])</sup> 23:16, 7 January 2007 (UTC)
::Probably best to just wait, I know I am checking shadowbot2's mailings. ]<small> <sup>(Need help? ])</sup></small> 23:18, 7 January 2007 (UTC)
:::Suggestion: Might it be possible to authorise the continued running of ProtectionBot for as long as this RfA maintains a suitable level of consensus for the Bot? e.g. 80 or 85%? That would combine practicality with respect for the views of the community... ]&nbsp;<sup><small>]</small></sup> 23:39, 7 January 2007 (UTC)
:::*That's a very nice idea, and I'd commend you for lateral thinking, but I don't think it's feasible. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 23:41, 7 January 2007 (UTC)
*No need to be bureaucractic about it, this bot is useful (and the RFA has overwhelming support so far) so there's no reason why it shouldn't keep running for a few more days. ] 12:28, 8 January 2007 (UTC)
::*My point is it's ''not'' running right now. ] 23:05, 8 January 2007 (UTC)

== Buffer overflow ==

I see a few people concerned about buffer overflow exploits, my understanding is that this type of vulnerability can only be used on a bot that can be given binary input. Since this script gets all of it's input from mediawiki which stores it's data in text form, I see no way to insert such an attack. Python does not allow for run-time compiling. You cannot fool such a bot into running arbitrary code given such input restrictions, as the precompiled code needed for such an attack cannot be stored as text.

I may be wrong, so correct me if I am, but it seems a buffer overflow vulnerability is not an issue for technical reasons. ]<small> <sup>(Need help? ])</sup></small> 23:50, 7 January 2007 (UTC)

*Mostly correct. It could still malfunction on malformed input, or if the input format changes. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 23:52, 7 January 2007 (UTC)

:Malfunction on malformed input is far from executing arbitrary code, and would lead to a parsing failure. And changing input formats would exceed the approval it is seeking. ]<small> <sup>(Need help? ])</sup></small> 23:53, 7 January 2007 (UTC)

:*I know - hence "mostly correct" :) I just felt it necessary to point out that there still ways in which this kind of thing may happen, if without the severe consequences. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 00:00, 8 January 2007 (UTC)

:I see, I agree that we cannot discount the possibility of the bot being intentionally screwed with, but I think the threat of arbitrary code execution is not an issue. ]<small> <sup>(Need help? ])</sup></small> 00:04, 8 January 2007 (UTC)

:: Arbitrary code exception on the bot's machine? No. Arbitrary command execution on wikipedia? Depends on how the bot sends information back to the servers. I have not reviewed the code, nor has, to my knowledge, any python security expert, nor can I rely on distributed error checking to review the code, and as such, you just don't know what happens if a page to be protected has the following on it -> <!--{{malicioustemplate%2%6%9deleteallcontributions}} -->. Does it load and recurisvely protect everyting on malicioustemplate%2%6%9deleteallcontributions? Does it attempt to load and recursively protect the page deleteallcontributions, which has now resulted in the protection of the entire encyclopedia (oops!) Does it load http://en.wikipedia.com/w/deleteallcontributions and fail? I don't know! I can think of more way to beat the bot, but I'm just shooting in the dark. Real security audits involve reviewing the code. ] - ] 20:15, 8 January 2007 (UTC)

:::Interesting point - do you know any available python security experts who might be willing? ] 21:35, 8 January 2007 (UTC)

:::: ], to be a bit snarky but not too much. ] - ] 21:54, 8 January 2007 (UTC)

::::Yep, I'm aware of that, and I read a good chunk of the RFA - I see your point and I see Robert's. He isn't likely to change his position, and the bot fills a real need. I respect both his position and yours. I trust both his judgement and yours. I only see one of two outcomes - either the RFA fails (with the result that the main page remains vulnerable), or the bot is approved with the code secret (or semi-secret). So rather than arguing about what ''should'' be, I am wondering how create the best outcome. ] 22:03, 8 January 2007 (UTC)

== Some concerns ==

While correcting a misunderstanding I wrote the following to express some of my concerns (despite supporting the RfA). Comments welcomed.

I was reading the ProtectionBot discussion, and I noticed in one of the oppose votes discussions someone said ''"remember this bot only protects images and templates on the Main Page"''. This is incorrect. The bot is also intended to protect templates being used on the featured article, the actual featured article ''page'', not the introduction to it that appears on the Main Page. Thus anyone can add a template to the featured article, vandalise the template, and sit back and watch as ProtectionBot protects the vandalised template. The good thing though, is that the featured article is (normally) freely editable, and so anyone can remove the protected vandalised template. This situation is a bit more problematic when the featured article is in a state of protection or semi-protection due to high levels of vandalism (someone always seems to protect the featured article at some point in any given day), and if the protected vandalised template is in widespread use in other articles. However, the discussions at ] may change all this. Thus the interaction of all these proposals needs to be carefully considered. Not too much change too fast. Also, no-one seems to have picked up yet on the comment I made . That can be summed up by: ] needs to be '''actively''' watched '''every''' day and a button clicked to show that someone has checked it, otherwise, as I said in that comment I linked to: ''"...the vandalism (possibly not very visible) remains undetected for a whole day, and then silently switches over on the main page, at which point all hell will break loose."'' ] 10:55, 8 January 2007 (UTC)
:Incidentially, ] is unprotected, for some reason. This should probably change. ] 11:02, 8 January 2007 (UTC)
:: Actually it was semi-protected. Now it's fully protected. Though since nothing that's directly on that page is ever included in the Main Page, I'm not entirely sure what the problem was – ] 12:33, 8 January 2007 (UTC)
:::The problem was that templates on that page are being protected by ProtectionBot, so this is a way for someone to get something protected by ProtectionBot. That something might be something that we wouldn't want to be protected and then freely added to other pages. Hence '''all''' pages scanned by ProtectionBot should be protected. Today's featured article page is a notable exception to this, and one that will need to be watched very closely. ] 12:46, 8 January 2007 (UTC)
::::Is the bot really checking ] itself rather than the relevant component templates? If so, I'm not sure that makes sense. The 'tomorrow' template isn't actually 'copied over' to the Main Page each day. When I first mocked up a 'tomorrow' version of the Main Page I used the ''then current'' formats of the ] and just added in the {{]}} template where appropriate... and someone then took that to create the 'tomorrow' page. However, since then there have been numerous small changes to the Main Page which have not necessarily been kept up on the 'tomorrow' version. That process will continue over time and eventually there may be templates and/or images on the 'tomorrow' version which are no longer used on the actual Main Page and which thus would not need to be protected. We could always 'recopy' the current Main Page formats from time to time (and would anyway), but the bot could be a bit smarter by checking the specific sub-templates which vary on the Main Page a day in advance. --] 13:19, 8 January 2007 (UTC)
:::::Good points. The bot description say: ''"In addition, it will protect the predictable elements (such as the next Picture of the Day) a day before they appear on the main page."'' - so I think you are right. What we need is for that description to be expanded so it says exactly what pages it protects in advance (probably just the TFA, SA and PoTD transcluded pages). On the other hand, the ] '''must''' show what will actually appear (and so needs to remain protected as I suggested), otherwise people watching that page might miss 'sleeper' vandalism.
:::::On another point, can we clarify terminology here. Does it make sense to distinguish between images, transclusion of pages from template namespace, and transclusion of pages from other namesspaces? When people refer to templates, they can mean either pages in template namespace, or (more widely) anything that appears in the <nowiki>{{</nowiki> and <nowiki>}}</nowiki> curly brackets. ] 13:38, 8 January 2007 (UTC)
::::::I did find by the bot programmer, who said (on 30 December): ''"As described it would be looking at Main Page/Tomorrow and Tomorrow's Featured Article as well as the current ones, so predictable elements will be protected ''before'' they actually reach high profile status."'' - though possibly things have changed since then. I've asked Dragons flight to comment here. ] 13:49, 8 January 2007 (UTC)
::::::Another point. If the bot tries to predict what the 'next day' templates are, there needs to be a note that changing that system (eg. changing the format of the dates, or using different templates - as recently happened with PoTD) would confuse and probably break that part of the bot's function. But then that would break ] as well! So another note for the human oversight section below. ] 13:58, 8 January 2007 (UTC)

:::::The answer is that yes, the current implementation relies on ] to predict the upcoming content, and I apologize if that was unclear. So yes, at the present time that would need to be kept updated and potentially full-protected if it becomes a problem. One could imagine an implementation that uses Main Page alone to predict future content, but that also would have problems. At present, the rotating elements rely on three different nomenclatures "{current month name} {current day}", "{current month name} {current day}, {current year}", "{current year}-{current month number}-{current day 2 digit number}" and only 2 of the 3 is on the Main Page itself, one of the rotating elements is in a subtemplate. Trying to write something that would be robust against the variations in placement and nomenclature that people might devise in the future would represent a hard problem (and I would note that POTD has already changed twice in the last week). My present "solution" is to encourage any modifications to the main page to also maintain the day+1 state of Tomorrow. I realize this isn't really a solution, but it is something that people can do that will work predictably, as opposed to my trying to guess at potential future main page changes, which seems likely to fail. ] 14:42, 8 January 2007 (UTC)

An example is where the editor who redesigned the PotD template system updated the ] page. If this step had been forgotten, the system might have broken down. ] 14:00, 8 January 2007 (UTC)

== Please don't forget that human oversight is still needed ==

Just to avoid complacency, and to remind those saying that this bot will "deal with the problems of Main Page vandalism", a reminder that the bot will deal with some methods of vandalism, but human administrators still need to be alert to the following, which, however unlikely, will probably happen at some point in the future. I've given examples below. ] 12:43, 8 January 2007 (UTC)

===Human error===
*'''Administrators unprotecting stuff and forgetting to re-protect''' (ProtectionBot will not override another administrator). The fix is to reprotect and politely ask the administrator not to make this mistake in the future. ] 12:43, 8 January 2007 (UTC)
**''Query'' - <s>if something is protected by an administrator, will ProtectionBot still ''unprotect'' the page in question once it leaves the sensitive areas?</s> This is not good for high-risk templates that should remain protected even when off the main page. ] 12:43, 8 January 2007 (UTC)
*** No. It remembers what it's protected and only unprotects things it protected itself. This may result in things being protected for longer than they should be, but that's infinitely preferable to things being ''un''protected when they shouldn't be – ] 12:45, 8 January 2007 (UTC)
**** I agree that having some things protected for longer than they should be is better than the alternative, but one of the advantages of having ProtectionBot unprotect things, was that admins would no longer have to do this chore. Admins will need to learn that if they protect something, they can't rely on ProtectionBot to unprotect it. Probably a separate bot is needed to unprotect any selected anniversary pages that remain protected after leaving the main page. The Picture of the Day and Today's Featured Article daily templates remain protected, I believe, as a record of what that bit of that day's main page looked like. The random stuff going on and off the featured article page and the DYK and ITN templates are the admins responsibility to protect and unprotect as needed, so I am happy that the query is not a problem, and have struck it out. The human error bit remains, of course, and not a lot we can do about that. ] 12:55, 8 January 2007 (UTC)
*'''Administrators forgetting to protect something in the first place''' before adding to the main page or to that day's featured article. ProtectionBot will protect a short while later, but a small window of opportunity remains for vandalism. Administrators should not be complacent and should still remember to protect and unprotect DYK and ITN templates/images (ITN is the most common update area, other areas less so as DYK should be done through the DYK update area, though the image on the Featured article blurb sometimes gets wrangled over) and featured article templates/images that they add to the featured article or main page '''on the day''' (if added a day beforehand, or to the DYK update area, ProtectionBot will protect for you the day before, via ]). ] 13:02, 8 January 2007 (UTC)
**Possible solution - if the ITN editors feel they might still forget to protect images, then they could move to an update area like DYK and lag a day behind the news. Just for the image, maybe, and have the other ITN lines updated throughout the day. ] 13:06, 8 January 2007 (UTC)
*'''Major redesigns of the main page or its templates.''' Any major (or even minor) redesign of the main page and its various template systems may impact the operation of the bot. Tread carefully before carrying out redesigns, and drop a note off at ]. This is an argument for having the actual step-by-step processes (if not the actual code) described as fully as possible. ie. a log of what it does, like an annotated version of its protection/edit contributions list. ] 14:06, 8 January 2007 (UTC)
**It should actually be quite robust against anything that you could do (though I should never doubt the potential for people to surprise me). More troubling I think is the potential for changes to Mediawiki to break it. Relevant changes would probably be quite infrequent but are at least possible. ] 14:19, 8 January 2007 (UTC)
*'''Changes in Mediawiki''' could affect the way the bot operates. The bot programmer has said (see above): ''"More troubling I think is the potential for changes to Mediawiki to break it. Relevant changes would probably be quite infrequent but are at least possible."'' (User:Dragons flight, 08/01/2007). ] 14:58, 8 January 2007 (UTC)
**This problem actually hit Shadowbot2 recently (see ]). --] 10:15, 10 January 2007 (]]])
*'''No-one watching ] for vandalism''' that then gets frozen in place by ProtectionBot. What is needed here is a way for any admin to 'sign off' on the Tomorrow page and confirm it is not in a vandalised state, and for ProtectionBot (prefereably, or possibly another bot) to squeal if such a check hasn't been performed. This could be similar to the breakdown alert system currently in place for ProtectionBot. ] 15:25, 8 January 2007 (UTC)
**It doesn't take an admin to look at and call attention to problems with that page, anyone could do it. ] 15:41, 8 January 2007 (UTC)
***The thing I had in mind was not so much calling attention to the problem, as having a box ticked to '''confirm''' that someone had checked the page. If this is not done, you can end up with everyone or no-one checking the page. By sod's law, and as people get bored doing this check, the one time no-one checks will be when the page (through one of its templates) is in a vandalised state. Everyone is away at various times, so you can't rely on a single person to carry out this single check. The reason an admin is needed to check the box (or turn a big red light green), is that if anyone can 'tick the box', then a vandal will do it. I suggest the sequence should go: (1) ProtectionBot protects all templates etc. on 'Tomorrow' at the beginning of a day. (2) An admin makes a change to a protected page (call it the checkpage) that indicates that the 'Tomorrow' page has been checked by a human, and indicates to others that this change has been done. (3) ProtectionBot checks the checkpage and if the change hasn't been made that indicates a human has checked the page, e-mails the admins on its list. (4) At the end of the day, ProtectionBot changes the checkpage back to its "unchecked" status. Put this checkpage on a ProtectionBot subpage if need be, and then transclude as a little red/green light at the top right of ]. Does this sound workable or too complicated? ] 16:38, 8 January 2007 (UTC)

===Bot error===
*'''Protecting a vandalised transclusion added to the featured article.''' The bot cannot check whether an image or template is in a vandalised state before it protects it. If a vandal strikes lucky and vandalises a template just before it gets protected (unlikely but possible), then an admin is required to unprotect and undo the vandalism. If no-one is watching closely, then a vandal could do this on the featured article and then remove the newly protected vandalised template or image and add it to lots of pages. The template/image in question will be unprotected by ProtectionBot after two passes, but a lot of damage could be done in this time interval. ] 12:43, 8 January 2007 (UTC)
**Such a race condition is possible (]). I don't see anyway around it. However, since the vandal has to guess at when the bot will run, I'd guess that on average he would be blocked even before he succeeded at getting the timing right to protect something. Any other suggestions? ] 14:12, 8 January 2007 (UTC)
***If this one is too bean-y, please remove it. But this has been discussed elsewhere as well. I think the problem (of a malicious user indirectly using ProtectionBot to get something protected) may be resolved if the featured article and main page functions of ProtectionBot are separated. Then it becomes a question of whether ] ever gets resolved. ] 14:58, 8 January 2007 (UTC)
****Could the bot at least post a message somewhere, after the protection, if the page had a recent edit (i.e. more recent then the last time it scanned)? The bot wouldn't be able to tell if the page had been vandalised, but it would be able to call in a human who could tell. --] 17:19, 8 January 2007 (]]])
*'''Protecting a vandalised state of a rotating main page transcluded page.''' A similar example to the above is when ProtectionBot protects the rotating transcluded pages that use date parsing to queue the main page templates for the featured article and the picture of the day and selected anniversaries. This is done in advance by using ], but unless humans watch this page, vandalism may pass un-noticed here for a day until it flips over onto the Main Page. ] 12:43, 8 January 2007 (UTC)
**Yes, humans will still have to pay some attention. But looking at a single page to see if it looks right ought to be a much easier task that checking the protection state of everything. ] 14:12, 8 January 2007 (UTC)
**''Conclusion'' - cannot be detected by ProtectionBot. Requires human oversight. Reliable human checking system needs to be implemented, allowing humans to tell ProtectionBot that the page has been checked. ] 17:09, 8 January 2007 (UTC)
*<s>'''The bot may unprotect a page that should remain protected.'''</s> The bot is unable to make the necessary judgement, though it could be programmed to look at whether the page is already in a high-risk category. Merely having it return the page to the state it was in before arriving on the main page or featured article is not enough, as some pages are protected by admins beforehand, but should be unprotected once they leave the sensitive area. ] 12:43, 8 January 2007 (UTC)
**It will only unprotect pages that it has protected. If a high-risk template is added to the Main Page it will already be protected, so the bot won't do anything when it is removed. If an administrator protects a template/image themselves, and they add to the Main Page, the bot won't touch it at all, no matter what – ] 12:47, 8 January 2007 (UTC)
***OK, thanks. I've struck that one out. ] 12:57, 8 January 2007 (UTC)
*<s>'''Transcluding featured article onto itself.'''</s> Does the bot protect all templates transcluded on the daily FA, or all pages? If a vandal transcludes the featured article into itself, would the bot end up protecting the featured article and any vandalized content? ] 13:05, 8 January 2007 (UTC)
** (after trying it out on one of my user subpages) this is indeed possible. Strange, and well-spotted. And yes, I believe it does protect any transcluded pages. The rotating date pages for the main page featured article, picture of the day and selected anniversaries are actually page transclusions, not transclusions from template namespace. ] 13:13, 8 January 2007 (UTC)
**I've added a line to prevent this eventuality. ] 14:12, 8 January 2007 (UTC)

Please add any more examples you can think of needing human oversight. ] 12:43, 8 January 2007 (UTC)

== Shrubberies ==
We are the ]!

;1. The bot must not be sysopped until we can see that the bot does only that which is stated
;2. The bot may not be run under Dragons Flight's own account because that violates bot rules
;3. The bot must therefore only be run under its own assigned account
;4. The bot's assigned purpose requires sysop privileges
;5. Goto 1

And there you have it. <b>]</b> <small>(])</small> 16:38, 8 January 2007 (UTC)

:Indeed. Lots of ] being thrown about all over the place as well. Sad. —] ] 16:44, 8 January 2007 (UTC)
::What about the discussion above, which is actively trying to lay out possible problems and solutions. Contributing or linking to that could help. ] 16:55, 8 January 2007 (UTC)
:::I was talking more about the votes that have no explanation, or that are POINT violations (my favorite so far is the one opposing because the bot did not sign accepting the nomination, then proceeding to chastise everyone else involved for not knowing the rules), or that list issues that are either not factual or have already been addressed; as is my mantra, discussion is never a bad thing. Administrative oversight will always still be required, and laying out exactly what will be required above is wonderful. —] ] 17:05, 8 January 2007 (UTC)
::::Adding it to ] is ''my'' favourite. :-) ] 17:33, 8 January 2007 (UTC)
:::::I'm all for adding the bot to that category, just as soon as it expresses its willingness to be added. ;) ] 17:35, 8 January 2007 (UTC)
::::::The bot will agree to stand for reconfirmation upon the request of any six other bots. :) ] 17:41, 8 January 2007 (UTC)
:::::::I'll make sure no other bot will clerk for it (I was born in Detroit, we have ''ways'' to influence bots...) and thus the recall will fail procedurally. ++]: ]/] 23:06, 8 January 2007 (UTC)
:*Would it be POINTy to post this under my ] I wonder? :) Nice to see some good humour in here after some of the clashes on both sides, though. :) ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 17:44, 8 January 2007 (UTC)

:"The Knights have a weakness in that a number of words, when spoken to them, cause them pain and agony." (from the article ]). Thanks for that nice pointer, Guy. :-) --] 22:55, 8 January 2007 (UTC)

::Alas! there is away out! Go look thee, to the wonderful land of the . Just follow that yellow brick to find all sorts of wonderful things! </]>Anyway, this bot can easily test on the testwiki, and a sysop bit should be easy to come by over there. The wikimedia framework is very similar, so what works there should work here as well. Cheers! —— ] <sup>(])</sup> 23:12, 8 January 2007 (UTC)

:::I know a way out of that shrubbery, it removes number 1! ''Release the ****ing source code''. Sorry for the starred language, but I'm really, really, annoyed that such a simple solution seems to be randomly overlooked.
:::And for humor, I think we need to stick this thing on ArbCom, it'll go well with AntiVandalBot. I wonder when it'll be programmed well enough to arbitrate? -] <small><sup>]</sup><sub>]</sub></small> 06:07, 9 January 2007 (UTC)
::::You mean best publish the source code before starting the RfA then? To give the vandals some advantage in case the RfA should succeed - <chuckle>. Looks like we should modify Guy's "Knights who say Ni" program loop then :-). Nice RfA though. --] 10:25, 9 January 2007 (UTC)

:::::Yeah, ]... No, really, such a simple way out. If anything the request for shrubbery is in obtaining access to the source code. "Drop me a note, I'll decide if you're trusted, and get back to you." ] 15:17, 9 January 2007 (UTC)

::::(in reply to Ligulem) Decent source, is secure, no matter who reads it. Anyone can go read the Mediawiki code that feels like it-go on, you can right now! What you'll note, however, is that this ability really doesn't make things any easier for the vandals, because the code is good. One hopes that this bot has certain features built in (for example, altering the ''exact'' times at which it performs its checks by random intervals)-these types of safeguards would not be compromised or harmed by anyone reading the code, and the bot could be compromised through simple observation if they're not. That being said, it seems like this bot's coder is pretty competent. Here's my problem. An adminbot must be ''exceptionally good''. If the source code cannot be released without danger that ''this'' bot will be compromised, it's not properly coded. If Dragons Flight believes that it'll make anything easier for blackhats than having the code for AWB out there, this thing is either exceptionally dangerous or he's unaware of the real situation. ''Any'' of those scenarios make me nervous enough to oppose this. It's really too bad-I saw one of those nasty incidents, I think this is a good way to solve it, and I'd like to be able to support it. But basically, what's being asked here is for the community to support a person for admin because someone trustworthy nominated them. We don't do that. We go look through ''that person's'' history. For the same reason, I can't support here because I trust the coder-I'd have to ''see the code'', and know that everyone can do so. ] 15:29, 9 January 2007 (UTC)

:::::One must also remember that there are a LOT more wikis than just en, but this bot will only protect on en. So, if DF openly releases the source code, vandals now have a bot handed to them where they can have it search out every single un-protected template and image on every Misplaced Pages in every language, change 10 lines and have it vandalize those pages instead. Could vandals program a bot to do this anyway? Sure... but let's not make it any easier for them and just hand them the absolute perfect vandal tool. As it stands now, the code has already been reviewed by ''numerous'' qualified programmers, and is available to anyone in good standing (including those on other language wikipedias who want to implement it on theirs). I'll be honest, I'd prefer if it were open-source. I love open-source and will always prefer it to closed-source. But opposing simply because it is not open-source is acting ideologically rather than practically. The "security through obscurity" charge doesn't work, because it's been reviewed by numerous people, and can continue to be reviewed by anyone who wants to. It's not closed-source because there's a possibility of it being compromised, but because there's a possibility of the code being abused. My $0.02. —] ] 16:12, 9 January 2007 (UTC)
::::::The others I'm aware of do not have fully protected main pages anyway. I do believe the question is moot for all wikis except this one. I am also amused to find myself accused of being an open-source ideologue. ] 19:32, 9 January 2007 (UTC)
:::::::I've already recieved a request for ProtectionBot to be used on the Italian wiki, for which I am happy to provide the code provided that I can find a English speaking Italian bot operator who is willing to take responsibility for it. ] 19:59, 9 January 2007 (UTC)
::::::::Huh? I can edit some of the main page templates on Italian wikipedia. It's not even semi-protected, and it doesn't look like it's supposed to be. ] 20:24, 9 January 2007 (UTC)
::Is there really anything in the code that cannot be gotten from ]? ]<small> <sup>(Need help? ])</sup></small> 16:34, 9 January 2007 (UTC)
::::::::Great, so you'll share with more "trusted users" in an entirely different wiki, but still can't be bothered to release the code so any wiki can use it, or so that we can review it. WTF, man? --] <small>]</small> 00:43, 10 January 2007 (UTC)
(undent) ''The others I'm aware of do not have fully protected main pages anyway. I do believe the question is moot for all wikis except this one.'' You appear to totally misunderstand the situation. This is not about vulnerabilities in the ''Main Page'', it is about vulnerabilities in the ''Main Page Article'', the most widely viewed page on all of en.wikipedia.org. And that article is ''never'' fully protected.

The bot is designed to find weaknesses (unprotected templates and images) ''and fix them''. It is trivial for anyone familiar with that type of code to modify the program so that it finds weakness and ''lists them''. And that list is a list of targets for vandalism.

Do you really believe that other wikis don't use templates and images, or that (the only way to avoid this problem) those other wikis fully protect templates and images so that only admins can modify them? Because those are the only two ways that this issue is "moot" for other wikis. ] | ] 01:34, 10 January 2007 (UTC)

:Actually, just as a quick note, it does both. It does what you describe, but it also searches for and protects vulnerable templates and images on the Main Page, which is what has been a bigger problem as of late. —] ] 02:10, 10 January 2007 (UTC)

=== Italian Misplaced Pages ===
Okay, if I have misunderstood, please educate me: aren't there templates and images on the italian main page (and, of course, on the featured articles) which are editable ''on purpose''? What would ProtectionBot do on the italian WP? And what would access to ProtectionBot's code do for an italian WP vandal that he could not already do with the "edit" links right under his nose on the main page? I'm pretty sure this goes for most, if not all, of the others -- with the notable exception of english WP. These concerns are misplaced. In any case, isn't Shadowbot2's code already written to do just what John fears: report vulnerable vandalism targets? You'd only need to change who it notifies. ] 14:38, 10 January 2007 (UTC)

:I have no idea (and no familiarity with it.wiki), but the point you raise that it doesn't seem protected is valid. The request was raised by ], so perhaps you should go ask him. ] 15:55, 10 January 2007 (UTC)

== Idea for a compromise about the source code ==

I see considerable worries about the code of the bot not being open source and if we fail to address these the whole thing will not fly.

As being one of the supporters of giving the bot admin rights and letting the bot do what it was intended for, I would like to try to work towards consensus and see if we can move a little bit on all sides.

Would it be acceptable, that Dragons flight gets a startup time where he can establish the bot and tweak it and iron out all early bugs ''without publishing'' the code '''and publish the code later, when the bot is running and already doing its job'''?

Part of this my proposal here is that those who give their RfA support as soon as the code is published would move a bit and give their approval for the RfA <u>based on the mere promise of Dragons flight to publish the code in - let's say - a month</u>?

Of course my proposal only works if Dragons flight would agree to publish the code in a month. Deal or no deal?

If this is not acceptable, what can be tweaked to make it acceptable?

Could we keep parts of the code unpublished? For example the code part(s) that determine the exact time when the bot will protect a specific page. Maybe this could be refactored out into a call of a random function so that the complete code could be published without giving the knowledge exactly when it is going to protect a page.

Please help work towards ], everybody! --] 10:40, 10 January 2007 (UTC)

: Personally, I would like to see at least an outline of the algorithm used, in order to be able to identify potential exploits and have them addressed ahead of time. Perhaps this is already written somewhere, but between this RfA and BRfA I can't find it. For instance, how is the template recursion handled? If template A transcludes B which transcludes A, does the recursion stop? How does it treat noincude and includeonly parts of templates? ] 13:53, 10 January 2007 (UTC)

:A month? Well... let's face it -- contrary to what you say, the whole thing is going to fly without addressing our concerns. So if that was the offer I'd take it. ] 14:54, 10 January 2007 (UTC)

== Access rights for protection bot ==

This RfA is about giving the ''whole'' admin group ] to ]. Since the bot only does protections, could we increase the support for this bot by the community '''by limiting the rights we give to this bot to "registered"+"protect" (per ])'''? I believe any ] can enact this (See "userrights" in ]). --] 12:19, 10 January 2007 (UTC)

:If I'm correct, we can deposit a request at ] for this, if we have consensus. --] 12:23, 10 January 2007 (UTC)

:I've asked the stewards for confirmation about the procedure . --] 12:33, 10 January 2007 (UTC)
]
::You can see the screenshot of steward version of ] with assignable groups. Because not every possible option is visible, here is the full list:
<pre>
Bots
Sysops
Bureaucrats
checkuser
Stewards
boardvote
import
developer
oversight</pre>
::] 13:04, 10 January 2007 (UTC)
:::If I remember correctly, a developer can invent a new user group with any set of permissions they like (in this case, a 'protectionbot' group with bot + protect + all autoconfirmed rights could be created); it's unlikely that the developers would do this without consensus that it's needed first. --] 13:10, 10 January 2007 (]]])
::::"Protector" would probably be better in case for whatever reason they wanted to give a non-bot protect rights. --] 13:35, 10 January 2007 (UTC)
:::::I've posted a question on wikitech-l . --] 13:41, 10 January 2007 (UTC)
::::::One response there, from Gmaxwell, is: ''"If the operator of the bot is not trusted enough to have access to deletion or blocking, why do you trust him enough to have access to protection?"'' - my response to this would be that the bot will be running unsupervised, and there are concerns that defects in the code could lead to the other admin powers being used a vandal/hacker who exploits said defects to start deleting and blocking, rather than protecting and unprotecting. It is a request designed to limit the potential damage that could be done by the other admin tools if an unsupervised adminbot account was compromised. If the bot ran awry and was manipulated to do stuff, undoing mass protections and unprotections would be easier to fix and less damaging than the other stuff (such as deleting and blocking). I've posted to ] to see if he wants to respond here. ] 14:35, 10 January 2007 (UTC)
:::::(In response to WikiSlasher) That would be a rejected ]. By lumping 'bot' in with it, it would prevent people complaining that the consensus there was ignored. (Personally, I don't see why we don't have separate delete/protect/block, etc., as it would help to remove some of the political notions of adminship, but that's a different discussion.) --] 16:01, 10 January 2007 (]]])
:As I've said in my oppose, I would be willing to support bots with certain delimited administrative 'powers' such as an auto-protection bot, so yes, if you can find a developer willing to finally do this, that might help. -- '']']'' 16:32, 10 January 2007 (UTC)

I don't like the idea of creating new access group, '']''. The theory of possibility of exploits seems to me extremely unlikely in this case. ] 16:45, 10 January 2007 (UTC)

:We can establish a lot of principles and all obey them. But how does that help in finding consensus about how to protect the main page against sneaky template vandalism? --] 17:21, 10 January 2007 (UTC)
::Current tally of 183/37/13 can already be considered consensus. And even if not - making a new feature intended for use by a single bot is unproductive IMO. Sure, Werdna's solution, if implemented soon enough, would be the best. ] 18:41, 10 January 2007 (UTC)
:::Please see the section below about Werdna's solution. I would want to see the queries answered before agreeing that it is the best solution. If it is not, those working on the ground will have to cover any gaps in the defence. ] 18:45, 10 January 2007 (UTC)
:::Also, I agree that the current tally can be considered consensus, but I also think that in the remaining four days of this RfA, the percentage will drift down towards 80%. I know that percentages are not ultimately what decides an RfA, but having it drifting steadily, if ever so slowly, downwards from 85% towards 80% does make the closing 'crat's decision slightly more difficult. ] 22:12, 10 January 2007 (UTC)

If a developer wants to make it happen, I have no problem with this. I don't think its actually necessary (and probably sets a poor precendent of having to have developers get involved with any future admin bot approval), but I do understand the sentiment of those who would be comforted by this. ] 16:47, 10 January 2007 (UTC)

::Ok, I will say this again, you cannot run arbitrary code on this bot because of two things: 1. It does not take any binary input so buffer overflow is out of the question, 2. Python does not support run-time encoding.

::It is possible to fool this bot into protecting something it should not, I cannot imagine how but it is possible. Short of getting the bots password, there is no way to get the bot account to do something it is not programed to do. Limited access to protection only will only serve to assuage baseless fears. ]<small> <sup>(Need help? ])</sup></small> 16:54, 10 January 2007 (UTC)

== Newbie kind of idea ==

Erm... '''please''' don't shoot me down, cos I'm not a techie and I'm also not overly familiar with process here (yet) and I'm feeling my way with this message:

Is this correct, that the Bot needs admin powers only to be able to automatically protect and unprotect images? Am I also right that the real urgency is only the need to protect images, not the unprotecting again?

Surely a simple workaround would be to give all logged-in non admins the power to protect images (something an admin can easily over-ride if abused, and something not particularly attractive as a method of vandalism). The bot can then do the urgent protection and (also on an automation) send an unprotect message to an admin backlog page, like when us non admins flag something for speedy deletion (because obviously vandals should not have the chance to unprotect).

I presume that making the protection function available to non admins would be controversial (I'm not ''that'' wet behind the ears!) but if we pretend for a minute that that could go through "on the nod", have I made any mistakes in my understanding of what the Bot needs to do, that means my suggestion wouldn't work? --] 17:02, 10 January 2007 (UTC)

:It would be used in content disputes, that is why a level of trust must be established before such powerful tools are given out. This will not fly I am afraid. Maybe in the future when the world is perfect, I can see it now.... ]<small> <sup>(Need help? ])</sup></small> 17:07, 10 January 2007 (UTC)

::Hmmm... you mean where two editors edit war about the inclusion of an image, one might protect it? I get your point. --] 17:27, 10 January 2007 (UTC)

:Exactly. Of course, bots don't have a pride to lead them to edit war. ]<small> <sup>(Need help? ])</sup></small> 17:43, 10 January 2007 (UTC)

:Also, it needs to protect templates, not just images. ] 18:01, 10 January 2007 (UTC)

== Rumours that ProtectionBot will be redundant!! ==

I followed the rest of the wikitech mailing list thread linked to above, and (and the previous one from Werdna himself) makes clear that possibly ProtectionBot will be redundant soon anyway. Of course, we won't know for sure until Werdna reveals what he has been working on, but it seems the vandalism and this RfA might have got things happening developer-side. Would be nice if a developer dropped in on this discussion and told us what is happening, though... ] 17:59, 10 January 2007 (UTC)

:Like me? Yeah, I'm overhauling the protection system. The most visible of my changes is a "cascading protection" option - essentially, it protects anything in the Template: namespace that's transcluded onto a page, as well as the page itself, if a certain checkbox is ticked on the protect page screen. Any other issues like this, '''please''' come to developers and ask for a fix before hacking together a bot. I realise that Dragons Flight had great intentions, but really, this kind of thing needs to be implemented as part of MediaWiki, along with most of the stuff that gets proposed for adminbots and, indeed, some of the regular bots. &mdash; ''']''' '']'' 18:06, 10 January 2007 (UTC)

:::This effort is very laudable and much appreciated. But I fear it will take quite some time to iron out all the quirks on this. For example you will have to separate direct protections to a template from "remote" protections (a template should remain protected if it was directly protected even if the cause for a "remote" protection goes away). And you need to track the protectors (a protector is a page that caused protecting a template) for each template. I would be surprised if you manage to track all these relations properly and in a stable manner. Let alone thinking of the user interface of this: for example we should know why a template is "remote" protected. And last but not least there are pages which are transcluded into protected other pages that should not be protected (see ]). Just a few possible complications, I'm sure there are more. "overhauling" always sounds nice, until it comes to the details (just my experience as dumb software developer - not a MediaWiki developer though). But don't get me wrong: I really hope this can be implemented and I wish you all the best on this. In the mean time, we still have to solve the transclusion vandalism problem... --] 00:12, 11 January 2007 (UTC)

::::I'll reply to these objections in point form.

#There is a need to separate direct protections from remote protections.
::''No, there is not, because cascaded protections '''are not stored in the database'''''
#Tracking all the protectors (and displaying them to the user)
::''This will come in time: There's no need for "tracking" them, they're found on-the-spot; and are already retrieved with the current SQL queries anyway. As for displaying them to the user, we'll work on it, but I'm sure you understand this isn't a core feature for the modifications.''

... I didn't see any more objections. &mdash; ''']''' '']'' 01:57, 11 January 2007 (UTC)

::What about transcluded items not in the Template: namespace? ]<small> <sup>(Need help? ])</sup></small> 18:12, 10 January 2007 (UTC)

:::The point is that protection on page X is automatically extended to all items included in it, for the duration that they are there. So yes. ] 18:39, 10 January 2007 (UTC)

::What about the Image namespace? Also, the main page 'templates' are not actually in the Template namespace. They are pages transcluded across from (I think) the Misplaced Pages namespace. For example, the featured article blurb is put on the main page as <nowiki>{{Misplaced Pages:Today's featured article/{{CURRENTMONTHNAME}} {{CURRENTDAY}}, {{CURRENTYEAR}}}}</nowiki>, which resolves (today) as ], which is then transcluded across. Which leads me to ask (and please don't get offended), have the developers communicated with the people who dealt with the vandalism and who created and who maintain the current main page and its protection processes? You not mentioning images and focusing on the template namespace leads me to think that something might slip through the cracks if all the bases aren't covered (to mix metaphors). ] 18:17, 10 January 2007 (UTC)

::Not to malign you or anyone else, but in my experience the response time on feature requests requiring new code to be written is lousy. Which I assume is a side effect of there being a small number of developers and a large number of priorities to take care of. Interesting question though, did the existence of the bot proposal make you more or less interested in patching this? ] 18:20, 10 January 2007 (UTC)

::To be fair, Werdna, there's been many issues in the past that we asked the developers about and were promptly ignored or put indefinitely on hold, so we did have to come up with a hack on our own. For the record, I'm still waiting on single user login, non-vandalized version tagging, and non-linking date syntax. --] 18:28, 10 January 2007 (UTC)
:::All of which are coming in the near future. Brion's currently working on SUL, stable versions is supposed to start on dewiki shortly after SUL is finished, and Robchurch is currently working on unlinked dates. --] 20:57, 10 January 2007 (UTC)
::::Oh wow. I even saw mentions of Category Intersection on the wikitech mailing list. This does all sound very promising. Let's all give the developers a big cheer, and look forward to playing with the new tools very soon. :-) ] 22:16, 10 January 2007 (UTC)

::Another problem: ''"as well as the page itself"'' - this protection of "the page itself" is not viable for the featured article page, which needs to remain unprotected to allow editing, but the transcluded material (images, templates, other pages, whatever) needs to be protected. ] 18:31, 10 January 2007 (UTC)

::Also, does it unprotect templates once they've left sensitive areas? Does it at least add the protected templates to a category so others can tidy up after it if needed? Note that not all templates should be unprotected once they leave the sensitive area, some need to remain protected as they are high-risk templates. (copied from Werdna's talk page) ] 18:34, 10 January 2007 (UTC)
:::ie. The process needs to protect ''unprotected'' stuff, and unprotect them later, while ignoring protected stuff. Can that really be done through Mediawiki coding? ] 18:39, 10 January 2007 (UTC)

::::Items should automatically revert to their previous protection status. The point is that the protection applied at any given time should automagically be determined as the greatest of the protection applied to the item itself plus the protection applied to any pages that include the item (and have cascading protection enabled). ] 18:43, 10 January 2007 (UTC)

:Ahh, nothing like a wiki community to ask a bazillion questions. I'll respond to the objections/comments/questions in turn:
:#But it will only protect in templatespace!
:#:Myself, Domas and Tim Starling decided later on that this restriction was unnecessary. I also forgot to mention that it protects images, too. &mdash; ''']''' '']'' 22:12, 10 January 2007 (UTC)
:#But it will protect the page itself, and not ONLY the page's transclusion templates: &mdash; ''']''' '']'' 22:01, 10 January 2007 (UTC)
:#:If this is a requested feature, I will implement it. It's a fairly trivial fix, and I'm glad you let me know about it - because it's far easier to add the fields into the database now. &mdash; ''']''' '']'' 22:01, 10 January 2007 (UTC)
:#Does it unprotect templates when it's left sensitive areas?
:::The templates are never protected, according to the database. Similarly to the protection afforded to CSS/JS user subpages, the protection is a special case in the code. &mdash; ''']''' '']'' 22:01, 10 January 2007 (UTC)
:::Addendum: I see that it's also been asked whether or not MediaWiki will unprotect the pages once they are off the main page &mdash; it will not. The new functionality actually never modifies the database. Another myth: '''busted'''. &mdash; ''']''' '']'' 22:25, 10 January 2007 (UTC)
:That's all I can see so far, please feel free to add more! &mdash; ''']''' '']'' 22:01, 10 January 2007 (UTC)
::You've answered all my questions, and admirably so! Thanks for answering and explaining things. I guess the only question now is whether these changes occur before the ProtectionBot RfA finishes (on 14 January 2006). If not, hopefully the closing bureaucrat will be aware of all this. If yes, then the question of withdrawal raises its head, and whether people will squabble over whether Dragons flight or ProtectionBot should do the honours... I can just see the ] headlines now. ''"First-ever adminbot set to be promoted at RfA! Last-minute developer patch squashes bot's dreams of glory!! Bot refuses to be interviewed and sulks in corner!!!"'' ] 22:33, 10 January 2007 (UTC)

::Please also have a way of identifying which page is responsbile for the cascading protection applied to a given item. ] 22:45, 10 January 2007 (UTC)

::I have a feeling that I may be repeating a question here. Would it be possible to have the cascading protection of en's Main Page protect both the images locally and at Commons. Currently admins must upload local copies or, in the cases of some admins who aren't familiar with the process, they just link to the Commons image, resulting in a mad scramble to upload a local protected copy before a vandal either overwrites the Commons copy or uploads an inappropriate image to the local image location. - ]] 03:33, 11 January 2007 (UTC)

'''Fix has been committed''' in . The only use I can see for the bot ''at this point'' is re-uploading commons images prior to them appearing on the main page (which can be done by a regular user account); once they appear on the main page, the normal cascading protection will apply. The changes will become live in the next few days, depending on how long before a developer gets off their ass and does the required database updates :-). &mdash; ''']''' '']'' 23:39, 10 January 2007 (UTC)

:Your link gives me a 403 error. ] 23:43, 10 January 2007 (UTC)

::Try instead. ] 23:52, 10 January 2007 (UTC)

:::Or, indeed, ]. —] <small>(])</small> 00:24, 11 January 2007 (UTC)

::::I doubt solves the problems I've listed above. --] 00:30, 11 January 2007 (UTC)

:::::Per the "Field for future support of per-user restriction", I once thought about a text page under each user's space (with a reserved name like "access", containing DENY and ALLOW clauses for that user, editable only by admins with wildcards for page names). So we could allow XYuser to edit Common.css by specifying "ALLOW MediaWiki:Common.css" in ]. Just an idea... :-) --] 00:45, 11 January 2007 (UTC)

:Did a quick testinstall. Looks damn sexy. Per ''"cascaded protections are not stored in the database"'' I just hope this can take the load of en.wikipedia then and thus passes brion (..., which refers to ). --] 10:00, 11 January 2007 (UTC)

===Problem?===
<small>(header added by ] 22:53, 10 January 2007 (UTC))</small>

Following a brief discussion on <code>#wikimedia-tech</code>, it was pointed out that protecting every page transcluded into the featured article, but not the featured article itself, could pose a serious security hole, allowing a vandal to protect arbitrary pages. Strongly discourage the existence of this feature in the bot, and the development team will not be implementing this as a feature. Human eyes are needed to check for this vulnerability. Cheers, &mdash; ''']''' '']'' 22:23, 10 January 2007 (UTC)
:Ah. This possibility was pointed out with ProtectionBot as well. The risk is less here because ProtectionBot protection is a random action, not an immediate effect. With the Mediawiki coding, surely if a vandal transcludes a page to the Featured Article, thus getting that page protected, won't just removing the page reverse the protection? What about having the cascading full protection feature available for article s-protection as well as full protection? No, that won't work either, as vandals can easily use sleeper accounts. Hmm. Looks like discussion will have to go to ]. ] 22:43, 10 January 2007 (UTC)

:This has been raised in my discussions as well. A suggested (but not yet implemented) response was to require an item to appear as needing protection during multiple passes of the featured article seperated by some interval (e.g. 15 minutes) so that a single act of vandalism was less likely to result in protection and to have a hard limit on the total number of items the bot would protect at any one time. This ameliorates (but not entirely prevents) the situation you suggest. This approach is of course not practical for Mediawiki. Since the vandalism required to promote this sort of spurious protection should be easily revertable, I'm not sure how seriously I would rate it even in the Mediawiki case (though self-inclusion could have strange consequences if not explicitly exempted). ] 22:45, 10 January 2007 (UTC)

== PseudoCode? ==

Why not publish pseudocode? ---] <small>(]/]/])</small> 18:47, 10 January 2007 (UTC)

:I am willing to publish psuedocode as a compromise, provided a couple bits can have a high level of abstraction. It would probably take me at least a couple days before I can find time to write it out though. Also, I'm not sure how many of the people objecting on the basis of open source concerns would actually accept psuedocode. ] 19:04, 10 January 2007 (UTC)

::Well, it might be an acceptable compromise for some people on the fence. Just tossing out ideas:) ---] <small>(]/]/])</small> 19:05, 10 January 2007 (UTC)

*Sorry, but this is unacceptable. Either release the code, or don't. Don't give me a half measure of which the accuracy cannot be verified. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 21:03, 10 January 2007 (UTC)

:*The accuracy could be verified by any of the other nearly dozen people who have seen the code. ] 21:06, 10 January 2007 (UTC)
::*I would be willing to review and vouch for the accuracy of pseudo code. I have received the code to the bot, which was sent promptly and politely. ++]: ]/] 22:46, 10 January 2007 (UTC)
:*Peter, have you requsted the code from Dragons flight? If so, did he send you the code? ] 21:22, 10 January 2007 (UTC)

::*I can't speak for Peter, but for me, I won't waste Dragons Flight's time - picking and choosing who the code is sent to is not transparency, whether or not that list happens to include me, and even if he sent me the code right away it wouldn't change my mind. Someone else might spot a bug the handpicked list would miss-that's why code should be open to ''all'' eyes. Nothing less would sway me. I ''already'' trust Dragons' Flight that, to the best of his ability, the code works and does what he says it does. But there's ''always'' bugs and vulnerabilities in any non-trivial code, and the more people can look at it, the more get found. Our alternative is to wait until a vandal figures out a weakness by observation or "plinking", and the handpicked group scrambles to figure out how it happened. Secrecy is not a good way to write code, not a good way to audit code, and realistically, against the spirit of how we do things around here. ] 22:55, 10 January 2007 (UTC)
:::*Well said. I am in full agreement. Cheers, ✎ <span style="font-family: Verdana">] ( ] &bull; ] )</span> 00:28, 11 January 2007 (UTC)

== New MediaWiki feature "Cascading protection" now enabled ==

See (just for the record). --] 10:39, 14 January 2007 (UTC)

Latest revision as of 20:23, 1 October 2024

Someone please explain to me...

Why can't the source code be revealed? AWB would require much less modification to be an effective vandalbot, and its source is freely available to anyone who cares. -Amarkov edits 18:18, 7 January 2007 (UTC)

Not sure, I have read it and it seems to be safe releasing the source. HighInBC 18:26, 7 January 2007 (UTC)

If Dragons flight released the source, I would withdraw my opposition. My only significant beef is the needless secrecy. Cheers, ✎ Peter M Dodge ( Talk to MeNeutrality Project ) 19:05, 7 January 2007 (UTC)

Dragons flight has stated (see comment under Oppose #1), "The code has been released to trusted members of the community for review, but it will not be made public. I feel the risk of people adapting certain functions to create powerful vandalbots is too great." Perhaps other users who have seen and reviewed the code can comment on this issue. This seems a plausible concern to me but an even bigger concern to me is that releasing the code would allow the vandals to try to reverse-engineer ways around it (compare WP:BEANS). Newyorkbrad 19:10, 7 January 2007 (UTC)

There is no WP:BEANS here. This is nothing that couldn't be done with the freely and openly available pywikipedia framework. Cheers, ✎ Peter M Dodge ( Talk to MeNeutrality Project ) 19:21, 7 January 2007 (UTC)
I agree, pywikipedia framwork, the perl wikimedia module, or just plain html scripting can get the same results. The functions this bot performs are not difficult to reproduce. What's more, the code would not be able to perform admin functions on a non-admin account anyways, so it is really just the recursive unprotected template/image finder. If the bot is functioning, then this list of unprotected pages will not be a threat. I read the source, I see no reason to keep it a secret, but I respect the authors right to do so. HighInBC 19:28, 7 January 2007 (UTC)
Earlier today, I was thinking the same thing as you, HighinBC, but I've realised the potential issue with releasing the code. I'm going to break WP:BEANS here (on the understanding that the code won't be released), in order to enlighten everyone. The simple matter is that the bot code could be changed to automatically vandalise every unprotected page, perhaps before the bot would be able to protect, and cause the vandalised page to be protected. This is a very serious possibility, allowing vandals to easily impose mass vandalism (esp image vandalism). I anyone thinks that this comment is severely WP:BEANS, blank it. Martinp23 20:03, 7 January 2007 (UTC)

On thing that just came to my mind - Dragons flight noted on the BRFA that the bot would run on random times etc. to prevent the vandals from predicting its execution and racing to vandalism. I haven't seen the code yet, but this feature (or something similar) may well be the reason that the release of the source code would violate WP:BEANS. Миша13 22:01, 7 January 2007 (UTC)

  • It isn't all that hard to design a RNG algorithm such that determining the times from it without having direct access is too hard to be plausible. Video games have managed that for a while, I think that a bot can. -Amarkov edits 22:06, 7 January 2007 (UTC)
    • I agree with Amarkov here ... any even somewhat decent implementation of a RNG would not allow anyone to predict its random numbers, even with access to the source. Besides, even if it was a simple timestamp RNG, on-wiki actions are only reported to the nearest second, whereas the script would be using a more fine-grained time seed than that. So there really would be no way to try to predict when it would run again. --Cyde Weys 22:19, 7 January 2007 (UTC)
  • I don't really think my point got across. If you wanted to build a vandalbot, you could do it from this, grabbing the unprotected pages... or you could just remove the checkpage requirement from AWB, set it on auto mode, and vandalize away. Much easier than introducing editing functionality to a bot that doesn't have it, and plus, as long as you have a user and user talk page, and are careful not to remove or add too much stuff, it won't look any more suspicious than any other AWB fix. While removing the checkpage requirement isn't a trivial matter, anyone who could turn this bot into a vandalbot could manage it. -Amarkov edits 22:32, 7 January 2007 (UTC)
So far as I can tell, the code of User:AntiVandalBot isn't public (at least I couldn't find it)... why is no one freaking out about that? It's a much more complicated bot that can make edits to every page on Misplaced Pages. It makes more edits in a day than the protection bot will in an entire year. If it 'went berserk' it could require vastly more work to clean up than the proposed protection bot ever would. In short, all the concerns expressed about 'protection bot' are vastly more applicable to 'antivandal bot'... yet the code is not public and no one seems to mind. Why do you suppose that is? Why do you suppose that 'auto wiki browser' isn't just given out to anyone who wants it? My own theory is that most people realize that 'making smarter vandals' is a bad idea. Yes, a vandal could build their own version of 'anti vandal bot' that instead creates vandalism... some have. But most of them aren't 'dedicated' enough to figure out the hows of it and eventually go away. Does it really make sense to HAND those people a ready made vandalism tool that just requires a few tweaks to create a massive mess? That's what making 'protection bot' or 'anti vandal bot' code publically available would do... give general vandals the ability to do alot more damage. We can handle the few vandals who are capable of building their own bots. Let's not give every vandal the ability to make bot attacks. --CBD 23:01, 7 January 2007 (UTC)
AntiVandalBot obviously does have a vandalism problem. It can edit anything already, it can do it fast, and it requires no human intervention. This bot can only edit images and templates, and even then only to add or remove three specific things, so it would take loads more work to convert it into a useful vandalbot. And as I've reiterated a lot already, we already have the full source of AWB, which would be much easier to convert to a vandalbot. (It wouldn't even be conversion, really). -Amarkov edits 23:07, 7 January 2007 (UTC)
Alternatively, you can use my perlwikipedia framework to write a vandalbot. I just wrote a dirt simple, proof-of-concept one with the framework, 24 lines of code, that uses threading and multiple usernames. Elapsed time: 4 minutes. Just because the bot is open-source doesn't make it an automatic target for vandals trying to create vandalbots. It would probably be harder to convert ProtectionBot into a vandalbot than it would be to write one from scratch using pywikipedia. Shadow1 (talk) 23:31, 7 January 2007 (UTC)
I agree, bot making is not some secret, anyone can learn it and use existing frameworks. HighInBC 23:32, 7 January 2007 (UTC)
Absolutely agree. Not that I don't trust HighInBC, but I believe strongly in trust-but-verify. I already know pretty well how Antivandalbot works just by having seen what types of things it's done, and it would not hard to write vandalbots from what's already out there. Our anti-vandalism techniques need to be just as open, so that when the vandalbot runners find a way around them (and you believe me, they will), we can respond quickly and improve our own techniques (and perhaps find weaknesses before they're exploited). Security through obscurity isn't-and if this bot's code is too insecure to post, it's too insecure period, let alone to trust with an admin flag. Seraphimblade 00:04, 8 January 2007 (UTC)
This is one of the best comments I've seen for the release of the source code. Just posting this to highlight it... Mathmo 16:08, 9 January 2007 (UTC)
To echo some comments from other editors that I think are most worthy of consideration: nothing this bot could do is difficult or uniquely complex, there's no good reason not to publish, publication would facilitate bug discovery and resolution. The bot could be blocked if it ever caused problems. It should also be possible to distribute a version this bot set to run in semi-automatic attended mode, which would enable the word to be done efficiently without the risk that comes with a fully automatic bot, of being fooled by cleverly written malware or mischievous humans. --Tony Sidaway 07:12, 8 January 2007 (UTC)

I believe many people here are grossly underestimating how little modification to the source it would take to turn a bot that looks for vulnerabilities in order to protect them, into a bot that looks for vulnerabilities in order to vandalize them. Changing fewer than 5 lines would turn this into effective malware. Changing a few more than that would be enough to let it rampage all over the place. If you are unwilling to accept this as private source, then by all means kill it, but I have no intention of making the source public. Dragons flight 07:34, 8 January 2007 (UTC)

I said this in my oppose !vote-if this bot code, through error or malice, is that dangerous (and if danger exists, either error or malice can lead into it), and would be that dangerous if a non-admin had possession of it, it is more, not less, critical that the code be open to continuous review-not just now but during its operation. It's not like we've never seen a vandalbot, but if this code is suddenly released we'll have a flood of them. (Please note-you certainly have the right to keep your code secret, but even if most seem alright with that, I think it's a bad idea and will in the end decrease the effectiveness of the response against vandalism. And for myself, I can't support it without seeing it.) Seraphimblade 07:50, 8 January 2007 (UTC)
Um... with AWB, just removing checkpage functionality, which should be much easier, leaves you with a pretty effective vandalbot. You won't be guaranteed to hit the unprotected things transcluded on main page articles, no, but you could just vandalise the pages themselves, and I do not see how that's worse. -Amarkov edits 15:50, 8 January 2007 (UTC)

Everyone does know that WP:BEANS was written for a reason, right? FOLLOW IT! It is absolute nonsense to argue over who can write a vandal-bot faster. You've got a responce above how fast someone could turn this into a vandal-bot. There's no point in making it easier for someone to do it. Yes, people can make vandal-bots, but lets make them write them entirely themselves. Let's not hand then one that's already mostly pre-written! -Royalguard11(Talk·Desk·Review Me!) 04:23, 9 January 2007 (UTC)

You're missing my point. There already is one, and it's patently obvious that removing the checkpage would leave you with a vandalbot. Thus, it is obviously not all that much of a problem. -Amarkov edits 05:11, 9 January 2007 (UTC)
Is there really anything in the code that cannot be gotten from User:Shadowbot2/Source? HighInBC 14:44, 9 January 2007 (UTC)

Current status question

(cross-posted to bot approval page) With the RfA now pending, is ProtectionBot currently operating during the RfA period? I hope that it is, at least on an ongoing trial basis. Newyorkbrad 20:21, 7 January 2007 (UTC)

A member of the BAG ended the trial after one day and instructed DF to shut down the bot here, and DF did as he requested, so no, it's not running. —bbatsell ¿? 20:30, 7 January 2007 (UTC)

Suggest continued trial operation during RfA period

If Dragons flight is willing I would like to see this bot continue operating on a trial basis during the RfA period, both so we have the benefit of its services during the next week and so that in the unlikely event of an issue arising the RfA !voters could consider it. Comments? Newyorkbrad 20:32, 7 January 2007 (UTC)

I think BAG shut it down, In the meantime we have User:Shadowbot2. Which as stated on the RFA page, is fixed and will preform correctly. Cheers! —— Eagle 101 23:16, 7 January 2007 (UTC)
Probably best to just wait, I know I am checking shadowbot2's mailings. HighInBC 23:18, 7 January 2007 (UTC)
Suggestion: Might it be possible to authorise the continued running of ProtectionBot for as long as this RfA maintains a suitable level of consensus for the Bot? e.g. 80 or 85%? That would combine practicality with respect for the views of the community... WJBscribe  23:39, 7 January 2007 (UTC)
  • No need to be bureaucractic about it, this bot is useful (and the RFA has overwhelming support so far) so there's no reason why it shouldn't keep running for a few more days. >Radiant< 12:28, 8 January 2007 (UTC)

Buffer overflow

I see a few people concerned about buffer overflow exploits, my understanding is that this type of vulnerability can only be used on a bot that can be given binary input. Since this script gets all of it's input from mediawiki which stores it's data in text form, I see no way to insert such an attack. Python does not allow for run-time compiling. You cannot fool such a bot into running arbitrary code given such input restrictions, as the precompiled code needed for such an attack cannot be stored as text.

I may be wrong, so correct me if I am, but it seems a buffer overflow vulnerability is not an issue for technical reasons. HighInBC 23:50, 7 January 2007 (UTC)

Malfunction on malformed input is far from executing arbitrary code, and would lead to a parsing failure. And changing input formats would exceed the approval it is seeking. HighInBC 23:53, 7 January 2007 (UTC)
I see, I agree that we cannot discount the possibility of the bot being intentionally screwed with, but I think the threat of arbitrary code execution is not an issue. HighInBC 00:04, 8 January 2007 (UTC)
Arbitrary code exception on the bot's machine? No. Arbitrary command execution on wikipedia? Depends on how the bot sends information back to the servers. I have not reviewed the code, nor has, to my knowledge, any python security expert, nor can I rely on distributed error checking to review the code, and as such, you just don't know what happens if a page to be protected has the following on it -> . Does it load and recurisvely protect everyting on malicioustemplate%2%6%9deleteallcontributions? Does it attempt to load and recursively protect the page deleteallcontributions, which has now resulted in the protection of the entire encyclopedia (oops!) Does it load http://en.wikipedia.com/w/deleteallcontributions and fail? I don't know! I can think of more way to beat the bot, but I'm just shooting in the dark. Real security audits involve reviewing the code. Hipocrite - «Talk» 20:15, 8 January 2007 (UTC)
Interesting point - do you know any available python security experts who might be willing? Guettarda 21:35, 8 January 2007 (UTC)
Open_source_versus_closed_source#Security, to be a bit snarky but not too much. Hipocrite - «Talk» 21:54, 8 January 2007 (UTC)
Yep, I'm aware of that, and I read a good chunk of the RFA - I see your point and I see Robert's. He isn't likely to change his position, and the bot fills a real need. I respect both his position and yours. I trust both his judgement and yours. I only see one of two outcomes - either the RFA fails (with the result that the main page remains vulnerable), or the bot is approved with the code secret (or semi-secret). So rather than arguing about what should be, I am wondering how create the best outcome. Guettarda 22:03, 8 January 2007 (UTC)

Some concerns

While correcting a misunderstanding I wrote the following to express some of my concerns (despite supporting the RfA). Comments welcomed.

I was reading the ProtectionBot discussion, and I noticed in one of the oppose votes discussions someone said "remember this bot only protects images and templates on the Main Page". This is incorrect. The bot is also intended to protect templates being used on the featured article, the actual featured article page, not the introduction to it that appears on the Main Page. Thus anyone can add a template to the featured article, vandalise the template, and sit back and watch as ProtectionBot protects the vandalised template. The good thing though, is that the featured article is (normally) freely editable, and so anyone can remove the protected vandalised template. This situation is a bit more problematic when the featured article is in a state of protection or semi-protection due to high levels of vandalism (someone always seems to protect the featured article at some point in any given day), and if the protected vandalised template is in widespread use in other articles. However, the discussions at Misplaced Pages:Main Page featured article protection may change all this. Thus the interaction of all these proposals needs to be carefully considered. Not too much change too fast. Also, no-one seems to have picked up yet on the comment I made here. That can be summed up by: Main_Page/Tomorrow needs to be actively watched every day and a button clicked to show that someone has checked it, otherwise, as I said in that comment I linked to: "...the vandalism (possibly not very visible) remains undetected for a whole day, and then silently switches over on the main page, at which point all hell will break loose." Carcharoth 10:55, 8 January 2007 (UTC)

Incidentially, Main_Page/Tomorrow is unprotected, for some reason. This should probably change. Carcharoth 11:02, 8 January 2007 (UTC)
Actually it was semi-protected. Now it's fully protected. Though since nothing that's directly on that page is ever included in the Main Page, I'm not entirely sure what the problem was – Gurch 12:33, 8 January 2007 (UTC)
The problem was that templates on that page are being protected by ProtectionBot, so this is a way for someone to get something protected by ProtectionBot. That something might be something that we wouldn't want to be protected and then freely added to other pages. Hence all pages scanned by ProtectionBot should be protected. Today's featured article page is a notable exception to this, and one that will need to be watched very closely. Carcharoth 12:46, 8 January 2007 (UTC)
Is the bot really checking Main Page/Tomorrow itself rather than the relevant component templates? If so, I'm not sure that makes sense. The 'tomorrow' template isn't actually 'copied over' to the Main Page each day. When I first mocked up a 'tomorrow' version of the Main Page I used the then current formats of the Main Page and just added in the {{day+1}} template where appropriate... and someone then took that to create the 'tomorrow' page. However, since then there have been numerous small changes to the Main Page which have not necessarily been kept up on the 'tomorrow' version. That process will continue over time and eventually there may be templates and/or images on the 'tomorrow' version which are no longer used on the actual Main Page and which thus would not need to be protected. We could always 'recopy' the current Main Page formats from time to time (and would anyway), but the bot could be a bit smarter by checking the specific sub-templates which vary on the Main Page a day in advance. --CBD 13:19, 8 January 2007 (UTC)
Good points. The bot description say: "In addition, it will protect the predictable elements (such as the next Picture of the Day) a day before they appear on the main page." - so I think you are right. What we need is for that description to be expanded so it says exactly what pages it protects in advance (probably just the TFA, SA and PoTD transcluded pages). On the other hand, the Main Page/Tomorrow must show what will actually appear (and so needs to remain protected as I suggested), otherwise people watching that page might miss 'sleeper' vandalism.
On another point, can we clarify terminology here. Does it make sense to distinguish between images, transclusion of pages from template namespace, and transclusion of pages from other namesspaces? When people refer to templates, they can mean either pages in template namespace, or (more widely) anything that appears in the {{ and }} curly brackets. Carcharoth 13:38, 8 January 2007 (UTC)
I did find this edit by the bot programmer, who said (on 30 December): "As described it would be looking at Main Page/Tomorrow and Tomorrow's Featured Article as well as the current ones, so predictable elements will be protected before they actually reach high profile status." - though possibly things have changed since then. I've asked Dragons flight to comment here. Carcharoth 13:49, 8 January 2007 (UTC)
Another point. If the bot tries to predict what the 'next day' templates are, there needs to be a note that changing that system (eg. changing the format of the dates, or using different templates - as recently happened with PoTD) would confuse and probably break that part of the bot's function. But then that would break Main Page/Tomorrow as well! So another note for the human oversight section below. Carcharoth 13:58, 8 January 2007 (UTC)
The answer is that yes, the current implementation relies on Main Page/Tomorrow to predict the upcoming content, and I apologize if that was unclear. So yes, at the present time that would need to be kept updated and potentially full-protected if it becomes a problem. One could imagine an implementation that uses Main Page alone to predict future content, but that also would have problems. At present, the rotating elements rely on three different nomenclatures "{current month name} {current day}", "{current month name} {current day}, {current year}", "{current year}-{current month number}-{current day 2 digit number}" and only 2 of the 3 is on the Main Page itself, one of the rotating elements is in a subtemplate. Trying to write something that would be robust against the variations in placement and nomenclature that people might devise in the future would represent a hard problem (and I would note that POTD has already changed twice in the last week). My present "solution" is to encourage any modifications to the main page to also maintain the day+1 state of Tomorrow. I realize this isn't really a solution, but it is something that people can do that will work predictably, as opposed to my trying to guess at potential future main page changes, which seems likely to fail. Dragons flight 14:42, 8 January 2007 (UTC)

An example is this edit where the editor who redesigned the PotD template system updated the Main Page/Tomorrow page. If this step had been forgotten, the system might have broken down. Carcharoth 14:00, 8 January 2007 (UTC)

Please don't forget that human oversight is still needed

Just to avoid complacency, and to remind those saying that this bot will "deal with the problems of Main Page vandalism", a reminder that the bot will deal with some methods of vandalism, but human administrators still need to be alert to the following, which, however unlikely, will probably happen at some point in the future. I've given examples below. Carcharoth 12:43, 8 January 2007 (UTC)

Human error

  • Administrators unprotecting stuff and forgetting to re-protect (ProtectionBot will not override another administrator). The fix is to reprotect and politely ask the administrator not to make this mistake in the future. Carcharoth 12:43, 8 January 2007 (UTC)
    • Query - if something is protected by an administrator, will ProtectionBot still unprotect the page in question once it leaves the sensitive areas? This is not good for high-risk templates that should remain protected even when off the main page. Carcharoth 12:43, 8 January 2007 (UTC)
      • No. It remembers what it's protected and only unprotects things it protected itself. This may result in things being protected for longer than they should be, but that's infinitely preferable to things being unprotected when they shouldn't be – Gurch 12:45, 8 January 2007 (UTC)
        • I agree that having some things protected for longer than they should be is better than the alternative, but one of the advantages of having ProtectionBot unprotect things, was that admins would no longer have to do this chore. Admins will need to learn that if they protect something, they can't rely on ProtectionBot to unprotect it. Probably a separate bot is needed to unprotect any selected anniversary pages that remain protected after leaving the main page. The Picture of the Day and Today's Featured Article daily templates remain protected, I believe, as a record of what that bit of that day's main page looked like. The random stuff going on and off the featured article page and the DYK and ITN templates are the admins responsibility to protect and unprotect as needed, so I am happy that the query is not a problem, and have struck it out. The human error bit remains, of course, and not a lot we can do about that. Carcharoth 12:55, 8 January 2007 (UTC)
  • Administrators forgetting to protect something in the first place before adding to the main page or to that day's featured article. ProtectionBot will protect a short while later, but a small window of opportunity remains for vandalism. Administrators should not be complacent and should still remember to protect and unprotect DYK and ITN templates/images (ITN is the most common update area, other areas less so as DYK should be done through the DYK update area, though the image on the Featured article blurb sometimes gets wrangled over) and featured article templates/images that they add to the featured article or main page on the day (if added a day beforehand, or to the DYK update area, ProtectionBot will protect for you the day before, via Main Page/Tomorrow). Carcharoth 13:02, 8 January 2007 (UTC)
    • Possible solution - if the ITN editors feel they might still forget to protect images, then they could move to an update area like DYK and lag a day behind the news. Just for the image, maybe, and have the other ITN lines updated throughout the day. Carcharoth 13:06, 8 January 2007 (UTC)
  • Major redesigns of the main page or its templates. Any major (or even minor) redesign of the main page and its various template systems may impact the operation of the bot. Tread carefully before carrying out redesigns, and drop a note off at User talk:Dragons flight. This is an argument for having the actual step-by-step processes (if not the actual code) described as fully as possible. ie. a log of what it does, like an annotated version of its protection/edit contributions list. Carcharoth 14:06, 8 January 2007 (UTC)
    • It should actually be quite robust against anything that you could do (though I should never doubt the potential for people to surprise me). More troubling I think is the potential for changes to Mediawiki to break it. Relevant changes would probably be quite infrequent but are at least possible. Dragons flight 14:19, 8 January 2007 (UTC)
  • Changes in Mediawiki could affect the way the bot operates. The bot programmer has said (see above): "More troubling I think is the potential for changes to Mediawiki to break it. Relevant changes would probably be quite infrequent but are at least possible." (User:Dragons flight, 08/01/2007). Carcharoth 14:58, 8 January 2007 (UTC)
  • No-one watching Main Page/Tomorrow for vandalism that then gets frozen in place by ProtectionBot. What is needed here is a way for any admin to 'sign off' on the Tomorrow page and confirm it is not in a vandalised state, and for ProtectionBot (prefereably, or possibly another bot) to squeal if such a check hasn't been performed. This could be similar to the breakdown alert system currently in place for ProtectionBot. Carcharoth 15:25, 8 January 2007 (UTC)
    • It doesn't take an admin to look at and call attention to problems with that page, anyone could do it. Dragons flight 15:41, 8 January 2007 (UTC)
      • The thing I had in mind was not so much calling attention to the problem, as having a box ticked to confirm that someone had checked the page. If this is not done, you can end up with everyone or no-one checking the page. By sod's law, and as people get bored doing this check, the one time no-one checks will be when the page (through one of its templates) is in a vandalised state. Everyone is away at various times, so you can't rely on a single person to carry out this single check. The reason an admin is needed to check the box (or turn a big red light green), is that if anyone can 'tick the box', then a vandal will do it. I suggest the sequence should go: (1) ProtectionBot protects all templates etc. on 'Tomorrow' at the beginning of a day. (2) An admin makes a change to a protected page (call it the checkpage) that indicates that the 'Tomorrow' page has been checked by a human, and indicates to others that this change has been done. (3) ProtectionBot checks the checkpage and if the change hasn't been made that indicates a human has checked the page, e-mails the admins on its list. (4) At the end of the day, ProtectionBot changes the checkpage back to its "unchecked" status. Put this checkpage on a ProtectionBot subpage if need be, and then transclude as a little red/green light at the top right of Main Page/Tomorrow. Does this sound workable or too complicated? Carcharoth 16:38, 8 January 2007 (UTC)

Bot error

  • Protecting a vandalised transclusion added to the featured article. The bot cannot check whether an image or template is in a vandalised state before it protects it. If a vandal strikes lucky and vandalises a template just before it gets protected (unlikely but possible), then an admin is required to unprotect and undo the vandalism. If no-one is watching closely, then a vandal could do this on the featured article and then remove the newly protected vandalised template or image and add it to lots of pages. The template/image in question will be unprotected by ProtectionBot after two passes, but a lot of damage could be done in this time interval. Carcharoth 12:43, 8 January 2007 (UTC)
    • Such a race condition is possible (beans anyone?). I don't see anyway around it. However, since the vandal has to guess at when the bot will run, I'd guess that on average he would be blocked even before he succeeded at getting the timing right to protect something. Any other suggestions? Dragons flight 14:12, 8 January 2007 (UTC)
      • If this one is too bean-y, please remove it. But this has been discussed elsewhere as well. I think the problem (of a malicious user indirectly using ProtectionBot to get something protected) may be resolved if the featured article and main page functions of ProtectionBot are separated. Then it becomes a question of whether Misplaced Pages:Main Page featured article protection ever gets resolved. Carcharoth 14:58, 8 January 2007 (UTC)
        • Could the bot at least post a message somewhere, after the protection, if the page had a recent edit (i.e. more recent then the last time it scanned)? The bot wouldn't be able to tell if the page had been vandalised, but it would be able to call in a human who could tell. --ais523 17:19, 8 January 2007 (UTC)
  • Protecting a vandalised state of a rotating main page transcluded page. A similar example to the above is when ProtectionBot protects the rotating transcluded pages that use date parsing to queue the main page templates for the featured article and the picture of the day and selected anniversaries. This is done in advance by using Main Page/Tomorrow, but unless humans watch this page, vandalism may pass un-noticed here for a day until it flips over onto the Main Page. Carcharoth 12:43, 8 January 2007 (UTC)
    • Yes, humans will still have to pay some attention. But looking at a single page to see if it looks right ought to be a much easier task that checking the protection state of everything. Dragons flight 14:12, 8 January 2007 (UTC)
    • Conclusion - cannot be detected by ProtectionBot. Requires human oversight. Reliable human checking system needs to be implemented, allowing humans to tell ProtectionBot that the page has been checked. Carcharoth 17:09, 8 January 2007 (UTC)
  • The bot may unprotect a page that should remain protected. The bot is unable to make the necessary judgement, though it could be programmed to look at whether the page is already in a high-risk category. Merely having it return the page to the state it was in before arriving on the main page or featured article is not enough, as some pages are protected by admins beforehand, but should be unprotected once they leave the sensitive area. Carcharoth 12:43, 8 January 2007 (UTC)
    • It will only unprotect pages that it has protected. If a high-risk template is added to the Main Page it will already be protected, so the bot won't do anything when it is removed. If an administrator protects a template/image themselves, and they add to the Main Page, the bot won't touch it at all, no matter what – Gurch 12:47, 8 January 2007 (UTC)
  • Transcluding featured article onto itself. Does the bot protect all templates transcluded on the daily FA, or all pages? If a vandal transcludes the featured article into itself, would the bot end up protecting the featured article and any vandalized content? Gimmetrow 13:05, 8 January 2007 (UTC)
    • Apparently (after trying it out on one of my user subpages) this is indeed possible. Strange, and well-spotted. And yes, I believe it does protect any transcluded pages. The rotating date pages for the main page featured article, picture of the day and selected anniversaries are actually page transclusions, not transclusions from template namespace. Carcharoth 13:13, 8 January 2007 (UTC)
    • I've added a line to prevent this eventuality. Dragons flight 14:12, 8 January 2007 (UTC)

Please add any more examples you can think of needing human oversight. Carcharoth 12:43, 8 January 2007 (UTC)

Shrubberies

We are the Knights who say Ni!

1. The bot must not be sysopped until we can see that the bot does only that which is stated
2. The bot may not be run under Dragons Flight's own account because that violates bot rules
3. The bot must therefore only be run under its own assigned account
4. The bot's assigned purpose requires sysop privileges
5. Goto 1

And there you have it. Guy (Help!) 16:38, 8 January 2007 (UTC)

Indeed. Lots of FUD being thrown about all over the place as well. Sad. —bbatsell ¿? 16:44, 8 January 2007 (UTC)
What about the discussion above, which is actively trying to lay out possible problems and solutions. Contributing or linking to that could help. Carcharoth 16:55, 8 January 2007 (UTC)
I was talking more about the votes that have no explanation, or that are POINT violations (my favorite so far is the one opposing because the bot did not sign accepting the nomination, then proceeding to chastise everyone else involved for not knowing the rules), or that list issues that are either not factual or have already been addressed; as is my mantra, discussion is never a bad thing. Administrative oversight will always still be required, and laying out exactly what will be required above is wonderful. —bbatsell ¿? 17:05, 8 January 2007 (UTC)
Adding it to Category:Administrators open to recall is my favourite. :-) Carcharoth 17:33, 8 January 2007 (UTC)
I'm all for adding the bot to that category, just as soon as it expresses its willingness to be added. ;) SuperMachine 17:35, 8 January 2007 (UTC)
The bot will agree to stand for reconfirmation upon the request of any six other bots. :) Newyorkbrad 17:41, 8 January 2007 (UTC)
I'll make sure no other bot will clerk for it (I was born in Detroit, we have ways to influence bots...) and thus the recall will fail procedurally. ++Lar: t/c 23:06, 8 January 2007 (UTC)
"The Knights have a weakness in that a number of words, when spoken to them, cause them pain and agony." (from the article Knights who say Ni). Thanks for that nice pointer, Guy. :-) --Ligulem 22:55, 8 January 2007 (UTC)
Alas! there is away out! Go look thee, to the wonderful land of the test wiki. Just follow that yellow brick to find all sorts of wonderful things! </Wizard of oz>Anyway, this bot can easily test on the testwiki, and a sysop bit should be easy to come by over there. The wikimedia framework is very similar, so what works there should work here as well. Cheers! —— Eagle 101 23:12, 8 January 2007 (UTC)
I know a way out of that shrubbery, it removes number 1! Release the ****ing source code. Sorry for the starred language, but I'm really, really, annoyed that such a simple solution seems to be randomly overlooked.
And for humor, I think we need to stick this thing on ArbCom, it'll go well with AntiVandalBot. I wonder when it'll be programmed well enough to arbitrate? -Amarkov edits 06:07, 9 January 2007 (UTC)
You mean best publish the source code before starting the RfA then? To give the vandals some advantage in case the RfA should succeed - <chuckle>. Looks like we should modify Guy's "Knights who say Ni" program loop then :-). Nice RfA though. --Ligulem 10:25, 9 January 2007 (UTC)
Yeah, security through obscurity... No, really, such a simple way out. If anything the request for shrubbery is in obtaining access to the source code. "Drop me a note, I'll decide if you're trusted, and get back to you." Abu-Fool Danyal ibn Amir al-Makhiri 15:17, 9 January 2007 (UTC)
(in reply to Ligulem) Decent source, is secure, no matter who reads it. Anyone can go read the Mediawiki code that feels like it-go on, you can right now! What you'll note, however, is that this ability really doesn't make things any easier for the vandals, because the code is good. One hopes that this bot has certain features built in (for example, altering the exact times at which it performs its checks by random intervals)-these types of safeguards would not be compromised or harmed by anyone reading the code, and the bot could be compromised through simple observation if they're not. That being said, it seems like this bot's coder is pretty competent. Here's my problem. An adminbot must be exceptionally good. If the source code cannot be released without danger that this bot will be compromised, it's not properly coded. If Dragons Flight believes that it'll make anything easier for blackhats than having the code for AWB out there, this thing is either exceptionally dangerous or he's unaware of the real situation. Any of those scenarios make me nervous enough to oppose this. It's really too bad-I saw one of those nasty incidents, I think this is a good way to solve it, and I'd like to be able to support it. But basically, what's being asked here is for the community to support a person for admin because someone trustworthy nominated them. We don't do that. We go look through that person's history. For the same reason, I can't support here because I trust the coder-I'd have to see the code, and know that everyone can do so. Seraphimblade 15:29, 9 January 2007 (UTC)
One must also remember that there are a LOT more wikis than just en, but this bot will only protect on en. So, if DF openly releases the source code, vandals now have a bot handed to them where they can have it search out every single un-protected template and image on every Misplaced Pages in every language, change 10 lines and have it vandalize those pages instead. Could vandals program a bot to do this anyway? Sure... but let's not make it any easier for them and just hand them the absolute perfect vandal tool. As it stands now, the code has already been reviewed by numerous qualified programmers, and is available to anyone in good standing (including those on other language wikipedias who want to implement it on theirs). I'll be honest, I'd prefer if it were open-source. I love open-source and will always prefer it to closed-source. But opposing simply because it is not open-source is acting ideologically rather than practically. The "security through obscurity" charge doesn't work, because it's been reviewed by numerous people, and can continue to be reviewed by anyone who wants to. It's not closed-source because there's a possibility of it being compromised, but because there's a possibility of the code being abused. My $0.02. —bbatsell ¿? 16:12, 9 January 2007 (UTC)
The others I'm aware of do not have fully protected main pages anyway. I do believe the question is moot for all wikis except this one. I am also amused to find myself accused of being an open-source ideologue. Abu-Fool Danyal ibn Amir al-Makhiri 19:32, 9 January 2007 (UTC)
I've already recieved a request for ProtectionBot to be used on the Italian wiki, for which I am happy to provide the code provided that I can find a English speaking Italian bot operator who is willing to take responsibility for it. Dragons flight 19:59, 9 January 2007 (UTC)
Huh? I can edit some of the main page templates on Italian wikipedia. Like this one. It's not even semi-protected, and it doesn't look like it's supposed to be. Abu-Fool Danyal ibn Amir al-Makhiri 20:24, 9 January 2007 (UTC)
Is there really anything in the code that cannot be gotten from User:Shadowbot2/Source? HighInBC 16:34, 9 January 2007 (UTC)
Great, so you'll share with more "trusted users" in an entirely different wiki, but still can't be bothered to release the code so any wiki can use it, or so that we can review it. WTF, man? --badlydrawnjeff talk 00:43, 10 January 2007 (UTC)

(undent) The others I'm aware of do not have fully protected main pages anyway. I do believe the question is moot for all wikis except this one. You appear to totally misunderstand the situation. This is not about vulnerabilities in the Main Page, it is about vulnerabilities in the Main Page Article, the most widely viewed page on all of en.wikipedia.org. And that article is never fully protected.

The bot is designed to find weaknesses (unprotected templates and images) and fix them. It is trivial for anyone familiar with that type of code to modify the program so that it finds weakness and lists them. And that list is a list of targets for vandalism.

Do you really believe that other wikis don't use templates and images, or that (the only way to avoid this problem) those other wikis fully protect templates and images so that only admins can modify them? Because those are the only two ways that this issue is "moot" for other wikis. John Broughton | Talk 01:34, 10 January 2007 (UTC)

Actually, just as a quick note, it does both. It does what you describe, but it also searches for and protects vulnerable templates and images on the Main Page, which is what has been a bigger problem as of late. —bbatsell ¿? 02:10, 10 January 2007 (UTC)

Italian Misplaced Pages

Okay, if I have misunderstood, please educate me: aren't there templates and images on the italian main page (and, of course, on the featured articles) which are editable on purpose? What would ProtectionBot do on the italian WP? And what would access to ProtectionBot's code do for an italian WP vandal that he could not already do with the "edit" links right under his nose on the main page? I'm pretty sure this goes for most, if not all, of the others -- with the notable exception of english WP. These concerns are misplaced. In any case, isn't Shadowbot2's code already written to do just what John fears: report vulnerable vandalism targets? You'd only need to change who it notifies. Abu-Fool Danyal ibn Amir al-Makhiri 14:38, 10 January 2007 (UTC)

I have no idea (and no familiarity with it.wiki), but the point you raise that it doesn't seem protected is valid. The request was raised by User:Dario vet, so perhaps you should go ask him. Dragons flight 15:55, 10 January 2007 (UTC)

Idea for a compromise about the source code

I see considerable worries about the code of the bot not being open source and if we fail to address these the whole thing will not fly.

As being one of the supporters of giving the bot admin rights and letting the bot do what it was intended for, I would like to try to work towards consensus and see if we can move a little bit on all sides.

Would it be acceptable, that Dragons flight gets a startup time where he can establish the bot and tweak it and iron out all early bugs without publishing the code and publish the code later, when the bot is running and already doing its job?

Part of this my proposal here is that those who give their RfA support as soon as the code is published would move a bit and give their approval for the RfA based on the mere promise of Dragons flight to publish the code in - let's say - a month?

Of course my proposal only works if Dragons flight would agree to publish the code in a month. Deal or no deal?

If this is not acceptable, what can be tweaked to make it acceptable?

Could we keep parts of the code unpublished? For example the code part(s) that determine the exact time when the bot will protect a specific page. Maybe this could be refactored out into a call of a random function so that the complete code could be published without giving the knowledge exactly when it is going to protect a page.

Please help work towards consensus, everybody! --Ligulem 10:40, 10 January 2007 (UTC)

Personally, I would like to see at least an outline of the algorithm used, in order to be able to identify potential exploits and have them addressed ahead of time. Perhaps this is already written somewhere, but between this RfA and BRfA I can't find it. For instance, how is the template recursion handled? If template A transcludes B which transcludes A, does the recursion stop? How does it treat noincude and includeonly parts of templates? Gimmetrow 13:53, 10 January 2007 (UTC)
A month? Well... let's face it -- contrary to what you say, the whole thing is going to fly without addressing our concerns. So if that was the offer I'd take it. Abu-Fool Danyal ibn Amir al-Makhiri 14:54, 10 January 2007 (UTC)

Access rights for protection bot

This RfA is about giving the whole admin group access rights to user:ProtectionBot. Since the bot only does protections, could we increase the support for this bot by the community by limiting the rights we give to this bot to "registered"+"protect" (per WP:UAL)? I believe any steward can enact this (See "userrights" in WP:UAL). --Ligulem 12:19, 10 January 2007 (UTC)

If I'm correct, we can deposit a request at m:Requests_for_permissions#Miscellaneous_requests for this, if we have consensus. --Ligulem 12:23, 10 January 2007 (UTC)
I've asked the stewards for confirmation about the procedure . --Ligulem 12:33, 10 January 2007 (UTC)
Possible access levels
You can see the screenshot of steward version of Special:Makesysop with assignable groups. Because not every possible option is visible, here is the full list:
Bots
Sysops
Bureaucrats
checkuser
Stewards
boardvote
import
developer
oversight
MaxSem 13:04, 10 January 2007 (UTC)
If I remember correctly, a developer can invent a new user group with any set of permissions they like (in this case, a 'protectionbot' group with bot + protect + all autoconfirmed rights could be created); it's unlikely that the developers would do this without consensus that it's needed first. --ais523 13:10, 10 January 2007 (UTC)
"Protector" would probably be better in case for whatever reason they wanted to give a non-bot protect rights. --WikiSlasher 13:35, 10 January 2007 (UTC)
I've posted a question on wikitech-l . --Ligulem 13:41, 10 January 2007 (UTC)
One response there, from Gmaxwell, is: "If the operator of the bot is not trusted enough to have access to deletion or blocking, why do you trust him enough to have access to protection?" - my response to this would be that the bot will be running unsupervised, and there are concerns that defects in the code could lead to the other admin powers being used a vandal/hacker who exploits said defects to start deleting and blocking, rather than protecting and unprotecting. It is a request designed to limit the potential damage that could be done by the other admin tools if an unsupervised adminbot account was compromised. If the bot ran awry and was manipulated to do stuff, undoing mass protections and unprotections would be easier to fix and less damaging than the other stuff (such as deleting and blocking). I've posted to User talk:Gmaxwell to see if he wants to respond here. Carcharoth 14:35, 10 January 2007 (UTC)
(In response to WikiSlasher) That would be a rejected perennial proposal. By lumping 'bot' in with it, it would prevent people complaining that the consensus there was ignored. (Personally, I don't see why we don't have separate delete/protect/block, etc., as it would help to remove some of the political notions of adminship, but that's a different discussion.) --ais523 16:01, 10 January 2007 (UTC)
As I've said in my oppose, I would be willing to support bots with certain delimited administrative 'powers' such as an auto-protection bot, so yes, if you can find a developer willing to finally do this, that might help. -- nae'blis 16:32, 10 January 2007 (UTC)

I don't like the idea of creating new access group, entities should not be multiplied beyond necessity. The theory of possibility of exploits seems to me extremely unlikely in this case. MaxSem 16:45, 10 January 2007 (UTC)

We can establish a lot of principles and all obey them. But how does that help in finding consensus about how to protect the main page against sneaky template vandalism? --Ligulem 17:21, 10 January 2007 (UTC)
Current tally of 183/37/13 can already be considered consensus. And even if not - making a new feature intended for use by a single bot is unproductive IMO. Sure, Werdna's solution, if implemented soon enough, would be the best. MaxSem 18:41, 10 January 2007 (UTC)
Please see the section below about Werdna's solution. I would want to see the queries answered before agreeing that it is the best solution. If it is not, those working on the ground will have to cover any gaps in the defence. Carcharoth 18:45, 10 January 2007 (UTC)
Also, I agree that the current tally can be considered consensus, but I also think that in the remaining four days of this RfA, the percentage will drift down towards 80%. I know that percentages are not ultimately what decides an RfA, but having it drifting steadily, if ever so slowly, downwards from 85% towards 80% does make the closing 'crat's decision slightly more difficult. Carcharoth 22:12, 10 January 2007 (UTC)

If a developer wants to make it happen, I have no problem with this. I don't think its actually necessary (and probably sets a poor precendent of having to have developers get involved with any future admin bot approval), but I do understand the sentiment of those who would be comforted by this. Dragons flight 16:47, 10 January 2007 (UTC)

Ok, I will say this again, you cannot run arbitrary code on this bot because of two things: 1. It does not take any binary input so buffer overflow is out of the question, 2. Python does not support run-time encoding.
It is possible to fool this bot into protecting something it should not, I cannot imagine how but it is possible. Short of getting the bots password, there is no way to get the bot account to do something it is not programed to do. Limited access to protection only will only serve to assuage baseless fears. HighInBC 16:54, 10 January 2007 (UTC)

Newbie kind of idea

Erm... please don't shoot me down, cos I'm not a techie and I'm also not overly familiar with process here (yet) and I'm feeling my way with this message:

Is this correct, that the Bot needs admin powers only to be able to automatically protect and unprotect images? Am I also right that the real urgency is only the need to protect images, not the unprotecting again?

Surely a simple workaround would be to give all logged-in non admins the power to protect images (something an admin can easily over-ride if abused, and something not particularly attractive as a method of vandalism). The bot can then do the urgent protection and (also on an automation) send an unprotect message to an admin backlog page, like when us non admins flag something for speedy deletion (because obviously vandals should not have the chance to unprotect).

I presume that making the protection function available to non admins would be controversial (I'm not that wet behind the ears!) but if we pretend for a minute that that could go through "on the nod", have I made any mistakes in my understanding of what the Bot needs to do, that means my suggestion wouldn't work? --Dweller 17:02, 10 January 2007 (UTC)

It would be used in content disputes, that is why a level of trust must be established before such powerful tools are given out. This will not fly I am afraid. Maybe in the future when the world is perfect, I can see it now.... HighInBC 17:07, 10 January 2007 (UTC)
Hmmm... you mean where two editors edit war about the inclusion of an image, one might protect it? I get your point. --Dweller 17:27, 10 January 2007 (UTC)
Exactly. Of course, bots don't have a pride to lead them to edit war. HighInBC 17:43, 10 January 2007 (UTC)
Also, it needs to protect templates, not just images. Carcharoth 18:01, 10 January 2007 (UTC)

Rumours that ProtectionBot will be redundant!!

I followed the rest of the wikitech mailing list thread linked to above, and this post (and the previous one from Werdna himself) makes clear that possibly ProtectionBot will be redundant soon anyway. Of course, we won't know for sure until Werdna reveals what he has been working on, but it seems the vandalism and this RfA might have got things happening developer-side. Would be nice if a developer dropped in on this discussion and told us what is happening, though... Carcharoth 17:59, 10 January 2007 (UTC)

Like me? Yeah, I'm overhauling the protection system. The most visible of my changes is a "cascading protection" option - essentially, it protects anything in the Template: namespace that's transcluded onto a page, as well as the page itself, if a certain checkbox is ticked on the protect page screen. Any other issues like this, please come to developers and ask for a fix before hacking together a bot. I realise that Dragons Flight had great intentions, but really, this kind of thing needs to be implemented as part of MediaWiki, along with most of the stuff that gets proposed for adminbots and, indeed, some of the regular bots. — Werdna talk 18:06, 10 January 2007 (UTC)
This effort is very laudable and much appreciated. But I fear it will take quite some time to iron out all the quirks on this. For example you will have to separate direct protections to a template from "remote" protections (a template should remain protected if it was directly protected even if the cause for a "remote" protection goes away). And you need to track the protectors (a protector is a page that caused protecting a template) for each template. I would be surprised if you manage to track all these relations properly and in a stable manner. Let alone thinking of the user interface of this: for example we should know why a template is "remote" protected. And last but not least there are pages which are transcluded into protected other pages that should not be protected (see Misplaced Pages:Template doc page pattern). Just a few possible complications, I'm sure there are more. "overhauling" always sounds nice, until it comes to the details (just my experience as dumb software developer - not a MediaWiki developer though). But don't get me wrong: I really hope this can be implemented and I wish you all the best on this. In the mean time, we still have to solve the transclusion vandalism problem... --Ligulem 00:12, 11 January 2007 (UTC)
I'll reply to these objections in point form.
  1. There is a need to separate direct protections from remote protections.
No, there is not, because cascaded protections are not stored in the database
  1. Tracking all the protectors (and displaying them to the user)
This will come in time: There's no need for "tracking" them, they're found on-the-spot; and are already retrieved with the current SQL queries anyway. As for displaying them to the user, we'll work on it, but I'm sure you understand this isn't a core feature for the modifications.

... I didn't see any more objections. — Werdna talk 01:57, 11 January 2007 (UTC)

What about transcluded items not in the Template: namespace? HighInBC 18:12, 10 January 2007 (UTC)
The point is that protection on page X is automatically extended to all items included in it, for the duration that they are there. So yes. Dragons flight 18:39, 10 January 2007 (UTC)
What about the Image namespace? Also, the main page 'templates' are not actually in the Template namespace. They are pages transcluded across from (I think) the Misplaced Pages namespace. For example, the featured article blurb is put on the main page as {{Misplaced Pages:Today's featured article/{{CURRENTMONTHNAME}} {{CURRENTDAY}}, {{CURRENTYEAR}}}}, which resolves (today) as Misplaced Pages:Today's featured article/January 10, 2007, which is then transcluded across. Which leads me to ask (and please don't get offended), have the developers communicated with the people who dealt with the vandalism and who created and who maintain the current main page and its protection processes? You not mentioning images and focusing on the template namespace leads me to think that something might slip through the cracks if all the bases aren't covered (to mix metaphors). Carcharoth 18:17, 10 January 2007 (UTC)
Not to malign you or anyone else, but in my experience the response time on feature requests requiring new code to be written is lousy. Which I assume is a side effect of there being a small number of developers and a large number of priorities to take care of. Interesting question though, did the existence of the bot proposal make you more or less interested in patching this? Dragons flight 18:20, 10 January 2007 (UTC)
To be fair, Werdna, there's been many issues in the past that we asked the developers about and were promptly ignored or put indefinitely on hold, so we did have to come up with a hack on our own. For the record, I'm still waiting on single user login, non-vandalized version tagging, and non-linking date syntax. --Cyde Weys 18:28, 10 January 2007 (UTC)
All of which are coming in the near future. Brion's currently working on SUL, stable versions is supposed to start on dewiki shortly after SUL is finished, and Robchurch is currently working on unlinked dates. --Rory096 20:57, 10 January 2007 (UTC)
Oh wow. I even saw mentions of Category Intersection on the wikitech mailing list. This does all sound very promising. Let's all give the developers a big cheer, and look forward to playing with the new tools very soon. :-) Carcharoth 22:16, 10 January 2007 (UTC)
Another problem: "as well as the page itself" - this protection of "the page itself" is not viable for the featured article page, which needs to remain unprotected to allow editing, but the transcluded material (images, templates, other pages, whatever) needs to be protected. Carcharoth 18:31, 10 January 2007 (UTC)
Also, does it unprotect templates once they've left sensitive areas? Does it at least add the protected templates to a category so others can tidy up after it if needed? Note that not all templates should be unprotected once they leave the sensitive area, some need to remain protected as they are high-risk templates. (copied from Werdna's talk page) Carcharoth 18:34, 10 January 2007 (UTC)
ie. The process needs to protect unprotected stuff, and unprotect them later, while ignoring protected stuff. Can that really be done through Mediawiki coding? Carcharoth 18:39, 10 January 2007 (UTC)
Items should automatically revert to their previous protection status. The point is that the protection applied at any given time should automagically be determined as the greatest of the protection applied to the item itself plus the protection applied to any pages that include the item (and have cascading protection enabled). Dragons flight 18:43, 10 January 2007 (UTC)
Ahh, nothing like a wiki community to ask a bazillion questions. I'll respond to the objections/comments/questions in turn:
  1. But it will only protect in templatespace!
    Myself, Domas and Tim Starling decided later on that this restriction was unnecessary. I also forgot to mention that it protects images, too. — Werdna talk 22:12, 10 January 2007 (UTC)
  2. But it will protect the page itself, and not ONLY the page's transclusion templates: — Werdna talk 22:01, 10 January 2007 (UTC)
    If this is a requested feature, I will implement it. It's a fairly trivial fix, and I'm glad you let me know about it - because it's far easier to add the fields into the database now. — Werdna talk 22:01, 10 January 2007 (UTC)
  3. Does it unprotect templates when it's left sensitive areas?
The templates are never protected, according to the database. Similarly to the protection afforded to CSS/JS user subpages, the protection is a special case in the code. — Werdna talk 22:01, 10 January 2007 (UTC)
Addendum: I see that it's also been asked whether or not MediaWiki will unprotect the pages once they are off the main page — it will not. The new functionality actually never modifies the database. Another myth: busted. — Werdna talk 22:25, 10 January 2007 (UTC)
That's all I can see so far, please feel free to add more! — Werdna talk 22:01, 10 January 2007 (UTC)
You've answered all my questions, and admirably so! Thanks for answering and explaining things. I guess the only question now is whether these changes occur before the ProtectionBot RfA finishes (on 14 January 2006). If not, hopefully the closing bureaucrat will be aware of all this. If yes, then the question of withdrawal raises its head, and whether people will squabble over whether Dragons flight or ProtectionBot should do the honours... I can just see the Signpost headlines now. "First-ever adminbot set to be promoted at RfA! Last-minute developer patch squashes bot's dreams of glory!! Bot refuses to be interviewed and sulks in corner!!!" Carcharoth 22:33, 10 January 2007 (UTC)
Please also have a way of identifying which page is responsbile for the cascading protection applied to a given item. Dragons flight 22:45, 10 January 2007 (UTC)
I have a feeling that I may be repeating a question here. Would it be possible to have the cascading protection of en's Main Page protect both the images locally and at Commons. Currently admins must upload local copies or, in the cases of some admins who aren't familiar with the process, they just link to the Commons image, resulting in a mad scramble to upload a local protected copy before a vandal either overwrites the Commons copy or uploads an inappropriate image to the local image location. - BanyanTree 03:33, 11 January 2007 (UTC)

Fix has been committed in r19095. The only use I can see for the bot at this point is re-uploading commons images prior to them appearing on the main page (which can be done by a regular user account); once they appear on the main page, the normal cascading protection will apply. The changes will become live in the next few days, depending on how long before a developer gets off their ass and does the required database updates :-). — Werdna talk 23:39, 10 January 2007 (UTC)

Your link gives me a 403 error. Dragons flight 23:43, 10 January 2007 (UTC)
Try 19095 instead. Carcharoth 23:52, 10 January 2007 (UTC)
Or, indeed, rev:19095. —Ilmari Karonen (talk) 00:24, 11 January 2007 (UTC)
I doubt this solves the problems I've listed above. --Ligulem 00:30, 11 January 2007 (UTC)
Per the "Field for future support of per-user restriction", I once thought about a text page under each user's space (with a reserved name like "access", containing DENY and ALLOW clauses for that user, editable only by admins with wildcards for page names). So we could allow XYuser to edit Common.css by specifying "ALLOW MediaWiki:Common.css" in user:XYuser/access. Just an idea... :-) --Ligulem 00:45, 11 January 2007 (UTC)
Did a quick testinstall. Looks damn sexy. Per "cascaded protections are not stored in the database" I just hope this can take the load of en.wikipedia then and thus passes brion (..., which refers to ). --Ligulem 10:00, 11 January 2007 (UTC)

Problem?

(header added by Carcharoth 22:53, 10 January 2007 (UTC))

Following a brief discussion on #wikimedia-tech, it was pointed out that protecting every page transcluded into the featured article, but not the featured article itself, could pose a serious security hole, allowing a vandal to protect arbitrary pages. Strongly discourage the existence of this feature in the bot, and the development team will not be implementing this as a feature. Human eyes are needed to check for this vulnerability. Cheers, — Werdna talk 22:23, 10 January 2007 (UTC)

Ah. This possibility was pointed out with ProtectionBot as well. The risk is less here because ProtectionBot protection is a random action, not an immediate effect. With the Mediawiki coding, surely if a vandal transcludes a page to the Featured Article, thus getting that page protected, won't just removing the page reverse the protection? What about having the cascading full protection feature available for article s-protection as well as full protection? No, that won't work either, as vandals can easily use sleeper accounts. Hmm. Looks like discussion will have to go to Misplaced Pages:Main Page featured article protection. Carcharoth 22:43, 10 January 2007 (UTC)
This has been raised in my discussions as well. A suggested (but not yet implemented) response was to require an item to appear as needing protection during multiple passes of the featured article seperated by some interval (e.g. 15 minutes) so that a single act of vandalism was less likely to result in protection and to have a hard limit on the total number of items the bot would protect at any one time. This ameliorates (but not entirely prevents) the situation you suggest. This approach is of course not practical for Mediawiki. Since the vandalism required to promote this sort of spurious protection should be easily revertable, I'm not sure how seriously I would rate it even in the Mediawiki case (though self-inclusion could have strange consequences if not explicitly exempted). Dragons flight 22:45, 10 January 2007 (UTC)

PseudoCode?

Why not publish pseudocode? ---J.S (T/C/WRE) 18:47, 10 January 2007 (UTC)

I am willing to publish psuedocode as a compromise, provided a couple bits can have a high level of abstraction. It would probably take me at least a couple days before I can find time to write it out though. Also, I'm not sure how many of the people objecting on the basis of open source concerns would actually accept psuedocode. Dragons flight 19:04, 10 January 2007 (UTC)
Well, it might be an acceptable compromise for some people on the fence. Just tossing out ideas:) ---J.S (T/C/WRE) 19:05, 10 January 2007 (UTC)
  • I would be willing to review and vouch for the accuracy of pseudo code. I have received the code to the bot, which was sent promptly and politely. ++Lar: t/c 22:46, 10 January 2007 (UTC)
  • I can't speak for Peter, but for me, I won't waste Dragons Flight's time - picking and choosing who the code is sent to is not transparency, whether or not that list happens to include me, and even if he sent me the code right away it wouldn't change my mind. Someone else might spot a bug the handpicked list would miss-that's why code should be open to all eyes. Nothing less would sway me. I already trust Dragons' Flight that, to the best of his ability, the code works and does what he says it does. But there's always bugs and vulnerabilities in any non-trivial code, and the more people can look at it, the more get found. Our alternative is to wait until a vandal figures out a weakness by observation or "plinking", and the handpicked group scrambles to figure out how it happened. Secrecy is not a good way to write code, not a good way to audit code, and realistically, against the spirit of how we do things around here. Seraphimblade 22:55, 10 January 2007 (UTC)

New MediaWiki feature "Cascading protection" now enabled

See (just for the record). --Ligulem 10:39, 14 January 2007 (UTC)