Revision as of 12:45, 8 January 2007 editGurch (talk | contribs)Extended confirmed users, Rollbackers109,955 editsm →Please don't forget that human oversight is still needed: comment← Previous edit | Revision as of 12:46, 8 January 2007 edit undoCarcharoth (talk | contribs)Administrators73,578 edits →Some concerns: reply to GurchNext edit → | ||
Line 79: | Line 79: | ||
:Incidentially, ] is unprotected, for some reason. This should probably change. ] 11:02, 8 January 2007 (UTC) | :Incidentially, ] is unprotected, for some reason. This should probably change. ] 11:02, 8 January 2007 (UTC) | ||
:: Actually it was semi-protected. Now it's fully protected. Though since nothing that's directly on that page is ever included in the Main Page, I'm not entirely sure what the problem was – ] 12:33, 8 January 2007 (UTC) | :: Actually it was semi-protected. Now it's fully protected. Though since nothing that's directly on that page is ever included in the Main Page, I'm not entirely sure what the problem was – ] 12:33, 8 January 2007 (UTC) | ||
:::The problem was that templates on that page are being protected by ProtectionBot, so this is a way for someone to get something protected by ProtectionBot. That something might be something that we wouldn't want to be protected and then freely added to other pages. Hence '''all''' pages scanned by ProtectionBot should be protected. Today's featured article page is a notable exception to this, and one that will need to be watched very closely. ] 12:46, 8 January 2007 (UTC) | |||
== Please don't forget that human oversight is still needed == | == Please don't forget that human oversight is still needed == |
Revision as of 12:46, 8 January 2007
Someone please explain to me...
Why can't the source code be revealed? AWB would require much less modification to be an effective vandalbot, and its source is freely available to anyone who cares. -Amarkov edits 18:18, 7 January 2007 (UTC)
- Not sure, I have read it and it seems to be safe releasing the source. HighInBC 18:26, 7 January 2007 (UTC)
If Dragons flight released the source, I would withdraw my opposition. My only significant beef is the needless secrecy. Cheers, ✎ Peter M Dodge ( Talk to Me • Neutrality Project ) 19:05, 7 January 2007 (UTC)
Dragons flight has stated (see comment under Oppose #1), "The code has been released to trusted members of the community for review, but it will not be made public. I feel the risk of people adapting certain functions to create powerful vandalbots is too great." Perhaps other users who have seen and reviewed the code can comment on this issue. This seems a plausible concern to me but an even bigger concern to me is that releasing the code would allow the vandals to try to reverse-engineer ways around it (compare WP:BEANS). Newyorkbrad 19:10, 7 January 2007 (UTC)
- There is no WP:BEANS here. This is nothing that couldn't be done with the freely and openly available pywikipedia framework. Cheers, ✎ Peter M Dodge ( Talk to Me • Neutrality Project ) 19:21, 7 January 2007 (UTC)
- I agree, pywikipedia framwork, the perl wikimedia module, or just plain html scripting can get the same results. The functions this bot performs are not difficult to reproduce. What's more, the code would not be able to perform admin functions on a non-admin account anyways, so it is really just the recursive unprotected template/image finder. If the bot is functioning, then this list of unprotected pages will not be a threat. I read the source, I see no reason to keep it a secret, but I respect the authors right to do so. HighInBC 19:28, 7 January 2007 (UTC)
- Earlier today, I was thinking the same thing as you, HighinBC, but I've realised the potential issue with releasing the code. I'm going to break WP:BEANS here (on the understanding that the code won't be released), in order to enlighten everyone. The simple matter is that the bot code could be changed to automatically vandalise every unprotected page, perhaps before the bot would be able to protect, and cause the vandalised page to be protected. This is a very serious possibility, allowing vandals to easily impose mass vandalism (esp image vandalism). I anyone thinks that this comment is severely WP:BEANS, blank it. Martinp23 20:03, 7 January 2007 (UTC)
- This could be easily done with ANY bot framework - including my own or perlwikipedia - so where's the specific risk? Please clarify. Cheers, ✎ Peter M Dodge ( Talk to Me • Neutrality Project ) 20:21, 7 January 2007 (UTC)
- We get it. The point is that we want to make it as hard as possible for people to do that. Would you like Tawker to release the source for AVB too? That would be incredibly stupid too. -Royalguard11(Talk·Desk·Review Me!) 20:36, 7 January 2007 (UTC)
- Peter, I'm sure that perlwikipedia doesn't allow you to find all unprotected pages/files linked from one, does it? Martinp23 20:41, 7 January 2007 (UTC)
- Actually, yeah, it does, thanks to the lovely patch the devs made to the transclusion list code. Shadow1 (talk) 22:11, 7 January 2007 (UTC)
- I can get every contribution an editor's made, every edit to an article by x users - getting all transclusions is trivial, since it's just a api.php hack. I appreciate the security concerns, but I feel they are unwarranted. Cheers, ✎ Peter M Dodge ( Talk to Me • Neutrality Project ) 20:53, 7 January 2007 (UTC)
- You can get the list of pages trancluded on one page using api.php?! Wow - I didn't know that (though I do use api.php a lot for my bots, I tend to stick to the same queries). Can you give me a link to show this (just out of interest)? Martinp23 21:30, 7 January 2007 (UTC)
- Peter, I'm sure that perlwikipedia doesn't allow you to find all unprotected pages/files linked from one, does it? Martinp23 20:41, 7 January 2007 (UTC)
- That is a really simple sub-routine to make for anyone capable of editing existing code. HighInBC 20:42, 7 January 2007 (UTC)
- I'm fairly certain that I could write a decent vandalbot in under 5 minutes with perlwikipedia, it's not like this sort of thing requires a rocket scientist </cliche>. Any fifth grader with a decent knowledge of Perl and a copy of the WWW::Mechanize module can write one. Shadow1 (talk) 22:11, 7 January 2007 (UTC)
- We get it. The point is that we want to make it as hard as possible for people to do that. Would you like Tawker to release the source for AVB too? That would be incredibly stupid too. -Royalguard11(Talk·Desk·Review Me!) 20:36, 7 January 2007 (UTC)
On thing that just came to my mind - Dragons flight noted on the BRFA that the bot would run on random times etc. to prevent the vandals from predicting its execution and racing to vandalism. I haven't seen the code yet, but this feature (or something similar) may well be the reason that the release of the source code would violate WP:BEANS. Миша13 22:01, 7 January 2007 (UTC)
- It isn't all that hard to design a RNG algorithm such that determining the times from it without having direct access is too hard to be plausible. Video games have managed that for a while, I think that a bot can. -Amarkov edits 22:06, 7 January 2007 (UTC)
- I agree with Amarkov here ... any even somewhat decent implementation of a RNG would not allow anyone to predict its random numbers, even with access to the source. Besides, even if it was a simple timestamp RNG, on-wiki actions are only reported to the nearest second, whereas the script would be using a more fine-grained time seed than that. So there really would be no way to try to predict when it would run again. --Cyde Weys 22:19, 7 January 2007 (UTC)
- I don't really think my point got across. If you wanted to build a vandalbot, you could do it from this, grabbing the unprotected pages... or you could just remove the checkpage requirement from AWB, set it on auto mode, and vandalize away. Much easier than introducing editing functionality to a bot that doesn't have it, and plus, as long as you have a user and user talk page, and are careful not to remove or add too much stuff, it won't look any more suspicious than any other AWB fix. While removing the checkpage requirement isn't a trivial matter, anyone who could turn this bot into a vandalbot could manage it. -Amarkov edits 22:32, 7 January 2007 (UTC)
- So far as I can tell, the code of User:AntiVandalBot isn't public (at least I couldn't find it)... why is no one freaking out about that? It's a much more complicated bot that can make edits to every page on Misplaced Pages. It makes more edits in a day than the protection bot will in an entire year. If it 'went berserk' it could require vastly more work to clean up than the proposed protection bot ever would. In short, all the concerns expressed about 'protection bot' are vastly more applicable to 'antivandal bot'... yet the code is not public and no one seems to mind. Why do you suppose that is? Why do you suppose that 'auto wiki browser' isn't just given out to anyone who wants it? My own theory is that most people realize that 'making smarter vandals' is a bad idea. Yes, a vandal could build their own version of 'anti vandal bot' that instead creates vandalism... some have. But most of them aren't 'dedicated' enough to figure out the hows of it and eventually go away. Does it really make sense to HAND those people a ready made vandalism tool that just requires a few tweaks to create a massive mess? That's what making 'protection bot' or 'anti vandal bot' code publically available would do... give general vandals the ability to do alot more damage. We can handle the few vandals who are capable of building their own bots. Let's not give every vandal the ability to make bot attacks. --CBD 23:01, 7 January 2007 (UTC)
- AntiVandalBot obviously does have a vandalism problem. It can edit anything already, it can do it fast, and it requires no human intervention. This bot can only edit images and templates, and even then only to add or remove three specific things, so it would take loads more work to convert it into a useful vandalbot. And as I've reiterated a lot already, we already have the full source of AWB, which would be much easier to convert to a vandalbot. (It wouldn't even be conversion, really). -Amarkov edits 23:07, 7 January 2007 (UTC)
- Alternatively, you can use my perlwikipedia framework to write a vandalbot. I just wrote a dirt simple, proof-of-concept one with the framework, 24 lines of code, that uses threading and multiple usernames. Elapsed time: 4 minutes. Just because the bot is open-source doesn't make it an automatic target for vandals trying to create vandalbots. It would probably be harder to convert ProtectionBot into a vandalbot than it would be to write one from scratch using pywikipedia. Shadow1 (talk) 23:31, 7 January 2007 (UTC)
- AntiVandalBot obviously does have a vandalism problem. It can edit anything already, it can do it fast, and it requires no human intervention. This bot can only edit images and templates, and even then only to add or remove three specific things, so it would take loads more work to convert it into a useful vandalbot. And as I've reiterated a lot already, we already have the full source of AWB, which would be much easier to convert to a vandalbot. (It wouldn't even be conversion, really). -Amarkov edits 23:07, 7 January 2007 (UTC)
- I agree, bot making is not some secret, anyone can learn it and use existing frameworks. HighInBC 23:32, 7 January 2007 (UTC)
- Absolutely agree. Not that I don't trust HighInBC, but I believe strongly in trust-but-verify. I already know pretty well how Antivandalbot works just by having seen what types of things it's done, and it would not hard to write vandalbots from what's already out there. Our anti-vandalism techniques need to be just as open, so that when the vandalbot runners find a way around them (and you believe me, they will), we can respond quickly and improve our own techniques (and perhaps find weaknesses before they're exploited). Security through obscurity isn't-and if this bot's code is too insecure to post, it's too insecure period, let alone to trust with an admin flag. Seraphimblade 00:04, 8 January 2007 (UTC)
- To echo some comments from other editors that I think are most worthy of consideration: nothing this bot could do is difficult or uniquely complex, there's no good reason not to publish, publication would facilitate bug discovery and resolution. The bot could be blocked if it ever caused problems. It should also be possible to distribute a version this bot set to run in semi-automatic attended mode, which would enable the word to be done efficiently without the risk that comes with a fully automatic bot, of being fooled by cleverly written malware or mischievous humans. --Tony Sidaway 07:12, 8 January 2007 (UTC)
I believe many people here are grossly underestimating how little modification to the source it would take to turn a bot that looks for vulnerabilities in order to protect them, into a bot that looks for vulnerabilities in order to vandalize them. Changing fewer than 5 lines would turn this into effective malware. Changing a few more than that would be enough to let it rampage all over the place. If you are unwilling to accept this as private source, then by all means kill it, but I have no intention of making the source public. Dragons flight 07:34, 8 January 2007 (UTC)
- I said this in my oppose !vote-if this bot code, through error or malice, is that dangerous (and if danger exists, either error or malice can lead into it), and would be that dangerous if a non-admin had possession of it, it is more, not less, critical that the code be open to continuous review-not just now but during its operation. It's not like we've never seen a vandalbot, but if this code is suddenly released we'll have a flood of them. (Please note-you certainly have the right to keep your code secret, but even if most seem alright with that, I think it's a bad idea and will in the end decrease the effectiveness of the response against vandalism. And for myself, I can't support it without seeing it.) Seraphimblade 07:50, 8 January 2007 (UTC)
Current status question
(cross-posted to bot approval page) With the RfA now pending, is ProtectionBot currently operating during the RfA period? I hope that it is, at least on an ongoing trial basis. Newyorkbrad 20:21, 7 January 2007 (UTC)
- A member of the BAG ended the trial after one day and instructed DF to shut down the bot here, and DF did as he requested, so no, it's not running. —bbatsell ¿? 20:30, 7 January 2007 (UTC)
Suggest continued trial operation during RfA period
If Dragons flight is willing I would like to see this bot continue operating on a trial basis during the RfA period, both so we have the benefit of its services during the next week and so that in the unlikely event of an issue arising the RfA !voters could consider it. Comments? Newyorkbrad 20:32, 7 January 2007 (UTC)
- I think BAG shut it down, In the meantime we have User:Shadowbot2. Which as stated on the RFA page, is fixed and will preform correctly. Cheers! —— Eagle 101 23:16, 7 January 2007 (UTC)
- Probably best to just wait, I know I am checking shadowbot2's mailings. HighInBC 23:18, 7 January 2007 (UTC)
- Suggestion: Might it be possible to authorise the continued running of ProtectionBot for as long as this RfA maintains a suitable level of consensus for the Bot? e.g. 80 or 85%? That would combine practicality with respect for the views of the community... WJBscribe 23:39, 7 January 2007 (UTC)
- That's a very nice idea, and I'd commend you for lateral thinking, but I don't think it's feasible. Cheers, ✎ Peter M Dodge ( Talk to Me • Neutrality Project ) 23:41, 7 January 2007 (UTC)
- Suggestion: Might it be possible to authorise the continued running of ProtectionBot for as long as this RfA maintains a suitable level of consensus for the Bot? e.g. 80 or 85%? That would combine practicality with respect for the views of the community... WJBscribe 23:39, 7 January 2007 (UTC)
- Probably best to just wait, I know I am checking shadowbot2's mailings. HighInBC 23:18, 7 January 2007 (UTC)
- No need to be bureaucractic about it, this bot is useful (and the RFA has overwhelming support so far) so there's no reason why it shouldn't keep running for a few more days. >Radiant< 12:28, 8 January 2007 (UTC)
Buffer overflow
I see a few people concerned about buffer overflow exploits, my understanding is that this type of vulnerability can only be used on a bot that can be given binary input. Since this script gets all of it's input from mediawiki which stores it's data in text form, I see no way to insert such an attack. Python does not allow for run-time compiling. You cannot fool such a bot into running arbitrary code given such input restrictions, as the precompiled code needed for such an attack cannot be stored as text.
I may be wrong, so correct me if I am, but it seems a buffer overflow vulnerability is not an issue for technical reasons. HighInBC 23:50, 7 January 2007 (UTC)
- Mostly correct. It could still malfunction on malformed input, or if the input format changes. Cheers, ✎ Peter M Dodge ( Talk to Me • Neutrality Project ) 23:52, 7 January 2007 (UTC)
- Malfunction on malformed input is far from executing arbitrary code, and would lead to a parsing failure. And changing input formats would exceed the approval it is seeking. HighInBC 23:53, 7 January 2007 (UTC)
- I know - hence "mostly correct" :) I just felt it necessary to point out that there still ways in which this kind of thing may happen, if without the severe consequences. Cheers, ✎ Peter M Dodge ( Talk to Me • Neutrality Project ) 00:00, 8 January 2007 (UTC)
- I see, I agree that we cannot discount the possibility of the bot being intentionally screwed with, but I think the threat of arbitrary code execution is not an issue. HighInBC 00:04, 8 January 2007 (UTC)
Some concerns
While correcting a misunderstanding I wrote the following to express some of my concerns (despite supporting the RfA). Comments welcomed.
I was reading the ProtectionBot discussion, and I noticed in one of the oppose votes discussions someone said "remember this bot only protects images and templates on the Main Page". This is incorrect. The bot is also intended to protect templates being used on the featured article, the actual featured article page, not the introduction to it that appears on the Main Page. Thus anyone can add a template to the featured article, vandalise the template, and sit back and watch as ProtectionBot protects the vandalised template. The good thing though, is that the featured article is (normally) freely editable, and so anyone can remove the protected vandalised template. This situation is a bit more problematic when the featured article is in a state of protection or semi-protection due to high levels of vandalism (someone always seems to protect the featured article at some point in any given day), and if the protected vandalised template is in widespread use in other articles. However, the discussions at Misplaced Pages:Main Page featured article protection may change all this. Thus the interaction of all these proposals needs to be carefully considered. Not too much change too fast. Also, no-one seems to have picked up yet on the comment I made here. That can be summed up by: Main_Page/Tomorrow needs to be actively watched every day and a button clicked to show that someone has checked it, otherwise, as I said in that comment I linked to: "...the vandalism (possibly not very visible) remains undetected for a whole day, and then silently switches over on the main page, at which point all hell will break loose." Carcharoth 10:55, 8 January 2007 (UTC)
- Incidentially, Main_Page/Tomorrow is unprotected, for some reason. This should probably change. Carcharoth 11:02, 8 January 2007 (UTC)
- Actually it was semi-protected. Now it's fully protected. Though since nothing that's directly on that page is ever included in the Main Page, I'm not entirely sure what the problem was – Gurch 12:33, 8 January 2007 (UTC)
- The problem was that templates on that page are being protected by ProtectionBot, so this is a way for someone to get something protected by ProtectionBot. That something might be something that we wouldn't want to be protected and then freely added to other pages. Hence all pages scanned by ProtectionBot should be protected. Today's featured article page is a notable exception to this, and one that will need to be watched very closely. Carcharoth 12:46, 8 January 2007 (UTC)
- Actually it was semi-protected. Now it's fully protected. Though since nothing that's directly on that page is ever included in the Main Page, I'm not entirely sure what the problem was – Gurch 12:33, 8 January 2007 (UTC)
Please don't forget that human oversight is still needed
Just to avoid complacency, and to remind those saying that this bot will "deal with the problems of Main Page vandalism", a reminder that the bot will deal with some methods of vandalism, but human administrators still need to be alert to the following, which, however unlikely, will probably happen at some point in the future. I've given examples below. Carcharoth 12:43, 8 January 2007 (UTC)
Human error
- Administrators unprotecting stuff and forgetting to re-protect (ProtectionBot will not override another administrator). The fix is to reprotect and politely ask the administrator not to make this mistake in the future. Carcharoth 12:43, 8 January 2007 (UTC)
- Query - if something is protected by an administrator, will ProtectionBot still unprotect the page in question once it leaves the sensitive areas? This is not good for high-risk templates that should remain protected even when off the main page. Carcharoth 12:43, 8 January 2007 (UTC)
- No. It remembers what it's protected and only unprotects things it protected itself. This may result in things being protected for longer than they should be, but that's infinitely preferable to things being unprotected when they shouldn't be – Gurch 12:45, 8 January 2007 (UTC)
- Query - if something is protected by an administrator, will ProtectionBot still unprotect the page in question once it leaves the sensitive areas? This is not good for high-risk templates that should remain protected even when off the main page. Carcharoth 12:43, 8 January 2007 (UTC)
Bot error
- The bot cannot check whether an image or template is in a vandalised state before it protects it. If a vandal strikes lucky and vandalises a template just before it gets protected (unlikely but possible), then an admin is required to unprotect and undo the vandalism. If no-one is watching closely, then a vandal could do this on the featured article and then remove the newly protected vandalised template or image and add it to lots of pages. The template/image in question will be unprotected by ProtectionBot after two passes, but a lot of damage could be done in this time interval. Carcharoth 12:43, 8 January 2007 (UTC)
- A similar example to the above is when ProtectionBot protects the rotating templates that use date parsing to queue the main page templates for the featured article and the picture of the day and selected anniversaries. This is done in advance by using Main Page/Tomorrow, but unless humans watch this page, vandalism may pass un-noticed here for a day until it flips over onto the Main Page. Carcharoth 12:43, 8 January 2007 (UTC)
- The bot may unprotect a page that should remain protected. The bot is unable to make the necessary judgement, though it could be programmed to look at whether the page is already in a high-risk category. Merely having it return the page to the state it was in before arriving on the main page or featured article is not enough, as some pages are protected by admins beforehand, but should be unprotected once they leave the sensitive area. Carcharoth 12:43, 8 January 2007 (UTC)
Please add any more examples you can think of needing human oversight. Carcharoth 12:43, 8 January 2007 (UTC)