Misplaced Pages:Bot requests

This is an old revision of this page, as edited by Jmax- (talk | contribs) at 06:56, 12 January 2007 (→hot bot action for test wiki). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 06:56, 12 January 2007 by Jmax- (talk | contribs) (→hot bot action for test wiki)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff) Shortcut

]

This is a page for requesting work to be done by a bot. This is an appropriate place to simply put ideas for bots. If you need a piece of software written for a specific article you may get a faster response time at the computer help desk. You might also check Misplaced Pages:Bots to see if the bot you are looking for already exists. Please add your bot requests to the bottom of this page.

If you are a bot operator and you complete a request, note what you did, and archive it. Requests that are no longer relevant should also be archived in a timely fashion.

This talk page is automatically archived by Werdnabot. Any sections older than 14 days are automatically archived to Misplaced Pages:Bot requests/Archive 9. Sections without timestamps are not archived.

Add a new request

Archives
August 2004 – September 2005 June 2005 – November 2005 August 2004 – January 2006 February 2006 – April 2006 November 2005 – February 2006 February 2006 – April 2006 May 2006 – July 2006 August 2006 – December 2006 January 2007

Punctuation Bot

Hey how about a bot that will put all the commas, periods (all punctuation except semi-colons, in fact) inside quotation marks; it looks quite unprofessional to see articles written with punctuation outside quotations. - Unisgned comment added by User:165.82.156.110

Not quite sure what you are proposing. Perhaps you could provide a sample of correct and incorrect punctuation within the context of quotations? - PocklingtonDan 21:17, 20 December 2006 (UTC)

I think he's referring to punctuation within quotations, which is not really an english "rule", more of a matter of style. See Misplaced Pages:Manual of Style#Quotation marks. There's no real way to automate this, and there's no real reason to, in my opinion -- Jmax- 21:22, 20 December 2006 (UTC)

Does he mean just the trailing full stop/period? If so, then it should always g outside the closing quotation mark, but he seems to suggest you should never use punctuation within quotation marks, which I don't understand - quotation marks are used to quote somebody. If the words you are quoted would reasonable be punctuated when written in prose, then that punctuation is included, regardless of whether it is in punctuation marks. The following is perfectly valid punctuation, regardless of style:

My friend said to me "The cat sat on the mat, then it bit me. I don't think it likes me".

- PocklingtonDan 21:34, 20 December 2006 (UTC)

And what about a single word in quotation marks, followed by a comma - the comma obviously shouldn't go inside the quotation marks. John Broughton | Talk 01:17, 28 December 2006 (UTC)

Wow. I was taught in school that the trailing punctuation should go inside the quotations. For example, 'That bastard at the movie said "shhh!" Can you belive it?' (even with the assumption that "shhh" wasn't being exclaimed.) Then again, maybe I wasn't paying good enough attention in class. ---J.S 00:12, 29 December 2006 (UTC)

Having the punctuation always inside quotation marks is the American style; British English uses punctuation inside if it belongs to the quotation, and otherwise outside. (http://en.wikipedia.org/British_and_american_english_differences#Punctuation) Therefore I don't think this bot would be a good idea, since it only represents the American usuage. —The preceding unsigned comment was added by CJHung (talk • contribs) 03:16, 1 January 2007 (UTC).

Anyone willing to analyse some bot pseudocode?

I'm building a research (ie, no edit) bot in C++... since I'm not really that experienced in programing I was wondering if someone would be willing to check my pseudocode?

The basic concept behind the bot is to identify when a particular string of text was added to an article using a binary search method. In theory it could search though the history of a page with 10,000 edits with less then 15 page-requests.

A research program like this will be a helpful tool in tracking down subtle vandals and spammers. So.. I've kinda drifted. Anyone more experienced with OOP languages want to audit my pseudocode? ---J.S 23:59, 28 December 2006 (UTC)

Here's the link... User:J.smith/pseudocode. ---J.S 00:06, 29 December 2006 (UTC)

I've written a perl interpretation of your pseudocode, but am having trouble understanding precisely the context of that block. How will it be used? Is that the 'main' function? Where is the return value used? -- Jmax- 08:41, 29 December 2006 (UTC)

I'm not certain, but I believe the idea is the user provides the wikipedia page and a string that's in the current version of the article. The function returns the diff of when that string was added. So, I would say that no this wouldn't be main, this would probably be 'search'. Vicarious 09:26, 29 December 2006 (UTC)

Does it recurse? Where should it recurse, if it does? -- Jmax- 09:31, 29 December 2006 (UTC)

No I don't think so, Main would take the user's input, run the search function then either link to or redirect the user to diff page. Vicarious 09:34, 29 December 2006 (UTC)

Here is a perl implementation, less the essential bits (which could easily be added). I'm not entirely sure if the algorithm will even work properly, actually. Something seems off about it. -- Jmax- 10:09, 29 December 2006 (UTC)

Well, it does have limitations. If the text was added in and taken out multiple times it won't necessarily find the -first- time the string was added, but it will find one of the times the string was added. There are a number of elements I haven't designed yet so the code is incomplete. ---J.S 17:38, 29 December 2006 (UTC)

The basic idea here is that the user would input the name of the article to search and the string of text they were looking for and then the program would output a link to the first version of the page with that paticular string. ---J.S 17:43, 29 December 2006 (UTC)

As was hinted at above, a binary search skips over many alterations. A binary search will find one alteration where the string appeared, but it might not be the first time the string appeared. The bot might look at versions 128, 64, 32, 48, 56, 60, 58, 59 and identify version 59 as having the string while 58 does not. But the string might have been inserted in version 34 and deleted in version 35, as well as several other times. (SEWilco 05:45, 30 December 2006 (UTC))

Yes, but even that can be usefull information when tracking stuff down...

It occurs to me, this might be useful for tracking down an unsigned post on a talk-page when the date is completely unknown. Hmmm... ---J.S 05:48, 30 December 2006 (UTC)

It at least can help in many situations. You wanted comments on the method, and now you know some of the limitations. If you really want to find the first insertion of a string you could examine the article-with-history format which is used in data dumps. (SEWilco 16:00, 30 December 2006 (UTC))

That could be done, but a db dumb is quite huge:( Maybe I should chat with the toolserver people on that when they get replication up and running? ---J.S 09:35, 31 December 2006 (UTC)

Is the full-with-history available through Export? (SEWilco 15:03, 31 December 2006 (UTC))

Help:Export says the full history for a page is available, but at bottom of page is a note that it has been disabled for performance reasons. If the history was available you'd have a single file where you'd just have to recognize the version header (and a few others such as Talk page) and by remembering the earliest version with the desired text be able to find the version in a single read of one file. At present that's only relevant if you search a mirror with export history enabled. (SEWilco 06:39, 3 January 2007 (UTC))

Although I don't think this is as big of issue as you guys do, I have a relatively elagent solution to the finding the first insertion problem. Run the exact same search again on only the preceding versions. Have it include a case where if it never finds the string it'll let the first search know it found the right one. This method won't work if the string has been absent from most of the versions, but by far the most common reason the original search won't work is it'll find pageblankings and attribute the sentence to the person that reverts it, this solution solves that problem. Vicarious 01:08, 1 January 2007 (UTC)

That's a brilliant solution! I'll certainly include a function for this. ---J.S 19:06, 2 January 2007 (UTC)

Ad-stopping bot

In theory, the bot will look through new articles to try and find key phrases like "our products" and "we are a". It then places a template on the page like this:

AdBot suspects this page of being blatant advertising, otherwise known as spam.

Please check this page conforms to the neutral point-of-view policy before nominating for speedy deletion, deleting or removing this template.

And places it in a relevant category. A human (or other intelligent individual) would then look through the list and nominate any articles that are blatant ads for WP:SPEEDY.

What do you think? --///Jrothwell /// 13:15, 29 December 2006 (UTC)

Sure, why not? But "suspects that this page contains" rather than "suspects this page of being" might be a little more neutral, as well as more grammatically correct. And it would be an ad-flagging bot, not an ad-stopping bot. (I'm quibbling, I know.) John Broughton | Talk 15:13, 29 December 2006 (UTC)

Sounds good. There might be some changes in implementation (EG. That flag might cause concern), but I've found phrases like those to be dead giveaways to both commercial intent and notorious copyright infringements.

It's a good idea. Other phrases you could search for include "our company", "visit our website/site/home page", and "we provide". You might also want to add "fixing the article" to the list of suggested options. Proto::► 01:51, 31 December 2006 (UTC)

I've altered the template slightly to fit in with everyone's suggestions. Here's the revised template:

AdBot suspects this article contains blatant advertising, otherwise known as spam.

If the subject of the article complies with the Misplaced Pages notability guidelines, please fix the article if it doesn't conform to the neutral point-of-view policy. If the subject is not notable, please nominate the article for speedy deletion.

I'm also making a template for user pages of people whose pages have been flagged. Any other thoughts? --///Jrothwell /// 16:06, 31 December 2006 (UTC)

The user-talk template is at User:Jrothwell/Templates/Adbot-note. Is there anyone who'd be willing to code the bot? --///Jrothwell /// 17:13, 31 December 2006 (UTC)

Sounds like a great idea for a bot. If you haven't found anyone yet, I'd be willing to code it. Best, Hagerman 19:10, 31 December 2006 (UTC)

Shouldn't it be "If the article does not assert the notability of the subject, please nominate the article for speeedy deletion. OR "If the subject is not notable, please nominate the article for deletion." Please see WP:CSD#A7. --WikiSlasher 04:23, 1 January 2007 (UTC)

I don't know if this is a good idea. Googling "we are a" gives mostly legitimate pages where the phrase appears in a quotation. I think at the least there should be a human signing off on each flagging. --24.193.107.205 06:10, 2 January 2007 (UTC)

(undent) The issue of false positives is important. Certainly if a large majority of flaggings related to a particular phrase are in fact in error, that phrase shouldn't be used by the bot. But keep in mind that this flagging will only be used for new articles, which are much more likely to be spam then existing ones, so drawing conclusions from your search of existing articles isn't necessarily a good idea.

In any case, the bot should be tested by seeing what happens using a given phrase for (say) the first ten articles it finds. For example, our products looks like a good phrase to use. A google search on that found Enterprise Engineering Center (user who created article has done nothing else), plus several others (in the top 10 results) that were tagged as appearing to be advertisements.

Finally, the bot is only doing flagging. A human has to actually nominate an article for deletion (and it's easy to remove a template). But your comment does raise a point about there being a link to click on to complain about the bot. John Broughton | Talk 15:40, 2 January 2007 (UTC)

It strikes me that the Bayesian approach commonly used to detect e-mail spam could work here as well. All we'd need (besides a simple matter of programming) is a way to train the bot. I suppose, if the bot is watching the RC feed anyway, that deletion of a tagged article could be seen as a confirmation that it was spam, while removal of the tag could be taken as a sign that it was not (until and unless the article is deleted after all). But there would still need to be a manual training interface, if only for the initial training before the bot is started. —Ilmari Karonen (talk) 03:23, 3 January 2007 (UTC)

I like the idea of a Bayesian approach because of the simplicity. However, the bot training would always have to be manual in my opinion. Having the bot treat the deletion of a tagged article as spam will likely result in it learning some behaviors outside of its design scope. For instance, if it tags an article with patent nonsense that happens to trip the filter and that article is removed while the template is still intact, it will start gobbling up patent nonsense like there is no tomorrow. While that's not a bad thing, the template we'd be leaving on the page wouldn't accurately describe what's wrong with the page.

So... either a manual interface would be necessary to make sure that the bot stays on target or we'd need to change the scope of the bot to encompass every kind of problem there is with a new page (spam, attack pages, patent nonsense, etc.) I think either approach would be good, but would anyone care to offer their feedback? Best, Hagerman 03:31, 7 January 2007 (UTC)

I suggest this template:

AdBot suspects this article (or parts of this article) are blatant advertising, otherwise known as spam.

Cocoaguy_contribs 03:42, 3 January 2007 (UTC)

forced autoarchive

I know there's already autoarchive bots running such as the one archiving this page, but I think a bot that operates a little differently could be effectively used to archive all article talkpages. First off, it would only archive talk pages that are very long, so 3 year old comments on a tiny talk page would be left untouched. When the bot runs across a very long talk page it will archive similarly to current bots, but with a high threshold, for example all sections older than 28 days (rather than the typical 7 days). Also, unlike current bots I'd suggest we make this opt out rather than opt in, although very busy talk pages or talk pages that are manually archived wouldn't be touched anyway because they'd either be short enough or would have no inactive sections. Vicarious 03:56, 1 January 2007 (UTC)

If there is interest in this and someone can code it up and get it approved, I'll volunteer to host it and run it under the EssjayBot banner. Essjay (Talk) 03:58, 1 January 2007 (UTC)

Werdnabot is customizable as to how many days of no replies in a section before it archives - If the only feature you want is to only archive after a certain page length is reached, wouldn't it just be easier to put in a feature request to Werdna, rather than re-inventing the wheel? ShakingSpirit 04:04, 1 January 2007 (UTC)

Unless I've missed something, Werdnabot is like the EssjayBots, it's an opt-in. Archiving all article talk pages on Misplaced Pages would require a bit more than just a new feature; it's going to have several hundred thousand talk pages to parse, it's going to need a lot more efficient code than the current opt-in code. Essjay (Talk) 04:08, 1 January 2007 (UTC)

My apologies, I missed that part. Though that does bring up a new point - wouldn't this cause very unnecessary stress on the servers? Crawling through every single article talk page on wikipedia must be very bandwidth-intensive, even using Special:Export or an API, or some such. It would also create a huge number of new pages - again, putting strain on the database server. Does the small convenience of having a shorter talk page to look though justify this? Maybe I'm playing devil's advocate, but I'm sure this has been debated before and wasn't found to be such a good idea ^_^ ShakingSpirit

Well, it'll be a strain on the server that hosts it, parsing all those pages. However, if it's done right, it will only archive pages of a certain length, which should avoid most of the one or two line talk pages that are out there, thus reducing any server load. At this point, with 6,930,307 articles, 62,139,876 pages, and tens of thousands of edits a minute, one little bot archiving pages (and set on a delay, to avoid any problems) is hardly likely to bring the site down. As long as it is given a reasonable delay time on it's editing, it should be fine. The real problem will be getting the community signed on to the idea. Essjay (Talk) 04:28, 1 January 2007 (UTC)

As for bandwidth, I don't think it would be an issue. First off it could run once on a database dump to get the ball rolling, then it could patrol recent changes looking only at "talk:" changes. If it still seems like it could hog bandwidth I can think of many more ways to cut down the number of pages it checks. First off ignore any pages that just had characters removed instead of added. Secondly only check every third page (or so), this operates under the premise that big talk pages get big because they're edited often, so it'll pop up again soon if it's going to need archiving. Thirdly, the bot could store a local hash table of page lengths so rather than loading the page each time it would add (or subtract) the number of characters listed on Special:Recentchanges and could only load the page if it needs archived. This wouldn't be as hard on a bot as it sounds, the storage space would only be a few megs because all it needs is the page's hash and size. Also the computation would be easy, because it would hash, not search for the page so the lookup time is O(1) and the calculations are all real simple. Vicarious 04:34, 1 January 2007 (UTC)

Ok, the archive bots are great and seem to work really well... but forced archiving of one particular style on a project-wide scale? I'd so rather we keep the opt-in system and some active "recruiting" of large talk pages. ---J.S 19:04, 2 January 2007 (UTC)

congratulations bot

I suspect this idea isn't even remotely feasible, but I thought I'd suggest it in case I was wrong. A bot that posts a note on a user's talk page when they reach a milestone edit count (1k, 5k, whatever). It'd say congrats and maybe have a time and link for the thousandth edit. Vicarious 05:26, 1 January 2007 (UTC)

Probably not realistically possible. It would be too much of a strain looking through all users' contributions and counting them. We currently have 3,140,639 registered users, and loading Special:Contributions 3,140,639 times would just be a killer, and continuing to do that over time would just be even worse. —Mets501 (talk) 06:37, 1 January 2007 (UTC)

Users could opt-in by posting their count somewhere, or on irc, and the bot could watch the irc RC channel and just count edits. While it would work, is there enough of a point? ST47Talk 19:48, 2 January 2007 (UTC)

It sounds a bit counter productive actually. Except for making the distinction between a new editor and a regular editor there isn't much value in an edit count... and focusing on edit count has negative impact. ---J.S 00:04, 3 January 2007 (UTC)

I think you've missed the point a little. This isn't about telling editors their worth, it's a tiny pat on the back. I enjoy seeing the odometer on my car roll over to an even 10,000 even though it has no significance; this was supposed to be similarly cute and lighthearted. Accordingly because it would be difficult it's not worth it. Vicarious 05:33, 3 January 2007 (UTC)

I think that by having a bot to do this, we'd be giving legitimacy to making edit count matter, which many people feel it does not. ^demon 01:24, 5 January 2007 (UTC)

MessageBot

I suspect this idea has been thought of before but i don't see its fruit so here goes. When i first discovered the talk page here I couldn't for the life of me understand why wikipedia couldn't have a normal message box interface, even if it need be public. This simply means showing the thread of an exchange on different talkpages. It would save us having to keep an eye for a reply on the page we left a message the day before etc. A bot can simply thread a talk exchange, of course this would require tagging our talk as we reply to a message. This is more a navigational issue but since its not been integrated into the main wiki OS it seems to be left for a bot. I don't know how it would run though. Suggestions? frummer 17:57, 1 January 2007 (UTC)

Why not a TalkBot? User A posts a message on User B's talk page. User B responds on his/her own talk page, and this posting invokes (somehow) the TalkBot (maybe a template in the section, like {{TalkBot}}?). The TalkBot determines that the orginal posting from A isn't on A's talk page, and so (a) copys the section heading on B's talk page to A's talk page as a new section; (b) adds You wrote: and the text of A's posting, to that page, and (c) copies B's response on B's page to A's talk page (all with proper indentation). John Broughton

It gets tricker if A responds on A's talk page and isn't a subscriber to the TalkBot service, but perhaps the bot could insert a hidden comment in the heading of the new section on A's talk page, such as <--- Section serviced by TalkBot --->, and then watch for that textstring in the data stream?

Thats a good clarification. frummer 14:15, 2 January 2007 (UTC)

Here is an idea... Why not have a bot that can automaticly move the conversation to a (new)sub-page and then include the conversation into the talk page. Anyone else who wants the conversation on their talk page can include the conversation as well. A new template can be made to "trigger" the bot. Hmmm ---J.S 19:21, 2 January 2007 (UTC)

The Oregon Trail (computer game) anti-vandal bot

I was wondering if it would be possible to create a bot that would serve solely to revert the addition of a link to the oregon trail article. once every other week or so a user adds a link for a free game download that we delete off the article. the bot would just have to monitor the External links category, removing the link: http://www.spesw.com/pc/educational_games/games_n_r/oregon_trail_deluxe.html Oregon Trail Deluxe download whenever it appears. Thanks, please let me know on my talk page if this is a possibility that anyone could take up. Thanks again, b_cubed 17:00, 2 January 2007 (UTC)

You might want to see WP:SPAM. Blacklisting the link might be an option...

If it's not, you might want to contact the user who runs User:AntiVandalBot to have that added to the list of things it watches for. ---J.S 19:09, 2 January 2007 (UTC)

I've added the link to Shadowbot's spam blacklist. Thanks for the link! Shadow1 (talk) 21:48, 2 January 2007 (UTC)

No, thank you :) b_cubed 21:58, 2 January 2007 (UTC)

DarknessBot

May someone please operate this for me? It's already been userpaged, accounted, and flagged. D•a•r•k•n•e•s•s•L•o•r•d•i•a•n•••CCD••• 22:12, 2 January 2007 (UTC)

Operate it for you? As in execute? -- Jmax- 14:14, 3 January 2007 (UTC)

Yes, you can change the name even, but give me a little credit for creating it before my bot malfunctioned. :( D•a•r•k•n•e•s•s•L•o•r•d•i•a•n•••CCD••• 00:44, 4 January 2007 (UTC)

Why can't you operate it? -- Jmax- 02:46, 4 January 2007 (UTC)

Children Page Protection Bot

I would like to suggest the creation of a BOT to defend articals for children's show. For some reason these pages appile to vandals and I think somthing needs to help protect them. I'll use an exsample before the Dora the Explorer page was put back under protection it was vandalized alot one time sticks in my mind the most was by a user named Oddanimals who, stated Dora was 47 and had a sex change along with a few other sex related comments, and replaced the word Bannana in Boot's artical with the S curse word. This is not proper to say the least and one of the users I talked to said that the Backyardagains artical is also vandalized alot. Parents, kids, and people ,like me, who just enjoy those shows look it up and this kind of thing should NOT be allowed. Thank You Superx 23:18, 2 January 2007 (UTC)

Bots are already watching those pages... but bots are dumb and can't catch all types of vandalism. ---J.S 00:00, 3 January 2007 (UTC)

True but Those BOTs are checking other pages as well. that Vandalizm stuck out like a sore thumb and none of those bots caught it except for one after I fixed it myself and I think that just one BOT who's job it is too check those pages would be better than sevaral others who are checking a bunch of other pages as well. Superx 01:10, 3 January 2007 (UTC)

Misplaced Pages is not censored alphachimp. 01:15, 3 January 2007 (UTC)

Yes but that would only apily here if the stuff I mentioned ACTULLY HAD SOMETHING TO DO WITH THE SHOW! Curse words and other such stuff is only allowed if it is relavent to the artical and none of that is like that thus making that point you mentioned doesn't apliy in this situation. Superx 12:00, 5 January 2007 (UTC)

Finishing a template migration

Need to migrate all the existing transclusions of {{CopyrightedFreeUse}} to {{PD-release}} per discussion here. BetacommandBot started on this a few weeks ago and then mysteriously quit about 7/8ths of the way through and I haven't been able to get a response from Betacommand since then. Could someone else finish this so that we can finally delete that template. Thanks. Kaldari 01:27, 3 January 2007 (UTC)

Alphachimpbot is on it. alphachimp. 01:35, 3 January 2007 (UTC)

All done. alphachimp. 08:25, 3 January 2007 (UTC)

American television series by decade cleanup

Cleaning up from this category move: Misplaced Pages:Categories for deletion/Log/2006 December 19#American television series by decade where the meaning of the category was changed, there should be no overlap with Category:Anime by date of first release, because by the English definition no US originated-series that we know of is anime.

I'd like a bot to re-categorize with the following rule: If article in Category:Anime series and in Category:XXXXs American television series then remove from Category:XXXXs American television series and add to Category:Anime of the XXXXs instead. (The latter category includes both films and series.) --GunnarRene 05:37, 3 January 2007 (UTC)

If I wanted to run a bot of my own to do this, which one would be appropriate?--GunnarRene 21:01, 9 January 2007 (UTC)

I'd use WP:AWB. You'd need to do a WP:RFBA and create a bot account, and come up with a script to do it. I'd recommend the find and replace be set to:

Category:(....)s American television series

Category:Anime of the $1s

To generate a list, just use Category:Anime series, don't worry about the other, and tell it to skip if no replacement made. I'd do it, but I don't think I'm approved for that outside of WP:CFD, but the procedure is pretty easy. ST47Talk 21:56, 9 January 2007 (UTC)

Popes interwiki

Please add ro interwiki to all popes pages. Just created, Romihaitza 12:31, 3 January 2007 (UTC)

WikiProject France Bot

We need to add the {{WikiProject France}} to all the articles belonging to France and its sub categories. So would be nice if someone could do it for us or tell me how to do it. STTW (talk) 09:45, 4 January 2007 (UTC)

I can do this, please put a list here of the categories and indicate whether subcategories should be included. ST47Talk 11:15, 4 January 2007 (UTC)

Category:France and all it subcategories, thanks in advance. STTW (talk) 15:21, 4 January 2007 (UTC)

4 levels deep, categories with France or French in the name only, 23313 hits, converted to talk, prepending template, skipping if it contains {{WikiProject France ST47Talk 20:22, 5 January 2007 (UTC)

hot bot action for test wiki

Can someone please go over to http://test.wikipedia.org and with a bot populate Category:Really big category with anything, it doesn't matter what. Just dump every page and every image into the category please to test how the category system works when it is pushed to its limit. Testing man 22:53, 4 January 2007 (UTC)

I started, but then noticed that even if I categorized every single page on the wiki, that only comes to around 600, which we have categorys far larger than already. I can't see you'd get a very useful stress-test when the wiki is so small ^_^ ShakingSpirit 06:30, 5 January 2007 (UTC)

Then let the bot create ten or twenty thousand new pages containing random junk and add them. Neon Merlin 20:12, 11 January 2007 (UTC)

Why is this necessary? -- Jmax- 06:56, 12 January 2007 (UTC)

Page-protecting syso-bot

People usually do a good job of protecting the templates on the Main Page; but there have been some that slip through the cracks and the results can be disastrous. I propose a bot that would be given sysop status. I know this is controversial, and there was a big discussion about a similar request at the AFD page awhile back. Such, anyone allowed to know the password must have already been approved for adminship through conventional means, and it should be open-source. It will protect the next day's templates in advance of them being on the Main Page (say, 24 hours) and then unprotect them afterwards. Preferably, it would make sure the pages stay protected until off the Main Page, and even be able to work with the pictures for POTD, but they'd have to be specified in advance, whereas the templates would run on the {{CURRENTDAY}} magic word system. This would be a big help in reducing the possibility of Main Page vandalism (believe me, it happens).--Here T oHelp 03:52, 5 January 2007 (UTC)

see Misplaced Pages:Bots/Requests for approval/ProtectionBot Betacommand 05:54, 5 January 2007 (UTC)

Oh. I feel stupid now.--Here T oHelp 03:30, 6 January 2007 (UTC)

deletion bot

I have the feeling I'm gonna get yelled at for this one, but how about a bot that deletes articles that have a clear concensus on Misplaced Pages:Articles for deletion. For example, it's quite obvious that Misplaced Pages:Articles for deletion/Myspacephobia is going to get deleted, but it's currently waiting for an admin to do the work. Yes I know this would mean an admin bot, but that's not without precedent. Also, this bot would ONLY work on articles with a very obvious concensus. As for vandals abusing the bot, I don't think it would be an issue. First off it'd ignore IPs, secondly it'd have a minimum amount of time for voting, and there's too many legitamate voters to contest a bad faith deletion for the bot to touch it. Btw, this bot would also close candidates that are clearly keep as well. Vicarious 07:39, 5 January 2007 (UTC)

Absolutely not. AFD is not a vote, it's a discussion to achieve consensus. A bot will never be in a position to properly determine whether or not consensus is achieved. I'd strongly oppose both the creation and sysop status of such an account. (Coincidentally, from a purely technical angle, such a bot would probably not be difficult to create...) alphachimp 07:45, 5 January 2007 (UTC)

Although I understand your position, I'm not sure I agree with your argument. I agree that a computer couldn't tell who was winning a debate, but it could if both people were arguing the same side. Similarly this bot couldn't decide what consensus was concluded in an opposed discussion, but I don't see why it couldn't take advantage of the fact that everyone is on the same side of the discussion and that the concensus has already been reached. Vicarious 07:58, 5 January 2007 (UTC)

So you're proposing that we break deletion debates down into purely mechanical decisions? There's a clear difference between achieving consensus and simply "counting the votes". Administrators use discretion to evaluate the weight and strength of the arguments presented, making a decision based not only on those facts that they have surmised, but also on the strength of those arguments. It's quite possible to achieve "no consensus" even with an overwhelming "vote" for deletion. alphachimp 08:08, 5 January 2007 (UTC)

But is it possible to achieve no concensus with 10 votes to delete and 0 to keep? Vicarious 08:14, 5 January 2007 (UTC)

Absolutely, because AFD is not a vote. It's possible that the arguments could be entirely baseless, and all of the "votes" placed afterwards could be founded on those arguments. alphachimp 08:16, 5 January 2007 (UTC)

I understand that it's a discussion not a vote, but I would be astonished if that scenario had happened ever, let alone with any frequency. I confess I don't spend a lot of time on WP:AFD but I've spent a little and I find your argument specious. In fact, I think if that were to happen then even the admin that came along to close the debate would likely miss the same fallacy that the other 10 editors had. Vicarious 08:26, 5 January 2007 (UTC)

I'd certainly hope not, but that's a possibility. It's still a lot more comforting to leave such important decisions up to human judgment. alphachimp 08:34, 5 January 2007 (UTC)

What would be nice is an auto AfD relisting bot that relists articles w/ less than say 5 comments on it -- 64.180.84.87 09:41, 5 January 2007 (UTC)

musical artist template

{{Infobox musical artist 2
->
{{Infobox musical artist

86.201.106.176 13:23, 5 January 2007 (UTC)

why?ST47Talk 19:23, 5 January 2007 (UTC)

Images on commons with different name

I do not have any programming skills about running bots. I can handle and run the bot if someone writes the code to replace the image link from the articles, with the existing image on commons with different name. I suppose this type of bot could be useful other than english wikipedia as well in some of the cases. There are many examples of the images could be found in this category. Shyam 19:52, 5 January 2007 (UTC)

Tagging closed FAC nominations

I proposed this to Raul654 on his talk page, but he'd rather not add it to his workload, though he supported using a bot instead.

I'd like a bot to watch the Featured log (for successful noms) and Featured archive (for failed noms) and automatically tag each one with a line that indicates when they were closed (i.e. added to the archive) and the result. That way, it'll be possible to determine from the page itself what happened.

I'm thinking it should add

Promoted ~~~~~

Not Promoted ~~~~~

at the bottom of each, in line with WP:FPC. Night Gyr (talk/Oy) 20:59, 5 January 2007 (UTC)

Nice to see someone working on this idea, why not have it archive the page like in the XFD's? That way, justr looking at it tells you if it is done or not. and the summary is at the top. The Placebo Effect 21:01, 5 January 2007 (UTC)

I figured we should have a more consistent FxC style independent of xFD style. Those big boxes and colored backgrounds make sense if the pages are still going to be transcluded along side live debates, like xFD, but it's a lot of excess formatting to add when people are less likely to be confused. Night Gyr (talk/Oy) 21:07, 5 January 2007 (UTC)

Personally,I think we should add a template at the top that says what day the article passed or failed and mention that the debate is closed. The Placebo Effect 21:10, 5 January 2007 (UTC)

Yeah, top or bottom isn't really a big issue for me, and top (immediately below the section head) is probably better for quick reference. FPC uses {{FPCresult}}, so it needn't be a complicated template. Night Gyr (talk/Oy) 21:15, 5 January 2007 (UTC)

I've thought about this before. A bot could run about once a day and do a number of tasks:

Check the promotion and non-promotion logs for any updates from Raul654
Add a note to the candidate sub page indicating promotion or not
Remove the page from WP:FAC if it is still there (this might simplify one of Raul654's mundane tasks)
Update the {{fac}} on the article talk page for non-promotions
- This could even be done in a way that would make future fac submissions easy, eliminating quite a bit of needless work the other FA clerks currently handle
Possibly verify/update wikiproject assessments on the article talk page for promoted FAs

Has anyone else set up an account to develop a bot along this line yet? Gimmetrow 04:10, 6 January 2007 (UTC)

Would like to have a similar bot do the same (in reverse) for Featured article review; rather than Promoted or Not Promoted, the bot would return Kept or Removed Featured status, based on the Featured article review archive. SandyGeorgia (Talk) 05:53, 7 January 2007 (UTC)

I will volunteer to write a bot that performs these tasks, presuming no one else would like to or has already started. -- Jmax- 09:19, 7 January 2007 (UTC)

I've already started. See Misplaced Pages:Bots/Requests_for_approval#GimmeBot. Gimmetrow 11:06, 7 January 2007 (UTC)

Let us know if either will require separation of the FAR archives similar to the FAC archives, or if they can be handled as is. SandyGeorgia (Talk) 13:31, 7 January 2007 (UTC)

Misplaced Pages:Translation

Hello, we are (finishing to) putting in place a new translation project.

There are two things I would greatly appreciate if it was done by a bot.

First, we had to make a small modification of the format of the translation pages which are used for every translation request. The task is : For every page in Category:Translation sub-pages version 1, this kind of change needs to be done.

Second, there are a lot of categories to initialize with a very simple wikicode, 7 for each language and they are 50 of them. All red links of the array on Misplaced Pages:Translation/*/Lang (except the first column of the array which has a different syntax) should be initialized with the syntax explained on this page.

Let me know if you need any furhter info

Jmfayard 18:46, 6 January 2007 (UTC)

Abandoned Article bot

There is now a project dealing with articles which have not been modified or viewed recently at Misplaced Pages:WikiProject Abandoned Articles. Would there be any way to generate a bot which might list only articles which haven't been modified since, say, 2005 (or some other really long time, maybe by year), for the use of this project to help find the most overlooked articles? Badbilltucker 20:35, 6 January 2007 (UTC)

See Special:Ancientpages. —Mets501 (talk) 22:23, 6 January 2007 (UTC)

Ancientpages has two problems. First, it only lists the oldest 1000 articles (that is, 1000 articles with the oldest "most recent edit"). That, of course, is plenty to work on for any project. But the second problem is that at least 90% of the 1000 articles are disambiguation pages - not what members of the project are really interested in. An ideal bot would be able to screen out disambiguation pages from its results.

Alternatively, I guess, a special page (database listing) similiar to ancientpages, but excluding disambiguation pages, would suffice. John Broughton | Talk 02:17, 7 January 2007 (UTC)

I'm in the process of importing a database dump and I'll gather these statistics for you. To be clear, you want a list of pages with the oldest most recent edit, and is in the main namespace, and is not a disambiguation page; Correct? -- Jmax- 07:33, 7 January 2007 (UTC)

Correct, we'd like a list; statistics aren't needed. Ideally it would look like Special:Ancientpages, which does exactly what we need, except that it includes disambiguation pages, which we don't want, and turns out (see above) to be quite problematical. Thanks. John Broughton | Talk 20:37, 7 January 2007 (UTC)

DRVbot

I need a bot to do some routing maintenance tasks for deletion review. Possible tasks would include:

Create daily and monthly log pages
Remove headers and archive a closed daily log
Add <noinclude> tags to the text after archiving.

Any help with these tasks is appreciated. ~ trialsanderrors 09:07, 7 January 2007 (UTC)

replace "often times" and "oftentimes" with "often"

Can someone run a bot to replace all instances of "oftentimes" and "often times", with "often" which is shorter, means exactly the same thing, and doesn't sound so awkward in formal written English. I tried to start doing it manually, but there's too much of it.

You can find the instances via Google:

Obviously, perhaps ignoring all the talk and user namespaces might be a good idea.

Please. Paul Silverman 12:19, 7 January 2007 (UTC)

Hmm, this could be done fairly easily with AWB or pywikipedia, but personally I'm not sure it should be done with a fully automated bot for the same reason that misspellings shouldn't be automatically 'fixed' by a bot - quite often times come up where it would make no sense to change "often times" to "often"...like in that sentence! ^_^ I suggest using AWB and checking each edit yourself before confirming the change. ShakingSpirit 22:08, 7 January 2007 (UTC)

Heading-Bot

There are many headings that to not follow Misplaced Pages:Manual of Style (headings)in Misplaced Pages Articles. --Jamesshin92 22:14, 7 January 2007 (UTC)

And...? —Mets501 (talk) 22:38, 7 January 2007 (UTC)

I want to create a program to make sure that the proper "Capitalization" (did I spell it right?) for every heading in each article. It is faster for bots to do that job.--Jamesshin92 22:43, 7 January 2007 (UTC)

For intance, the heading "Train Overview" and "USA" should be changed. The bot would notice and report to the owner. THe owner decides if the heading is valid under Misplaced Pages:Manual of Style. --Jamesshin92 22:51, 7 January 2007 (UTC)

Also, Headings like "The Dodos" should be altered by this bot by noticing the word "the" or "a" or "an" in the front. However, the alternation must be manually done, since there are exceptions like "The United States of America." Also Heading like "About Dodos," will be ignored although this does not follow Misplaced Pages:Manual of Style, (it is like bot correcting spellings).--Jamesshin92 22:59, 7 January 2007 (UTC)

So, to simplifiy. The bot will look through every headings one by one and judge whether this heading follows WP:MSH. The user will make the alternations.

You, might think that this bot is useless since the alternation is done by humans. However, clicking into long articles and scroll down to search for headings is a time waste for humans to do. This bot will do.--Jamesshin92 23:05, 7 January 2007 (UTC)

The job for this bot is straight forward and useful.--Jamesshin92 23:07, 7 January 2007 (UTC)

WP:AWB allows for the generation of a list of non-conformist headings from a database dump. ST47Talk 00:41, 8 January 2007 (UTC)

Capitalization of section headings is a problem, but it's not clear a bot can handle it well - I'd much rather see a popups approach where an editor makes the final decision. Consider this:

      === White House communications ===

or this:

      === U.S. Senator ===

Humans know that "White House" is where the U.S. President lives, and that "Senator" is a title and is capitalized. A bot doesn't. John Broughton | Talk 03:12, 8 January 2007 (UTC)

I agree. --Jamesshin92 04:02, 8 January 2007 (UTC)

However, I still think that we can still spread this idea in different approach. For example, detecting repeated heading and special characters such as (%+^@~ and such).

We also might think of modifying the heading to standard headings such as. "Also see" into "See Also," "Links" into "External Links," and such. Jamesshin92 04:10, 8 January 2007 (UTC)

Yes, that could be useful. John Broughton | Talk 17:23, 8 January 2007 (UTC)

UserCheck-Bot

Souldn't we delete the users who did not contributed to Misplaced Pages for a long period of time?--Jamesshin92 22:32, 7 January 2007 (UTC)

No, as per WP:USERNAME ShakingSpirit 22:34, 7 January 2007 (UTC)

No. Visit the Misplaced Pages:Usurpation page to support the policy. ST47Talk 00:45, 8 January 2007 (UTC)

or Misplaced Pages:Delete unused username after 90 days ST47Talk 00:45, 8 January 2007 (UTC)

Repeated article-link bot

Could a bot that detects and removes repeats of links to other articles be created? For example the article on Lions might have the word "Africa" in the first paragraph which is linked to an article about Africa and then further down the page there is another link where the word "Africa" appears. The bot could detect this and de-link the second occurrence of the word. —The preceding unsigned comment was added by Mutley (talk • contribs) 06:21, 8 January 2007 (UTC).

I believe WP:AWB can detect this, perhaps you can use it? ST47Talk 11:55, 8 January 2007 (UTC)

While I think the concept is a good idea, it's useful to a reader not to have to go way back (up) in an article to find a wikilink. I suggest that if the separation is more than 3000 characters (that's roughly 40 lines, i.e., almost a full page), then the second occurence should not be dewikilinked. John Broughton | Talk 17:22, 8 January 2007 (UTC)

I agree with john, delink only if it's a large seperation. Also, it'd be useful if it could lookout for links caused by templates too, for example template:main. Vicarious 05:14, 9 January 2007 (UTC)

Also, if the repeated instance is in the "see also" section of the article it should be removed completely (or left alone). Vicarious 09:01, 9 January 2007 (UTC)

Invariable symbols bot

Unit symbols are invariable (unlike abbreviations), but there are nevertheless hundreds if not thousands of articles where "s" has been added improperly. A bot could fix this automatically, as the risk of confusion with correct English is about zero. Specifically:

Replace "cms" with "cm" (centimetre)
Replace "ins" with "in" (inch) <-- This one needs care (could be e.g. "ins and outs", "sit-ins", etc.)
Replace "kms" with "km" (kilometre)
Replace "mins" with "min" (minute)
Replace "mms" with "mm" (millimetre)
Replace "secs" with "s" (second)
Replace "yds" with "yd" (yard)

The sought strings should be case sensitive, and the bot should leave instances immediately followed by a period alone (they could be legitimate abbreviations).

Other cases than those listed above probably exist.

Urhixidur 18:55, 8 January 2007 (UTC)

This bot would have to be manually assisted. —Mets501 (talk) 20:24, 8 January 2007 (UTC)

Although it might be prudent to manually assist or at least review, I don't think it's necessary. If it only changes when it's all lower case, not followed by a period, and is preceded by a number, then I think the incidence of incorrect change would be 0. On a side note I'm lukewarm about changing secs to s though. Vicarious 05:11, 9 January 2007 (UTC)

Conversion bot

Would it be possible to create a bot that could automatically convert the U.S. Standard System of Measurment into the European Metric System of Measurment? I think a number of articles on here could benifit from such a bot if we do not already have one. Note that I know nothing about operating a bot, this is simply an idea of mind which I got while working on the article USS Wisconsin. —The preceding unsigned comment was added by TomStar81 (talk • contribs) 06:06, 9 January 2007 (UTC).

If I understand your idea corretly I think its a good one. What i understand is you want to find all standalone imperial measurements and add a parenthesised metric measurement too, and vice versa. ie if you find "1 mile" replace it with "1 mile (1.6km)" (or whatever). The bot would have to be very careful/intelligent, but I think it can be done and is a good idea. - PocklingtonDan 08:16, 9 January 2007 (UTC)

Yeah, thats exactly what I have in mind. It seems to me that us Americans will never convert to the metric system, and the rest of the world is not going to change measurements to accomadate us, so I figure it would be best to build a bot that can automatically handle the conversions rather than pester people to consistently make the conversions themselves. TomStar81 (Talk) 08:23, 9 January 2007 (UTC)

I'm in favor of the idea. I have some ideas on how to make this bot so that it doesn't break anything. I'd be willing to write the psuedocode or review someone else's code on the bot. Vicarious 08:33, 9 January 2007 (UTC)

I really wouldn't go there with a bot. Bobblewik was doing just that, and judging from his block log, I don't think the community really liked that. —Mets501 (talk) 12:02, 9 January 2007 (UTC)

All that fuss seems to be to do with date unlinking, not this idea. If the bot is properly set up with intelligent enough rules this would be genuinely useful, I don't see what objections people could have to this bot??? - PocklingtonDan 12:34, 9 January 2007 (UTC)

(undent) It doesn't seem that controversial. Obviously some test runs, and starting slowly, would be appropriate. In general, I think Misplaced Pages needs more bots like this - human beings just aren't that consistent (much of which comes from not knowing everything in detail), and bots like this can compensate for that. John Broughton | Talk 15:30, 9 January 2007 (UTC)

Link Normalization

Bot request: Change article text of the form "] bar", "foo ]", and "] ]" to "]". I call this link normalization. The bot should make one pass of the whole database every month or so.

This oddities exist due to disambiguation. The first editor writes "] bar", a disambiguator uses a tool to replace "]" with "]", leaving "] bar". Clearly "]" is the better form. It makes the link more sensible if it covers both words. The tools could be altered, but there's more than one, they're already complex, and thousands of these links already exits. -- Randall Bart 07:29, 11 January 2007 (UTC)

Considered a waste of resources. I tried much the same thing, see my talk page, archive 5. ST47Talk 11:18, 11 January 2007 (UTC)

Automated (non-assisted) speling bot

I am aware that such bots have been proposed before and rejected, so before proposing this in an RFA, I'm going to try and get something hammered out here instead before requesting approval. I think it would be handy to have an automated (non-assisted) spelling bot.

The Why Everybody makes spelling mistakes, its not a problem that is going to go away. Human editors have enough to do correcting typos etc without having to patrol for "systematic" spelling mistakes too. Spelling mistakes in main article namespace affect the professional image of wikipedia.

The Why Not Int he past people seem to have objected that such a bot would be indiscriminate, and cause too many false positives, "correcting" spellign mistakes that were not spelling mistakes at all.

What I propose All objections can be negated by applying sufficient conditions of operation to the bot, I believe:

The bot would monitor text submitted in recent edits
The bot would monitor edits to main article namespace only, not talk or user
The bot would not edit words embedded in wikilinks, assuming these are special cases.
The bot would not edit words with multiple spellings in different languages or language variations.
The bot would not edit words that are ambiguious and could be intended to be more than one word.
The bot would not edit words in html links, assuming these are special cases.
The bot would not edit words with first-letter captilaziation (to ignore all proper nouns)
The bot would not edit words without a word boundary on both sides
The bot would not edit words with "(sic)" in next 100 chars
The bot would not edit words if the same word appeared int hat spellign in the article title
The list of words matched would be available on the bot page.
No new words to be added without second bot RFA

Proposed list of changes Rather than trying to catch all typos, the bt would watch for the approximately top 100 most common "systematic" spelling errors, based on wikipedia:list_of_common_misspellingsand others. The proposed list is (note, there might be some mistakes in this list currently that need correcting):

accelarate -> accelerate
accessable -> accessible
accessery -> accessory
accidently -> accidentally
accomodate -> accommodate
accomodation -> accommodation
acustomed -> accustomed
annoint -> anoint
aquaintance -> acquaintance
adress - > address
antenatal -> antinatal
assassination -> assasination
batallion -> battalion
cemetary -> cemetery
changable -> changeable
commitment -> committment
concensus -> consensus
coolly - > cooly
concensus -> consensus
corollory -> corollary
definately -> definitely
desiccate -> dessicate
desiccated -> dessicated
dispair -> despair
desparate -> desperate
developement -> development
disippate -> dissipate
dissappointed -> disappointed
difference -> differense
drunkeness -> drunkenness
ecstasy -> ecstacy
elevater -> elevator
embarrassment -> embarassment
excede -> exceed
existance -> existence
february -> febuary
grammer -> grammar
guarantee -> garantee
harass -> harrass
harrassment -> harassment
heros -> heroes
independant -> independent
idiosyncracy -> idiosyncrasy
inadvertent -> inadvertant
indispensible -> indispensable
inoculate -> innoculate
irresistable -> irresistible
irritible -> irritable
insistant -> insistent
judgment -> judgement
liason -> liaison
libary -> library
liquefy - > liquify
momento -> memento
millenium -> millennium
mischievious -> mischievous
minuscule -> miniscule
noticable -> noticeable
ocassion -> occasion
ocassional -> occasional
occurence -> occurrence
parallel -> paralell
persue -> pursue
pitiful -> pityful
possess -> posess
possess -> posess
processed -> procesed
priviledge -> privilege
privelege -> privilege
reccomend -> recommend
recieve -> receive
refered -> referred
relavant -> relevant
repitition -> repetition
sacreligious -> sacrilegious
sieze -> seize
seperate -> separate
spatial -> spacial
subpena -> subpoena
supersede -> supercede
transfered -> transferred
tyrrany -> tyranny
unparalelled -> unparalleled
wastefull -> wasteful
wieght -> weight
wierd -> weird
yeild -> yield
Zeroes -> Zeros

Can I get some measured response to this proposal please? I want to try and hammer the proposal into shape such that it stands a chance of getting through the bot RFA process. Please suggest anything you can think of to improve its operation etc. Cheers - PocklingtonDan 15:41, 11 January 2007 (UTC)

An automated spelling bot will never be approved by the BAG as per bot policy Betacommand 17:57, 11 January 2007 (UTC)

Bot policy is not handed down from the gods, it is something that someone or some group of people once decided, ie it was the consensus when it was written. It can be changed if the consensus now is that it is acceptable under certain strict conditions. Are you stating that it is necessary to go through the process of changing the policy first, and then going throught he justification all over again when getting authorisation for the bot? - PocklingtonDan 18:37, 11 January 2007 (UTC)

I have started a request to change this policy here, please comment if you have any views on this - PocklingtonDan 18:43, 11 January 2007 (UTC)

What I am saying is that this is not a job for a bot, ask to have the words added to AWB as a fully automated bot has too many issues and risks. Also the community has said that they don't want spellchecking bots. This is a task for humans to do as the room for error is very small Betacommand 18:49, 11 January 2007 (UTC)

What are the issues and risks involved in such a bot? At worst, it would match a false positive and change the spelling of a word that was deliberately mispelt. Based on the outline above, I believe the chances of that happening are incredibly small. It would only be operating on 50-100 of the most commonly mispelt words and I have outlined various safeguards above to prevent false positive matches. Worst case scenario, it corrects a word it shouldn't have. I don't see why that is so dire, given that AVB's cockups when they occur are far worse, actually restoring vandalism in some cases. I agree with the basic premise that a bot cannot catch all typos, so the bot doesn't try and do this, it tries to catch just the 50-100 most common systematic spelling errors.

What I am really after is comments on improving the rules/restrictions to reduce the possibility of false positive matches.

Thanks - PocklingtonDan 19:51, 11 January 2007 (UTC)

Categories: