Misplaced Pages

User talk:ClueBot Commons: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 08:45, 5 January 2013 editNaomiAmethyst (talk | contribs)Edit filter managers, Extended confirmed users, Rollbackers, Template editors6,269 edits ClueBot III just archived everything into the same month instead of splitting it: Reply.← Previous edit Revision as of 09:08, 5 January 2013 edit undoAhnoneemoos (talk | contribs)Extended confirmed users16,167 edits ClueBot III just archived everything into the same month instead of splitting itNext edit →
Line 97: Line 97:
:::: {{idea}} can't I just hack it by creating the seed used by CB3 to perform consequent archives? Is it not the indexes? I can manually create the index then, no? That way CB3 will beleive that it already performed an archive and operate as if it were performing a routinary task. —] (]) 20:37, 3 January 2013 (UTC) :::: {{idea}} can't I just hack it by creating the seed used by CB3 to perform consequent archives? Is it not the indexes? I can manually create the index then, no? That way CB3 will beleive that it already performed an archive and operate as if it were performing a routinary task. —] (]) 20:37, 3 January 2013 (UTC)
::::: The bot does not distinguish between initial and "routine" archivals. That is the issue. The very first archival it performs it performs as if it were already current (and thus, a "routine" archival). The solution is to make your page current before adding the bot. The bot will keep it current, but won't fix (beyond moving it out of the way into the current archive) a page that needs cleaning up first. -- ]<sup>(]&#124;]&#124;])</sup> 08:45, 5 January 2013 (UTC) ::::: The bot does not distinguish between initial and "routine" archivals. That is the issue. The very first archival it performs it performs as if it were already current (and thus, a "routine" archival). The solution is to make your page current before adding the bot. The bot will keep it current, but won't fix (beyond moving it out of the way into the current archive) a page that needs cleaning up first. -- ]<sup>(]&#124;]&#124;])</sup> 08:45, 5 January 2013 (UTC)
:::::: OK, I understood now. The bot assumes that the content to be archived belongs to the current month. I wonder if there's a way to tell the bot what should be the correct month by looking at the diffs or history instead of the signatures like MiszaBot does. &mdash;] (]) 09:08, 5 January 2013 (UTC)


== Is ClueBot III restricted to same NAMESPACE? What about subpages and parent targets? == == Is ClueBot III restricted to same NAMESPACE? What about subpages and parent targets? ==

Revision as of 09:08, 5 January 2013

ClueBot NG Links!Report False Positives • Review edits for the dataset • Frequently Asked Questions False PositivesIf you believe that ClueBot NG has made a mistake, please follow the directions in the warning it gave or click here. Please do not report them here. It takes less time to report them to the correct location, and we can handle it more effectively if reported in the correct location. Purpose of this PageThis page is for comments on or questions about the ClueBots.

The current status of ClueBot NG is: Running
Praise should go on the praise page. Barnstars and other awards should go on the awards page.
Use the "new section" button at the top of this page to add a new section. Use the link above each section to edit that section.
This page is automatically archived by ClueBot III.
The ClueBots' owner or someone else who knows the answer to your question will reply on this page.

Template:Archive box collapsible

ClueBots
ClueBot NG/Anti-vandalism · ClueBot II/ClueBot Script
ClueBot III/Archive · Talk page for all ClueBots
Beware! This user's talk page is monitored by talk page watchers. Some of them even talk back.

FalsePositives page for inactive ClueBot

I followed the False positive? Report it link from the log message for an old edit by ClueBot (not NG). It took me to User:ClueBot/FalsePositives. That page's Click here reporting link fails with "server not found".

It has taken me quite some time to figure out that the bot in question is inactive. Judging by the page's edit history, several other people have been confused too. Adding to my confusion was that the corresponding page for ClueBot NG looks so similar; I didn't realize for quite a while that I was looking at two different pages. (I got a distinct maze of twisty little passages, all different feeling when that finally dawned on me :-) )

I've taken the liberty of copying the wikibreak template from User:ClueBot to User:ClueBot/FalsePositives, and rewording the latter's introductory sentence. I'm not comfortable doing more than that, but I suggest that a ClueBot maintainer edit the page more extensively, to make it clearer than my (intentionally minimal) changes can do that:

  1. ClueBot is out of service
  2. So is its false-positive reporting mechanism
  3. Reporting false positives against it would be pointless anyway (except in the unlikely circumstance that ClueBot is ever revived)

Or whatever subset of those points is actually true...

Thanks. Erics (talk) 20:15, 1 January 2013 (UTC)

I've tidied it up.--5 albert square (talk) 22:07, 1 January 2013 (UTC)
Looks good. Thanks much. Erics (talk) 05:17, 2 January 2013 (UTC)

Feeding ClueBot false-positives into ClueBot NG

Actually, thinking more about my point #3 above: mightn't it make sense to feed ClueBot's false positives into ClueBot NG's dataset? On the one hand, NG presumably makes different mistakes from the old ClueBot, so I have no idea how useful feedback from one tool would be for improving the other's accuracy. On the other hand, data is data; if people are willing to provide it, why not make use of it? Or would that in fact do more harm than good? Erics (talk) 20:20, 1 January 2013 (UTC)

I'm not sure that would actually be possible. It's been some time now since ClueBot was active and I'm fairly certain ClueNet have undergone a lot of change in that time, various upgrades etc etc. Also, I would imagine that merging the two databases would take some time and possibly make a database too large. After all the original ClueBot made close to 1.6 million edits I think before he took his Wikibreak.
I'll swing by Rich's page though and let him know about this.--5 albert square (talk) 22:06, 1 January 2013 (UTC)
Maybe Cobi will correct me, but I think they are 2 totally different databases... I don't think it can be done - Rich(MTCD) 23:28, 1 January 2013 (UTC)
Actually Rich, I think you're correct. Thinking about it I'm sure I now remember ClueBot NG having some downtime earlier in the year because of it.--5 albert square (talk) 00:18, 2 January 2013 (UTC)
Sorry if I was unclear. I didn't mean to suggest that the databases be merged. Rather, the idea was just that the old ClueBot's false-positive page could take me to a tool that submits my report to ClueBot NG instead of to the old ClueBot. That assumes (and I've only now realized this, and that the assumption might well be incorrect) that a report of the form "text FOO on page BAR was flagged as vandalism but is really OK" is sufficient to feed into NG's learning machinery. But maybe it isn't useful, without other context as to how the decision was made in the first place -- and of course that context, from old ClueBot's database, would be meaningless to NG. Anyway, it's probably not worth a whole lot of effort; on further thought, I don't know how many people would be looking far enough back in history, at this point, to be reporting old-ClueBot's false positives in the first place. Erics (talk) 05:16, 2 January 2013 (UTC)

ClueBot III just archived everything into the same month instead of splitting it

See at Talk:Monsanto. ClueBot III just archived everything into Talk:Monsanto/Archives/2012/October. It also created an index even though the parameters indicated the bot not to do so (default value of index). —Ahnoneemoos (talk) 05:26, 2 January 2013 (UTC)

Because the bot does not read timestamps (it works based on diffs -- if the section is unchanged between the revision age hours ago and the current revision, archive it into the page named after age hours ago; this also means that the bot is more accurate with what is archived and what is not -- for example, if someone were to fake a timestamp, that wouldn't affect the bot, it also will pick up non-timestamped changes), the initial archival will do this because the bot assumes you are current on archival when the bot tags are added. Subsequent archivals will go into the correct archive (the one dated based on the last activity in that section). The easiest way is to copy-paste into the relevant archives (if none existed before you added CB3).
Furthermore, per the docs,

The index parameter should be set to yes if you wish the bot to dump an index in place of the template. This is useful if you have wrapped the template with an {{archive box}}. Otherwise it should be set to no.

index doesn't have a specified default value, and should be set explicitly to yes or no. It will still create an index, just not transclude it. If you don't want it even to create an index in its own userspace, you need to use nogenerateindex=1:

nogenerateindex

Type: unsigned integer (boolean)
Default: 0

Description: If this is set to 1, the bot will not generate an index under User:ClueBot III/Indices/. There are very few times this option should be used. If this option is used, the index option will no longer work right.

-- Cobi 19:37, 3 January 2013 (UTC)
Not sure. I don't understand what do I have to do exactly so that CB3 archives each entry into its corresponding month-year if no archive existed before adding CB3. Could you please tone down a bit the technical jargon and explain it as if I were a 5 year old? Thanks for everything that you do bro, appreciate your time, patience, and effort. Cheers! —Ahnoneemoos (talk) 20:03, 3 January 2013 (UTC)
You have to manually create the archive pages and archive the sections before adding CB3 if you care about the history before CB3 was added. Otherwise CB3 will dump it all in the same archive when it creates it.
The reason is because the bot doesn't look at the timestamps, and instead looks at the history of the page. In this way, it is fundamentally different from MiszaBot, and this is one of drawbacks of doing it this way -- that it archives everything into the current archive on the first run. The benefits are that it is more accurate in detecting changes to sections. To be clear, it will work correctly after the first archive has happened.
To recap, the easiest way is to make sure your archives are current before switching to CB3. -- Cobi 20:23, 3 January 2013 (UTC)
Idea: can't I just hack it by creating the seed used by CB3 to perform consequent archives? Is it not the indexes? I can manually create the index then, no? That way CB3 will beleive that it already performed an archive and operate as if it were performing a routinary task. —Ahnoneemoos (talk) 20:37, 3 January 2013 (UTC)
The bot does not distinguish between initial and "routine" archivals. That is the issue. The very first archival it performs it performs as if it were already current (and thus, a "routine" archival). The solution is to make your page current before adding the bot. The bot will keep it current, but won't fix (beyond moving it out of the way into the current archive) a page that needs cleaning up first. -- Cobi 08:45, 5 January 2013 (UTC)
OK, I understood now. The bot assumes that the content to be archived belongs to the current month. I wonder if there's a way to tell the bot what should be the correct month by looking at the diffs or history instead of the signatures like MiszaBot does. —Ahnoneemoos (talk) 09:08, 5 January 2013 (UTC)

Is ClueBot III restricted to same NAMESPACE? What about subpages and parent targets?

See at Misplaced Pages:WikiProject Puerto Rico/Assessment/Requests. I specified the prefix as Misplaced Pages talk: which is different from where the original content resided at (the original content was in the Misplaced Pages: NAMESPACE). The bot ignored the specified NAMESPACE and performed the move onto the same NAMESPACE where the original content resided at; ie: it was not able to switch to a different NAMESPACE than that of the origin:

{{User:ClueBot III/ArchiveThis
| archiveprefix=Misplaced Pages talk:WikiProject Puerto Rico/Archives/
| format=Y/F
}}

EXPECTED RESULT
Misplaced Pages:WikiProject Puerto Rico/Assessment/Requests archived into Misplaced Pages talk:WikiProject Puerto Rico/Assessment/Requests/Archives/December/2012

REAL RESULT
Misplaced Pages:WikiProject Puerto Rico/Assessment/Requests archived into Misplaced Pages:WikiProject Puerto Rico/Assessment/Requests/Archives/December/2012


Also, how can the bot archive a subpage to an archive in a parent target? For example, how can the bot archive Misplaced Pages:WikiProject Puerto Rico/Assessment/Requests into Misplaced Pages:WikiProject Puerto Rico/Archives?

Ahnoneemoos (talk) 09:32, 2 January 2013 (UTC)

This is to prevent the bot clobbering over pages it shouldn't. If you need a specific instance fixed, let me know and I can override it. -- Cobi 05:46, 3 January 2013 (UTC)
Misplaced Pages talk:WikiProject Puerto Rico/Archive        → Misplaced Pages talk:WikiProject Puerto Rico/Archives/<year>/<month>
Misplaced Pages talk:WikiProject Puerto Rico/Archive 1      → Misplaced Pages talk:WikiProject Puerto Rico/Archives/<year>/<month>
Misplaced Pages talk:WikiProject Puerto Rico/Archive 2      → Misplaced Pages talk:WikiProject Puerto Rico/Archives/<year>/<month>
Misplaced Pages talk:WikiProject Puerto Rico/Archive 3      → Misplaced Pages talk:WikiProject Puerto Rico/Archives/<year>/<month>
Misplaced Pages talk:WikiProject Puerto Rico/Assessment     → Misplaced Pages talk:WikiProject Puerto Rico/Archives/<year>/<month>
Misplaced Pages talk:WikiProject Puerto Rico/templates      → Misplaced Pages talk:WikiProject Puerto Rico/Archives/<year>/<month>
Misplaced Pages:WikiProject Puerto Rico/Assessment/Requests → Misplaced Pages talk:WikiProject Puerto Rico/Archives/<year>/<month>
Is there any way that you can implement some sort of privilege for advanced users? Similar to MiszaBot's key parameter in order to override this?
Ahnoneemoos (talk) 06:15, 3 January 2013 (UTC)
It has a key parameter. Though, only I can generate them, and they are specific to a certain page and archiveprefix. I have updated the last one to have a key -- it should be fixed now. The others didn't have a CB3 template on them, and I am not entirely sure why the first 4 should be auto-archived anyway. If you add a CB3 template (commented out if you prefer so it doesn't immediately activate) to them to do what you want, I can add a key. -- Cobi 19:21, 3 January 2013 (UTC)
 Done Will this archive each entry into its corresponding month-year or will it dump everything into Misplaced Pages talk:WikiProject Puerto Rico/Archives/2013/January like it happened at ? —Ahnoneemoos (talk) 19:45, 3 January 2013 (UTC)
See my comment above about that behavior. -- Cobi 19:56, 3 January 2013 (UTC)

Weird db output

So, according to the recentchanges table, ClueBot's edits are made by a non-bot ;p. Anything at your end that could be causing this? Ironholds (talk) 15:21, 3 January 2013 (UTC)

This is intentional, and it applies to all anti-vandal bots, not just ClueBot NG. See User:ClueBot NG/FAQ#Edits. – Wdchk (talk) 15:43, 3 January 2013 (UTC)
Neat. Pain in the arse for db analysis, mind ;p. Ironholds (talk) 04:33, 4 January 2013 (UTC)

Captcha code broken?

More than 10 tries and still no success with ClueBot NG Report Interface. The Captcha says "two words" but one is always a non-word! (have tried closest real word also)

page info: http://report.cluebot.cluenet.org/?page=View&id=1425713 ID: 1425713 User: 165.121.80.205 Article: 2012 Delhi gang rape case Friday, the 4th of January 2013 at 04:03:18 AM #90238 Anonymous (anonymous) "When I originally edited this paragraph, I made an edit comment that I thought a video source was less reliable, as I could not check it. It has been several days and no one has added a better source. {cn} is not justified as the source is listed, just uncheckable. Is there a better way to handle this situation?"

P.S. the Captcha on Edit pages does not have this coding or problem. — Preceding unsigned comment added by 165.121.80.205 (talk) 09:19, 4 January 2013 (UTC)

Yea, I'm working to (very slowly) changing the CAPTCHA system on the report interface, ReCaptcha is becoming somewhat of an annoyance - Rich(MTCD) 00:17, 5 January 2013 (UTC)

WP:DABS and Cluebot II

Any chance the bot could be run once a month to update WP:DABS (or the transcluded page, User:ClueBot II/dino) the way it did until December 2009? This would be really useful. Firsfron of Ronchester 06:41, 5 January 2013 (UTC)