Revision as of 14:02, 2 October 2024 editLeaderboard (talk | contribs)489 edits →Update the global bots section: ReplyTag: Reply← Previous edit | Latest revision as of 22:55, 16 October 2024 edit undoIsaacl (talk | contribs)Extended confirmed users23,424 edits →Reuse for bots and tools: agree that knowing some basic usage info would be helpful; extra overhead, though, is a challenge | ||
(43 intermediate revisions by 13 users not shown) | |||
Line 19: | Line 19: | ||
{{User:MiszaBot/config | {{User:MiszaBot/config | ||
|maxarchivesize = 150K | |maxarchivesize = 150K | ||
|counter = |
|counter = 30 | ||
|minthreadsleft = 4 | |minthreadsleft = 4 | ||
|minthreadstoarchive = 1 | |minthreadstoarchive = 1 | ||
Line 25: | Line 25: | ||
|archive = Misplaced Pages talk:Bot policy/Archive %(counter)d | |archive = Misplaced Pages talk:Bot policy/Archive %(counter)d | ||
}} | }} | ||
== Mass creation section == | |||
{{ping|BilledMammal}} Would you mind explaining further? I don't understand what you wrote in your edit summary. – ] <small>(])</small> 09:05, 7 July 2024 (UTC) | |||
:My understanding is that ] applies to all mass creations, even when automation is not used - for example, ] mass creation. | |||
:I also think it is a little redundant. I can't think of any circumstances where true mass creation can occur without some level of automation - for example, the use of boilerplate text. H ] (]) 09:13, 7 July 2024 (UTC) | |||
::Thanks. I think it pretty clearly does not apply to all mass creations though, for the following reasons: | |||
::* It's part of the bot policy, not the editing policy | |||
::* It says so: {{tq|any large-scale ''automated or semi-automated'' content page creation task }}, not {{!xt|any large-scale content page creation task }} | |||
::* The (only) requirement created by this section is to seek permission at ]. If I went to BRFA and said I wanted to write a series of 50 articles on a subject without automation, I think they'd tell me to move along. | |||
::* A 2022 ] to {{tq|clarify that mass-creation through repetitive editing by hand is not different for policy purposes to automated/semi-automated mass-creation}} and {{tq|make getting consensus for creation prior to mass creation per WP:MASSCREATE mandatory}} ]. | |||
::It does apply to ], yes, but that is still within the limits of the bot policy. It is possible to create large numbers of articles without automation; I see editors doing it every day at NPP. For example, writing a stub on a species or location from scratch could take as little as ten minutes. So if you sit down and crank them out all day, you could break 50 and still have time for a long lunch. – ] <small>(])</small> 09:47, 7 July 2024 (UTC) | |||
:::It also says {{tq|all mass-created articles}}, and the final paragraph, as the exception that proves the rule, demonstrates that ] applies to the creation of content pages and that such creations are required to go through BRFA. However, I also agree that BRFA isn't the right place for various reasons, not least that per ] they shouldn't be approving mass creation. I suggest we reword the policy to direct editors first to the village pump, and clarify that once consensus has been obtained there only bot operators need to go through BRFA. | |||
:::{{tqb| A 2022 ] to {{tq|clarify that mass-creation through repetitive editing by hand is not different for policy purposes to automated/semi-automated mass-creation}} and {{tq|make getting consensus for creation prior to mass creation per WP:MASSCREATE mandatory}} ].}} | |||
:::It also failed to get a consensus against the proposal. Given that, I don't think it's appropriate to amend ] to exclude that interpretation when there wasn't a consensus that it is the wrong interpretation. | |||
:::{{tqb|It is possible to create large numbers of articles without automation; I see editors doing it every day at NPP. For example, writing a stub on a species or location from scratch could take as little as ten minutes.}} | |||
:::I might be wrong, but I believe those tend to use boilerplate text - which I consider semi-automation as the boilerplate is a primitive tool. ] (]) 10:09, 7 July 2024 (UTC) | |||
::::I think you are wrong, yes, but it's kind of beside the point. If you consider all forms of mass creation to be semi-automated, then what is the problem with amending the title of the section to read "Mass automated and semi-automated creation"? What's left out? – ] <small>(])</small> 11:37, 7 July 2024 (UTC) | |||
:::::It's redundant, and will make it harder to enforce the policy as editors have previously claimed that their mass creations are manual, even when there is clear evidence to the contrary such as them admitting to using scripts. ] (]) 11:41, 7 July 2024 (UTC) | |||
::::::I agree that it's redundant. I'm suggesting we add it anyway for clarity, because many people come here via a section link and do not realise that this section is part of the bot policy – that's all. I'm not sure I follow how that would make it harder to enforce the policy against people who are lying? – ] <small>(])</small> 11:53, 7 July 2024 (UTC) | |||
::: {{tq|It's part of the bot policy, not the editing policy}} This always seems to get in the way when people start arguing over ]. It's fairly clear the community wants to consider mass creation in general, not just automated mass creation, but for historical raisins ] is in ] and so it has to be "about" bots in some manner. I ], but few were interested in discussing it. ]] 10:50, 7 July 2024 (UTC) | |||
::::{{tq|It's fairly clear the community wants to consider mass creation in general, not just automated mass creation}} – is it? How? As you said yourself, you didn't get support for that interpretation when you proposed it just last year. And as I said, in the 2022 RfC, a proposal to change MASSCREATE to say this explicitly failed, with ] specifically noting opposition on the basis that {{tq|human editing falls outside of the scope of bot policy}}. | |||
::::I think if you guys want MASSCREATE to apply to all articles you should obtain a consensus and then move it to the editing policy. In the mean time, what is wrong with clarifying in the title that a section of the ''bot policy'' applies to automated edits, using words copied verbatim from that section? – ] <small>(])</small> 11:29, 7 July 2024 (UTC) | |||
:::::{{tqb|I think if you guys want MASSCREATE to apply to all articles you should obtain a consensus}} | |||
:::::Doesn't that apply equally in the opposite direction? If you don't want it to apply to manual mass creations, you should obtain a consensus? ] (]) 11:41, 7 July 2024 (UTC) | |||
::::::I'm not proposing to change anything. You just said yourself that my edit was "redundant", i.e. it merely restates what is already there (verbatim). – ] <small>(])</small> 11:43, 7 July 2024 (UTC) | |||
:::::::I think you have misinterpreted what I am saying. I see it as redundant because I see it as a tautology. The current text acknowledges that the policy applies to manual mass creation - regardless of my personal views on whether such a thing is possible - through the final paragraph which, as the exception that proves the rule, makes it clear that the ] mass creation of content pages is required to go through BRFA. ] (]) 11:44, 7 July 2024 (UTC) | |||
::::::::The final paragraph says that automated, semi-automated or bot-like creation of non-content pages do not need to go through BRFA. I don't see how that's relevant? – ] <small>(])</small> 11:51, 7 July 2024 (UTC) | |||
:::::::::]. ] (]) 12:00, 7 July 2024 (UTC) | |||
::::::] applies to all mass creation of articles, both from bots and from ]. If you edit in a bot-like manner, it does not matter if you are actually a bot or just a random person making articles quickly from boilerplate text.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 11:45, 7 July 2024 (UTC) | |||
:::::::{{ping|Headbomb}} Yep, there's absolutely no disagreement on that point. My edit added the words "automated and semi-automated" to the section heading, and as ] defines bot-like editing is equivalent to automated/semi-automated editing, the meaning remains unaltered. – ] <small>(])</small> 11:47, 7 July 2024 (UTC) | |||
::::::::In that case, can I propose a compromise? Title the section {{tq|Mass automated, semi-automated, or meatbot page creation}}. "meatbot" could perhaps be replaced with "bot-like". ] (]) 11:52, 7 July 2024 (UTC) | |||
:::::::::Sure, I'm good with that. I'd avoid 'meatbot' – it's not the most dignified piece of wikislang. Although actually, since it's getting a bit of a mouthful, do we really need the word "mass"? Nobody's using bots, silicon or flesh, to create one or two articles, right? – ] <small>(])</small> 11:55, 7 July 2024 (UTC) | |||
::::::::: {{ec}} Ugh. Please let's not make a long and confusing heading. ]] 11:57, 7 July 2024 (UTC) | |||
::::::::::Agreed. If the meaning remains unaltered, then there is no reason to make the change. ] (]) 12:39, 7 July 2024 (UTC) | |||
:::::::::::The reason I've suggested above is that because many people come here via a section link, they don't realise that this section is part of the bot policy and so end up reading it out of context. Most other sections already have the word "bot" in their title or shortcut, which ameliorates that. – ] <small>(])</small> 12:46, 7 July 2024 (UTC) | |||
::::: {{ec|2}} {{tq|As you said yourself, you didn't get support for that interpretation when you proposed it just last year}} No, I said I didn't get much discussion at all. More specifically, ] and ] supported, ] refused to consider it outside the context of a full rewrite, and ] took it on a bit of a tangent. No one else replied. {{tq|is it? How?}} Have you read through the actual discussions, with an eye for how ] being in ] restricts how people can consider ways of handling mass creation? Look at the very close you linked as "]", three of the seven oppose bullets hinge on "] can't regulate non-bot behavior". {{tq|I think if you guys want MASSCREATE to apply to all articles}} Personally I don't care. I'm just sick of ] and ] getting bent out of shape when people like you and BilledMammal argue over non-bot creations. ]] 11:57, 7 July 2024 (UTC) | |||
::::::::The main issue here is that people are trying to solve a problem that isn't a problem. If you've got a idiot on a stub-creating campaign using a boilerplate {{xt|"X is a fictional small village in the Chronicles of Narnia.<ref>CS LEWIS "The Chronicles of Narnia"</ref>}}, you can block them under ], ], ], ], ]... and per ] the method they use to create these undesired stubs is irrelevant. If it's disruptive, it must stop. This should be straightforward to understand.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 12:26, 7 July 2024 (UTC) | |||
:::::::::The problem is enforcement. While it's clear that editors who wish to create significant numbers of nearly-identical articles are required to get approval from the community, it is difficult to determine an action when they fail to do so, and the articles they created are usually accepted as fait accompli - Lugnuts is the clearest example of this. We need a streamlined process to stop editors who are engaged in mass creation without approval, and to remove the articles created in violation of this policy. ] (]) 12:35, 7 July 2024 (UTC) | |||
::::::::::I think you'd probably want to start by make it more obvious where and how they're supposed to get that approval. Above you infer that people falsely claim to be creating articles by hand to evade this policy. Maybe that happens sometimes. But I think there's a larger group of editors to who are say, creating stubs on similar topics by copying and pasting the last stub and changing the details, who genuinely don't think that the "bot policy" has anything relevant to them. Even if they did find their way to making a BRFA, as directed by ], they'd certainly conclude they were in the wrong place when asked to "create an account for your bot", specify "the computer language that this bot will be written in. E.g. Python, Java, C, VB, AutoWikiBrowser", provide "a link to the source code", and so on. The cruellest thing we do on this project is punish people for doing things that we never told them were forbidden. You have to set out the process before you can expect people follow it. – ] <small>(])</small> 13:19, 7 July 2024 (UTC) | |||
::::::::::"The problem is enforcement." If the problem is enforcement, fix enforcement. As for Lugnuts, he was banned in 2021 from created stubs under 500 words. And with Lugnuts, the problem never was policy, but ]. And that's why he's now indef banned.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 14:03, 7 July 2024 (UTC) | |||
:::::::::::I'm not sure that there's anything to enforce. When was the last time you saw someone creating more than 25 articles per day? It's unusual for anyone to even create 25 articles per week, and I don't think anyone has created 20–50 articles per day for any sustained, uninterrupted period of time. Even when ] was creating 15,000+ articles per year, it was often 150 this day and 200 the next, but then nothing (or very little) for the next several days. ] (]) 03:11, 8 July 2024 (UTC) | |||
* {{ping|Anomie}} You've the compromise suggested by BilledMammal. Could you please explain why? This is not an RfC – nobody is being asked to !vote support/oppose. I can see that you said "Please let's not make a long and confusing heading", which I tried to do, and Primefac asked (after I had made the edit) what the reason for it would be, which I answered. – ] <small>(])</small> 13:34, 7 July 2024 (UTC) | |||
** Exactly because Primefac and I opposed the change, and I've made the counterproposal in the section below to address the concerns you have. ]] 13:52, 7 July 2024 (UTC) | |||
**:As you know, ], and just saying you oppose something doesn't get us any closer to that. I think your idea to split the section is a good one but we needn't wait to see whether it consensus to fix this title. Do I understand correctly that your objection to "Automated, semi-automated or bot-like page creation" is that it's too long? In which case, how about "Bot or bot-like page creation", which aligns with other sections on this page and is no longer than most of them. – ] <small>(])</small> 14:08, 7 July 2024 (UTC) | |||
*:The 'compromise' is bad and inaccurate. The issue is mass-creation, not "Automated, semi-automated or bot-like page creation" because that literally means any page creation whatsoever.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 14:09, 7 July 2024 (UTC) | |||
*::BilledMammal's original suggestion was indeed "''Mass'' automated, semi-automated or bot-like page creation", which I'm also fine with. I dropped the "mass" to try and address Anomie's complaint that it was too wordy, but he reverted anyway. – ] <small>(])</small> 14:13, 7 July 2024 (UTC) | |||
*:::FWIW I'm not so sure about that addition, having read the context. Too easy to make it seem like "automated and semi-automated" includes "bot-like editing", and I don't see why the addition has any benefit. — <samp>] <sup style="font-size:80%;">]</sup></samp> \\ 14:22, 7 July 2024 (UTC) | |||
*The policy is currently that automated/semi-automated creation has to get authorization, plus a line that MEATBOT applies. MEATBOT, in turn, is almost entirely about making mistakes while editing quickly. The only clue in MEATBOT that it could extend beyond holding people accountable for their mass-mistakes is {{tq|processes which operate at higher speeds, with a higher volume of edits, or with less human involvement are more likely to be treated as bots}}. That seems reasonable to me. It doesn't prohibit any manual creation (and, BM, can we just stop with this argument that you alone make that "semi-automated editing tools" extends to include things like Microsoft Word or a boilerplate stored in notepad?), but if you go really fast and hard despite urges to slow down -- and especially if you make mistakes -- you may be asked to go through the bot authorization process. The problem is we seem to have a handful of "this 100% applies to everyone making more than a couple articles" folks on the front line, so it would help to have some additional clarity as to when going fast turns into bot-like editing of the sort that needs preauthorization. Unfortunately, ]. :/ In light of all this, I don't quite understand the purpose of the heading change or its reversion. — <samp>] <sup style="font-size:80%;">]</sup></samp> \\ 14:06, 7 July 2024 (UTC) | |||
*:See the Narnia example. And if it's not clear, MEATBOT is clear. We don't care how you do it, if it's disruptive, stop.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 14:11, 7 July 2024 (UTC) | |||
*::{{tq|if it's disruptive}} - The problem is, some users consider ''any'' fast article creation disruptive. — <samp>] <sup style="font-size:80%;">]</sup></samp> \\ 14:15, 7 July 2024 (UTC) | |||
*:::Exactly, that's why I changed the heading – sometimes you get people badgering other users for creating more than 25 articles at once manually, citing ], and either missing or deliberately overlooking the fact that it's in the bot policy and therefore can only be read within the context of bot or bot-like editing. As for why it was reverted, I'm stumped too. First it was because it changed the meaning, then it was because it didn't change the meaning, then it was because it was too long, now it's because we should split it instead. I think. It's hard to keep up. – ] <small>(])</small> 14:20, 7 July 2024 (UTC) | |||
*::::{{ping|Joe Roe}} Regarding {{tq|sometimes you get people badgering other users for creating more than 25 articles at once manually, citing ]}}, can you give some examples? ] (]) 02:56, 8 July 2024 (UTC) | |||
*:::::Or even three articles: ]. ] (]) 03:08, 8 July 2024 (UTC) | |||
*:::::: , and looking at a few of those they weren’t manual - they were boilerplate. ] (]) 03:20, 8 July 2024 (UTC) | |||
*:::::::I specifically refer to the statement that "creating <u>three articles</u> between 18.44 and 18.47 is a much a higher frequency than 25-50 per day". | |||
*:::::::Manual edits can be boilerplate, just like automated edits don't have to be boilerplate. ] (]) 04:33, 8 July 2024 (UTC) | |||
*:::::::I went through all of ] back to 2020. There was only one day in which that tool counts 25 articles (21 November 2022). They never exceeded that level, and rarely came close to it. However, for that date finds only 22, and six of those are redirects. ] (]) 04:53, 8 July 2024 (UTC) | |||
*::::::::{{ec}} I think their point with that statement was that if you are creating three articles in three minutes, you're obviously not doing it manually. | |||
*::::::::However, we're getting off topic here. Examples of editors being badgered for genuine manual creations would be helpful to see, if you have them. ] (]) 04:57, 8 July 2024 (UTC) | |||
*:::::::::Is your definition of "genuine manual creations" approximately "using completely different wording and sources in each article"? ] (]) 05:05, 8 July 2024 (UTC) | |||
*::::::::::No, but I don't think us discussing this is going to be productive, so I will step back now. If editors like Joe have examples or want to discuss further, I will happily do so. ] (]) 05:25, 8 July 2024 (UTC) | |||
{{reflist-talk}} | |||
=== Kicking it out of botpol? === | |||
{{discussion top|reason=RFC posted below. ]] 23:16, 9 July 2024 (UTC)}} | |||
I've drafted an RFC at ]. Anyone have comments before I post it somewhere? Opinions as to whether we should do it here or ]? ]] 13:26, 7 July 2024 (UTC) | |||
:I think that's a good idea. Even if the content doesn't change, this discussion illustrates of the difficulty of relying on a local consensus of technically-focused editors to manage a policy on article creation. I don't think you need to do an RfC, though. A consensus of editors on this page that the section is no longer in scope would be sufficient, since we're only moving accepted policy around, not significantly changing it. – ] <small>(])</small> 13:42, 7 July 2024 (UTC) | |||
::I don't see what's to be gained by separating it from botpol. It's clearly bot-related.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 13:59, 7 July 2024 (UTC) | |||
:::Except that people above are insisting that it also applies to creations that don't involve bots or bot-like edits, and therefore accurately titling it as part of the bot policy is unacceptable. – ] <small>(])</small> 14:15, 7 July 2024 (UTC) | |||
::: It's not always bot related. Various discussions have wanted to consider more manual mass creations as well, but have had to struggle against it being part of ]. So some have used that as an objection, and others try to stretch ] to somehow make it apply. Even in ] there was concern over restricting it to bots. ]] 16:38, 7 July 2024 (UTC) | |||
:Thanks for getting it started. Not opposed to this in principle, but in this draft while the moved policy seems like it retains the same meaning, the summary text left behind says something different. Mainly, you've created a new page that applies to "automated and semiautomated content page creation" and summarized it with a line saying {{tq| Mass page creation requires approval by the community}}. Probably an assumption is built in because of the scope of the BOTPOL, but it would be good to spell out. — <samp>] <sup style="font-size:80%;">]</sup></samp> \\ 14:20, 7 July 2024 (UTC) | |||
:: The controlling policy on that is the new mass-creation page rather than botpol anyway, the point is to note the mass creation policy exists rather than to restate it in every particular. I'm wary of putting too much in here that may easily become obsolete once people have the opportunity to discuss just how much they want it to cover non-bot mass creations, but if others want to nitpick it to that extent too then 🤷. ]] 16:38, 7 July 2024 (UTC) | |||
:::Regardless, right now the summary left behind in the draft changes policy. If your intent isn't to run a such an RfC, that line would need to change. — <samp>] <sup style="font-size:80%;">]</sup></samp> \\ 12:37, 8 July 2024 (UTC) | |||
:::: I still think you're over-interpreting it, but I adjusted the wording slightly to try to make you happy. ]] 23:43, 8 July 2024 (UTC) | |||
:It’s a good idea. Let’s get this done, and then we can discuss other changes, such as the one proposed below. ] (]) 02:14, 8 July 2024 (UTC) | |||
:@], I think that the RFC question may be so long that many editors won't read it. ] (]) 01:11, 8 July 2024 (UTC) | |||
:: I know you think no one will read more than the headline of anything, although why you think even "Should ] be severed from ]?" is too long I have no idea. That's the question, which I bolded to make it easy to pick out. The part before is background and everything after is defining what ''exactly'' that means because experience tells me that otherwise people will start arguing over how to rewrite the whole thing and we'll wind up with no consensus for anything. ]] 11:10, 8 July 2024 (UTC) | |||
:::Your sandbox contains 730 words, which is more than most editors will read. | |||
:::Even if you place your signature (everything before the timestamp is "the RFC question") after the bold-face question, that's 138 words. I'd rate that as being possible, but still being longer than the average RFC question. ] (]) 17:09, 8 July 2024 (UTC) | |||
{{discussion bottom}} | |||
===Defining mass creation as >50 articles per day=== | |||
:Separately, I've been wondering whether the way to address BilledMammal's (specifically his) ongoing concerns about MASSCREATE is to explain it in specific, unambiguous detail. When you picked that quotation from @] out of the 2009 RFC, there were other options: | |||
:* "anything more than 25 or 50" | |||
:* "rapid creation" | |||
:* "in a rapid manner" | |||
:* "25-50 articles per day" | |||
:* "25–50<code>+</code> articles per day" | |||
:* "clicking "save" every 5-10 seconds" | |||
:* "more than 50 articles in a short period" | |||
:* "more than 50 articles in a short amount of time". | |||
:Thinking back at BilledMammal's multiple attempts to get rid of articles or prevent future creations, then general themes seem (to me) to be: | |||
:* He interprets "25 to 50" as having no time limit whatsoever. If you create one article a week, a year from now, you may be guilty of "mass creating" articles. | |||
:* He is primarily concerned about very short, very similar fill-in-the-blank articles, especially if it cites the same source as all the others, and most especially if that source is a database. For example, "_____ is a British cricket player" or "_____ is a fungus in the genus ______". | |||
:I think if we replaced the quotation with a more detailed summary, that would resolve quite a lot of this. | |||
{{difftext|While no specific definition of "large-scale" was decided, a suggestion of "anything more than 25 or 50" was not opposed.|While no specific definition of "large-scale" was decided, editors who want to create more than 50 articles in any 24-hour period should obtain prior approval. | |||
}} | |||
:I do not expect this to make the anti-stub editors happy, but it would provide clarity about when creating a lot of articles is actually a WP:MASSCREATION matter, and when it's just creating a lot of articles. | |||
:BTW, to the best of my knowledge, there have never been any actual mass creation attempts that were not automated or semi-automated. The idea that someone could manually write 50+ articles per day is not realistic. ] (]) 01:11, 8 July 2024 (UTC) | |||
::More than 50 per day would be more than 18,250 per year. For context, it was very rare for Lugnuts to exceed fifty articles per day. | |||
::This change would result in the policy endorsing mass creation, not requiring it to get community approval. You’ve also misunderstood my interpretation of this; only similar articles created using mass creation techniques count towards the limit. | |||
::I’ve also split this into a seperate section, to avoid derailing Anomie’s proposal. ] (]) 01:25, 8 July 2024 (UTC) | |||
:::Lugnut's problem was IDHT, not that MASSCREATION was unclear. And 50 a day is too high a limit. 50 in a short term is better. 25 in a short term is also OK by me. We can leave that part undefined per "you know it when you see it" because as soon as you set a precise number, someone will go "but I made sure to edit at "X-1/time period", so MASSCREATION doesn't apply!"  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 01:51, 8 July 2024 (UTC) | |||
::::The Lugnuts situation came from two issues; ], and because the lack of clarity in ] made it hard for the community to enforce and thus address the IDHT issue. | |||
::::Largely agree on leaving it undefined; if an editor is creating 30 boilerplate articles a week for many months, then that’s obviously mass creation that requires community review and approval. ] (]) 01:57, 8 July 2024 (UTC) | |||
:::::MASSCREATE doesn't have anything to with "boilerplate articles". That's your idea. It's never been part of the policy. ] (]) 02:01, 8 July 2024 (UTC) | |||
::::::Falls under MEATBOT and/or semi-automated. ] (]) 02:08, 8 July 2024 (UTC) | |||
:::::::Automated and semi-automated article creation does not have to use a boilerplate. (See also ].) | |||
:::::::MEATBOT applies to "high-speed or large-scale edits that a) are contrary to consensus or b) cause errors an attentive human would not make". It has nothing to do with the edits being "boilerplate" or repetitive in any way. ] (]) 02:57, 8 July 2024 (UTC) | |||
::::::::If you're creating boilerplate articles, you're behaving like a bot. MASSCREATE doesn't prohibit boilerplate articles, but it does says that if you want to do that on a large scale, i.e. more than 25-50, you need consensus to do so.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 08:02, 8 July 2024 (UTC) | |||
:::::::::No, you're behaving like a human who used a boilerplate. Ditto if I take the time to write 20 totally different articles but publish them all at the same time. Even if, against the odds, you found consensus for calling the use of a boilerplate to manually create articles without errors a ] issue, it still doesn't fall under that 25-50 rule, which is specifically about automated or semiautomated editing. — <samp>] <sup style="font-size:80%;">]</sup></samp> \\ 12:42, 8 July 2024 (UTC) | |||
::::::::::"behaving like a human who used a boilerplate", that's exactly what a ] is. Again, it does not matter if you use an actual bot, semi-automation, or do things fully manually, if what you are doing is disruptive, you must stop.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 14:38, 8 July 2024 (UTC) | |||
:::::::::::Headbomb, I quoted the relevant sentence from MEATBOT for you. MEATBOT does not actually ''say'' anything about boilerplates or repetitive tasks. It might be typical to interpret it that way, but it does not actually ''say'' that. | |||
:::::::::::It does say that editing against consensus (e.g., being disruptive) is unacceptable regardless of the method used to edit against consensus. ] (]) 17:18, 8 July 2024 (UTC) | |||
::::::::::::]: A human (made of meat, unlike a robot) editor that makes a large amount of repetitive edits from their own account, often with semi-automated tools, much like a bot would. For the purpose of dispute resolution, it is irrelevant if edits are made by actual bots or by meatbots. See also WP:MEATBOT. | |||
::::::::::::Boilerplate editing is bot-like editing. Which, again, for the purpose of dispute resolution, is irrelevant, because if what you're doing is disruptive, you must stop and discuss and get consensus for what you're donig. I don't know why that's so hard to understand. <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 17:24, 8 July 2024 (UTC) | |||
:::::::::::::] is not the policy. | |||
:::::::::::::Nobody claims MEATBOT if you find and fix the same typo once a day for a year, because that's ''not'' bot-like editing. | |||
:::::::::::::Nobody claims MEATBOT if you use the same format ("a boilerplate") to write a single article each day for a year, because that's ''not'' bot-like editing. ] (]) 17:27, 8 July 2024 (UTC) | |||
::::::::::::::That is ''exactly'' what bot-like editing is. That it's happening slowly is irrelevant.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 18:31, 8 July 2024 (UTC) | |||
:::::::: ] was created specifically because we trouble with an editor trying to claim that the bot policy didn't apply to their bot-like editing because they were completely manually filling in a boilerplate with no automation at all. The intent was to cut the knot with a ], and "or large-scale" was specifically included to apply to "slow and steady" bot-like editing. See ] for the original discussion. Possibly some of the arguments in here have gotten things reversed or taken it too far (I'm not feeling up to reading through all of it in enough detail to work that out), but {{tq|q=y|It has nothing to do with the edits being "boilerplate" or repetitive in any way}} is wrong. ]] 23:39, 8 July 2024 (UTC) | |||
:::BilledMammal, this change would result in the policy more precisely representing what the 2009 RFC (the one that eventually resulted in its creation) actually said. I grant that this would make it more difficult for editors to make up their own claims about what it says (e.g., that it was intended to prevent editors from creating more than 50 articles ever – a limit you're coming up on, by the way). | |||
:::It would be hardly surprising if Lugnuts usually complied with MASSCREATE in at least some minimal fashion, since about 90% of his article creations were after the RFC that led to the MASSCREATE rule. MASSCREATE was not about Lugnuts; it was primarily about an editor who was creating more than a thousand articles per month, and sometimes hundreds per day, with only a few seconds in between each article, and the effect that this volume had on review processes. <small>Also, he's written more FAs than you've written articles of any kind, so please don't assume that he's a bad editor or doesn't know what he's doing.</small> | |||
:::If you want to make MASSCREATE stricter, then you could make such a proposal, but a sound basis for that future discussion would first be understanding what the long-standing rule actually says (25–50 per day, not per month/year/lifetime), what it was supposed to do (avoid overwhelming review processes and give admins a chance to stop CSD-worthy problems before there were hundreds or thousands of articles to deal with), and how it has or hasn't worked for us (e.g., it has stopped flooding review queues, but it hasn't stopped the creation of low-quality articles).''.'' ] (]) 01:54, 8 July 2024 (UTC) | |||
::::I also highly question the need to do anything with our mass creation policy if the primary objective is to retroactively prevent an indef banned editor from IDHT behaviour.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 01:57, 8 July 2024 (UTC) | |||
:::::I use Lugnuts as a convenient benchmark to determine whether a proposal is non-viable; because the community considers his creations to be mass creations, any proposal that would redefine MASSCREATE in such a way that his creations would not be covered is very likely to be rejected. ] (]) 02:06, 8 July 2024 (UTC) | |||
::::{{tqb| (e.g., that it was intended to prevent editors from creating more than 50 articles ever – a limit you're coming up on, by the way)}} | |||
::::I don’t think anyone - who isn’t making a ] - interprets it that way, so can we please stop using that interpretation as a reason it’s problematic? It’s a straw man. | |||
::::In any case, the 2009 RfC was 15 years ago. It is too late to contest that close; if you think the wording is wrong, then open a new RfC. ] (]) 02:00, 8 July 2024 (UTC) | |||
:::::We don't need an RFC to choose a different quotation from the 2009 RFC, or to re-word it so that it accurately represents the 2009 RFC without using a direct quotation. ] (]) 02:03, 8 July 2024 (UTC) | |||
::::::We do, because any substantial ] change to MASSCREATE is certain to be reverted - and what you propose is going to be seen by many editors as substantial, even if you disagree. ] (]) 02:06, 8 July 2024 (UTC) | |||
:::::::There's a stage in between ] and an RFC, called "forming consensus on the talk page". | |||
:::::::BTW, if you want to talk about strawman arguments, I suggest looking at the one implying that if the MASSCREATE approval process kicks in at 50 per day, then someone might actually create 50 articles per day, 365 days per year, and that the community would be helpless to stop them (assuming we wanted to, which is not always the case). ] (]) 02:10, 8 July 2024 (UTC) | |||
:: I think you're misinterpreting the 2009 RFC. As I read it, the "anything more than 25 or 50" suggestion was not strictly time limited, but was limited to a "task". Extremely fast creation is one problem, but "slow and steady" can also add up to a problem. Some of the replies focused on speed, while others did not. An advantage of choosing a quote from the proposal rather than some other comment is that ''it was the proposal'' that everyone should have read (even if they didn't).{{pb}}Also you probably shouldn't be ignoring the ], where ] (with much more nuance than this proposal) was rejected. ]] 11:35, 8 July 2024 (UTC) | |||
:::@], that quote wasn't from the proposal. The proposal was very short: "Proposal: Any large-scale semi-/automated article creation task require ]". | |||
:::You quoted part of the OP's vote in favor of his own proposal, from a sentence that said ""Large-scale" is up for discussion, but I would say anything more than 25 or 50." Every single !vote that mentioned those numbers afterwards specified that this was to be interpreted per day or in a short time period. I think this was specified precisely because "slow and steady" does ''not'' cause the problems that they were trying to solve (e.g., too many articles for the review processes to handle in a single day or bot-like editing). | |||
:::Nobody minds if someone creates one article a day for a month. That never overwhelms review processes. That is never considered bot-like editing. ] (]) 17:25, 8 July 2024 (UTC) | |||
:::: {{tq|You quoted part of the OP's vote in favor of his own proposal}} You see it as the vote. I see it as a clarification of the proposal. Not everything has to be in the headline. 🤷{{pb}}{{tq|Every single !vote that mentioned those numbers afterwards specified that this was to be interpreted per day or in a short time period}} Considering there were only three such !votes, two of which were opposes, I don't find that a very convincing argument. Meanwhile, a not insignificant number of the comments talk about mass creation needing review without reference to the rate of that creation.{{pb}}{{tq| is never considered bot-like editing.}} Did I say it was? ]] 23:09, 8 July 2024 (UTC) | |||
:::::{{xt|Considering there were only three such !votes}}, I count four: | |||
:::::# "25-50 articles per day" | |||
:::::# "25–50<code>+</code> articles per day" | |||
:::::# "more than 50 articles in a short period" | |||
:::::# "more than 50 articles in a short amount of time". | |||
:::::plus several more specifically talking about "rapid" editing (e.g., "clicking "save" every 5-10 seconds"). One article per day does not involve clicking "save" every 5 or 10 seconds, even if @] says ] that something as small and slow as fixing one typo each day {{xt|is ''exactly'' what bot-like editing is. That it's happening slowly is irrelevant.}} | |||
:::::Note, too, the section heading you linked above, which was "]", not "Clarification of <u>slow and steady</u> human editing" or even "Clarification of human editing that is <u>repetitive but happening at an average manual speed</u>". The problem with high-speed editing, no matter what method is being used for it, is that someone can make a huge mess before anyone has a chance to notice. Slow and steady, no matter what method is being used for it, does not have that same risk. ] (]) 00:09, 10 July 2024 (UTC) | |||
:::::: Your #2 is not a !vote, it's a comment in reply to someone else's !vote. You're the one who limited it to !votes mentioning those specific numbers. AnomieBOT has several tasks that only edit once per day, and in the past had one that only edited once per month, if that can illustrate for you that bot editing doesn't have to be fast. As for the section heading, are you trying to prove that ''you're'' one of the people who only reads the heading and not the discussion? And I note people can also make a huge mess while editing slowly to ]. ]] 00:22, 10 July 2024 (UTC) | |||
:::::::The purpose of MEATBOT is not to prevent people from editing. It's to prevent people from editing so quickly, in such enormous volume, that the rest of us are at risk of having a huge mess to clean up later. One edit per day does not have that risk. Hundreds of edits per hour does. ] (]) 00:32, 10 July 2024 (UTC) | |||
:::::::: 🤷 Well, you're free to believe whatever you want, no matter how wrong it may be. ]] 01:15, 10 July 2024 (UTC) | |||
:::IMO it's also factored in that if editors are doing some real work on each article that they create, we don't want to discourage that, '''''and vice versa'''''. No exact minimum, but something more than a stub from a database or database-like source. <b style="color: #0000cc;">''North8000''</b> (]) 19:51, 8 July 2024 (UTC) | |||
::::This factor is not mentioned in MEATBOT or MASSCREATE, but I assume it would be considered by the community if someone made a proposal under either policy provision. ] (]) 00:37, 10 July 2024 (UTC) | |||
:::I think that two reasons for attention this are: | |||
:::#Even though it's on the bot page, it's our main or only real rules regarding even non-bot mass creation. (unless I'm a real dummy in that area) | |||
:::#There have been many discussions at different wp:notability pages where a common sentiment was avoiding mass creation, but it then gets said "but that is covered elsewhere", so it's important that it really is effectively "covered elsewhere". So another reason is that it's really needed to enable evolution of wp:notability guidelines. | |||
:::<b style="color: #0000cc;">''North8000''</b> (]) 19:51, 8 July 2024 (UTC) | |||
::::It doesn't cover mass creation that doesn't involve a bot or bot-like editing, because it's in the bot policy, and because it explicitly says so ({{tq|Any '''large-scale automated or semi-automated''' content page creation task }}). This has been discussed at great length recently; see above. – ] <small>(])</small> 11:08, 9 July 2024 (UTC) | |||
::I'm also don't think that a hard limit is the way to go here. It's practically guaranteed to encourage gaming, i.e. posting pregenerated articles exactly every 28 minutes. It also doesn't address what I think Headbomb is trying to get at, which is that it's the disruptive outcome that is the problem, not exactly how it happened. | |||
::If anything, I'd go in the other direction. Instead of trying to define mass creation, identify the problems it causes, and shift the guideline to address those. So you'd say that we don't want people to create articles so fast that they overwhelm the ability of other editors to patrol them, or create articles without checking that the individual contents and formatting is correct, or create articles from a single source without a strong expectation of notability, that kind of thing. | |||
::That also fully detaches it from the bot policy, which I'm more and more convinced it should be. Using bots without approval is already forbidden, for anything. Why do we need to restate that it is extra forbidden for creating lots of articles? – ] <small>(])</small> 11:27, 9 July 2024 (UTC) | |||
:::The problems are: | |||
:::# The articles might be bad (e.g., non-notable, even hoaxes). | |||
:::# And while we get bad articles every hour of the day, we don't want hundreds more bad articles all at once. | |||
:::Posting one article every 28 minutes would actually be great. We would actually prefer to have someone post one new article every 28.8 minutes round the clock than 49 articles at 12:01 a.m. and then disappear. Why? Because if the first few turn out to be really bad, we can block you before you've posted any more. We'd have to clean up (e.g., delete) the handful you've already posted, but we wouldn't have to delete dozens or hundreds. | |||
:::The goal with MEATBOT is to give the community a chance to intervene. Yes, please, be bold and post some articles. But don't dump hundreds on us; trickle them in at a rate that we can actually manage – and by "manage", we mean "determine whether you're making a mess and we need to stop you". Also, if you want to dump hundreds in one go, then please determine whether there's consensus first. In theory, if you're going to dump hundreds of articles in one go, and we both want those topics and approve of your content, we'll approve those. ] created a lot of articles with the consent of the community; of the fully automated output from its first day: the lead plus two sections amounts to 374 words, and every single fact taken from ]. ] (]) 00:30, 10 July 2024 (UTC) | |||
::I agree (and I think few would disagree ) that there must be guidance on non-bot mass creation of articles. Whether we acknowledge that this section in the bot policy also applies to non-bot activity, or have a separate guideline or policy for that. <b style="color: #0000cc;">''North8000''</b> (]) 13:22, 9 July 2024 (UTC) | |||
== RFC: Sever ] from ] == | == RFC: Sever ] from ] == | ||
Line 323: | Line 158: | ||
** ]: code available | ** ]: code available | ||
--> | --> | ||
See: ] | |||
* {{Botlinks|DYKToolsBot}} | |||
** Code | |||
====Discussion==== | ====Discussion==== | ||
Line 334: | Line 168: | ||
:A ''mandatory'' policy is a recipe for "consensus" to shut bots down or worse remove bot privs fpr being a rouge operator. What else could "mandatory" mean? Which is like that Vietnam War saying, "We had to destroy a village to save it" (variations of this quote). Isaacl is exactly right that dumping a bunch of source to GitHub is meaningless for anyone trying to install and operate it. And some bots the operation requires a lot of training that is not easy to document. -- ]] 23:47, 30 September 2024 (UTC) | :A ''mandatory'' policy is a recipe for "consensus" to shut bots down or worse remove bot privs fpr being a rouge operator. What else could "mandatory" mean? Which is like that Vietnam War saying, "We had to destroy a village to save it" (variations of this quote). Isaacl is exactly right that dumping a bunch of source to GitHub is meaningless for anyone trying to install and operate it. And some bots the operation requires a lot of training that is not easy to document. -- ]] 23:47, 30 September 2024 (UTC) | ||
::{{tq|shut bots down}}. I imagine any new requirements would have an exception for bots approved before the new requirement. –] <small>(])</small> 00:44, 1 October 2024 (UTC) | ::{{tq|shut bots down}}. I imagine any new requirements would have an exception for bots approved before the new requirement. –] <small>(])</small> 00:44, 1 October 2024 (UTC) | ||
:::Where in my post did I say mandatory? I did not, so to disagree with something I didn't say is a little odd. This sub-thread is about taking the first steps - right now we don't even know which bots ''have'' open-source or freely-available code bases, or where they're hosted, etc. ] (]) 12:38, 5 October 2024 (UTC) | |||
:I get where everyone is coming from with the desire to make sure bots keep running smoothly, but it's not clear to me that there's consensus making open source mandatory. I'm concerned that: | :I get where everyone is coming from with the desire to make sure bots keep running smoothly, but it's not clear to me that there's consensus making open source mandatory. I'm concerned that: | ||
:*The focus is open source and adding extra requirements instead of having succession plans. | :*The focus is open source and adding extra requirements instead of having succession plans. | ||
Line 339: | Line 174: | ||
:*Grandfathering some bots might mean we end up with all of the downsides that will discourage future development without significantly improving continuity. | :*Grandfathering some bots might mean we end up with all of the downsides that will discourage future development without significantly improving continuity. | ||
:I've only written one bot so far, one that's trivial to set up and also open source, but it likely wouldn't exist if I had to produce open-source code before getting project approval or had been required to use Toolforge for my first project. The problem that {{u|Protection Helper Bot}} solves has been a Phabricator ticket since 2012 and proposed multiple times before and since then (such as ]). We should be encouraging new developers to help solve long-standing problems rather than throwing up roadblocks, even if they seem like low bars to most experienced Misplaced Pages developers. ] (]) 01:03, 1 October 2024 (UTC) | :I've only written one bot so far, one that's trivial to set up and also open source, but it likely wouldn't exist if I had to produce open-source code before getting project approval or had been required to use Toolforge for my first project. The problem that {{u|Protection Helper Bot}} solves has been a Phabricator ticket since 2012 and proposed multiple times before and since then (such as ]). We should be encouraging new developers to help solve long-standing problems rather than throwing up roadblocks, even if they seem like low bars to most experienced Misplaced Pages developers. ] (]) 01:03, 1 October 2024 (UTC) | ||
::I never said anything about anything being mandatory. ] (]) 12:38, 5 October 2024 (UTC) | |||
:::You're right, "mandatory" was used by another commenter. However, I do actually believe setting the expectation that core functions {{tpq|should}} make their code available would likely turn that expectation into a requirement in practice. The policy already recommends it and that seems to be interpreted aggressively at times in BRFA discussions. I would also like to understand the current situation before changing the policy. ] (]) 01:21, 6 October 2024 (UTC) | |||
::::Your bot was the exception as it's an adminbot that touches protection of articles. Most BRFAs have no requirement or request to release their source, and don't in practice. ] (]) 21:13, 14 October 2024 (UTC) | |||
:::::I agree with the bot policy that source code for adminbots should be open or the developer {{tpq|must present such code for review upon request from any BAG member or administrator}}. My previous comments should not be interpreted as contradicting that. I designed my bot to be easy for anyone to run by releasing the code as open source and ensuring it's easy to set up. However, I believe it's fair to say that some of the additional requirements that have been discussed would have likely deterred me from submitting a BRFA. ] (]) 23:11, 14 October 2024 (UTC) | |||
:Mandatory or whatever aside, I think there is merit to us having a list of what we think are "essential" bots, along some idea of what the succession strategy for these bots is. (any of: is source available? are they hosted on Toolforge with multiple maintainers?) ] (]) 21:12, 14 October 2024 (UTC) | |||
::I added a couple bots to the list started above. I also included how esoteric their tech stack would be considered these days (ie: how easily could someone take over maintaining it with updates/fixes). Bots using pywikibot or mwbot-rs for example I think are quite accessible. Custom C++ code or even Perl code I'd say is not particularly easy to take over. Realistically there's nothing we can do about these, but it's worth remaining aware of our ].{{pb}}I'm loosely defining "core" as the bot disappearing causing noticeable disruption to the encyclopaedia, some significant process, or otherwise meaningfully impacting the quality of articles. ] (]) 21:46, 14 October 2024 (UTC) | |||
:::I suggest creating a parent list of key English Misplaced Pages processes/ongoing work items, and under them listing the essential automated tasks for those processes/work items. (I understand that some bots may be grouped under multiple processes/work items.) At the very least, it would be helpful to those not familar with all the bots if the list could include a brief summary of their essential tasks. ] (]) 21:54, 14 October 2024 (UTC) | |||
::::Do you mean something like this? ] ] (]) 22:04, 14 October 2024 (UTC) | |||
:::::Yes. Basically a breakdown by workflow: here's an important process (which might be doing a set of ongoing work items), and here're the key elements that are automated in order to make this sustainable. I was thinking that depending on the size of the lists, or the number of bots that support multiple workflows, it might be worthwhile to keep the bot list with its details separate, and just have the workflow list point to the bots in the bot list. I feel this makes it easier to think about what workflows are absolutely necessary to keep running (and think of ones that are missing from the list), and to know what they rely on. ] (]) 22:23, 14 October 2024 (UTC) | |||
::::::Thanks {{u|isaacl}}, I think this is a good idea. I'd like to suggest a single list for now, unless it transpires that it's common for single bot accounts to do multiple core tasks? I think it's easier than correlating entries across two lists, if we can avoid it. | |||
::::::I am thinking there's a few things we should understand about each bot, rather than just asking "is the source available". I've tried to summarise these in the lead of ]. I give the example of ClueBot NG there - I think the fact that it's an ANN model using C++ and means someone outside the core development team is unlikely to be able to pickup that bot as-is and realistically maintain it, as opposed to just running it. | |||
::::::With that in mind, I'm wondering if it might be good to develop a ''simple'' criteria to assess a bot against, to serve as a decent ] compared to raw-text comments. e.g. categories like: "source available and executable?" / "multiple maintainers?" / "maintainable tech stack?", on which a bot can get a binary score (good/bad). These categories are mainly just to illustrate the idea. I'm not fixed on what kind of framework we should assess bots against for a realistic 'operational resiliency' strategy. ] (]) 11:06, 16 October 2024 (UTC) | |||
:::::::Could also do a scale of 1-5. Being hosted on Toolforge could add a point, source code published could add a point, active maintainers could add a point, etc. –] <small>(])</small> 13:25, 16 October 2024 (UTC) | |||
::::::::Point systems are pointless (pun unintended). Let's not have a metric that serves no actionable purpose for sake of having one. | |||
::::::::That said, what's the RFC bot? It should go on the core list.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 14:26, 16 October 2024 (UTC) | |||
:::::::::{{replyto|Headbomb}} There are two RFC bots: | |||
:::::::::*{{user|Legobot}} once an hour (i) detects {{tlx|rfc}} transclusions that lack a {{para|rfcid}} parameter, and adds one; (ii) ensures that the next valid timestamp after every existing {{tlx|rfc}} tag is less than thirty days in the past, and if not, removes the {{tlx|rfc}} tag and also removes the RfC statement from all of the listings (such as ]); (iii) checks the ] for each {{tlx|rfc}} transclusion, such as {{para||bio}}, and ensures that the RfC is listed on corresponding pages such as ] | |||
:::::::::*{{user|Yapperbot}} (also once an hour, but half an hour after Legobot) sends messages to user talk pages concerning RfCs where Legobot has recently added a {{para|rfcid}} parameter, see ] | |||
:::::::::HTH. --] 🌹 (]) 15:27, 16 October 2024 (UTC) | |||
::::::::::Was thinking of Legobot. Not sure the other is core, but Legobot is IMO.  <span style="font-variant:small-caps; whitespace:nowrap;">] {] · ] · ] · ]}</span> 15:30, 16 October 2024 (UTC) | |||
:::::::I don't see a lot of utility in a single summary score. I think the number of core bots should remain below the threshold where a group of people could go through them and determine relative priorities for attention. Plus, in a volunteer environment, who works on helping with what bot is going to be highly influenced by personal interest in the associated workflow, in any case. For an individual characteristic like "maintainable tech stack", there could be some usefulness in having a score, to help those not familiar with the details of the related technology to make relative comparisons. I would consider it to be more descriptive than analytic, though, to avoid getting bogged down in its precision. ] (]) 15:55, 16 October 2024 (UTC) | |||
===Reuse for bots and tools=== | |||
Somewhat orthogonal to <s>all this</s> the above thread, in an ideal world, I'd love to see greater standardization across bots and tools. I've written a few of my own tools, but I spent a lot of time reinventing a lot of wheels to make them work. I know there's been some progress in this area (pywikibot is certainly a step in the right direction) but there is still a lot of effort expending by people running in different directions. Which in turn makes it harder to have people pick up other people's projects. ] ] 16:35, 16 October 2024 (UTC) | |||
:{{ping|RoySmith}} as I imagine you've already heard, frameworks are wonderful: everyone should have one :-). It's a big challenge to create one that is usable by others, with sufficient documentation. There's {{section link|Help:Creating a bot|Programming languages and libraries}}, but it's mostly just a big list, without much guidance to help someone decide on what to use. I feel there should be a location for programmers to share experiences, but I'm not sure where that is. ] redirects to ], whose header makes it sound more like a co-ordination spot than somewhere to collaborate on development. ] (]) 21:31, 16 October 2024 (UTC) | |||
::Some statistics would go a long way. ] lists a lot of options that nobody would recommend nowadays. While I don’t believe we should mandate any language or toolkit, it would help to inform new developers which languages and toolkits are used in bots in active use, especially more recent bots. At least half of the current BRFAs are Python with Pywikibot. ] (]) 22:18, 16 October 2024 (UTC) | |||
:::I agree that knowing some basic usage info would be helpful. Whenever I look at a third-party library/framework/tool, I want to know how popular it is and how actively it is maintained, in order to get a sense of how likely it is to continue to maintained in future, how easy will it be able to find answers to questions, and how useful have others found it. But circling back to the problem of overhead scaring away developers, this also applies to those creating code and tools for reuse. Tracking this info and keeping it up to date is extra work, and it might be less interesting for a one-person team than working on their project. ] (]) 22:55, 16 October 2024 (UTC) | |||
== Update the global bots section == | == Update the global bots section == | ||
Line 348: | Line 211: | ||
::Agreed. ] (]) 13:16, 12 September 2024 (UTC) | ::Agreed. ] (]) 13:16, 12 September 2024 (UTC) | ||
:Why is this change needed, and which specific global bots would help improve the English Misplaced Pages under a more permissive policy? ] (]) 00:18, 30 September 2024 (UTC) | :Why is this change needed, and which specific global bots would help improve the English Misplaced Pages under a more permissive policy? ] (]) 00:18, 30 September 2024 (UTC) | ||
::@] This is more for consistency, because I would normally not request local bot rights on a wiki that's not on the global bots opt-out set unless I know that it does not accept all kinds of global bots. I do not have any specific global bots in mind for this reason. ] (]) 14:02, 2 October 2024 (UTC) | ::@] This is more for consistency, because I would normally not request local bot rights on a wiki that's not on the global bots opt-out set unless I know that it does not accept all kinds of global bots. I do not have any specific global bots in mind for this reason. Other wikis in this group include the Russian Misplaced Pages, where global bots are allowed but appear to reference old policies. ] (]) 14:02, 2 October 2024 (UTC) | ||
:::{{tq|I would normally not request local bot rights on a wiki that's not on the global bots opt-out}}. Good point. Maybe we should add our wiki to ] as "not allowed" so that global bot operators don't accidentally run global bots here. –] <small>(])</small> 00:23, 3 October 2024 (UTC) | |||
:The only relevant change I see from 2021 is , which does not say what you say it says. Is there another change somewhere else? ] (]) 19:52, 30 September 2024 (UTC) | :The only relevant change I see from 2021 is , which does not say what you say it says. Is there another change somewhere else? ] (]) 19:52, 30 September 2024 (UTC) | ||
::@] That's the one actually - why do you think that it "does not say what you say it says"? ] (]) 14:03, 2 October 2024 (UTC) | |||
:::The change on meta changed the requirements for meta. It did not change the requirements here, so when you say {{tq|I think this requires a change on this wiki as well}}, it reads as if you think we must be consistent with global policy. "Confusing" it is not: we simply have different requirements. ] (]) 17:21, 2 October 2024 (UTC) | |||
::::@] Actually, that is what I was hinting at - "fixing double redirects" is just a single task that I am not convinced is needed as a specific exemption. And I was also saying that en-wiki is not the only wiki in this group, where it appears that the rules were created when the global bot policy was more restrictive. Also the policy change in Meta ''did'' change the rules for every wiki allowing global bots that did not explicitly have a restriction. ] (]) 06:03, 3 October 2024 (UTC) | |||
:::::If I'm understanding correctly, meta has a policy allowing global bots, but that policy doesn't mandate that the bots can be run on the individual wikis without their consent. Each wiki can make its own decisions on what bots it wants to accept. ] (]) 06:56, 3 October 2024 (UTC) | |||
::::::@] Yes and no. The global policies are global and individual wikis cannot "opt-out" of it. However, individual wikis can set preferences in terms of how they want global bots to use their bot flag, but I've seen a few wikis (eg this one) that references ''old'' Meta policies in doing so (which is what I want to correct) - I've not seen a single wiki that explicitly sets such restrictions ''while'' referencing the updated 2021 rules - nor have I seen any wiki yet have restrictions other than allowing fixing double-redirects/interwiki language links. | |||
::::::And also, regarding "policy doesn't mandate that the bots can be run on the individual wikis without their consent", yes it isn't a "mandate", but the whole point of global bots is to avoid having operators request bot flags on every wiki, and hence if I were a global bot operator, I wouldn't go around asking for permission on global bot-approved wikis, unless I already know that the said wiki does not allow global bots for any approved purpose. And I cannot do this for all of the 800+ wikis that allow global bots either. | |||
::::::TLDR; yes "wiki can make its own decisions on what bots it wants to accept", but to do so would kind of defeat the purpose of global bots. ] (]) 07:31, 3 October 2024 (UTC) | |||
::::::: You're making a false distinction in {{tq|q=y|I wouldn't go around asking for permission on global bot-approved wikis, unless I already know that the said wiki does not allow global bots for any approved purpose}}, which IMO makes your argument unconvincing. If you (as a global-bot operator) know enwiki only allows interwiki-fixing global bots without separate approval, why would you ''not'' ask for permission if you want your double-redirect-fixing bot to run here? If your only complaint is that ] isn't clear enough, an easier solution might be to improve that page. {{tq|q=y|The global policies are global and individual wikis cannot "opt-out" of it.}} Except this one they can. Even if the global bot policy didn't explicitly say so, I think you'd find that we don't necessarily accept here any random "policy" that someone on Meta declares is "global". {{tq|q=y|I've not seen a single wiki that explicitly sets such restrictions ''while'' referencing the updated 2021 rules}} You have, this one. ]] 12:12, 3 October 2024 (UTC) | |||
::::::::@] The bot page links to a discussion from 2008, not 2021. Put it this way: I ''know'' I have to apply for local bot rights. Would someone else not familiar with en.wiki? ] (]) 21:15, 3 October 2024 (UTC) | |||
::::::::: They should if they read ]. ]] 11:27, 4 October 2024 (UTC) | |||
:::::::The global policy says {{tq|The operator should make sure to adhere to the wiki's preference as related to the use of the bot flag.}} It explicitly allows each wiki to make its own decisions on what bots it wants to accept. ] (]) 15:58, 3 October 2024 (UTC) | |||
::::::::@] I don't dispute that, and I don't also dispute that en.wiki is ''not'' doing anything wrong per se. However, I do believe that en.wiki created this exemption in 2008 when Meta rules were different, and believed that it needs at least a relook in 2024. How and what I'm not too bothered with - the other contributors have a lot more experience than I. Put it this way: I would rather have en.wiki put itself in the global bot opt-out set so that it's clear to everyone that you ''must'' apply for a local bot flag, rather than this weird one-task exception which isn't obvious unless you actually go to the bot policy (and it's not like it's any more difficult for bot operators fixing double-redirects to file a local bot flag request than a global bot operator for any other task). ] (]) 21:21, 3 October 2024 (UTC) | |||
::::::::: As I mentioned back at the beginning of this, if you want a reexamination then creating an RFC is the way to go. If you want to start drafting one (I recommend a draft to reduce the chance of confusing wording issues), feel free. ]] 11:27, 4 October 2024 (UTC) | |||
::::::::::I'm not motivated enough to do it - you all have more experience than I. I just posted here as a suggestion for improvement - it appears to me that the community does not feel this to be worth it. ] (]) 19:37, 4 October 2024 (UTC) | |||
== "]" listed at ] == | |||
] | |||
The redirect <span class="plainlinks"></span> has been listed at ] to determine whether its use and function meets the ]. Readers of this page are welcome to comment on this redirect at '''{{slink|Misplaced Pages:Redirects for discussion/Log/2024 October 12#Bot policy}}''' until a consensus is reached. <!-- Template:RFDNote --> <span style=white-space:nowrap;>] <span style="background-color:#e6e6fa;padding:2px 5px;border-radius:5px;font-family:Arial black">]</span></span> 20:40, 12 October 2024 (UTC) |
Latest revision as of 22:55, 16 October 2024
This is not the place to request a bot, request approval to run a bot, or to complain about an individual bot
|
This is the talk page for discussing improvements to the Bot policy page. |
|
The project page associated with this talk page is an official policy on Misplaced Pages. Policies have wide acceptance among editors and are considered a standard for all users to follow. Please review policy editing recommendations before making any substantive change to this page. Always remember to keep cool when editing, and don't panic. |
This policy page has been mentioned by a media organization:
|
Bot-related archives |
---|
Noticeboard1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19 |
Bots (talk)1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 21, 22 Newer discussions at WP:BOTN since April 2021 |
Bot policy (talk)19, 20, 21, 22, 23, 24, 25, 26, 27, 28 29, 30 Pre-2007 archived under Bots (talk) |
Bot requests1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 31, 32, 33, 34, 35, 36, 37, 38, 39, 40 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 51, 52, 53, 54, 55, 56, 57, 58, 59, 60 61, 62, 63, 64, 65, 66, 67, 68, 69, 70 71, 72, 73, 74, 75, 76, 77, 78, 79, 80 81, 82, 83, 84, 85, 86, 87 |
Bot requests (talk)1, 2 Newer discussions at WP:BOTN since April 2021 |
BRFAOld format: 1, 2, 3, 4 New format: Categorized Archive (All subpages) |
BRFA (talk)1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15 Newer discussions at WP:BOTN since April 2021 |
Bot Approvals Group (talk)1, 2, 3, 4, 5, 6, 7, 8, 9 BAG Nominations |
RFC: Sever WP:MASSCREATE from WP:BOTPOL
- The following discussion is an archived record of a request for comment. Please do not modify it. No further edits should be made to this discussion. A summary of the conclusions reached follows.
|text=
Should WP:MASSCREATE be severed from WP:Bot policy? 23:15, 9 July 2024 (UTC)
Back in 2009, the community first enacted a restriction on mass creation of articles. The resulting policy was placed in WP:Bot policy since the impetus was mass creation using automated tooling. Even then concern was raised over whether WP:BRFA was the right forum for this, but at the time "good enough" carried the day.
In the years since we've had various discussions where this has become an issue. Possibly the most prominent was Misplaced Pages:Arbitration Committee/Requests for comment/Article creation at scale/Closing statement#Question 17: Amend WP:MASSCREATE, where three of the seven "oppose" bullets hinge on WP:BOTPOL being the wrong place.
Personally I'm tired of seeing WP:BOTPOL and WP:MEATBOT being bent out of shape in the arguments over what sorts of not-entirely-bot mass creations are or can be "covered" by WP:MASSCREATE. Thus I propose the question:
Should WP:MASSCREATE be severed from WP:Bot policy?
Should the answer be yes, the following changes to the text of the policy will be made. The intention is to keep the current meaning as far as possible while removing the bits specific to WP:BOTPOL:
See also: Misplaced Pages:Bot-created articlesAny large-scale automated or semi-automated content page creation task must be approved at Misplaced Pages:Bots/Requests for approval by the community. This requirement initially applied to articles, but has since been expanded to include all "content pages", broadly meaning pages designed to be viewed by readers through the mainspace. These include articles, most visible categories, files hosted on Misplaced Pages, mainspace editnotices, and portals. While no specific definition of "large-scale" was decided, a suggestion of "anything more than 25 or 50" was not opposed. It is also strongly encouraged (and may be required by BAG) that cCommunity input may be solicited at WP:Village pump (proposals) and the talk pages of any relevant WikiProjects. Bot operators Creators must ensure that all creations are strictly within the terms of their approval.
Per a 2022 RfC, all mass-created articles (except those not required to meet WP:GNG) must cite at least one source which would plausibly contribute to GNG, that is, which constitutes significant coverage in an independent reliable secondary source.
Alternatives to simply creating mass quantities of content pages include creating the pages in small batches or creating the content pages as subpages of a relevant WikiProject to be individually moved to public facing space after each has been reviewed by human editors. While use of these alternatives does not remove the need for a BRFA approval, it may garner more support from the community at large.
Mass creation by automated means may additionally require approval as specified by Misplaced Pages:Bot policy. Approval of a bot for mass creation does not override the need for community consensus for the creation itself, nor does community consensus for a creation override the need for approval of the bot itself.
Note that whileShould the answer be yes, I don't much care if the destination is a new standalone policy page, WP:Editing policy, or some other existing policy. In the interest of this not failing due to lack of consensus for where to put it, if there's not consensus for a specific destination then we'll default to "a new standalone policy page" at Misplaced Pages:Mass page creation and people can start a separate merge discussion later if they want.
The bot policy will retain a stub referring to the new policy. The existing redirects such as WP:MASSCREATE will be retargeted.
Main article: Misplaced Pages:Mass page creation (or whatever)Mass page creation may require approval by the community, in addition to a BRFA if the method of that creation falls under this Bot policy. BAG may require that community approval for any mass content creation exists before considering bot approval.
Approval of a bot for mass creation does not override the need for community consensus for the creation itself, nor does community consensus for a creation override the need for approval of the bot itself. Bot operators must ensure that all creations are strictly within the terms of their approvals.Poll (sever MASSCREATE from BOTPOL)
- Support As proposer. Anomie⚔ 23:15, 9 July 2024 (UTC)
- Not against but I don't see the point, personally. Headbomb {t · c · p · b} 23:34, 9 July 2024 (UTC)
- Support. The current location of this policy is causing issues and this seems like a very simple way to fix that. Thryduulf (talk) 23:38, 9 July 2024 (UTC)
- Support as I think Anomie gives three practical reasons below. I lean slightly towards adding it as a new section in Misplaced Pages:Editing policy – perhaps one calling ==Creating articles== that mostly contains a sentence about Misplaced Pages:Notability, and then the current wording (with these minor adjustments) as a subsection. WhatamIdoing (talk) 04:02, 10 July 2024 (UTC)
- Support moving it to Misplaced Pages:Editing policy, per WAID, Anome, and what I've said above. – Joe (talk) 12:21, 10 July 2024 (UTC)
- Support as a step towards making this clearer. — Rhododendrites \\ 13:53, 10 July 2024 (UTC)
- Support I wonder whether moving the top half of MEATBOT should be done at the same time, both are about editing rather than bot policy. -- LCU ActivelyDisinterested «@» °∆t° 18:24, 10 July 2024 (UTC)
- I don't suggest doing this at the same time. (Also, I think it would have to be something like the first and third sentences from the first paragraph, which is a level of complexity that should probably be discussed separately.) WhatamIdoing (talk) 04:44, 15 July 2024 (UTC)
- Support Now comes the hard part.....writing it. North8000 (talk) 20:18, 15 July 2024 (UTC)
- This proposal already includes a written-out version of the split policy. – Joe (talk) 11:09, 18 July 2024 (UTC)
- Support per the three reasons given by Anomie below. The current situation always seemed like a strange compromise to me. Pinguinn 🐧 06:59, 16 July 2024 (UTC)
- Support per Anomie, tho I'm not sure I agree with the wording of the stub, however that can be wordsmithed later. Sohom (talk) 20:08, 21 July 2024 (UTC)
- Support per nom. Makes more sense. The status quo doesn't always have to stay just because it technically works. C F A 💬 22:18, 28 July 2024 (UTC)
Discussion (sever MASSCREATE from BOTPOL)
Please don't start trying to discuss any more sweeping changes here. Save those for a separate RFC you can hold after this passes. I ask uninvolved editors to hat any such discussions if people try to start them here, and closers to disregard any !votes calling for such changes. Anomie⚔ 23:15, 9 July 2024 (UTC)
- Reading this again, my concern is that the wording you use, and the removal from BOTPOL, will mean WP:MEATBOT no longer applies, and thus there will be no restrictions on the mass creation of articles by methods such as boilerplates, which some editors argue aren’t covered by semi-automated.
- To prevent this proposal from changing the meaning of this section, can we insert "bot-like" into the first sentence? BilledMammal (talk) 23:20, 9 July 2024 (UTC)
- @BilledMammal where specifically do you propose to add it? Thryduulf (talk) 23:36, 9 July 2024 (UTC)
Any large-scale automated,
BilledMammal (talk) 00:17, 10 July 2024 (UTC)orsemi-automated content, or bot-like page creation task
- (edit conflict) I disagree that this proposal changes the meaning in that way. The first sentence already maintains the existing wording about
Any large-scale automated or semi-automated content page creation task
.Also WP:MEATBOT is really just a duck test, it's supposed to stop people from claiming that a policy about automated edits doesn't apply because their edits are manual boilerplate-filling or whatever by saying that if it looks automated then we can treat it as such regardless. It doesn't actually do anything to make boilerplate-driven manual edits fall under WP:MASSCREATE where they aren't already against consensus or are otherwise disruptive.Also, IMO you'd probably do better to support this, because if this goes through then "The bot policy can't regulate human behavior" and "it makes no sense for human edits to be approved through BRFA" will no longer be valid objections to a proposal to strike "automated or semi-automated" from the first sentence (because it will no longer be part of the bot policy), and if you can get that through then you won't have to abuse WP:MEATBOT at all. Anomie⚔ 23:43, 9 July 2024 (UTC)- At the moment, we have a policy that applies to manual bot-like mass creation, while your proposed change removes that aspect.
- Considering the
intention is to keep the current meaning as far as possible while removing the bits specific to WP:BOTPOL
, it would make more sense to removeautomated or semi-automated
. These are bits specific to BOTPOL, and by removing them you ensure that the section you are removing from BOTPOL actually has applicability outside BOTPOL. BilledMammal (talk) 00:32, 10 July 2024 (UTC)- I think you're trying to sneak in a wording change that tries to make your existing arguments easier to support. As I asked above, let's do this simple RFC first, then you can try to convince the community at large to accept your changes. Anomie⚔ 01:18, 10 July 2024 (UTC)
- At the moment, you're
trying to sneak in a wording change
that makes the argument harder to support. Because of that, this isn't the simple RfC that I thought it was. BilledMammal (talk) 01:19, 10 July 2024 (UTC)- Well, you too are free to believe what you want, no matter how wrong it may be. Anomie⚔ 01:22, 10 July 2024 (UTC)
- I don't think you're actually trying to sneak anything in, but I was a little annoyed by you suggesting I was.
- What I do think is that this is a change from the status quo - I think the language I proposed to Thryduulf would maintain the status quo, while the language I proposed to you would change it in the opposite direction. BilledMammal (talk) 01:32, 10 July 2024 (UTC)
- You've convinced yourself that WP:MEATBOT means that if you can squint hard enough to convince yourself that something is "bot-like" then you can expand the scope of WP:BOTPOL to cover clearly human actions, so you want to add "bot-like" to try to bolster that. That's no more correct than WhatamIdoing insisting above that WP:MEATBOT is about preventing high-speed editing and nothing else; she might hypothetically say that removing WP:MASSCREATE from WP:BOTPOL removes an implication that it only applies to high-speed editing (since only rapid editing, in her view, is "bot-like") and so want it to say
Any high-speed large-scale automated or semi-automated content page creation task
to "preserve" that interpretation. Anomie⚔ 02:00, 10 July 2024 (UTC)
- You've convinced yourself that WP:MEATBOT means that if you can squint hard enough to convince yourself that something is "bot-like" then you can expand the scope of WP:BOTPOL to cover clearly human actions, so you want to add "bot-like" to try to bolster that. That's no more correct than WhatamIdoing insisting above that WP:MEATBOT is about preventing high-speed editing and nothing else; she might hypothetically say that removing WP:MASSCREATE from WP:BOTPOL removes an implication that it only applies to high-speed editing (since only rapid editing, in her view, is "bot-like") and so want it to say
- Well, you too are free to believe what you want, no matter how wrong it may be. Anomie⚔ 01:22, 10 July 2024 (UTC)
- At the moment, you're
- I think you're trying to sneak in a wording change that tries to make your existing arguments easier to support. As I asked above, let's do this simple RFC first, then you can try to convince the community at large to accept your changes. Anomie⚔ 01:18, 10 July 2024 (UTC)
- @BilledMammal, I don't think you need to worry about this. MEATBOT applies to all edits that are "high-speed or large-scale edits that a) are contrary to consensus or b) cause errors an attentive human would not make". Even if MASSCREATE ends up on another page, or even if MASSCREATE didn't exist at all, MEATBOT would still apply to the same edits that it does now.
- One way of looking at this is that if this passes, you'd have two Official™ Policies that you could argue were being violated. WhatamIdoing (talk) 03:57, 10 July 2024 (UTC)
- @BilledMammal where specifically do you propose to add it? Thryduulf (talk) 23:36, 9 July 2024 (UTC)
@Headbomb: I see at least three points to this:
- Ending the fiction that BAG approves the mass creations. Most of the time we already say "go get consensus at WP:VPR first", and then rubber-stamp it if the bot itself passes trials.
- Getting arguments about how WP:MASSCREATE should apply to non-bots off of this page, which is supposed to be about the bot policy.
- Stopping BilledMammal from having to abuse WP:MEATBOT to argue that WP:MASSCREATE should cover non-bot mass creations, by letting them argue for changing the policy to say that directly instead.
HTH. Anomie⚔ 23:55, 9 July 2024 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.Open source bots
I would like to do a temperature check (explicitly not an RfC) as a follow-up to the discussion at this BRFA, of how people feel about changing the source code requirements. Currently the language in the bot policy is:
Authors of bot processes are encouraged, but not required, to publish the source code of their bot.
and for adminbots: It is recommended that the source code for adminbots be open, but should the operator elect to keep all or part of the code not publicly visible, they must present such code for review upon request from any BAG member or administrator.
I would like to replace it with something like: "Authors of bot processes are expected to publish the source code of their bot in a public manner under an open source license to facilitate collaboration and forking. Should an author wish to keep the source code private, they must request an exemption from BAG during the bot approval process. BAG members may decline requests solely on the basis of source code not being open source." And some kind of grandfathering clause for current closed source bots.
The rationale being 1) allowing people to suggest improvements for bots or point out possible bugs as a technical review step, and 2) when bots/maintainers inevitably disappear, mandate that there is a path for someone else to take over the bot without starting from scratch. I think the Wikimedia movement has moved in this direction, with requiring open source licenses for bots run on Toolforge (the vast majority of our bots) and there's an abundance of places to post your code. Or in other words, open source should be the norm, private should be an exception.
Some carve outs: I'm sure people can come up with edge cases where publishing code isn't desired; if those turn into actual BRFAs, I'm happy to defer the decision to BAG on whether the exception is justified or not. As a practical matter, I think it would be fine if people make changes, test their code, and then publish it shortly after. I don't think we need a hard requirement that code must be open source before you run it against Misplaced Pages.
So temperature check: how do people feel about this? Is this a reasonable proposal? Or if you would not support something like this being formalized, why not? Legoktm (talk) 03:54, 5 September 2024 (UTC)
- Looks ok to me. I expect open source codes for bots as well. It helped with the takeover of the AdminStats task (although we ended up hunting for a copy of the codes in Toolforge itself). However, can the grandfather clause be extended into new task requests of current bots as the new tasks may still utilise the closed-source codes. – robertsky (talk) 04:03, 5 September 2024 (UTC)
- I don't see why we should bake 'they must be open, unless you ask for an exception, which we may deny' into policy, rather than the current 'we encourage, but don't mandate, open source bots'. Headbomb {t · c · p · b} 10:00, 5 September 2024 (UTC)
- @Headbomb: Could you expand on why you don't think it should be done? (I personally don't think we need an exception, I just expected people would oppose it without one) Legoktm (talk) 15:35, 5 September 2024 (UTC)
- I'm also curious about this, but I do see that allowing for exceptions is a reasonable thing to do. Imagine if somebody came to us and said, "The company I work for has an abuse-detection system that's 10x better than what you're using now. We're willing to let you use it at no cost, but unfortunately I cannot make the code available". Having that exception carve-out gives us the ability to accept or refuse that offer as we see fit at the time. Not having that carve-out forces our hand. I think that's reasonable.
- We already have agreements like that with some IP proxy detection vendors (I'm being cagey here about the vendor names only because I'm not sure of the status of these relationships). Because they are only (to the best of my knowledge) used as back-ends to some interactive tools, they don't fall under BAG's remit. But I could certainly imagine somebody wanting to build a BAGgy tool which uses one of those services as a back end. As much as I believe in FOSS everywhere, I also wouldn't want us to shoot ourselves in the foot to stand on principle. RoySmith (talk) 16:04, 5 September 2024 (UTC)
- You might not want to put your code up because it's crude/inelegant. You could also be doing things that is "OK" with private code, that isn't OK with public code, like having "if password = sw0rdf!sh continue, else fail" instead of whatever you should be doing with passwords and logins. Or you might be using code from someone else that you got permission to use, but didn't get permission to distribute. Or you may be using closed source code that you purchased, but don't have rights to distribute.
- Like, I'll agree it's unequivocally better to have things open sourced. Hence why it should be encouraged. But volunteer coders are an extremely limited resource, so the fewer barriers to entry/participation we have, the better, IMO. Headbomb {t · c · p · b} 19:12, 5 September 2024 (UTC)
- @Headbomb: Could you expand on why you don't think it should be done? (I personally don't think we need an exception, I just expected people would oppose it without one) Legoktm (talk) 15:35, 5 September 2024 (UTC)
- I don't see why we should bake 'they must be open, unless you ask for an exception, which we may deny' into policy, rather than the current 'we encourage, but don't mandate, open source bots'. Headbomb {t · c · p · b} 10:00, 5 September 2024 (UTC)
Strong supportRequiring bots to be open source seems like a good idea to me for reasons ranging from cultural (supporting the goalspromoting the ideals of the Wikimedia movement) to security (code review) to disaster recovery (being able to continue operation of critical services should the original developer disappear). RoySmith (talk) 12:15, 5 September 2024 (UTC)- I mainly pushed back against this in the above BRFA because I felt it violated current norms. But I am not opposed to it in general if we make a change to BOTPOL. There are major maintainability advantages to having bot code open sourced. Volunteers that write critical code lose interest or go inactive all the time. Honestly maybe a proposal to require Toolforge for non-AWB bots might also be worth considering. The combination of open sourced code plus Toolforge would be the ideal situation for rescuing abandoned bots. Finally, we also had a situation recently where an operator passed away and their bot was immediately blocked and globally locked. Avoiding blocking working bots ASAP, giving time for us to properly fork and replace them, might be worth adding to BOTPOL as well. –Novem Linguae (talk) 16:18, 5 September 2024 (UTC)
- This is a good point; as anybody who has ever tried to port anything knows, just having the source code is only half the battle. Moving things to a new operating environment can be a pain too; requiring that everything runs in Toolforge (or Cloud VPS) would be a good thing IMHO. I'm not sure where you draw the line, however. Some people would insist that everything run in a Docker container. That would drive me nuts. Some people would insist that we only use phab, gitlab, and so on. That would also drive me nuts. RoySmith (talk) 17:01, 5 September 2024 (UTC)
- I think you're underestimating how successive additional requirements hinder attracting new volunteers. It's no big deal to experienced developers, but there is already a lot to navigate as a new developer. Of course, if the goal is to reduce the number of abandoned bots, discouraging new bots will definitely help. I think it would be better to encourage practices like source code availability, succession plans, etc. with a dashboard, recognition, and other approaches. Daniel Quinlan (talk) 20:14, 5 September 2024 (UTC)
- I'm strongly opposed to the proposed change. The current policy encourages open source without being overly restrictive or discouraging of people submitting requests. As an open source developer, I think that's a good thing. But requiring all bots to be open source could discourage some potential projects, especially if they use proprietary code or need to use non-free components. For some projects, there are also security-related reasons to not open source code for the same reasons we have private edit filters. Finally, if there are specific bots that are truly critical and not open source, we should identify those bots and solicit for replacement bots that would be open source, or ask the WMF to write, maintain, and operate replacement bots for those functions. The current policy is well-written. Daniel Quinlan (talk) 21:04, 5 September 2024 (UTC)
- Part of my concern is that if it's necessary to grandfather existing bots, it strongly implies that there would be a chilling effect on future proposals, for both existing and new bots. I’d prefer to start with a review of existing bots to assess their criticality and succession plans, and then consider improvements based on the assessment. Policy changes might be one approach, but I believe that providing encouragements that won't discourage future projects, and can be applied to all bots, would be more effective. Daniel Quinlan (talk) 22:25, 5 September 2024 (UTC)
- I appreciate the feedback that people have given and will reply a bit later after digesting them, but please, can we avoid the bold votes? I would like to focus on the discussion and rationales not ... voting. Legoktm (talk) 21:39, 5 September 2024 (UTC)
- Sorry if the bold came across as harsh, I was following the format of an earlier comment. I appreciate you following up on the discussion to discuss it out in the open which will reduce the odds this is rediscussed in random future BRFAs. Daniel Quinlan (talk) 22:10, 5 September 2024 (UTC)
- I think the circumstances have to be taken into account. If the bot is going to be a one-time simple task, it's probably less critical to have a succession plan in place. If it's a bot that's going to underpin key workflow processes, then having a plan for ensuring that the task can be handed off is more important. I agree that ideally all code would be open source (keeping any necessary private configuration closed) and with a relatively uniform development and runtime environment, but practically speaking, I don't think English Misplaced Pages can afford to limit its potential pool of developers to that degree. isaacl (talk) 23:48, 5 September 2024 (UTC)
- I would unequivocally support something stronger for bots expected to be a continuing run as "we require open source" and would definitely encourage something like the OP even for shorter runs. I'm sorry, but we cannot continue to depend on closed source bots and private source. (I have said as much multiple times now.) And sorry, if your code is shitty, that's how open source works. Either you can get over your fear of publishing something that is hacked together (as if no one else has hacked together code into production, know how modern MediaWiki started?) or you can do something else with your time (which I'm sure will be productive for wiki goals as well). Izno (talk) 18:00, 29 September 2024 (UTC)
- While having code available is a good start, if we're going to introduce some requirements for succession planning, I don't think it should stop there. Too many people think if the source is available, all is good. But if it's Haskell code, for instance, the pool of potential contributors is significantly smaller. So if we start on this route, I think we need to also include things like the bot has to run on the toolforge servers, the language is highly recommended to be from a list of supported languages, and there is at least one other maintainer actively involved. isaacl (talk) 21:34, 29 September 2024 (UTC)
- Before we consider a stricter policy for some cases, we need to be specific about which bots are actually critical. Requiring people to code in specific languages, or get up to speed on Toolforge on top of learning MediaWiki APIs for a bot that quietly does its own thing, isn't mission critical, etc. is going to be counterproductive. We need more people, especially newcomers to Misplaced Pages coding, writing more helpful bots, and we should endeavor to keep the cost of entry as low as reasonably possible. The ideal case is always going to be easily portable and well maintained code, but no requirements are going to keep us from avoiding some inevitable realities of the software lifecycle and the perfect is the enemy of good. Daniel Quinlan (talk) 00:13, 30 September 2024 (UTC)
- Let's tackle one thing at a time. I happen to agree that it makes sense to state an increased expectation for these other qualities, but I think we have to start from "someone could even theoretically pick this up and run with it". Izno (talk) 00:52, 30 September 2024 (UTC)
- I think a little bit more is needed to lay the base to enable someone else to theoretically operate a given bot. It could be one of the following: tool runs on toolforge, there are multiple active maintainers, or there is sufficient up-to-date written documentation that describes the software stack and execution environment. And, unfortunately for aficionados of more obscure languages, I think there should be a list of highly recommended languages. isaacl (talk) 01:28, 30 September 2024 (UTC)
- On the one hand, I endorse all of these as obvious good things. And to that I'd add that the source has to be publicly available in a standard source control system (which these days basically means git). It's one thing to say the code is under an FOSS license, but if the distribution mechanism is to download a ROT-13'd shar file from a gopher server, it might as well not exist. And it should have a comprehensive test suite. And an issue tracking system. And code reviews. Well, you see where this is going. All of these things are essential good software engineering practices, but each one is also a barrier to entry for a lot of people, and at some point we need to make an intelligent decision about where we want to draw the line. If we chased away every potential code contributor with onerous requirements, we'd certainly solve the problem of tool migration because we wouldn't have any tools. RoySmith (talk) 01:58, 30 September 2024 (UTC)
- Yes, I already stated I don't think English Misplaced Pages can afford to limit its potential developer pool for all its tools. I think when a tool or bot is planned for deployment, we need to decide how important it is to have some form of succession plan in place. In many cases, we may just live with the risk. For some key processes, we may want to plan for future transition to different maintainers. isaacl (talk) 02:33, 30 September 2024 (UTC)
- On the one hand, I endorse all of these as obvious good things. And to that I'd add that the source has to be publicly available in a standard source control system (which these days basically means git). It's one thing to say the code is under an FOSS license, but if the distribution mechanism is to download a ROT-13'd shar file from a gopher server, it might as well not exist. And it should have a comprehensive test suite. And an issue tracking system. And code reviews. Well, you see where this is going. All of these things are essential good software engineering practices, but each one is also a barrier to entry for a lot of people, and at some point we need to make an intelligent decision about where we want to draw the line. If we chased away every potential code contributor with onerous requirements, we'd certainly solve the problem of tool migration because we wouldn't have any tools. RoySmith (talk) 01:58, 30 September 2024 (UTC)
- I think a little bit more is needed to lay the base to enable someone else to theoretically operate a given bot. It could be one of the following: tool runs on toolforge, there are multiple active maintainers, or there is sufficient up-to-date written documentation that describes the software stack and execution environment. And, unfortunately for aficionados of more obscure languages, I think there should be a list of highly recommended languages. isaacl (talk) 01:28, 30 September 2024 (UTC)
- While having code available is a good start, if we're going to introduce some requirements for succession planning, I don't think it should stop there. Too many people think if the source is available, all is good. But if it's Haskell code, for instance, the pool of potential contributors is significantly smaller. So if we start on this route, I think we need to also include things like the bot has to run on the toolforge servers, the language is highly recommended to be from a list of supported languages, and there is at least one other maintainer actively involved. isaacl (talk) 21:34, 29 September 2024 (UTC)
- I have nothing against publishing my code. I acknowledge that the code quality is far from ideal, since I don't get a lot of time to work on it, and my decision to use C# may not have been the best. The bots have worked their way into some of our processes, so it might be good if someone could at least see what they do in case something happens to me. I would have to go through the files and add the appropriate copyright notices. I am unsure what counts as an open licence; my intent was always to use GPLv3. Once a licence is applied though, it will be hard to change, so a grandfather clause would be necessary. And yes, moving things to a new operating environment is a real pain; Toolforge now insists that everything run in a Docker container. Getting that to work has been taxing, and would have been impossible without a lot of help from the Toolforge admins. Hawkeye7 (discuss) 20:25, 29 September 2024 (UTC)
- Toolforge currently requires an OSI-approved license. That seems like the right expectation to me. Comparison of free and open-source software licenses might be useful for your own review of what you might license your code as. Izno (talk) 00:54, 30 September 2024 (UTC)
- I found the policy at Help:Toolforge/Right to fork policy. I haven't actually implemented it yet. Hawkeye7 (discuss) 04:56, 2 October 2024 (UTC)
- I think the docker container part is just picking an image on toolforge. You don't need to install docker on your local computer, nor do you need to know much about docker except for the one CLI command to run and which image you are going to pick from the list of images. My bots are written in PHP and I use xampp locally to run them when I am doing coding and manual tests. –Novem Linguae (talk) 15:59, 30 September 2024 (UTC)
- Not an image, but a cloud native build pack. The build pack creates the container image. I had to get them to install a dotnet build pack for me. Running docker locally was no problem; getting the application to work properly in that operating environment took more effort. Hawkeye7 (discuss) 21:30, 30 September 2024 (UTC)
- I previously looked at getting a .NET app to run on Toolforge but they didn't have any "official" support so I couldn't be bothered to figure it out and I concluded I would need to get in touch with them to add this support. This is in addition to all the hoops one has to jump through to figure out how to do things there. None of it is user-friendly, that's for sure, speaking of barrier to entry and all that. Anyway, this did not exist at the time. I might ping you at some point to ask you how exactly you did it if I can't figure out stuff from the documentation. I don't suppose you recorded the steps you took to get it working? — HELLKNOWZ ∣ TALK 22:27, 30 September 2024 (UTC)
- Yes, I recorded the steps I took. Hawkeye7 (discuss) 04:53, 2 October 2024 (UTC)
- I previously looked at getting a .NET app to run on Toolforge but they didn't have any "official" support so I couldn't be bothered to figure it out and I concluded I would need to get in touch with them to add this support. This is in addition to all the hoops one has to jump through to figure out how to do things there. None of it is user-friendly, that's for sure, speaking of barrier to entry and all that. Anyway, this did not exist at the time. I might ping you at some point to ask you how exactly you did it if I can't figure out stuff from the documentation. I don't suppose you recorded the steps you took to get it working? — HELLKNOWZ ∣ TALK 22:27, 30 September 2024 (UTC)
- Not an image, but a cloud native build pack. The build pack creates the container image. I had to get them to install a dotnet build pack for me. Running docker locally was no problem; getting the application to work properly in that operating environment took more effort. Hawkeye7 (discuss) 21:30, 30 September 2024 (UTC)
- Toolforge currently requires an OSI-approved license. That seems like the right expectation to me. Comparison of free and open-source software licenses might be useful for your own review of what you might license your code as. Izno (talk) 00:54, 30 September 2024 (UTC)
Bots with available source code
List
See: Misplaced Pages:Core bots
Discussion
Based on the above, there appears to be a reasonable consensus that most (if not all) bots that do "core functions" (my phrasing) should have their code posted so that if the operator disappears (intentionally or not) the functionality can be quickly/easily/efficiently ported onto a new bot/operator who can take over the task. I have started a (currently empty) list above, and would invite editors to add and discuss the list so that we can start asking operators to provide the code if deemed necessary. Personally speaking I think this list should focus on open-approval tasks (i.e. not one time runs) to start, but if someone wants the code for OTRs feel free to ask. Primefac (talk) 19:30, 29 September 2024 (UTC)
- I think there's a tendency for the general editing population to think if the code is available, it should be easy for anyone to step into the void and quickly get a bot running, but that's a fallacy. I think if we're going to introduce requirements for succession planning, they should cover a bit more (as I discussed in an earlier comment). isaacl (talk) 21:37, 29 September 2024 (UTC)
- I don't see that consensus. We all agree it's preferable. But mandatory is different. Headbomb {t · c · p · b} 00:51, 30 September 2024 (UTC)
- I think the consensus is for a stronger policy than saying it's "preferable" (which we already do). We could make some exceptions for long-standing bots, such as AAlertBot, which is I think what you are primarily concerned about. – SD0001 (talk) 12:04, 30 September 2024 (UTC)
- A mandatory policy is a recipe for "consensus" to shut bots down or worse remove bot privs fpr being a rouge operator. What else could "mandatory" mean? Which is like that Vietnam War saying, "We had to destroy a village to save it" (variations of this quote). Isaacl is exactly right that dumping a bunch of source to GitHub is meaningless for anyone trying to install and operate it. And some bots the operation requires a lot of training that is not easy to document. -- GreenC 23:47, 30 September 2024 (UTC)
shut bots down
. I imagine any new requirements would have an exception for bots approved before the new requirement. –Novem Linguae (talk) 00:44, 1 October 2024 (UTC)- Where in my post did I say mandatory? I did not, so to disagree with something I didn't say is a little odd. This sub-thread is about taking the first steps - right now we don't even know which bots have open-source or freely-available code bases, or where they're hosted, etc. Primefac (talk) 12:38, 5 October 2024 (UTC)
- I get where everyone is coming from with the desire to make sure bots keep running smoothly, but it's not clear to me that there's consensus making open source mandatory. I'm concerned that:
- The focus is open source and adding extra requirements instead of having succession plans.
- There aren't clear definitions for terms like "critical" or "core" and we don't have a list of the bots that would be impacted.
- Grandfathering some bots might mean we end up with all of the downsides that will discourage future development without significantly improving continuity.
- I've only written one bot so far, one that's trivial to set up and also open source, but it likely wouldn't exist if I had to produce open-source code before getting project approval or had been required to use Toolforge for my first project. The problem that Protection Helper Bot solves has been a Phabricator ticket since 2012 and proposed multiple times before and since then (such as this discussion in 2017). We should be encouraging new developers to help solve long-standing problems rather than throwing up roadblocks, even if they seem like low bars to most experienced Misplaced Pages developers. Daniel Quinlan (talk) 01:03, 1 October 2024 (UTC)
- I never said anything about anything being mandatory. Primefac (talk) 12:38, 5 October 2024 (UTC)
- You're right, "mandatory" was used by another commenter. However, I do actually believe setting the expectation that core functions
should
make their code available would likely turn that expectation into a requirement in practice. The policy already recommends it and that seems to be interpreted aggressively at times in BRFA discussions. I would also like to understand the current situation before changing the policy. Daniel Quinlan (talk) 01:21, 6 October 2024 (UTC)- Your bot was the exception as it's an adminbot that touches protection of articles. Most BRFAs have no requirement or request to release their source, and don't in practice. ProcrastinatingReader (talk) 21:13, 14 October 2024 (UTC)
- I agree with the bot policy that source code for adminbots should be open or the developer
must present such code for review upon request from any BAG member or administrator
. My previous comments should not be interpreted as contradicting that. I designed my bot to be easy for anyone to run by releasing the code as open source and ensuring it's easy to set up. However, I believe it's fair to say that some of the additional requirements that have been discussed would have likely deterred me from submitting a BRFA. Daniel Quinlan (talk) 23:11, 14 October 2024 (UTC)
- I agree with the bot policy that source code for adminbots should be open or the developer
- Your bot was the exception as it's an adminbot that touches protection of articles. Most BRFAs have no requirement or request to release their source, and don't in practice. ProcrastinatingReader (talk) 21:13, 14 October 2024 (UTC)
- You're right, "mandatory" was used by another commenter. However, I do actually believe setting the expectation that core functions
- I never said anything about anything being mandatory. Primefac (talk) 12:38, 5 October 2024 (UTC)
- Mandatory or whatever aside, I think there is merit to us having a list of what we think are "essential" bots, along some idea of what the succession strategy for these bots is. (any of: is source available? are they hosted on Toolforge with multiple maintainers?) ProcrastinatingReader (talk) 21:12, 14 October 2024 (UTC)
- I added a couple bots to the list started above. I also included how esoteric their tech stack would be considered these days (ie: how easily could someone take over maintaining it with updates/fixes). Bots using pywikibot or mwbot-rs for example I think are quite accessible. Custom C++ code or even Perl code I'd say is not particularly easy to take over. Realistically there's nothing we can do about these, but it's worth remaining aware of our bus factor.I'm loosely defining "core" as the bot disappearing causing noticeable disruption to the encyclopaedia, some significant process, or otherwise meaningfully impacting the quality of articles. ProcrastinatingReader (talk) 21:46, 14 October 2024 (UTC)
- I suggest creating a parent list of key English Misplaced Pages processes/ongoing work items, and under them listing the essential automated tasks for those processes/work items. (I understand that some bots may be grouped under multiple processes/work items.) At the very least, it would be helpful to those not familar with all the bots if the list could include a brief summary of their essential tasks. isaacl (talk) 21:54, 14 October 2024 (UTC)
- Do you mean something like this? User:ProcrastinatingReader/Core bots ProcrastinatingReader (talk) 22:04, 14 October 2024 (UTC)
- Yes. Basically a breakdown by workflow: here's an important process (which might be doing a set of ongoing work items), and here're the key elements that are automated in order to make this sustainable. I was thinking that depending on the size of the lists, or the number of bots that support multiple workflows, it might be worthwhile to keep the bot list with its details separate, and just have the workflow list point to the bots in the bot list. I feel this makes it easier to think about what workflows are absolutely necessary to keep running (and think of ones that are missing from the list), and to know what they rely on. isaacl (talk) 22:23, 14 October 2024 (UTC)
- Thanks isaacl, I think this is a good idea. I'd like to suggest a single list for now, unless it transpires that it's common for single bot accounts to do multiple core tasks? I think it's easier than correlating entries across two lists, if we can avoid it.
- I am thinking there's a few things we should understand about each bot, rather than just asking "is the source available". I've tried to summarise these in the lead of Misplaced Pages:Core bots. I give the example of ClueBot NG there - I think the fact that it's an ANN model using C++ and an uncommon C++ framework means someone outside the core development team is unlikely to be able to pickup that bot as-is and realistically maintain it, as opposed to just running it.
- With that in mind, I'm wondering if it might be good to develop a simple criteria to assess a bot against, to serve as a decent summary statistic compared to raw-text comments. e.g. categories like: "source available and executable?" / "multiple maintainers?" / "maintainable tech stack?", on which a bot can get a binary score (good/bad). These categories are mainly just to illustrate the idea. I'm not fixed on what kind of framework we should assess bots against for a realistic 'operational resiliency' strategy. ProcrastinatingReader (talk) 11:06, 16 October 2024 (UTC)
- Could also do a scale of 1-5. Being hosted on Toolforge could add a point, source code published could add a point, active maintainers could add a point, etc. –Novem Linguae (talk) 13:25, 16 October 2024 (UTC)
- Point systems are pointless (pun unintended). Let's not have a metric that serves no actionable purpose for sake of having one.
- That said, what's the RFC bot? It should go on the core list. Headbomb {t · c · p · b} 14:26, 16 October 2024 (UTC)
- @Headbomb: There are two RFC bots:
- Legobot (talk · contribs) once an hour (i) detects
{{rfc}}
transclusions that lack a|rfcid=
parameter, and adds one; (ii) ensures that the next valid timestamp after every existing{{rfc}}
tag is less than thirty days in the past, and if not, removes the{{rfc}}
tag and also removes the RfC statement from all of the listings (such as WP:RFC/BIO); (iii) checks the RfC category parameters for each{{rfc}}
transclusion, such as|bio
, and ensures that the RfC is listed on corresponding pages such as WP:RFC/BIO - Yapperbot (talk · contribs) (also once an hour, but half an hour after Legobot) sends messages to user talk pages concerning RfCs where Legobot has recently added a
|rfcid=
parameter, see WP:FRS
- Legobot (talk · contribs) once an hour (i) detects
- HTH. --Redrose64 🌹 (talk) 15:27, 16 October 2024 (UTC)
- @Headbomb: There are two RFC bots:
- I don't see a lot of utility in a single summary score. I think the number of core bots should remain below the threshold where a group of people could go through them and determine relative priorities for attention. Plus, in a volunteer environment, who works on helping with what bot is going to be highly influenced by personal interest in the associated workflow, in any case. For an individual characteristic like "maintainable tech stack", there could be some usefulness in having a score, to help those not familiar with the details of the related technology to make relative comparisons. I would consider it to be more descriptive than analytic, though, to avoid getting bogged down in its precision. isaacl (talk) 15:55, 16 October 2024 (UTC)
- Could also do a scale of 1-5. Being hosted on Toolforge could add a point, source code published could add a point, active maintainers could add a point, etc. –Novem Linguae (talk) 13:25, 16 October 2024 (UTC)
- Yes. Basically a breakdown by workflow: here's an important process (which might be doing a set of ongoing work items), and here're the key elements that are automated in order to make this sustainable. I was thinking that depending on the size of the lists, or the number of bots that support multiple workflows, it might be worthwhile to keep the bot list with its details separate, and just have the workflow list point to the bots in the bot list. I feel this makes it easier to think about what workflows are absolutely necessary to keep running (and think of ones that are missing from the list), and to know what they rely on. isaacl (talk) 22:23, 14 October 2024 (UTC)
- Do you mean something like this? User:ProcrastinatingReader/Core bots ProcrastinatingReader (talk) 22:04, 14 October 2024 (UTC)
- I suggest creating a parent list of key English Misplaced Pages processes/ongoing work items, and under them listing the essential automated tasks for those processes/work items. (I understand that some bots may be grouped under multiple processes/work items.) At the very least, it would be helpful to those not familar with all the bots if the list could include a brief summary of their essential tasks. isaacl (talk) 21:54, 14 October 2024 (UTC)
- I added a couple bots to the list started above. I also included how esoteric their tech stack would be considered these days (ie: how easily could someone take over maintaining it with updates/fixes). Bots using pywikibot or mwbot-rs for example I think are quite accessible. Custom C++ code or even Perl code I'd say is not particularly easy to take over. Realistically there's nothing we can do about these, but it's worth remaining aware of our bus factor.I'm loosely defining "core" as the bot disappearing causing noticeable disruption to the encyclopaedia, some significant process, or otherwise meaningfully impacting the quality of articles. ProcrastinatingReader (talk) 21:46, 14 October 2024 (UTC)
Reuse for bots and tools
Somewhat orthogonal to all this the above thread, in an ideal world, I'd love to see greater standardization across bots and tools. I've written a few of my own tools, but I spent a lot of time reinventing a lot of wheels to make them work. I know there's been some progress in this area (pywikibot is certainly a step in the right direction) but there is still a lot of effort expending by people running in different directions. Which in turn makes it harder to have people pick up other people's projects. RoySmith (talk) 16:35, 16 October 2024 (UTC)
- @RoySmith: as I imagine you've already heard, frameworks are wonderful: everyone should have one :-). It's a big challenge to create one that is usable by others, with sufficient documentation. There's Help:Creating a bot § Programming languages and libraries, but it's mostly just a big list, without much guidance to help someone decide on what to use. I feel there should be a location for programmers to share experiences, but I'm not sure where that is. Wikipedia_talk:Bots redirects to Misplaced Pages:Bots/Noticeboard, whose header makes it sound more like a co-ordination spot than somewhere to collaborate on development. isaacl (talk) 21:31, 16 October 2024 (UTC)
- Some statistics would go a long way. Help:Creating a bot lists a lot of options that nobody would recommend nowadays. While I don’t believe we should mandate any language or toolkit, it would help to inform new developers which languages and toolkits are used in bots in active use, especially more recent bots. At least half of the current BRFAs are Python with Pywikibot. Daniel Quinlan (talk) 22:18, 16 October 2024 (UTC)
- I agree that knowing some basic usage info would be helpful. Whenever I look at a third-party library/framework/tool, I want to know how popular it is and how actively it is maintained, in order to get a sense of how likely it is to continue to maintained in future, how easy will it be able to find answers to questions, and how useful have others found it. But circling back to the problem of overhead scaring away developers, this also applies to those creating code and tools for reuse. Tracking this info and keeping it up to date is extra work, and it might be less interesting for a one-person team than working on their project. isaacl (talk) 22:55, 16 October 2024 (UTC)
- Some statistics would go a long way. Help:Creating a bot lists a lot of options that nobody would recommend nowadays. While I don’t believe we should mandate any language or toolkit, it would help to inform new developers which languages and toolkits are used in bots in active use, especially more recent bots. At least half of the current BRFAs are Python with Pywikibot. Daniel Quinlan (talk) 22:18, 16 October 2024 (UTC)
Update the global bots section
Hi, currently the global bots policy says that (Misplaced Pages:Global_rights_policy#Global_bots) global bots can only run on this wiki for the purpose of fixing double-redirects. I believe that is outdated, because it links to a discussion from 2008 and Meta policies have changed global bots in 2021 to allow running any task that is approved. I think this requires a change on this wiki as well, as otherwise it can be rather confusing. I propose allowing global bots to run here for any approved task, especially since en.wikipedia will be notified when anyone submits a global bot request. After all, this wiki can still instruct any bot not to run here, if required. If not, I recommend adding this wiki to the opt-out set so that global bots are completely disabled (rather than enabled for just one purpose). Leaderboard (talk) 06:49, 12 September 2024 (UTC)
- Applying for local bot approval here should still be required for bots here, our community and project are huge and expect our bot operators to be engaged here. As far should we kick out all global bots that are doing the task they are already approved for, that doesn't seem to be necessary. — xaosflux 09:32, 12 September 2024 (UTC)
- Any change to the policy would need consensus for that change, here on the English Misplaced Pages. That discussion could be held here, but would need to be an actual RFC or other widely-advertised and widely-participated in discussion. Personally, I'd want to see more reason to change the policy than given so far. Anomie⚔ 12:07, 12 September 2024 (UTC)
- Agreed. Primefac (talk) 13:16, 12 September 2024 (UTC)
- Why is this change needed, and which specific global bots would help improve the English Misplaced Pages under a more permissive policy? Daniel Quinlan (talk) 00:18, 30 September 2024 (UTC)
- @Daniel Quinlan This is more for consistency, because I would normally not request local bot rights on a wiki that's not on the global bots opt-out set unless I know that it does not accept all kinds of global bots. I do not have any specific global bots in mind for this reason. Other wikis in this group include the Russian Misplaced Pages, where global bots are allowed but appear to reference old policies. Leaderboard (talk) 14:02, 2 October 2024 (UTC)
I would normally not request local bot rights on a wiki that's not on the global bots opt-out
. Good point. Maybe we should add our wiki to meta:Bot policy/Implementation#Where it is policy as "not allowed" so that global bot operators don't accidentally run global bots here. –Novem Linguae (talk) 00:23, 3 October 2024 (UTC)
- @Daniel Quinlan This is more for consistency, because I would normally not request local bot rights on a wiki that's not on the global bots opt-out set unless I know that it does not accept all kinds of global bots. I do not have any specific global bots in mind for this reason. Other wikis in this group include the Russian Misplaced Pages, where global bots are allowed but appear to reference old policies. Leaderboard (talk) 14:02, 2 October 2024 (UTC)
- The only relevant change I see from 2021 is this one, which does not say what you say it says. Is there another change somewhere else? Izno (talk) 19:52, 30 September 2024 (UTC)
- @Izno That's the one actually - why do you think that it "does not say what you say it says"? Leaderboard (talk) 14:03, 2 October 2024 (UTC)
- The change on meta changed the requirements for meta. It did not change the requirements here, so when you say
I think this requires a change on this wiki as well
, it reads as if you think we must be consistent with global policy. "Confusing" it is not: we simply have different requirements. Izno (talk) 17:21, 2 October 2024 (UTC)- @Izno Actually, that is what I was hinting at - "fixing double redirects" is just a single task that I am not convinced is needed as a specific exemption. And I was also saying that en-wiki is not the only wiki in this group, where it appears that the rules were created when the global bot policy was more restrictive. Also the policy change in Meta did change the rules for every wiki allowing global bots that did not explicitly have a restriction. Leaderboard (talk) 06:03, 3 October 2024 (UTC)
- If I'm understanding correctly, meta has a policy allowing global bots, but that policy doesn't mandate that the bots can be run on the individual wikis without their consent. Each wiki can make its own decisions on what bots it wants to accept. isaacl (talk) 06:56, 3 October 2024 (UTC)
- @Isaacl Yes and no. The global policies are global and individual wikis cannot "opt-out" of it. However, individual wikis can set preferences in terms of how they want global bots to use their bot flag, but I've seen a few wikis (eg this one) that references old Meta policies in doing so (which is what I want to correct) - I've not seen a single wiki that explicitly sets such restrictions while referencing the updated 2021 rules - nor have I seen any wiki yet have restrictions other than allowing fixing double-redirects/interwiki language links.
- And also, regarding "policy doesn't mandate that the bots can be run on the individual wikis without their consent", yes it isn't a "mandate", but the whole point of global bots is to avoid having operators request bot flags on every wiki, and hence if I were a global bot operator, I wouldn't go around asking for permission on global bot-approved wikis, unless I already know that the said wiki does not allow global bots for any approved purpose. And I cannot do this for all of the 800+ wikis that allow global bots either.
- TLDR; yes "wiki can make its own decisions on what bots it wants to accept", but to do so would kind of defeat the purpose of global bots. Leaderboard (talk) 07:31, 3 October 2024 (UTC)
- You're making a false distinction in
I wouldn't go around asking for permission on global bot-approved wikis, unless I already know that the said wiki does not allow global bots for any approved purpose
, which IMO makes your argument unconvincing. If you (as a global-bot operator) know enwiki only allows interwiki-fixing global bots without separate approval, why would you not ask for permission if you want your double-redirect-fixing bot to run here? If your only complaint is that meta:Bot policy/Implementation#Where it is policy isn't clear enough, an easier solution might be to improve that page.The global policies are global and individual wikis cannot "opt-out" of it.
Except this one they can. Even if the global bot policy didn't explicitly say so, I think you'd find that we don't necessarily accept here any random "policy" that someone on Meta declares is "global".I've not seen a single wiki that explicitly sets such restrictions while referencing the updated 2021 rules
You have, this one. Anomie⚔ 12:12, 3 October 2024 (UTC)- @Anomie The bot page links to a discussion from 2008, not 2021. Put it this way: I know I have to apply for local bot rights. Would someone else not familiar with en.wiki? Leaderboard (talk) 21:15, 3 October 2024 (UTC)
- They should if they read Misplaced Pages:Global rights policy#Global bots. Anomie⚔ 11:27, 4 October 2024 (UTC)
- @Anomie The bot page links to a discussion from 2008, not 2021. Put it this way: I know I have to apply for local bot rights. Would someone else not familiar with en.wiki? Leaderboard (talk) 21:15, 3 October 2024 (UTC)
- The global policy says
The operator should make sure to adhere to the wiki's preference as related to the use of the bot flag.
It explicitly allows each wiki to make its own decisions on what bots it wants to accept. isaacl (talk) 15:58, 3 October 2024 (UTC)- @Isaacl I don't dispute that, and I don't also dispute that en.wiki is not doing anything wrong per se. However, I do believe that en.wiki created this exemption in 2008 when Meta rules were different, and believed that it needs at least a relook in 2024. How and what I'm not too bothered with - the other contributors have a lot more experience than I. Put it this way: I would rather have en.wiki put itself in the global bot opt-out set so that it's clear to everyone that you must apply for a local bot flag, rather than this weird one-task exception which isn't obvious unless you actually go to the bot policy (and it's not like it's any more difficult for bot operators fixing double-redirects to file a local bot flag request than a global bot operator for any other task). Leaderboard (talk) 21:21, 3 October 2024 (UTC)
- As I mentioned back at the beginning of this, if you want a reexamination then creating an RFC is the way to go. If you want to start drafting one (I recommend a draft to reduce the chance of confusing wording issues), feel free. Anomie⚔ 11:27, 4 October 2024 (UTC)
- I'm not motivated enough to do it - you all have more experience than I. I just posted here as a suggestion for improvement - it appears to me that the community does not feel this to be worth it. Leaderboard (talk) 19:37, 4 October 2024 (UTC)
- As I mentioned back at the beginning of this, if you want a reexamination then creating an RFC is the way to go. If you want to start drafting one (I recommend a draft to reduce the chance of confusing wording issues), feel free. Anomie⚔ 11:27, 4 October 2024 (UTC)
- @Isaacl I don't dispute that, and I don't also dispute that en.wiki is not doing anything wrong per se. However, I do believe that en.wiki created this exemption in 2008 when Meta rules were different, and believed that it needs at least a relook in 2024. How and what I'm not too bothered with - the other contributors have a lot more experience than I. Put it this way: I would rather have en.wiki put itself in the global bot opt-out set so that it's clear to everyone that you must apply for a local bot flag, rather than this weird one-task exception which isn't obvious unless you actually go to the bot policy (and it's not like it's any more difficult for bot operators fixing double-redirects to file a local bot flag request than a global bot operator for any other task). Leaderboard (talk) 21:21, 3 October 2024 (UTC)
- You're making a false distinction in
- If I'm understanding correctly, meta has a policy allowing global bots, but that policy doesn't mandate that the bots can be run on the individual wikis without their consent. Each wiki can make its own decisions on what bots it wants to accept. isaacl (talk) 06:56, 3 October 2024 (UTC)
- @Izno Actually, that is what I was hinting at - "fixing double redirects" is just a single task that I am not convinced is needed as a specific exemption. And I was also saying that en-wiki is not the only wiki in this group, where it appears that the rules were created when the global bot policy was more restrictive. Also the policy change in Meta did change the rules for every wiki allowing global bots that did not explicitly have a restriction. Leaderboard (talk) 06:03, 3 October 2024 (UTC)
- The change on meta changed the requirements for meta. It did not change the requirements here, so when you say
- @Izno That's the one actually - why do you think that it "does not say what you say it says"? Leaderboard (talk) 14:03, 2 October 2024 (UTC)
"Bot policy" listed at Redirects for discussion
The redirect Bot policy has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Misplaced Pages:Redirects for discussion/Log/2024 October 12 § Bot policy until a consensus is reached. C F A 💬 20:40, 12 October 2024 (UTC)
Category: