Revision as of 15:28, 23 June 2020 editTomasTomasTomas (talk | contribs)Extended confirmed users565 edits →Finding artwork for missing pages: responseTag: 2017 wikitext editor← Previous edit | Revision as of 17:40, 23 June 2020 edit undoHasteur (talk | contribs)31,857 edits →A bot to develop a mass of short stubs and poorly built articles for Brazilian municipalities: Not a good ideaNext edit → | ||
Line 295: | Line 295: | ||
*'''Support'''. See ] for more discussion. It looks like a straightforward (but far from trivial) screenscraper. ] (]) 12:35, 20 May 2020 (UTC) | *'''Support'''. See ] for more discussion. It looks like a straightforward (but far from trivial) screenscraper. ] (]) 12:35, 20 May 2020 (UTC) | ||
:{{rto|Encyclopædius}} and {{rto|Aymatth2}} Where's the community endorsed consensus from WikiProject Brazil/WikiProject Latin America/Village Pump? Where's your driver list of proposed articles? How are you proposing to improve the page so that these aren't perma stubs with no chance at improvement? Per ] and ] it's expected that there will be a very large and well attended consensus that this bulk creation is appropriate. In short, {{BOTREQ|badidea}} table this until you have a community conesnsus in hand as very few bot operators will roll the dice on doing this task in exchange for having their bot revoked. ] (]) 17:40, 23 June 2020 (UTC) | |||
== Updating the dates on the maps on ] == | == Updating the dates on the maps on ] == |
Revision as of 17:40, 23 June 2020
page for bot requestsThis page has a backlog that requires the attention of willing editors. Please remove this notice when the backlog is cleared. |
Commonly Requested Bots |
This is a page for requesting tasks to be done by bots per the bot policy. This is an appropriate place to put ideas for uncontroversial bot tasks, to get early feedback on ideas for bot tasks (controversial or not), and to seek bot operators for bot tasks. Consensus-building discussions requiring large community input (such as request for comments) should normally be held at WP:VPPROP or other relevant pages (such as a WikiProject's talk page).
You can check the "Commonly Requested Bots" box above to see if a suitable bot already exists for the task you have in mind. If you have a question about a particular bot, contact the bot operator directly via their talk page or the bot's talk page. If a bot is acting improperly, follow the guidance outlined in WP:BOTISSUE. For broader issues and general discussion about bots, see the bot noticeboard.
Before making a request, please see the list of frequently denied bots, either because they are too complicated to program, or do not have consensus from the Misplaced Pages community. If you are requesting that a template (such as a WikiProject banner) is added to all pages in a particular category, please be careful to check the category tree for any unwanted subcategories. It is best to give a complete list of categories that should be worked through individually, rather than one category to be analyzed recursively (see example difference).
- Alternatives to bot requests
- WP:AWBREQ, for simple tasks that involve a handful of articles and/or only needs to be done once (e.g. adding a category to a few articles).
- WP:URLREQ, for tasks involving changing or updating URLs to prevent link rot (specialized bots deal with this).
- WP:USURPREQ, for reporting a domain be usurped eg.
|url-status=usurped
- WP:SQLREQ, for tasks which might be solved with an SQL query (e.g. compiling a list of articles according to certain criteria).
- WP:TEMPREQ, to request a new template written in wiki code or Lua.
- WP:SCRIPTREQ, to request a new user script. Many useful scripts already exist, see Misplaced Pages:User scripts/List.
- WP:CITEBOTREQ, to request a new feature for WP:Citation bot, a user-initiated bot that fixes citations.
Note to bot operators: The {{BOTREQ}} template can be used to give common responses, and make it easier to keep track of the task's current status. If you complete a request, note that you did with {{BOTREQ|done}}
, and archive the request after a few days (WP:1CA is useful here).
Please add your bot requests to the bottom of this page.
Make a new request
Legend |
---|
|
|
|
|
|
Manual settings |
When exceptions occur, please check the setting first. |
Bot-related archives |
---|
Noticeboard1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19 |
Bots (talk)1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 21, 22 Newer discussions at WP:BOTN since April 2021 |
Bot policy (talk)19, 20, 21, 22, 23, 24, 25, 26, 27, 28 29, 30 Pre-2007 archived under Bots (talk) |
Bot requests1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 31, 32, 33, 34, 35, 36, 37, 38, 39, 40 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 51, 52, 53, 54, 55, 56, 57, 58, 59, 60 61, 62, 63, 64, 65, 66, 67, 68, 69, 70 71, 72, 73, 74, 75, 76, 77, 78, 79, 80 81, 82, 83, 84, 85, 86, 87 |
Bot requests (talk)1, 2 Newer discussions at WP:BOTN since April 2021 |
BRFAOld format: 1, 2, 3, 4 New format: Categorized Archive (All subpages) |
BRFA (talk)1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15 Newer discussions at WP:BOTN since April 2021 |
Bot Approvals Group (talk)1, 2, 3, 4, 5, 6, 7, 8, 9 BAG Nominations |
Copy coordinates from lists to articles
Virtually every one of the 3000-ish places listed in the 132 sub-lists of National Register of Historic Places listings in Virginia has an article, and with very few exceptions, both lists and articles have coordinates for every place, but the source database has lots of errors, so I've gone through all the lists and manually corrected the coords. As a result, the lists are a lot more accurate, but because I haven't had time to fix the articles, tons of them (probably over 2000) now have coordinates that differ between article and list. For example, the article about the John Miley Maphis House says that its location is 38°50′20″N 78°35′55″W / 38.83889°N 78.59861°W / 38.83889; -78.59861, but the manually corrected coords on the list are 38°50′21″N 78°35′52″W / 38.83917°N 78.59778°W / 38.83917; -78.59778. Like most of the affected places, the Maphis House has coords that differ only a small bit, but (1) ideally there should be no difference at all, and (2) some places have big differences, and either we should fix everything, or we'll have to have a rather pointless discussion of which errors are too little to fix.
Therefore, I'm looking for someone to write a bot to copy coords from each place's NRHP list to the coordinates section of {{infobox NRHP}} in each place's article. A few points to consider:
- Some places span county lines (e.g. bridges over border streams), and in many of these cases, each list has separate coordinates to ensure that the marked location is in that list's county. For an extreme example, Skyline Drive, a long scenic road, is in eight counties, and all eight lists have different coordinates. The bot should ignore anything on the duplicates list; this is included in citation #4 of National Register of Historic Places listings in Virginia, but I can supply a raw list to save you the effort of distilling a list of sites to ignore.
- Some places have no coordinates in either the list or the article (mostly archaeological sites for which location information is restricted), and the bot should ignore those articles.
- Some places have coordinates only in the list or only in the article's {{Infobox NRHP}} (for a variety of reasons), but not in both. Instead of replacing information with blanks or blanks with information, the bot should log these articles for human review.
- Some places might not have {{infobox NRHP}}, or in some cases (e.g. Newport News Middle Ground Light) it's embedded in another infobox, and the other infobox has the coordinates. If {{infobox NRHP}} is missing, the bot should log these articles for human review, while embedded-and-coordinates-elsewhere is covered by the previous bullet.
- I don't know if this is the case in Virginia, but in some states we have a few pages that cover more than one NRHP-listed place (e.g. Zaleski Mound Group in Ohio, which covers three articles); if the bot produced a list of all the pages it edits, a human could go through the list, find any entries with multiple appearances, and check them for fixes.
- Finally, if a list entry has no article at all, don't bother logging it. We can use WP:NRHPPROGRESS to find what lists have redlinked entries.
I've copied this request from an archive three years ago; an off-topic discussion happened, but no bot operators offered any opinions. Neither then nor now has any discussion has yet been conducted for this idea; it's just something I've thought of. I've come here basically just to see if someone's willing to try this route, and if someone says "I think I can help", I'll start the discussion at WT:NRHP and be able to say that someone's happy to help us. Of course, I wouldn't ask you actually to do any coding or other work until after consensus is reached at WT:NRHP. Nyttend (talk) 15:53, 12 February 2020 (UTC)
- You could use {{Template parameter value}} to pull the coordinate values out of the {{NRHP row}} template. It would still likely take a bot to do the swap but it would mean less updating in the future. Of course, if the values are 100% accurate on the lists then I suppose it wouldn't be necessary. Primefac (talk) 16:55, 12 February 2020 (UTC)
- Never heard of that template before. It sounds like an Excel
=whatever
function, e.g. in cell L4 you type=B4
so that L4 displays whatever's in B4; is that right? If so, I don't think it would be useful unless it were immediately followed by whatever's analogous to Excel's "Paste Values". Is that what you mean by having a bot doing the swap? Since there are 3000+ entries, I'm sure there are a few errors somewhere, but I trust they're over 99% accurate. Nyttend (talk) 02:57, 13 February 2020 (UTC)- That's a reasonable analogy, actually. Check out the source of Normani#Awards_and_nominations: it pulls the
wins
andnominations
values from the infobox at the "list of awards", which means the main article doesn't need to be updated every time the list is changed. - As far as what the bot would do, it would take one value of {{coord}} and replace it with a call to {{Template parameter value}}, pointing in the direction of the "more accurate" data. If the data is changed in the future, it would mean not having to update both pages.
- Now, if the data you've compiled is (more or less) accurate and of the not-likely-to-change variety (I guess I wouldn't expect a monument to move locations) then this is a silly suggestion – since there wouldn't be a need for automatic syncing – and we might as well just have a bot do some copy/pasting. Primefac (talk) 21:27, 14 February 2020 (UTC)
- Y'know, this sort of situation is exactly what Wikidata is designed for... --AntiCompositeNumber (talk) 22:29, 14 February 2020 (UTC)
- Primefac, thank you for the explanation. The idea sounds wonderful for situations like the list of awards, but yes these are rather accurate and unlikely to change (imagine someone picking up File:Berry Hill near Orange.jpg and moving it off site), so the bot copy/paste job is probably best. Nyttend (talk) 02:23, 15 February 2020 (UTC)
- By the way, Primefac, are you a bot operator, or did you simply come here to offer useful input as a third party? Nyttend (talk) 03:12, 20 February 2020 (UTC)
- I am both botop and BAG, but I would not be offering to take up this task as it currently stands. Primefac (talk) 11:24, 20 February 2020 (UTC)
- Thank you for helping me understand. "as it currently stands" Is there something wrong with it, i.e. if changes were made you'd be offering, or do you simply mean that you have other interests (WP:VOLUNTEER) and don't feel like getting involved in this one? This question might sound like I'm being petty; I'm writing with a smile and not trying to complain at all. Nyttend (talk) 00:27, 21 February 2020 (UTC)
- I am both botop and BAG, but I would not be offering to take up this task as it currently stands. Primefac (talk) 11:24, 20 February 2020 (UTC)
- Y'know, this sort of situation is exactly what Wikidata is designed for... --AntiCompositeNumber (talk) 22:29, 14 February 2020 (UTC)
- That's a reasonable analogy, actually. Check out the source of Normani#Awards_and_nominations: it pulls the
- Never heard of that template before. It sounds like an Excel
- I came here to say what AntiCompositeNumber said. It's worth emphasising: this is exactly what Wikidata is designed for. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:25, 17 March 2020 (UTC)
- Actually not. A not-so-small fraction of articles need to have different coordinates in lists and infoboxes, as I already noted here. If we consistently rely on the lists to inform Wikidata, it's going to end up with a good number of self-contradictions due to lists that appropriately don't provide coordinates that make sense in articles (e.g. multi-county listings). Moreover, you can't rely on the infoboxes to inform Wikidata, because there's a consistently unacceptable error rate in coordinates unchecked by humans, and very few infoboxes are checked by humans; they're derived from the National Register database, and it would be pointless to ignore or trash the human-corrected Virginia coordinates. Literally all that needs to be done is a bot doing some copy/pasting; it would greatly be appreciated if someone were to spend a few minutes on this, instead of passing the buck. Nyttend backup (talk) 19:36, 28 April 2020 (UTC)
- The coordinates in the lists are often incorrect too. Let me know if you want help manually correcting them. Abductive (reasoning) 02:46, 22 June 2020 (UTC)
- Actually not. A not-so-small fraction of articles need to have different coordinates in lists and infoboxes, as I already noted here. If we consistently rely on the lists to inform Wikidata, it's going to end up with a good number of self-contradictions due to lists that appropriately don't provide coordinates that make sense in articles (e.g. multi-county listings). Moreover, you can't rely on the infoboxes to inform Wikidata, because there's a consistently unacceptable error rate in coordinates unchecked by humans, and very few infoboxes are checked by humans; they're derived from the National Register database, and it would be pointless to ignore or trash the human-corrected Virginia coordinates. Literally all that needs to be done is a bot doing some copy/pasting; it would greatly be appreciated if someone were to spend a few minutes on this, instead of passing the buck. Nyttend backup (talk) 19:36, 28 April 2020 (UTC)
2019-20 coronavirus pandemic updater bot
If any bot could take data from https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6 and https://www.worldometers.info/coronavirus/#countries and edit Template:Cases in 2019–20 coronavirus pandemic and Template:Territories affected by the 2019-20 coronavirus pandemic automatically with the latest information that would be great. Sam1370 (talk) 00:21, 8 April 2020 (UTC)
- Paging Wugapodes, who's working on a similar bot. Enterprisey (talk!) 01:51, 8 April 2020 (UTC)
- I'm a little backlogged at the moment, but will try to get the worldometers dataset working asap. The first link uses the same dataset that WugBot does, and an interim solution would be to write a Lua module that reads the on-wiki CSV files and writes a wikitable. — Wug·a·po·des 05:12, 8 April 2020 (UTC)
- @Wugapodes: I did a little work on this myself, and found that there’s an additional complication: the GitHub dataset updates only daily, while the actual interactive website updates every few hours. I tried fooling around with some web-scrapers that support JavaScript, but ran into a lot of problems, probably due to my very small amount of programming experience. Perhaps you can find a working solution? Sam1370 (talk) 10:04, 8 April 2020 (UTC)
- Updating more than once a day is likely unnecessary. The source data for each administrative unit doesn't really update more than once a day anyway, the website just shows the data as it comes in and the GitHub export combines it into a batch update. --AntiCompositeNumber (talk) 23:15, 8 April 2020 (UTC)
- However, it is likely that even if this bot is implemented which updates it once per day, there are still going to be people who, in the interest of providing the most up-to-date information, will manually edit in the correct numbers, and bringing us back to where we started. I think that we should try to keep the info as accurate and recent as possible. I have contacted JHU on their email about this subject, asking him to either make the GitHub update along with the site or provide an easy way for a bot to get the most up to date data, but have received no response so far. Sam1370 (talk) 06:32, 9 April 2020 (UTC)
- Perhaps this could be useful for any developers who want to take up the task: https://apify.com/covid-19 Sam1370 (talk) 06:37, 9 April 2020 (UTC)
- Any potential problems caused by manual changes may be resolved by the bot building the page instead of amending it, just as Legobot (talk · contribs) does with the RfC listings. For example, go to WP:RFC/BIO and alter it in any way you like - move the requests around, delete some, add others. Then wait for the next bot run (1 min past the hour) and see what happens. --Redrose64 🌹 (talk) 07:58, 9 April 2020 (UTC)
- However, do we really want to sacrifice accuracy for automation? Personally I would rather have manual, but the most accurate, case readings instead of automated, but slightly inaccurate, readings. As for the bot building the page, that just seems weird to me — removing helpful edits in favor of outdated data? I think we should either find a way to deliver the information right along with the JHU site, or leave it to be updated manually. Sam1370 (talk) 09:35, 9 April 2020 (UTC)
- However, it is likely that even if this bot is implemented which updates it once per day, there are still going to be people who, in the interest of providing the most up-to-date information, will manually edit in the correct numbers, and bringing us back to where we started. I think that we should try to keep the info as accurate and recent as possible. I have contacted JHU on their email about this subject, asking him to either make the GitHub update along with the site or provide an easy way for a bot to get the most up to date data, but have received no response so far. Sam1370 (talk) 06:32, 9 April 2020 (UTC)
- Updating more than once a day is likely unnecessary. The source data for each administrative unit doesn't really update more than once a day anyway, the website just shows the data as it comes in and the GitHub export combines it into a batch update. --AntiCompositeNumber (talk) 23:15, 8 April 2020 (UTC)
- It would be best to use mw:Help:Tabular Data files on Commons, that way other wikis can benefit from the updating data as well. Tabular data can also be used to create graphs and charts using Extension:Graph. --AntiCompositeNumber (talk) 23:21, 8 April 2020 (UTC)
- Oh come on, the JHU data isn't freely licensed and they're actively claiming copyright over it (which has no basis in US law). Copying it to Commons would not be a great idea in that case, unless the Commons community has decided to ignore their claims. --AntiCompositeNumber (talk) 23:30, 8 April 2020 (UTC)
- @Wugapodes: I did a little work on this myself, and found that there’s an additional complication: the GitHub dataset updates only daily, while the actual interactive website updates every few hours. I tried fooling around with some web-scrapers that support JavaScript, but ran into a lot of problems, probably due to my very small amount of programming experience. Perhaps you can find a working solution? Sam1370 (talk) 10:04, 8 April 2020 (UTC)
- I'm a little backlogged at the moment, but will try to get the worldometers dataset working asap. The first link uses the same dataset that WugBot does, and an interim solution would be to write a Lua module that reads the on-wiki CSV files and writes a wikitable. — Wug·a·po·des 05:12, 8 April 2020 (UTC)
- The JHU data had a specific discussion at Misplaced Pages:Village_pump_(technical)/Archive_180#Let's_update_all_our_COVID-19_data_by_bot_instead_of_manually; Enterprisey/Wugapodes, you need to stop the bot task at earliest convenience. Thanks. --Izno (talk) 23:43, 8 April 2020 (UTC)
- @Izno: I've been in touch with WMF Legal regarding this specific bot task and the response from Jrogers (WMF) was "I don't see any reason for the Foundation to remove these templates or any of the map pages linked from them". Johns Hopkins can claim copyright only on the specific presentation and selection of the data, not the data itself (which is public domain) per Feist v. Rural: "Notwithstanding a valid copyright, a subsequent compiler remains free to use the facts contained in another's publication to aid in preparing a competing work, so long as the competing work does not feature the same selection and arrangement". The data on the wiki have a different presentation and selection of data and therefore represent a valid use of the public domain component of the Johns Hopkins dataset, so I see no need to stop the bot task nor does WMF's senior legal counsel see a reason to remove its output. — Wug·a·po·des 03:14, 9 April 2020 (UTC)
- The data's not acceptable on Commons because Commons cares about source country and US copyright. However, enwiki only cares about US copyright law, which doesn't recognize any copyrightable authorship in data like this. --AntiCompositeNumber (talk) 13:28, 9 April 2020 (UTC)
- @Izno: I've been in touch with WMF Legal regarding this specific bot task and the response from Jrogers (WMF) was "I don't see any reason for the Foundation to remove these templates or any of the map pages linked from them". Johns Hopkins can claim copyright only on the specific presentation and selection of the data, not the data itself (which is public domain) per Feist v. Rural: "Notwithstanding a valid copyright, a subsequent compiler remains free to use the facts contained in another's publication to aid in preparing a competing work, so long as the competing work does not feature the same selection and arrangement". The data on the wiki have a different presentation and selection of data and therefore represent a valid use of the public domain component of the Johns Hopkins dataset, so I see no need to stop the bot task nor does WMF's senior legal counsel see a reason to remove its output. — Wug·a·po·des 03:14, 9 April 2020 (UTC)
- Heather Houser (May 5, 2020). "The Covid-19 'Infowhelm'". The New York Review of Books.
Covid-19 is undoubtedly testing our public health, medical, and economic systems. But it's also testing our ability to process so much frightening and imminently consequential data. All these data add up to the Covid-19 "infowhelm," the term I use to describe the phenomenon of being overwhelmed by a constant flow of sometimes conflicting information.
-- GreenC 16:56, 6 May 2020 (UTC)
Cleanup Template:harv-like templates.
If you have short citations like
{{harvnb|Smith|2001|pp=13}}
{{harvnb|Smith|2001|p=1-3}}
Those will appear like
- Smith 2001, pp. 13
- Smith 2001, p. 1-3
Those are obviously wrong, and should be fixed so they would appear like this
{{harvnb|Smith|2001|p=13}}
{{harvnb|Smith|2001|pp=1–3}}
Those will appear like
- Smith 2001, p. 13
- Smith 2001, pp. 1–3
Those should be an easy fix for an AWB bot or similar. Those should cover all {{harv}}/{{sfn}}-like templates. Headbomb {t · c · p · b} 13:02, 11 April 2020 (UTC)
- Also, the same for
{{harvnb|Smith|2001|p=p. 13}} {{harvnb|Smith|2001|p=p. 1–3}} {{harvnb|Smith|2001|pp=pp. 13}} {{harvnb|Smith|2001|pp=pp. 1–3}} {{harvnb|Smith|2001|p=pp. 13}} {{harvnb|Smith|2001|p=pp. 1–3}} {{harvnb|Smith|2001|pp=p. 13}} {{harvnb|Smith|2001|pp=p. 1–3}} |
→ |
{{harvnb|Smith|2001|p=13}} {{harvnb|Smith|2001|pp=1–3}} {{harvnb|Smith|2001|p=13}} {{harvnb|Smith|2001|pp=1–3}} {{harvnb|Smith|2001|p=13}} {{harvnb|Smith|2001|pp=1–3}} {{harvnb|Smith|2001|p=13}} {{harvnb|Smith|2001|pp=1–3}} |
Headbomb {t · c · p · b} 13:05, 11 April 2020 (UTC)
- @Headbomb: Before doing this, would it be reasonable to ask if the template source could be tweaked to display the right info even when the parameter is incorrect? GoingBatty (talk) 04:12, 23 April 2020 (UTC)
- A bot or script taking on this task would somehow have to account for the edge case where a single page number contains a valid hyphen, like
p=3-1
, for a document where page 1 of part 3 is called "3-1". – Jonesey95 (talk) 05:00, 23 April 2020 (UTC)- Those have IMO, acceptable false positives rates (after all this type of stuff is part of AWB genfixes, and no one is calling for heads to roll), and that's why the standard is to explicitly set
|page=3{{hyphen}}1
in those cases in CS1/CS2 templates. But if that's somehow not an acceptable solution here, the bot could take care of the rest. Or assume that|p=p. 3-4
should be converted to|p=3-4
and not|pp=3–4
. Headbomb {t · c · p · b} 15:21, 2 May 2020 (UTC)
- Those have IMO, acceptable false positives rates (after all this type of stuff is part of AWB genfixes, and no one is calling for heads to roll), and that's why the standard is to explicitly set
- A bot or script taking on this task would somehow have to account for the edge case where a single page number contains a valid hyphen, like
WikiProject United States files on Commons
There are thousands of file talk pages in Category:File-Class United States articles for files that were moved to Commons and deleted in 2011 or 2012. These talk pages contain no content except a transclusion of {{WikiProject United States}} (or one of its redirects) and should have been deleted long ago. These transclusions are of no use to the WikiProject and should be removed; however, simply removing them would leave these talk pages blank and mislead a viewer seeing a blue link into thinking there is something there. More broadly, there is no reason for these Commons files to be project-tagged on en.wikipedia—local talk pages for Commons files generally lead to split discussions or invite occasional comments that no one sees or answers.
I asked about these talk pages at the WikiProject's talk page (see Misplaced Pages talk:WikiProject United States#Categorizing files on Commons), and was told to "go with own instincts on this". Any page in Category:File-Class United States articles that (1) does not have a corresponding file on en.wikipedia and (2) contains no content other than a transclusion of {{WikiProject United States}} (or a redirect), should be speedily deleted under criterion G6 (routine housekeeping). Given the sheer number of pages involved, I am hoping a bot could take on the task. Thanks, -- Black Falcon 23:21, 19 April 2020 (UTC)
- I don't think it's this clear cut. Even if the files are on Commons, they do appear and are used on enWikipedia and I can see reasons to tag them for a WikiProject. The misplaced comments are an actual problem but I don't think their occurrence has any correlation with the presence of a WikiProject template. Jo-Jo Eumerus (talk) 08:23, 20 April 2020 (UTC)
- Jo-Jo Eumerus, you may be right in general, but this WikiProject does not have such reasons. Certainly, the fact that CSD G8 exempts talk pages for files that exist on Commons suggests it would be wrong to assume that no WikiProject could tag files on Commons (although that is my preference). However, I am not looking to take on that broader issue right now, and my focus is just on WikiProject United States, which does not need these pages to be tagged. -- Black Falcon 16:32, 26 April 2020 (UTC)
- I think I'm going to go with Needs wider discussion. WP:PROJSCOPE is pretty clear that if the members of a wikiproject agree that a page is outside of their scope, it should not be tagged. However, anything related to mass deletion requires strong consensus to implement, which I do not see here. Under WP:G8, simply being a talk page for a Commons file is not a sufficient reason to delete, and this task isn't clearly covered by the text of G6. According to this query, there are over 11,000 file talk pages with only one revision and a {{WikiProject United States}} tag. (I unfortunately can't reliably filter for "has more than one template" without doing wikitext parsing. However, most of the WP:USA file tags appear to have been added in single-project AWB runs, so the total number is likely to be fairly close. Any bot that would implement this task would need to parse the wikitext to ensure that the WP:USA tag is the only page content.) Bot tagging 11,000 pages for deletion is also not exactly polite, so this task would be best implemented by an adminbot that can just do the deletions (which again, requires demonstrated strong community approval). --AntiCompositeNumber (talk) 16:25, 22 April 2020 (UTC)
- AntiCompositeNumber, thank you for responding. The challenge is that there are two issues which are intermingled here: (1) removing {{WikiProject United States}} from talk pages of files on Commons; and (2) mass-deleting the resulting empty talk pages. (1) is the WikiProject's decision and does not require a wider discussion. I was hoping that (2) would be uncontroversial housekeeping (CSD G6), but am willing to seek a wider discussion if that is not the case. -- Black Falcon 16:32, 26 April 2020 (UTC)
- @AntiCompositeNumber: Making an edit with the sole purpose of bringing the page within the scope of a speedy deletion criterion is not acceptable behaviour for a human or a bot. You will need explicit consensus that these pages should be deleted. Thryduulf (talk) 09:57, 19 May 2020 (UTC)
- @AntiCompositeNumber: fixing the ping. Thryduulf (talk) 09:58, 19 May 2020 (UTC)
- That's what I said. @Thryduulf:, did you mean to ping Black Falcon instead? --AntiCompositeNumber (talk) 14:52, 19 May 2020 (UTC)
- Whoops, I did indeed mean to ping Black Falcon. Sorry. Thryduulf (talk) 15:41, 19 May 2020 (UTC)
- Thryduulf, that misses the point. The edits would not be for the sole purpose of speedily deleting the pages; instead, they would be for the purpose of removing an unneeded project banner. Deletion would be incidental to the pages becoming blank, and I am just trying to save time by skipping an intermediate step. However, in light of the hesitation expressed above, I will seek a wider discussion related to this request. -- Black Falcon 16:35, 25 May 2020 (UTC)
- That's what I said. @Thryduulf:, did you mean to ping Black Falcon instead? --AntiCompositeNumber (talk) 14:52, 19 May 2020 (UTC)
- AntiCompositeNumber, thank you for responding. The challenge is that there are two issues which are intermingled here: (1) removing {{WikiProject United States}} from talk pages of files on Commons; and (2) mass-deleting the resulting empty talk pages. (1) is the WikiProject's decision and does not require a wider discussion. I was hoping that (2) would be uncontroversial housekeeping (CSD G6), but am willing to seek a wider discussion if that is not the case. -- Black Falcon 16:32, 26 April 2020 (UTC)
Challenge bot
I hope that someone can help me by making a bot add the template that articles has been added to Misplaced Pages:WikiProject Europe/The 10,000 Challenge for example. There are several Challenge pages and there are templates to be added at the articles talk pages that the articles has been added to the Challenge project page, but the bot has stopped to do the task for a long time. Please ping me if this can be done.BabbaQ (talk) 17:17, 27 April 2020 (UTC)
- @BabbaQ: This still a active request? Hasteur (talk) 23:06, 6 June 2020 (UTC)
- @Hasteur: - If it can be done. Sure.--BabbaQ (talk) 10:03, 7 June 2020 (UTC)
Coding... I did a quick sample/proof of concept of going in and reviewing paged for eligibility, here's a random sampling of pages that appear to be eligible. Adding the template to the talk page is easy compared to unwinding the list.
- Talk:Pihtsusköngäs
- Talk:Otto Evens
- Talk:Royal Pawn (Denmark)
- Talk:Nyhavn 1
- Talk:Nyhavn 31
- Talk:Nyhavn 11
- Talk:Ludvig Ferdinand Rømer
- Talk:Nyhavn 51
- Talk:Eva Eklund
- Talk:Nyhavn 18
- Talk:Hostrups Have
- Talk:Nyhavn 12
- Talk:Nyhavn 20
- Talk:Verrayon House
- Talk:Lis Mellemgaard
- Talk:Sophia Bruun
- Talk:Inger-Lena Hultberg
@BabbaQ: Was there a consensus discussion about applying {{WPEUR10k}} to these talk pages? I suspect this isn't contraversial, but it might be needed when I go to file the BRFA. Hasteur (talk) 19:45, 7 June 2020 (UTC)
- @Hasteur: - I did the request based on this being uncontroversial. A few years back the template was added to all new articles joining the projects. And I was surprise to notice that was not done anymore. BabbaQ (talk) 08:11, 8 June 2020 (UTC)
- BRFA filed. Y Done. It's doing a first round of adding the templates. Hasteur (talk) 17:10, 21 June 2020 (UTC)
Removal of "w:en:" prefix from wikilinks
About 106 pages in article and template space contain wikilinks that begin with w:en:
, which is redundant, and VPT consensus was that this extra code can interfere with various tools and scripts that expect links to be in a certain form. Would it be possible for an AWB-wielding editor to go through and remove those prefixes, at least in article space? The edits in template space would need manual inspection to see if they are intentional for some reason. Pinging @Redrose64, Trialpears, Xaosflux, Johnuniq, and BrownHairedGirl:, who attended that VPT discussion. – Jonesey95 (talk) 03:55, 1 May 2020 (UTC)
- @Jonesey95, I'm doing it now for article space only. --BrownHairedGirl (talk) • (contribs) 04:36, 1 May 2020 (UTC)
- I have completed a first pass, in these 87 edits.
- In that run I turned off genfixes, so that I could focus clearly on this precise issue. Some of the links fixed were of the form
]
, which has now been changed to]
. That's fine ... however, many of the links were of the form]
, and that first run has left them as]
, which needs to be consolidated as]
. So I will do a second run through the set, just applying genfixes. --BrownHairedGirl (talk) • (contribs) 05:08, 1 May 2020 (UTC)- Second pass complete, in these 70 edits. A further 17 pages needed no genfixes at all, so were skipped. Note that some of the pages had only genfixes unrelated to the first pass.
- That leaves only the 19 pages in template-space with wikilinks that begin with
w:en:
. I will leave to others the manual inspection and possible cleanup of those templates. @Jonesey95, Redrose64, Trialpears, Xaosflux, and Johnuniq: do any of you want to do the templates? --BrownHairedGirl (talk) • (contribs) 05:31, 1 May 2020 (UTC)- I got all of the trivial ones for you. The remainder either require advanced permissions, or are Template:En-WP attribution notice, which appears to have the w:en: on purpose. The Squirrel Conspiracy (talk) 06:15, 1 May 2020 (UTC)
- I did a couple of "w:en:" removals but left it in these:
- Template:Non-free symbol (it specifically says "the English-language Misplaced Pages")
- Template:En-WP attribution notice (does it need "
:
" in case the parameter is File/Category? any other bad side effects?) - Template:Checkip ("intentional so that the links work in m:Special:CentralAuth")
- Template:Checkuser ("intentional so that the links work in m:Special:CentralAuth")
- My thanks to JJMC89 for the information regarding the last two.
- Johnuniq (talk) 07:49, 1 May 2020 (UTC)
- Thanks, all! I did not search other namespaces initially, but there are apparently 100,000+ instances across all namespaces. Many are in user signatures and other things that should not be modified, but detail-oriented editors may find links worth changing in some namespaces. – Jonesey95 (talk) 13:27, 1 May 2020 (UTC)
- I did a couple of "w:en:" removals but left it in these:
- I got all of the trivial ones for you. The remainder either require advanced permissions, or are Template:En-WP attribution notice, which appears to have the w:en: on purpose. The Squirrel Conspiracy (talk) 06:15, 1 May 2020 (UTC)
Updating DANFS links to ship articles
While working on a stub recently, I noticed the US Navy's Naval History and Heritage Command has updated the syntax of links to entries in the important reference Dictionary of American Naval Fighting Ships. This means that many outside links to the dictionary and tools like Template:DANFS (which is transcluded on hundreds if not thousands of US Navy ship articles) now have incorrect html targets. Here are three examples of repairs I've performed personally: , , . As those examples reveal, the new webpage structure isn't complicated and while I suppose I could go through all the articles by hand and rapidly improve my edit count, this is exactly the sort of thing that an automated performer of edits would be best to solve. I've never before requested a bot, so I'm asking meekly for advice. BusterD (talk) 15:54, 2 May 2020 (UTC)
- BusterD, what is the change in syntax between the old URL and the new one? Primefac (talk) 16:00, 2 May 2020 (UTC)
- Thanks for the speedy reply. As I mouseover the links I created in my request, I see
- the site is now secure "https" not "http"
- after the page address www.history.navy.mil/ they've added a new location for the entire collection "research/histories/ship-histories/"
- the new addresses all end in .html not .htm
- In addition, they've changed the reference structure so that the page link no longer directs to a sub-page, for example in the USS Minnesota example, the old link referenced the 11th "m" page, rendered as "m11". The new link just uses "m".
- The first three are simply direct replacement edits (copy and paste), the fourth one requires the deletion of ANY digit or digits directly following the only letter in that sector of the address. Does that make sense? I'm certain my use of terminology is inexpert. BusterD (talk) 16:15, 2 May 2020 (UTC)
- I've converted you list to numbers just for my own ease of use. I've got a few other projects I'm working on, but I'll take a look if and when I can.
- Small update, looks to me from a quick LinkSearch that we're looking at around 18k links to http://www.history.navy.mil. Primefac (talk) 16:21, 2 May 2020 (UTC)
- Thanks for the refactoring. I suspected the number of entries must be large. Testing the success of the first few attempts would be a simple matter. Thanks for any help you can offer. Perhaps there are some MilHist or WPShips people who'd do this, but as opposed to starting a talkpage discussion, I just thought I'd request automated help. I'd be glad to monitor or help in any way necessary. Coding isn't my thing. BusterD (talk) 16:26, 2 May 2020 (UTC)
- The major part of the change isn't recent - see and the discussion there was that the changes weren't completely consistent so couldn't easily be done automatically.Nigel Ish (talk) 20:41, 2 May 2020 (UTC)
- Thanks for linking that discussion. BusterD (talk) 21:55, 2 May 2020 (UTC)
- A lot of the links have been fixed manually by users in the meantime as part of normal editing.Nigel Ish (talk) 09:13, 3 May 2020 (UTC)
- Many of my criticisms in that old discussion stand but a few of the historical documents have come back. Some of those are not in the original report format preserving all context but in new transcribed form. Meanwhile the Vandals (Homeland Security with no interest in "Service history"?) sacked the USCG Historian's site with the old cutter histories "burned" and instead of a good index a pile of "stuff" one has to click through in hopes of finding what was once well organized. (Fingers crossed Army holds the anti Vandal defense line!) One has to realize providing excellent historical libraries for the public (that paid for everything) is not high on the mission priority or budget list and contracting out has eliminated subject matter expert librarians from intimate involvement and oversight regarding on line collections. Palmeira (talk) 15:04, 3 May 2020 (UTC)
- Thanks for linking that discussion. BusterD (talk) 21:55, 2 May 2020 (UTC)
- The major part of the change isn't recent - see and the discussion there was that the changes weren't completely consistent so couldn't easily be done automatically.Nigel Ish (talk) 20:41, 2 May 2020 (UTC)
- Thanks for the refactoring. I suspected the number of entries must be large. Testing the success of the first few attempts would be a simple matter. Thanks for any help you can offer. Perhaps there are some MilHist or WPShips people who'd do this, but as opposed to starting a talkpage discussion, I just thought I'd request automated help. I'd be glad to monitor or help in any way necessary. Coding isn't my thing. BusterD (talk) 16:26, 2 May 2020 (UTC)
- Thanks for the speedy reply. As I mouseover the links I created in my request, I see
"2019–20 coronavirus pandemic" title changing
As the result of a move request, 2019–20 coronavirus pandemic was moved to COVID-19 pandemic. Unfortunately, there are a metric tonne of articles (and templates and categories) that have "2019–20 coronavirus pandemic" (or "2020 coronavirus pandemic") in the name. Accordingly, it would be appreciated if we could get a bot that would move all of these to the consistent "COVID-19 pandemic" name. This matter was briefly discussed in the move request, with unanimous support for consistency, and it's quite obvious that all these titles should be in line with the main article, named so only because of the previous name.
While this is a one-time request, I believe this is too time-consuming with AWB as these are title changes. But happy to be told otherwise. -- tariqabjotu 03:14, 4 May 2020 (UTC)
- If implemented, can appropriate rcats please be added via the bot, as outlined as User:J947/sandbox/16. — J947 03:22, 4 May 2020 (UTC)
- As a note, it seems like enough people are attempting to do this manually that this may not be necessary as a bot. But, I'll leave this up anyway. -- tariqabjotu 03:31, 4 May 2020 (UTC)
- I'd like to see this done through a bot, just so we don't miss any and save ourselves some work. I started some discussion about general implementation here. {{u|Sdkb}} 04:40, 4 May 2020 (UTC)
- Care should also be taken to ensure all talk page archives (or any other subpages if they exist) are moved. Nil Einne (talk) 06:17, 4 May 2020 (UTC)
- I'd like to see this done through a bot, just so we don't miss any and save ourselves some work. I started some discussion about general implementation here. {{u|Sdkb}} 04:40, 4 May 2020 (UTC)
- Per an intitle search and it looks like all the relevant pages have been moved already. Galobtter (pingó mió) 23:25, 4 May 2020 (UTC)
- Wow, well done, gnomes! I found a few stragglers. Template:2019–20 coronavirus pandemic data/styles.css; Template:2019–20 coronavirus pandemic data/Benin medical cases chart; Template:Territories affected by the 2019-20 coronavirus pandemic; Template:2019–20 coronavirus pandemic data/India/Punjab medical cases chart; Charitable activities related to the 2019-20 coronavirus pandemic; maybe 2020 Philippine coronavirus testing controversy. – Jonesey95 (talk) 00:02, 5 May 2020 (UTC)
- Yes, it was mostly taken care of with AWB and the MassMove tool. Template:2019–20 coronavirus pandemic data/styles.css just needs to be deleted (done). Template:2019–20 coronavirus pandemic data/Benin medical cases chart was recently created; I moved it. Template:Territories affected by the 2019-20 coronavirus pandemic is a deprecated template that should just be deleted. Template:2019–20 coronavirus pandemic data/India/Punjab medical cases chart should just be a redirect; I reverted a change that removed it. Charitable activities related to the 2019-20 coronavirus pandemic was missed because it didn't have an endash; it's been moved. 2020 Philippine coronavirus testing controversy... yeah, I'm going to leave that alone. -- tariqabjotu 03:25, 5 May 2020 (UTC)
- Wow, well done, gnomes! I found a few stragglers. Template:2019–20 coronavirus pandemic data/styles.css; Template:2019–20 coronavirus pandemic data/Benin medical cases chart; Template:Territories affected by the 2019-20 coronavirus pandemic; Template:2019–20 coronavirus pandemic data/India/Punjab medical cases chart; Charitable activities related to the 2019-20 coronavirus pandemic; maybe 2020 Philippine coronavirus testing controversy. – Jonesey95 (talk) 00:02, 5 May 2020 (UTC)
- If someone wants to take on the task of a bot that tags a few hundred COVID-19 categories for moving, by all means; there is a long list at Misplaced Pages:Categories for discussion/Speedy#Current requests. -- tariqabjotu 03:43, 5 May 2020 (UTC)
Checking Category:Misplaced Pages images in SVG format
- Namespace:
File:
Hello!
I checked this category which is for so-called valid SVG files tagged with {{Valid SVG}} however I noticed that many in fact were invalid. I would like a bot to check all files in the category to see if they are in fact valid or if the files are mistagged. Steps:
- Check if file is valid at
http://validator.w3.org/check?uri=http:{{urlencode:{{filepath:{{#titleparts:{{PAGENAME}}}}}}}}
, if yes ignore, if no see 2. - Replace {{Valid SVG}} with {{Invalid SVG|<number of errors>}}.
- <number of errors> can be retrieved at
http://validator.w3.org/check?uri=http:{{urlencode:{{filepath:{{#titleparts:{{PAGENAME}}}}}}}}
- <number of errors> can be retrieved at
Pinging @JJMC89: who is familiar with the File: namespace.
I think this is quite important to do since now probably hundreds of files are lying about their validity which isn't good.
Thanks!Jonteemil (talk) 07:21, 7 May 2020 (UTC)
- @Jonteemil: I looked at doing this but couldn't find an API that matches that validator. There is one for https://validator.w3.org/nu/ that I could use though. — JJMC89 (T·C) 01:25, 28 May 2020 (UTC)
- @JJMC89: I don't think using another validator should be a problem, as long as they both output the same amount of errors. If that's not the case, I think the valid/invalid SVG templates should be updated with the new validator as well.Jonteemil (talk) 20:56, 29 May 2020 (UTC)
- @JJMC89 and Jonteemil: Both validators shared the same number of warnings/errors for a few files I put through them, which makes sense, because, well, they're following the same spec to validate off. That being said, whilst there is a nice, easy API to use for the nu validator, it's still possible to use the old validator just by parsing the HTML it outputs - although that'd be slower to run and a bit more of a pain. Naypta ☺ | ✉ talk page | 21:06, 29 May 2020 (UTC)
- @Jonteemil and Naypta: They don't always give the same errors. File:NuclearPore.svg: 60 for check vs 4 for nu. Yes, the HTML could be parsed, but I'm not going to do it, especially when I can get JSON from nu. — JJMC89 (T·C) 01:39, 30 May 2020 (UTC)
- @JJMC89 and Jonteemil: Both validators shared the same number of warnings/errors for a few files I put through them, which makes sense, because, well, they're following the same spec to validate off. That being said, whilst there is a nice, easy API to use for the nu validator, it's still possible to use the old validator just by parsing the HTML it outputs - although that'd be slower to run and a bit more of a pain. Naypta ☺ | ✉ talk page | 21:06, 29 May 2020 (UTC)
- @JJMC89: I don't think using another validator should be a problem, as long as they both output the same amount of errors. If that's not the case, I think the valid/invalid SVG templates should be updated with the new validator as well.Jonteemil (talk) 20:56, 29 May 2020 (UTC)
Enlist help to clear Category:Harv and Sfn no-target errors
- Fetch all articles in Category:Harv and Sfn no-target errors
- Compile a list of who created what article
- Compile a list of which Wikiproject covers what article
- Send each user and each WikiProject a personalized report about which articles they created have errors in them, e.g.
= = List of your created articles that are in ] = = A few articles you created are in need of some reference cleanup. Basically, some short references create via {{tl|sfn}} and {{tl|harvnb}} and similar templates have missing full citations or have some other problems. This is ''usually'' caused by copy-pasting a short reference from another article without adding the full reference, or because a full reference is not making use of citation templates like {{tl|cite book}} (see ]) or {{tl|citation}} (see ]). See ]. To easily see which citation is in need of cleanup, you can check ''']''' to enable error messages ('''Svick's script''' is the simplest to use, but '''Trappist the monk's script''' is a bit more refined if you're interested in doing deeper cleanup). The following articles could use some of your attention {{columns-list|colwidth=30em| #] #] ... }} If you could add the full references to those article, that would be great. Again, the easiest way to deal with those is to install Svick's script per ]. If after installing the script, you do not see an error, that means it was either taken care of, or was a false positive, and you don't need to do anything else. Also note that the use of {{para|ref|harv}} is no longer needed to generate anchors. ~~~~
- Skip user talk pages with links to
List of your created articles that are in ]
in headers since they already have such a report
Headbomb {t · c · p · b} 23:18, 18 May 2020 (UTC)
- I think the message needs to provide a link to a discussion page where people can go for help. Keep in mind that most requests for help will be of the form "What is this message? I didn't do anything or ask for this. I don't understand it. Help me." – Jonesey95 (talk) 23:31, 18 May 2020 (UTC)
- Agree it would be a good idea to point to a help page. Where would that be? Help talk:CS1 perhaps? Headbomb {t · c · p · b} 00:21, 19 May 2020 (UTC)
- Maybe Template talk:Sfn? If this goes through, I'd like to see these messages go out in batches, in case a potential help system (run by you and me, presumably) gets a lot of traffic. – Jonesey95 (talk) 01:44, 19 May 2020 (UTC)
- Doesn't really matter much to me where things go. Module talk:Footnotes could be a place. Messages could be sent in batches too. Maybe top 25 users, then next 25, and so on each day for the first week. And then see what the traffic is and adjust rates if it's nothing crazy. Headbomb {t · c · p · b} 11:28, 19 May 2020 (UTC)
- Maybe Template talk:Sfn? If this goes through, I'd like to see these messages go out in batches, in case a potential help system (run by you and me, presumably) gets a lot of traffic. – Jonesey95 (talk) 01:44, 19 May 2020 (UTC)
- Agree it would be a good idea to point to a help page. Where would that be? Help talk:CS1 perhaps? Headbomb {t · c · p · b} 00:21, 19 May 2020 (UTC)
- @Headbomb: as an intermediate step, I think it would be helpful to include articles in the category in Bamyers99's WikiProject Cleanup Listings. – Finnusertop (talk ⋅ contribs) 00:16, 17 June 2020 (UTC)
- I have added both categories to the WikiProject Cleanup Listings. They will appear in the next run on June 23. --Bamyers99 (talk) 01:24, 17 June 2020 (UTC)
A bot to add missing instances of padlocks
Following up from this conversation, I think it would be helpful to have a bot automatically apply the appropriate padlock icon to pages after they become protected. {{u|Sdkb}} 09:39, 21 May 2020 (UTC)
- Worth noting that TheMagikBOT 2 previously had a successful BRFA to do this, but no longer appears to be functional. If there's consensus that it's still a good idea, I'm happy to make this task 3 for Yapperbot - it's not that hard to do. Naypta ☺ | ✉ talk page | 09:43, 21 May 2020 (UTC)
- I think it's still a good idea. The Squirrel Conspiracy (talk) 15:15, 21 May 2020 (UTC)
- courtesy pinging Redrose64 who I see was involved in the previous discussion Sdkb linked Naypta ☺ | ✉ talk page | 15:17, 21 May 2020 (UTC)
Could I also mention that it would be useful to have a bot which fixes incorrect protection templates? MusikBot removes incorrect ones, as I pointed out here, but it doesn't replace them (and could be the cause of some of these missing templates). This seems like a related subject. RandomCanadian (talk / contribs) 18:29, 21 May 2020 (UTC)
- MusikBot is capable of fixing incorrect protection templates, that feature just didn't get through the BRFA. I am willing to give it another push though, if there's demand for it. Similarly it can apply missing protection templates, I just didn't enable that since there was talk to revive the Lowercase sigmabot task that did this and we didn't want the bots to clash. When that didn't happen, TheMagickBOT came through, but alas it has retired now too. I don't mind one way or the other, so if Naypta wants to take it on don't let me stop you, just know that the code is largely written in MusikBot. — MusikAnimal 18:49, 21 May 2020 (UTC)
- @MusikAnimal: If the code's already written in MusikBot, seems to me to make a whole lot more sense to just push to use MusikBot for it then if there's consensus to do this now - the lazier I can be, the better! Naypta ☺ | ✉ talk page | 18:52, 21 May 2020 (UTC)
- It definitely makes sense to have one bot do all the work regarding protection templates rather than a hodge podge of different bots. Galobtter (pingó mió) 18:45, 23 May 2020 (UTC)
- Yeah, I've been doing a fair amount of batch protects now (which are easier using p-batch or manually using the mediawiki interface in some cases.) I'm not mass adding the ECP template though, as that seems like a waste of my time for nice but optional templates. Anyway, I thought this was still happening via another bot, so add a +1 to bring some bot back to do it (cc: MusikAnimal if you're still willing to give this a go ) TonyBallioni (talk) 18:57, 7 June 2020 (UTC)
- Insert reference to ancient task on the point. --Izno (talk) 19:32, 7 June 2020 (UTC)
Coding... — MusikAnimal 17:34, 8 June 2020 (UTC)
- @Sdkb, RandomCanadian, Naypta, TonyBallioni, and Izno: Code is largely ready. A few questions:
- Should the bot add templates to fully-protected pages, too? The bot will be exclusion compliant, so if admins for whatever reason didn't want to advertise full protection they can use the {{bots}} template to keep MusikBot out (MusikBot II to be precise, since it already has admin rights).
- Should it do this for "move" and "autoreview" (pending changes) actions in addition to "edit"? Cyberbot II used to handle PC-protected pages but it appears that task has been disabled.
- I'm going to hold off on fixing existing protection templates for the time being, just to keep it simple. We'll get to that with a follow-up BRFA. — MusikAnimal 21:06, 9 June 2020 (UTC)
- MusikAnimal, adding it for full-protected pages sounds fine to me. I'm not sure about move and autoreview. Thanks for working on this! {{u|Sdkb}} 21:31, 9 June 2020 (UTC)
- Be cognizant of different content types e.g. CSS/sanitized CSS. --Izno (talk) 14:51, 10 June 2020 (UTC)
BRFA filed — MusikAnimal 01:04, 11 June 2020 (UTC)
A bot to develop a mass of short stubs and poorly built articles for Brazilian municipalities
I propose a bot along the lines of {{Brazil municipality}} is created to develop our stubs like Jacaré dos Homens which have been lying around for up to 14 years in some cases. There's 5570 municipality articles, mostly poorly developed or inconsistent with data and formatting even within different states. A bot would bring much needed information and consistency to the articles and leave them in a half decent state for the time being, Igaci which Aymatth2 expanded is an example of what is planned and would happen to stubs like Jacaré dos Homens. Some municipalities have infoboxes and some information but hopefully this bot will iron out the current inconsistencies and dramatically improve the average article quality. It would be far too tedious to do it manually, would take years, and they've already been like this for up to 14 years! So support on this would be appreciated.† Encyclopædius 12:09, 20 May 2020 (UTC)
- Support. See User talk:Aymatth2#Brazil municipalities for more discussion. It looks like a straightforward (but far from trivial) screenscraper. Aymatth2 (talk) 12:35, 20 May 2020 (UTC)
- @Encyclopædius: and @Aymatth2: Where's the community endorsed consensus from WikiProject Brazil/WikiProject Latin America/Village Pump? Where's your driver list of proposed articles? How are you proposing to improve the page so that these aren't perma stubs with no chance at improvement? Per WP:FAIT and WP:MASSCREATION it's expected that there will be a very large and well attended consensus that this bulk creation is appropriate. In short, Not a good task for a bot. table this until you have a community conesnsus in hand as very few bot operators will roll the dice on doing this task in exchange for having their bot revoked. Hasteur (talk) 17:40, 23 June 2020 (UTC)
Updating the dates on the maps on COVID-19 pandemic
Can someone create a bot that will look at the latest date of the maps when the maps are updated and update the date automatically? I tried putting in the TODAY template, but I got reverted by Boing! said Zebedee that it would not work. I was hoping someone could work on a bot to save editors' time updating the dates on the maps. Interstellarity (talk) 19:45, 26 May 2020 (UTC)
- No, the "as of" dates should be updated only when the actual data is updated, not any time the map file is updated (which could be for many reasons). Do we update the "as of" if someone adjusts the colour of a map? No. Do we update it if someone modifies a geographical border? No. We would only do it when a map is updated to reflect new data - and I can't think of how that could be done other than manually. Incidentally, I reverted your use of TODAY as it's obviously wrong for every map to say it's up to date as of today. Boing! said Zebedee (talk) 19:52, 26 May 2020 (UTC)
- This seems like a doable task. I'm not sure if it's for a bot so much as a template, though. I imagine that it would work similarly to
{{Cases in the COVID-19 pandemic|date}}
, fetching a value that would be stored at the Commons file and updated by the map updater whenever they upload a new version. As an aside, thank you, Interstellarity, for all the work you've put in updating map date captions; I recognize it's a tedious task. {{u|Sdkb}} 19:59, 26 May 2020 (UTC)- Some kind of template like that might work, but whoever updates the map would still have to update the data field at the Commons file manually - it couldn't just use the upload date as the "as of" date. Boing! said Zebedee (talk) 20:03, 26 May 2020 (UTC)
- This seems like a doable task. I'm not sure if it's for a bot so much as a template, though. I imagine that it would work similarly to
- While we're here, if anyone wants to work on a bot for updating maps themselves, that's something that ought to be done at some point, but I imagine it'll be a much more complex task. Still, we have the data stored in templates already, so it'd just need to be mapped onto the various maps. It could help with some of the standardization we've been discussing at the WikiProject. {{u|Sdkb}} 20:05, 26 May 2020 (UTC)
- This is a very good idea. Currently, for the map File:COVID-19 Outbreak World Map per Capita.svg and File:COVID-19 Outbreak World Map.svg, the date is accessible in the first sentence, e.g. "Map of the COVID-19 verified number of infected per capita as of 28 May 2020.". It takes me a lot of time to then go modify the date in every page, even more in many languages. We would gain a lot by having a way of entering this value only once. Raphaël Dunant (talk) 15:15, 28 May 2020 (UTC)
- @Sdkb and Raphaël Dunant: It looks like you two might be talking about different things - either that or I'm misunderstanding one or both of you. Raphaël, it sounds like what you want is basically just an AWB run for pages that contain the map to replace the associated date when appropriate; Sdkb, it sounds like what you're after is software that constructs the actual map.Both of these things are eminently possible; the world map SVG is such that making a bot to update the colours from a dataset given a scale ought to be trivial. That being said, if that bot was wanting to update the Commons file, it would need to be a Commons bot, not a bot on enwiki. Let me know if I've got what you're both looking for wrong though! Naypta ☺ | ✉ talk page | 16:06, 28 May 2020 (UTC)
- Update: actually, on looking, seems there's already a bot that's making a version of the map! See c:File:Covid19DataBot-Case-Data-World.svg and c:User:Covid19DataBot. Naypta ☺ | ✉ talk page | 16:08, 28 May 2020 (UTC)
- This bot request is about updating dates automatically, not about the map colour. But it would be nice to adapt the map bot for COVID-19 map, @Sdkb: if you can open a discussion about this subject, I'll happily participate. @Naypta: if you could explain how to automatically update dates, I'll be delighted! Raphaël Dunant (talk) 16:42, 28 May 2020 (UTC)
- @Raphaël Dunant: So one option here might be using the {{Wikidata}} template to pull in a record from Wikidata. That would mean that you could replace each iteration of the date with
{{wikidata|qualifier|Q81068910|P1846|P585}}
, which produces- and you'd then only have to update the single point on Wikidata qualifier on Wikidata (wikidata:Q81068910#P1846) for it to update on all wikis. I've had a chat with a couple of admins about this and the general consensus is that it's okay to do performance-wise, but be careful with how you use this - using the wikidata template in this way can be taxing on the server, so try and use it the fewest amount of times you can!If you're happy with that method, I can run through and update the relevant bits on enwiki - you'll know better than I will where the bits are on the other wikis. Naypta ☺ | ✉ talk page | 18:26, 28 May 2020 (UTC)
- @Naypta: The main places where the date is needed are, in English Misplaced Pages, COVID-19 pandemic and COVID-19 pandemic by country and territory (performance-wise, it's a total of ~1.5 mio page views per week). How doable is the use of this bot in other languages? As of now, there is 56 different languages using the map, with the date to update on each of them. Raphaël Dunant (talk) 10:36, 30 May 2020 (UTC)
- @Raphaël Dunant: Well, this would be a way of doing it without a bot. The {{wd}} template pulls directly from Wikidata, so there's no need for a bot to update the page wikitext then. Assuming that the other wikis also have a similar template for Wikidata, which I think most do, they'd be able to use the same code. I will just ping in here the creator of the template, Thayts - do you think it'd be okay to use this method on high traffic pages in this way? The general consensus I've had seems to be "yes", but I've not spoken to anyone directly involved in the Wd module. Naypta ☺ | ✉ talk page | 13:52, 30 May 2020 (UTC)
- Sure, why not. :) Thayts ••• 15:20, 31 May 2020 (UTC)
- @Raphaël Dunant: Well, this would be a way of doing it without a bot. The {{wd}} template pulls directly from Wikidata, so there's no need for a bot to update the page wikitext then. Assuming that the other wikis also have a similar template for Wikidata, which I think most do, they'd be able to use the same code. I will just ping in here the creator of the template, Thayts - do you think it'd be okay to use this method on high traffic pages in this way? The general consensus I've had seems to be "yes", but I've not spoken to anyone directly involved in the Wd module. Naypta ☺ | ✉ talk page | 13:52, 30 May 2020 (UTC)
- @Naypta: The main places where the date is needed are, in English Misplaced Pages, COVID-19 pandemic and COVID-19 pandemic by country and territory (performance-wise, it's a total of ~1.5 mio page views per week). How doable is the use of this bot in other languages? As of now, there is 56 different languages using the map, with the date to update on each of them. Raphaël Dunant (talk) 10:36, 30 May 2020 (UTC)
- @Raphaël Dunant: So one option here might be using the {{Wikidata}} template to pull in a record from Wikidata. That would mean that you could replace each iteration of the date with
- This bot request is about updating dates automatically, not about the map colour. But it would be nice to adapt the map bot for COVID-19 map, @Sdkb: if you can open a discussion about this subject, I'll happily participate. @Naypta: if you could explain how to automatically update dates, I'll be delighted! Raphaël Dunant (talk) 16:42, 28 May 2020 (UTC)
- Awesome! Raphaël Dunant, if you're happy with this solution, I can get it working on the relevant pages on enwiki at least. I can also have a crack at the other language wikis - it's clear that the template is available on the other wikis too, so this kind of a centralised approach should work. The only problem might come in terms of needing to purge the page caches when the Wikidata item changes - but that should happen when any part of the page changes anyway, and can be done manually if need be. Naypta ☺ | ✉ talk page | 16:34, 31 May 2020 (UTC)
- @Naypta: Thank you very much for the solution! I applied it to the English Misplaced Pages pages. It would be amazing if you can apply it here and on other Wikipedias, as I am not quite sure on how to apply the template to Commons and other languages. Thanks again, I hope this solution works. Raphaël Dunant (talk) 19:00, 31 May 2020 (UTC)
- @Raphaël Dunant: Doing... - just FYI, to make it compatible for inclusion on Commons and on some other language Wikipedias, I've changed the Wikidata page it links into. It's now wikidata:Q95963597 - so when updating the date, update it on the P585 "point in time" property there, and it'll update everywhere else automatically. Naypta ☺ | ✉ talk page | 20:02, 31 May 2020 (UTC)
- @Naypta: The solution works well for most pages, thanks! However, it does not automatically update this page, which is problematic (maybe because the date is updated only when there is a page update?). Do you have any solution to make it update this page as well? Raphaël Dunant (talk) 22:01, 1 June 2020 (UTC)
- @Raphaël Dunant: Sure thing. So the cache expires on the sooner of the next edit, a manual purge being requested, or seven days from the last cache time. I've manually purged the cache of that page, and you can see it's now updated, but you can also purge it at this link whenever you like. You may wish to do so after updating the Wikidata item - just click that link and then click "yes", it'll automatically update the date :) Naypta ☺ | ✉ talk page | 22:16, 1 June 2020 (UTC)
- @Naypta: The solution works well for most pages, thanks! However, it does not automatically update this page, which is problematic (maybe because the date is updated only when there is a page update?). Do you have any solution to make it update this page as well? Raphaël Dunant (talk) 22:01, 1 June 2020 (UTC)
- @Raphaël Dunant: Doing... - just FYI, to make it compatible for inclusion on Commons and on some other language Wikipedias, I've changed the Wikidata page it links into. It's now wikidata:Q95963597 - so when updating the date, update it on the P585 "point in time" property there, and it'll update everywhere else automatically. Naypta ☺ | ✉ talk page | 20:02, 31 May 2020 (UTC)
- @Naypta: Thank you very much for the solution! I applied it to the English Misplaced Pages pages. It would be amazing if you can apply it here and on other Wikipedias, as I am not quite sure on how to apply the template to Commons and other languages. Thanks again, I hope this solution works. Raphaël Dunant (talk) 19:00, 31 May 2020 (UTC)
- Update: actually, on looking, seems there's already a bot that's making a version of the map! See c:File:Covid19DataBot-Case-Data-World.svg and c:User:Covid19DataBot. Naypta ☺ | ✉ talk page | 16:08, 28 May 2020 (UTC)
- @Sdkb and Raphaël Dunant: It looks like you two might be talking about different things - either that or I'm misunderstanding one or both of you. Raphaël, it sounds like what you want is basically just an AWB run for pages that contain the map to replace the associated date when appropriate; Sdkb, it sounds like what you're after is software that constructs the actual map.Both of these things are eminently possible; the world map SVG is such that making a bot to update the colours from a dataset given a scale ought to be trivial. That being said, if that bot was wanting to update the Commons file, it would need to be a Commons bot, not a bot on enwiki. Let me know if I've got what you're both looking for wrong though! Naypta ☺ | ✉ talk page | 16:06, 28 May 2020 (UTC)
- This is a very good idea. Currently, for the map File:COVID-19 Outbreak World Map per Capita.svg and File:COVID-19 Outbreak World Map.svg, the date is accessible in the first sentence, e.g. "Map of the COVID-19 verified number of infected per capita as of 28 May 2020.". It takes me a lot of time to then go modify the date in every page, even more in many languages. We would gain a lot by having a way of entering this value only once. Raphaël Dunant (talk) 15:15, 28 May 2020 (UTC)
DYKN image resize bot
Greetings. At WP:DYKN, the image size is based on the orientation of the image; vertical images at 120px, square at 140, and horizontal at 160. However there is no way to set the resolution during nomination, which means that even experienced editors often forget to fix the size of the image, and new editors don't know that they should.
I am proposing that a bot do a daily check and update the resolution where needed. In order to cut down on the amount of resources required, it needs only look at recent additions.
It would, I'm guessing, work something like this:
- Generate a list of all DYK nominations added to Template talk:Did you know since the task was last run. (It can't use the nomination date because there's a 7-day window to nominate.)
- Determine if they contain {{main page image}}.
- For nominations where that template is present, determine the aspect ratio of the image.
- If the ratio is between 5:6 and 6:5, change the field width= from 120 to 140.
- If the ratio is greater than 6:5, change the field width= from 120 to 160.
Sincerely, The Squirrel Conspiracy (talk) 00:31, 7 June 2020 (UTC)
- Since nominations can be reviewed quite quickly and moved to the Template talk:Did you know/Approved page, the bot would need to check there as well. While the main Nominations page has a "Current nominations" section comprising of the current date and the previous seven days—this is updated at midnight every day—the Approved page doesn't have the equivalent section. Depending on how often it runs, the bot may need to check earlier on the page, because the dates are not when the nomination was added, but rather when work on the article began, which is supposed to be no more than seven days before nominating. (But is sometimes a little late.) BlueMoonset (talk) 01:44, 7 June 2020 (UTC)
Remove deprecated parameter "w/l" in Template:CBB schedule entry
There is already an existing score parameter that will determine if a team wins or loses a match. This w/l parameter deemed dubious and redundant, hence score parameter must be taken advantage to assess the win-loss logic instead.
Scenario description | Sample parameter usage | Requested bot action |
---|---|---|
Both w/l and score parameters are empty | |w/l= |score=
|
Remove w/l parameter usage |
w/l value is empty, and score value is dash (or en dash, hyphen) |
|w/l= |score=- (using minus sign)
| |
|w/l= |score=– (using en dash)
| ||
w/l is either W or L, and score contains dash-separated numbers |
|w/l=w |score=100-90 (using minus sign)
| |
|w/l=l |score=90–100 (using en dash)
| ||
|w/l=w |score=
| ||
|w/l=l |score=]
| ||
w/l is either W or L, and score contains HTML – between scores
|
|w/l=w |score=100–90
| |
|w/l=l |score=90–100
| ||
|w/l=w |score=
| ||
|w/l=l |score=]
| ||
w/l value is either w or l, and score is contains all any other values or is empty
(i.e. the winner/loser of the match is known, but the final score is not available) |
|w/l=w or |w/l=l |score=Default
|
Rename parameter to status:
|
|w/l=w or |w/l=l |score=Forfeit
| ||
|w/l=w or |w/l=l |score=
| ||
w/l value is p | |w/l=p
|
Rename parameter to status:
|
w/l parameter not found | Do nothing |
Let me know if I miss any other scenarios. – McVahl (talk) 07:09, 11 June 2020 (UTC)
- With over 7k transclusions, it sounds like this would fall under User:PrimeBOT/Task 30. Won't have time to start it until probably next week, but that has the benefit of allowing for discussion here about any issues with the above, and/or implementation. Primefac (talk) 12:48, 11 June 2020 (UTC)
- Sounds great. Meanwhile, I'll keep update on above table when necessary. – McVahl (talk) 20:06, 12 June 2020 (UTC)
- McVahl, I've done 25 changes just to check everything's working appropriately - mind taking a look and seeing if I'm screwing anything up too badly? Primefac (talk) 19:52, 19 June 2020 (UTC)
- Hi Primefac, I reviewed all 25 amendments and I don't see any issues. Thanks. – McVahl (talk) 04:44, 20 June 2020 (UTC)
- Primefac, I added new case when score uses HTML-based code
–
instead of "–" (for example,|score=90–100
). Sorry for late notice. I just observed only today when PrimeBot made some edits, as this case where not covered on the initial 25 amendments the other day. – McVahl (talk) 06:27, 21 June 2020 (UTC)- Thanks, added in. Primefac (talk) 13:54, 21 June 2020 (UTC)
- Primefac, I added new case when score uses HTML-based code
- Hi Primefac, I reviewed all 25 amendments and I don't see any issues. Thanks. – McVahl (talk) 04:44, 20 June 2020 (UTC)
- McVahl, I've done 25 changes just to check everything's working appropriately - mind taking a look and seeing if I'm screwing anything up too badly? Primefac (talk) 19:52, 19 June 2020 (UTC)
- Sounds great. Meanwhile, I'll keep update on above table when necessary. – McVahl (talk) 20:06, 12 June 2020 (UTC)
Convert comma separated values into List
Comma separated values like A, B, C can be instead converted into
- A
- B
- C
or
{{hlist|A|B|C}}
This is usually found in infoboxes. Additionally, values separated by a
<br/>
can also be converted into a list.
I'mFeistyIncognito 16:39, 14 June 2020 (UTC)
- @I'mfeistyincognito: Is there a particular reason for doing this? It looks cosmetic to me, and per WP:FLATLIST, either style is acceptable for the MOS. Naypta ☺ | ✉ talk page | 17:25, 14 June 2020 (UTC)
- @Naypta: I always try to turn the data carried by the infoboxes into a more structured form (I know it'll never get there completely). It would make it easier to export data from infoboxes into WikiData. I'mFeistyIncognito 20:01, 16 June 2020 (UTC)
- This is a context-sensitive task. To give just one example, {{hlist}}, because it uses
<div>...</div>
tags, cannot be wrapped by any tags or templates that use<span>...</span>
tags, like {{nowrap}}. If an infobox wraps a parameter with {{nowrap}}, converting that parameter's contents to use {{hlist}} will lead to invalid HTML output. – Jonesey95 (talk) 22:17, 14 June 2020 (UTC)- @Jonesey95: You are probably right. Nevertheless, using {{Comma separated entries}} shouldn't be a problem. I'mFeistyIncognito 20:13, 16 June 2020 (UTC)
DYK bot
Can someone make a bot to automatically update the Misplaced Pages:List of Wikipedians by number of DYKs. Just like Misplaced Pages:List of Wikipedians by featured list nominations and Misplaced Pages:List of Wikipedians by featured article nominations. Thanks. ~~ CAPTAIN MEDUSA 18:37, 15 June 2020 (UTC)
- This is something I'd be interested to work on, if it would be useful. Do you know how this list is currently updated? Pi (Talk to me!) 06:08, 20 June 2020 (UTC)
- Manually, by its participants.
- As an aside, two users on that list are combining totals from old and new accounts. I'm not on the list because I never bothered, but I would also be combining from two accounts. Is there a way for your proposed bot to handle this? The Squirrel Conspiracy (talk) 17:49, 21 June 2020 (UTC)
- That shouldn't be a problem, I'd just have to put the list somewhere of all the accounts that needed adding up. I'll look into the feasibility of it tomorrow Pi (Talk to me!) 05:28, 22 June 2020 (UTC)
Coding... - I'm just making the script to get the data. Once that's working I'll look at making the bot to update the table Pi (Talk to me!) 17:53, 22 June 2020 (UTC)
- @CAPTAIN MEDUSA: I've made some progress with getting the list of nominations, and getting the article creators is relatively simple, but I'm not sure where to find the data for who is credited with the expansion of the article or promotion to GA. Does DYK as a process have a policy on this, and is the data recorded anywhere? Pi (Talk to me!) 22:54, 22 June 2020 (UTC)
- Pi, here but you have to manually search for a user.
- You can also find the user by going through nominations. It would usually say, Created, Improved to GA, Moved to main space, 5x expanded, and nominated by..... ~~ CAPTAIN MEDUSA 23:32, 22 June 2020 (UTC)
- This is coming along OK, I should have a prototype in a couple of days Pi (Talk to me!) 04:32, 23 June 2020 (UTC)
- Category:DYK/Nominations, This category is quite useful. ~~ CAPTAIN MEDUSA 12:26, 23 June 2020 (UTC)
Misplaced Pages:Categories for discussion/Archive debates
Let's make a bot that creates each page that day at midnight. 95.49.166.194 (talk) 13:10, 17 June 2020 (UTC)
- Looks like it's mostly ProveIt who does this normally, who's previously mentioned that they have a script to do it that they then copy and paste from. I've pinged them in here - ProveIt, is this botreq something you're interested in having? Naypta ☺ | ✉ talk page | 13:38, 17 June 2020 (UTC)
Remove malformed obsolete Template:Infobox drug field
Per Misplaced Pages talk:WikiProject Pharmacology#Molecular weights in drugboxes, I am requesting bot attention to remove the following regexp line:
/\| *molecular_weight *= *+ *g\/mol\n/
in articles that transclude Template:Infobox drug. There are a few rare variations that I can remove by hand or that require manual decision whether to remove, but this seems to be the vast majority and a conservative regex for it. This is a one-time cleanup pass that I started doing it with WP:JWB before I realized it was possibly the majority of the 12K articles in that transcluders list. DMacks (talk) 19:18, 17 June 2020 (UTC)
- Am I correct in that the parameter itself has not been deprecated, just the usage where a value and units are given? Primefac (talk) 19:38, 17 June 2020 (UTC)
- Mostly-correct. The units should not be given with the number...that's a mistake that needs to be fixed. The majority of cases, even the number does not need to be given (it's a deprecated use-case of the field, not the field deprecated as a whole). One detail I had in my offline note and forgot to paste (yikes! sorry!) is to limit the scope to pages where there is a:
/\| *C *= *\d/
- as those are pages where the value can be automatically calculated, so the field is not needed. In terms of regex, this is almost always on the line immediately preceding the /molecular_weight/ if it would be useful to have a single regex rather than "one regex to filter the pages, another to replace". Rather than simply fixing the units across-the-board, this is an opportunity to upgrade the usage wherever easily possible. There are a bunch of special cases, where the field contains other than a single number or where the number really does need to be manually specified, but I'm setting those aside for now...once the majority of mindless fixes are done, individual decisions about each remaining case can be made. DMacks (talk) 00:37, 18 June 2020 (UTC)
- Might be useful to set up some tracking categories, then; those that don't need the param, and those that need the units removed. Primefac (talk) 00:40, 18 June 2020 (UTC)
- Category:Chem-molar-mass both hardcoded and calculated tracks where the param is redundant (or will, as soon as the job queue catches up), so can use that rather than looking for "tranclusion of {{Infobox drug}} ∧
|C=\d
". DMacks (talk) 05:10, 18 June 2020 (UTC)- ...has stabilized around 5600 pages. Next step is to filter the ones whose field is malformed (mistake to fix) rather than just redundant (deprecated but valid format). DMacks (talk) 14:40, 19 June 2020 (UTC)
- Category:Chem-molar-mass both hardcoded and calculated tracks where the param is redundant (or will, as soon as the job queue catches up), so can use that rather than looking for "tranclusion of {{Infobox drug}} ∧
- Might be useful to set up some tracking categories, then; those that don't need the param, and those that need the units removed. Primefac (talk) 00:40, 18 June 2020 (UTC)
- as those are pages where the value can be automatically calculated, so the field is not needed. In terms of regex, this is almost always on the line immediately preceding the /molecular_weight/ if it would be useful to have a single regex rather than "one regex to filter the pages, another to replace". Rather than simply fixing the units across-the-board, this is an opportunity to upgrade the usage wherever easily possible. There are a bunch of special cases, where the field contains other than a single number or where the number really does need to be manually specified, but I'm setting those aside for now...once the majority of mindless fixes are done, individual decisions about each remaining case can be made. DMacks (talk) 00:37, 18 June 2020 (UTC)
- Deferred I'm JWB'ing it, with a looser regex and manual oversight...manually annoying but still scratches the itch. DMacks (talk) 13:43, 21 June 2020 (UTC)
Populate tracking category for CS1|2 cite templates missing "}}"
Example. Missing "}}" is a not too uncommon problem. They can't be tracked by CS1|2 itself because the template is never invoked. I would caution attempting an automated fix because when "}}" doesn't exist there are often other structural problems, and there might be embedded templates etc.. -- GreenC 15:29, 18 June 2020 (UTC)
- Another problem with automated fixes is that edits that result in unclosed templates often need to be reverted entirely. – Jonesey95 (talk) 16:03, 18 June 2020 (UTC)
- @GreenC: Where would you envisage this putting categorisation markup? Tracking categories are a MediaWiki internal feature that would have to be added by a MediaWiki extension, not a bot. A bot could add a hidden category to the wikitext of the page, it could add articles with issues to a list, or it could tag articles with a maintenance template, like {{Invalid citation template}} perhaps - which could then categorise the page. Naypta ☺ | ✉ talk page | 16:19, 18 June 2020 (UTC)
- That is a good point. I think your idea for {{Invalid citation template}} (or universal {{Invalid template}} or
{{malformed template}}
) is great because it could be visible in the wikitext, produce a red warning message, allow for a tracking cat, and have argument options for the bot name and date, plus whatever future requirements. -- GreenC 17:05, 18 June 2020 (UTC)- @GreenC: Well there's my next project then Let me look into it Naypta ☺ | ✉ talk page | 17:27, 18 June 2020 (UTC)
- The problem here is going to be that, because there's no invocation of the template, it's tricky to find an appropriate set of pages to check over, without doing some god-awful regex search and crashing Elasticsearch in the process One way to do it might be to implement an edit filter that finds matching regexes and tags them, but I'm not sure if that'd necessarily be the best way. Any thoughts, ideas or suggestions are appreciated! Naypta ☺ | ✉ talk page | 09:16, 19 June 2020 (UTC)
- For other bots, I have a system on Toolforge that downloads a list of all 6 million article titles then goes through the list, when done recreates the list and starts over. It's brute force but effective and not as terrible as it sounds when running on Toolforge since the Misplaced Pages servers are on the local LAN. Another possibility is generate a backlink list for the CS1|2 templates and only target those which would reduce it down to a few million. I have a unix command-line tool that does both these (generate the full list, or backlink list) if you want to use it, on git. -- GreenC 13:48, 19 June 2020 (UTC)
- @GreenC: Going through a backlink list would be easy enough through the API, and there's already a list of all article titles in the DB dumps that are stored on drives accessible through Toolforge anyhow. My concern had been that doing that a) introduces quite a fair bit of server load, and b) seems like it would take about five hundred years to complete - have you found it works better? Naypta ☺ | ✉ talk page | 14:30, 19 June 2020 (UTC)
- Generally speaking in a shared environment like this slowing things down is the nicer way as it doesn't cause a spike in demand. Downloading every article sequentially would be like a 15-30k steady stream which is a blip on a gigabit LAN. And CPU/memory to regex a single article is nothing. It's about as low as one can get resource wise, while running a SQL query across 6 million can cause a resource spike but it's hidden from view. My guess is 10-15 days to complete 6 million articles based on previous experience. I have processes doing this continually so do other bots. If there was a way to regex the target articles with Elasticsearch could try that but I suspect ES will bail on the query if too complex (it limits 10,000 results but should not be a problem here). -- GreenC 15:11, 19 June 2020 (UTC)
- Sure - I'll give it a crack and see what happens Assuming all goes well, will put up a BRFA for the task soon(ish). Naypta ☺ | ✉ talk page | 15:41, 19 June 2020 (UTC)
- BRFA filed - it does a hell of a lot more than just solve this problem, but it definitely solves this problem too! Naypta ☺ | ✉ talk page | 15:31, 20 June 2020 (UTC)
- Generally speaking in a shared environment like this slowing things down is the nicer way as it doesn't cause a spike in demand. Downloading every article sequentially would be like a 15-30k steady stream which is a blip on a gigabit LAN. And CPU/memory to regex a single article is nothing. It's about as low as one can get resource wise, while running a SQL query across 6 million can cause a resource spike but it's hidden from view. My guess is 10-15 days to complete 6 million articles based on previous experience. I have processes doing this continually so do other bots. If there was a way to regex the target articles with Elasticsearch could try that but I suspect ES will bail on the query if too complex (it limits 10,000 results but should not be a problem here). -- GreenC 15:11, 19 June 2020 (UTC)
- @GreenC: Going through a backlink list would be easy enough through the API, and there's already a list of all article titles in the DB dumps that are stored on drives accessible through Toolforge anyhow. My concern had been that doing that a) introduces quite a fair bit of server load, and b) seems like it would take about five hundred years to complete - have you found it works better? Naypta ☺ | ✉ talk page | 14:30, 19 June 2020 (UTC)
- For other bots, I have a system on Toolforge that downloads a list of all 6 million article titles then goes through the list, when done recreates the list and starts over. It's brute force but effective and not as terrible as it sounds when running on Toolforge since the Misplaced Pages servers are on the local LAN. Another possibility is generate a backlink list for the CS1|2 templates and only target those which would reduce it down to a few million. I have a unix command-line tool that does both these (generate the full list, or backlink list) if you want to use it, on git. -- GreenC 13:48, 19 June 2020 (UTC)
- The problem here is going to be that, because there's no invocation of the template, it's tricky to find an appropriate set of pages to check over, without doing some god-awful regex search and crashing Elasticsearch in the process One way to do it might be to implement an edit filter that finds matching regexes and tags them, but I'm not sure if that'd necessarily be the best way. Any thoughts, ideas or suggestions are appreciated! Naypta ☺ | ✉ talk page | 09:16, 19 June 2020 (UTC)
- @GreenC: Well there's my next project then Let me look into it Naypta ☺ | ✉ talk page | 17:27, 18 June 2020 (UTC)
- That is a good point. I think your idea for {{Invalid citation template}} (or universal {{Invalid template}} or
wp:SQLREQ COVID 19 data compiler
Could a bot run a SQL query or similar to compile COVID 19 data into a editable data sheet that another/same bot could import to Misplaced Pages COVID 19 pandemic update map/graph — Preceding unsigned comment added by 80.41.138.48 (talk) 16:24, 19 June 2020 (UTC)
- Hi, what database would the bot be querying, and can you elaborate on what you had in mind by an 'editable data sheet'? Also, who would edit this sheet prior to the bot importing it into the map/graph? Pi (Talk to me!) 06:03, 20 June 2020 (UTC)
Finding artwork for missing pages
After editing a lot of music articles that had no album cover in the Template:Infobox_album (Category:Album_infoboxes_lacking_a_cover), I realized that it was a very repetitive processed that could be streamlined by having a bot that:
- Checks if article in Category:Album_infoboxes_lacking_a_cover is not about a single (as many singles don't have album artwork, so only looking at EPs/albums/mixtapes only would streamline
- Using Last.fm's album.getinfo request, and obtains the "small" artwork to abide to the size guidelines regarding uploading album artwork to wikipedia.
- Uploading said cover to the wiki, and editing it into the Album infobox
I looked in the rejected ideas and bots, and it seems like none really tried to attack this. My programming knowledge is okay at best, but I couldn't get any of the Java frameworks working so I'm out of luck doing this myself. ⠀TOMÁSTOMÁS⠀TALK⠀ 00:49, 22 June 2020 (UTC)
- Strongest possible oppose to any automation that adds non-free content to the project. WP:NFCC criteria 8 has to be decided on a case-by-case basis, and while there are some that believe that there is an inherent justification for using a non-free image in the infobox of a media work, that is not what the NFCC says. No offense to the proposer themselves, but this is a dangerous idea that goes against the third pillar, and would set a dangerous precedent if allowed. The Squirrel Conspiracy (talk) 00:00, 23 June 2020 (UTC)
- @The Squirrel Conspiracy: Thanks for the response. I understand, but one question just for my clarification more than anything. Wouldn't criteria 8 be applicable to any specific album page though? Wouldn't the addition of artwork in album articles "significantly increase readers' understanding of the article topic"? I wouldn't have think of a case where an album artwork doesn't do that (unless it's a soundtrack for a film or TV show). As well, to @Redrose64:'s point, since the inherent format of album articles gives consistency and thus consistency in reasons to use Non-Free Content, wouldn't boilerplated text be applicable since what is generally true for one similarly formatted article carry on? Again, don't mean to be contrarian here or anything, but I just want to genuinely better familiarize myself with the policy. ⠀TOMÁSTOMÁS⠀TALK⠀ 15:28, 23 June 2020 (UTC)
- Oppose WP:NFCCP#10c requires
... a separate, specific non-free use rationale for each use of the item, as explained at Misplaced Pages:Non-free use rationale guideline. The rationale is presented in clear, plain language and is relevant to each use.
This is not possible for a bot to do except by means of boilerplated text, and that would imply that little or no thought has been put into the wording of the FUR. --Redrose64 🌹 (talk) 11:21, 23 June 2020 (UTC)