Misplaced Pages

:Bots/Requests for approval/Lightbot 4 - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
< Misplaced Pages:Bots | Requests for approval

This is an old revision of this page, as edited by Carcharoth (talk | contribs) at 12:14, 17 July 2010 (Discussion: qualify). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 12:14, 17 July 2010 by Carcharoth (talk | contribs) (Discussion: qualify)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Lightbot 4

Operator: Lightmouse (talk · contribs)

Automatic or Manually assisted: Automatic

Programming language(s): AWB, monobook, vector, manual

Source code available: Source code for monobook or vector are available. Source code for AWB will vary but versions are often also kept as user pages.

Function overview: Janitorial edits to units

Links to relevant discussions (where appropriate):
This request duplicates the 'units of measure' section of Misplaced Pages:Bots/Requests for approval/Lightbot 3. That BRFA was very similar to the two previous approvals: Misplaced Pages:Bots/Requests for approval/Lightbot and Misplaced Pages:Bots/Requests for approval/Lightbot 2.

Edit period(s): Continuous

Estimated number of pages affected: Individual runs of tens, or hundreds, or thousands.

Exclusion compliant (Y/N): Yes, will comply with 'nobots'

Already has a bot flag (Y/N): Yes

Function details:
I would like to make it explicit that I will be editing units of measure in a variety of forms.

  • A 'unit of measure' is any sequence of characters that relates to measurement of things. This includes but is not limited to units defined by the BIPM SI, the US NIST or any other weights and measures organisation or none at all. This includes but is not limited to time, length, area, volume, mass, speed, power.
  • Edits may add or modify metric or non-metric units.
  • Edits may modify the format.
  • Edits may add, remove or modify templates that involve units.
  • Edits may add, remove or modify links to units.

Discussion

I'm sorry, did I miss something or are you not "indefinitely prohibited from using any automation whatsoever on Misplaced Pages"? - EdoDodo 13:30, 13 July 2010 (UTC)

Currently, yes. I have just applied to have the restriction lifted. However, an arbitrator said I need to come here first. So here I am. :) Lightmouse (talk) 14:09, 13 July 2010 (UTC)
Okay. - EdoDodo 14:14, 13 July 2010 (UTC)

Given your past history, I have a number of concerns:

  • What will you do to prevent any repeat of the behavior that led to the ArbCom finding of fact Lightbot repeats its own errors?
  • I recall much drama over how you used to respond to talk page comments (and even remove the "stop" command) using your bot account. Do you commit to never using your bot account to respond to talk page comments or make any other edits besides those for approved tasks, as required in WP:BOTPOL#Bot accounts?
  • You state that your bot will honor {{nobots}}. Will it also comply with {{bots|deny=Lightbot}} and variations?
  • As stated, this request is far too broad and far to vague; Lightbot 3 was quite controversial for that very reason, and things have become more strict since then. Please specify exactly the types of changes the bot will be doing rather than vaguely stating "may add, remove or modify". I understand this may be a long list, and I note that explicitly listing each change does mean that adding a new type of change will require a new BRFA.
    • Note that ArbCom also asks for a statement "indicating specifically which functions you will be performing".
  • "Units of measure" should similarly be more defined. Template:Convert/list of units has an extensive list of units, which may be incorporated by reference. Are there other units on which you intend to work?
  • The edit summary you used previously, "unit/dates/other", does not fit with WP:BOTPOL's requirement that the bot "uses informative messages, appropriately worded, in any edit summaries or messages left for users". Please address this.

Given the controversy Lightbot's activities generated both before and during the ArbCom case, we must be particularly careful here to ensure that the community wants this done and wants Lightmouse to be doing it. I have posted notices at WP:AN, WP:VPR, WP:BON, and User talk:Lightbot to attempt to gather community input. If anyone knows of other pages where Lightbot's previous activities were extensively discussed (e.g. MOS, WikiProject, or template talk pages), please post a similar notice in those places and mention that you did so here. Please keep in mind WP:CANVAS. Anomie 17:32, 13 July 2010 (UTC)

I'd be happy to provide a list of units of measure. It might take a little effort to compile but I can do it. Would that help? Lightmouse (talk) 19:42, 13 July 2010 (UTC)
Done. I used various 'units of measure' related categories on Misplaced Pages to create a list at User:Lightmouse/list_created_from_categories_referring_to_units_of_measure. As suggested, there is also the list of units of measure addressed by Template:Convert/list of units. There are also lists maintained by the official SI authority, by the British and US weights and measures authorities and by others. In the event of a dispute about whether something is a 'unit of measure', I'm sure the knowledgeable people at wp:mosnum can arbitrate. Has anyone ever seen any such disputes? Lightmouse (talk) 20:58, 13 July 2010 (UTC)
I asked for the units of measure you intended to work with, not every unit of measure you could possibly think of; do you really intend to do anything with amagats, almudes, agates, or adowlies? Several items on your list don't even seem to be units of measure, for example Active daylighting or Active Resistance to Metrication. And quite frankly, I don't have any confidence in your fan club at WT:MOSNUM. Anomie 23:45, 13 July 2010 (UTC)
It's entirely unclear to me from this request what, exactly, the bot would do. Can you give an example of what you think a typical edit would be like? Beyond My Ken (talk) 19:55, 13 July 2010 (UTC)

Please remember that Lightbot was already approved to edit units of measure. This approval request is a word-for-word copy of the units of measure section of Lightbot 3. So there are thousands of examples in the contributions. For example:

If anyone knows of a better edit summary, please feel free to suggest it.
This application stands on its own merit. It doesn't require anyone to read other BRFAs if they don't wish to.
AWB has a method of addressing bot exclusions that is used by other bot owners. I'll do the same.
Feel free to look at one example of AWB code that was used in the past. Lightmouse (talk) 20:23, 13 July 2010 (UTC)

So, to be clear, is doing conversions such as the examples you post the sole task of the bot? Beyond My Ken (talk) 20:54, 13 July 2010 (UTC)
No. It will do more than just add conversions.
  • Edits may add or modify metric or non-metric units. For example, this may add a conversion or fix an error in an existing conversion.
  • Edits may modify the format. For example, this may change 'KW' into 'kW' or 'kmph' into 'km/h'.
  • Edits may add, remove or modify templates that involve units. The diff examples given above show it adding templates. It may also remove or modify templates as part of maintenance e.g. if the templates themselves need updating.
  • Edits may add, remove or modify links to units. For example, it might add a link to obscure units, remove a link from a common unit, or correct a wrong or misdirected link.
I hope that helps. Regards Lightmouse (talk) 21:10, 13 July 2010 (UTC)
(edit conflict) Note that Lightbot 3 probably shouldn't have been approved given the lack of consensus evident in that discussion, and in reaction to criticism from the community (in part due to the approval of Lightbot 3) we have tried to become more careful about requiring a strong consensus and about not approving vague or overly-broad tasks. Although to be fair most (but by no means all) of the controversy there was related to dates rather than units.
While examples of edits are nice, we need to know exactly what types of edits are being approved here. For example, "Wrap measurements using US customary units with {{convert}} to also display them with the corresponding metric units" could appropriately describe the above 4 edits.
As for an edit summary, it should reflect what the bot is actually doing. Ideally the summaries for the 4 edits you link above would be "adding metric conversion for mph using {{convert}}", "adding metric conversion for inches using {{convert}}", "adding metric conversion for square feet using {{convert}}", and "adding metric conversion for miles, acres using {{convert}}", although just "automatically adding metric conversions using {{convert}}" would be ok. Anomie 21:12, 13 July 2010 (UTC)
Thanks. The code is sophisticated enough to parse a page for lots of units. I gave those examples because they were simple to understand. In reality, one edit might do 2 conversions of miles, 3 conversions of feet, 1 modification of a link, and a change of format from KW to kW. The next edit might do a completely different combination. The next edit might do a different combination again. That's why the edit summary is generic. Lightmouse (talk) 21:20, 13 July 2010 (UTC)
If the code is that sophisticated, shouldn't it also be able to determine an appropriate edit summary? Anomie 23:45, 13 July 2010 (UTC)

I'm not satisfied with the "for examples" and "mights" above, because they do nothing to make the request less vague. So far, it sounds like the types of changes being considered for this task are:

  • Add {{convert}} to measurements using non-metric units to also display appropriate metric units.
  • Add {{convert}} to measurements using metric units to also display appropriate customary units.
  • When appropriate, use {{convert}} to display multiple alternative units (e.g. 640 acres (2.6 km; 1.00 sq mi))
  • Correct broken invocations of {{convert}}.
  • Correct incorrect manually-formatted conversions, e.g. "100 miles (10 km)".
    • How does the bot determine whether 100 mi or 10 km is actually correct? What is the threshold between "inexact" and "incorrect"?
  • Correct spelling, abbreviation, or capitalization of existing units in measurements to match the applicable standards, e.g. 100 KW → 100 kW. This may be a side effect of applying {{convert}}, but may also be done on its own.
  • Add links to uncommon units in measurements (e.g. "100 furlong" → "100 furlong"), remove links to common units in measurements (e.g. "100 mi" → "100 mi"), or correct incorrect links to units in measurements (e.g. "1 atm" → "1 atm").
    • Would these also be done for units not in measurements?

Is that correct? Are there other specific changes being considered? Anomie 23:45, 13 July 2010 (UTC)

Thanks for your suggestions, Anomie. They are helpful. I think we are now moving forward.
The aim is to improve the use of units on Misplaced Pages. The aim can be achieved by the use of the convert template or by use of text. I'll try and rework your text as follows:

  1. Add {{convert}} to metric units so they display non-metric units.
  2. Add {{convert}} to non-metric units so they display metric units.
  3. Add text to metric units so they display non-metric units.
  4. Add text to non-metric units so they display metric units.
  5. Modify existing text conversions of units. This will be to correct errors, improve the conversion, improve appearance, improve consistency, change abbreviation, change spelling
  6. Modify existing template conversions of units. This will be to correct errors, improve the conversion, update the template, improve appearance, improve consistency, change abbreviation, change spelling
  7. Remove existing text conversions of units in order to replace it with a better template.
  8. Remove existing template conversions of units in order to replace it with better text.
  9. Remove existing template conversions of units in order to replace it with a better template.
  10. Add links to uncommon units
  11. Modify links to units. This will be to correct errors, make it more direct, improve appearance, improve consistency, change abbreviation, change spelling
  12. Remove links to common units
  13. It is not intended to add templates other than {{convert}} but if a better template exists, it will be considered
  14. For this BRFA, the scope of the term 'conversion' includes more than one unit e.g. 60 PS (44 kW; 59 hp)

I hope that helps. Lightmouse (talk) 21:46, 14 July 2010 (UTC)

Quite helpful. I hope you don't mind, I changed it to a numbered list to facilitate discussion.
  • When might you do the text-related items (3, 4, 5, and 8) rather than the corresponding template-related items (1, 2, 6, 7, 9), besides when the conversion isn't supported by {{convert}} or another template? How often do you anticipate that happening, and/or how often did it happen when Lightbot was running previously?
  • Re #11, how might links be edited to "improve appearance, improve consistency, change abbreviation, change spelling" in a way that is not covered by #5 or #6? Or is that portion just a bit of CYA in case someone tries to whine that changing the visible text of a link (e.g. KWkW) somehow isn't covered by #5 or #6?
  • Re #10 and #11: When linking an abbreviation, would you just link the abbreviation, pipe it to the written-out name, or pipe it to a section of the article on the base unit? For a really bad example, "kW" to kW, Kilowatt, or Watt#Kilowatt? I personally would prefer Kilowatt, as it makes the link's tooltip maximally useful to readers.
  • Re #12: Obviously if someone comes around complaining about Lightbot removing links to Mile or Metre they're just trolling. If someone complains about removal of links to a unit that reasonable people might disagree on the commonness of, will you stop and seek input from reasonable people, people much more familiar with units than the average reader, or just ignore them and keep editing?
Anomie 00:46, 15 July 2010 (UTC)

Can you define what a "common unit" is? Peachey88 06:19, 15 July 2010 (UTC)

  • Text conversions. By default, I try to use the 'convert' template. I think it must be the most frequent conversion method I've implemented. I think it's a 'good thing'. I started adding text conversions by hand before the convert template even existed. I might still use text because the template isn't able to do the conversion (or I'm unaware of how), or I might find it simpler to code the text version. I think I've used text for rare combinations of units - I can't think of an example but "x hp/ton" or "a gradient of two feet per mile" are the sorts of thing I mean. I've also used text when the numeric value is in words e.g. "three miles" (again, feel free to let me know if the template now supports this). Even when the template is an option, I'd like the flexibility to use text if I think it's a better method of conversion.
  • improve appearance, improve consistency, change abbreviation, change spelling. It is a bit of a catch-all clause to eliminate doubt. That's why I was previously keen on sweeping statements. Now that we're using specific language, we need specific clauses. I didn't think of mentioning upper and lower case - thanks - I think that should probably be stated.
  • Linking abbreviations. By default, I'll link to the actual article ]. I see what you mean about that being an odd one. It doesn't matter much to me if I ended up using ]. However, I'm not sure if I'd use Lightbot to add links to 'kW' anyway.
  • Common units. I'd be happy for a debate about common units, either now or later. It's interesting to me that 'Square kilometer' ranks 8th from top in Misplaced Pages's most linked articles but doesn't even feature in a list of the 1000 most viewed articles. There used to be a list of common units defined in wp:link. I wasn't aware that it had been removed. Here is a quote from the archive version :
  Examples of common measurements include:
  * units of time (second, minute, hour, day, week, month, year)
  * metric units of mass (milligram, gram, kilogram), length (millimetre, centimetre, metre,  
  kilometre), area (mm², etc.) and volume (millilitre, litre, mm³)
  * imperial and US units (inch, foot, yard, mile)
  * composite units (m/s, ft/s)
  Links may sometimes be helpful where there is ambiguity in the measurement system
  (such as Troy weight vs Avoirdupois weight) but only if the distinction is relevant.
  In an article specifically on units of measurement or measurement, such links can be useful.
The current version is more generic:
  Units of measurement which are common only in some parts of the English-speaking world
  need not be linked if they are accompanied by a conversion to units common in the rest of it,
  as in 18 °C (64 °F), as almost all readers of the English Misplaced Pages would be able to understand
  at least one of the two measures. Some units of measure, like "ounce" or "pound" can be misinterpreted
  because they are ambiguous. A link might serve, if a simple statement, "troy ounce", does not.
  Do not use a link for an ambiguous unit of measure unless a thorough explanation would help the article's context.
  You would then link "Ounce" or "pound" to the Troy weight or Avoirdupois weight article.
  For example, in an article specifically on measurement or on units of measurement, links to common units of measurement are useful.
I agree with almost all of both versions (although I'd reword the bit on ambiguity. The weight of a person in lb is unambigous when accompanied by kg and I think few people would think it was a troy pound even without a conversion).
Regards Lightmouse (talk) 19:04, 15 July 2010 (UTC)
I'm not worried about whether you choose to use text instead of {{convert}}, I was just wondering why you might. As for "common units", I'm really not interested in debating here whether any specific unit is common enough to be linked or not. But I do want to know what you intend to do if someone brings a legitimate dispute over commonness to your user talk page as a result of the bot's edits. Anomie 20:17, 15 July 2010 (UTC)
Disputers always think they have a 'legitimate dispute'. We can stop it becoming a dispute if any question about 'commonness' can be redirected to an arbiter. Lightmouse (talk) 20:53, 15 July 2010 (UTC)
  • Comment: I believe that Lightbot should not link units of measurements at all. This would simplify this definition phase and the bot's operation – by avoiding the need to parse the conversions to see whether it is the first (which may need linking) or nth (which will not be linked) occurrence. Any units can be linked manually as the cases arise. Ohconfucius 01:56, 16 July 2010 (UTC)
    • There is no automatic rule that only the first instance of something should be linked. Sometimes it is none, sometimes once, sometimes more. Deciding how many instances of something on a page should be linked (if any) is something best done by a human, not a bot. My view is that this bot tries to do too much in one go. It may be that if people took the time to understand what is being done here, they would support it, but it is easier to support smaller, simpler tasks, than complex ones. I do recognise that this would increase the overall number of edits done, but it is simpler when an edit pops up on a watchlist to check an edit that changes a few things (say five) than to try and work out whether an edit making 50 or so changes in one article is OK or not. Essentially, what I am saying is that the scope of what automated editors (using AWB and the like) and bots do, should not outstrip the ability of ordinary mortals to check each edit (or a selection of them). I recently found three examples of automated edits that included errors. Can you easily spot the mistake made in each of these edits? , , . My suspicion is that if a human checked any of those edits, they failed to realise the mistakes made in those edits. In other words, it is possible to write complex bots that make lots of changes to a page in one edit, but if that makes it difficult for a human to check that edit, then you have problems. Carcharoth (talk) 12:08, 17 July 2010 (UTC)

Please note that this is not a list of things that it will do. It's a list of things it's permitted to do. The original wording was Edits may add, remove or modify links to units.. It was another way of wording the software analysis term 'Create, read, update and delete' which is applied to each object class unless there is a reason not to. I don't want addition of links to be forbidden but I can't give you a scenario right now. Lightmouse (talk) 07:45, 16 July 2010 (UTC)

  • Lightmouse, is it possible to limit the bot to making no more than five separate changes on a single page (or whatever number seems reasonable)? And to queue the changes to be done in subsequent edits if more are needed? Also, is it possible to estimate how many pages this will affect during the initial run and how long it might take to complete? Thousands, tens of thousands pages? Days or months to complete? You say above "continuous" and "tens, hundreds, thousands", but that seems a bit vague. Carcharoth (talk) 12:13, 17 July 2010 (UTC)
Category: