Misplaced Pages

talk:Contributor copyright investigations: Difference between revisions - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 18:14, 1 February 2012 editSandyGeorgia (talk | contribs)Autopatrolled, Extended confirmed users, Page movers, File movers, Mass message senders, New page reviewers, Pending changes reviewers, Rollbackers, Template editors278,952 edits Just pulled Bozeman Carnegie Library from the main page -- input, please?: another← Previous edit Latest revision as of 18:33, 2 November 2024 edit undoLowercase sigmabot III (talk | contribs)Bots, Template editors2,293,067 editsm Archiving 2 discussion(s) to Misplaced Pages talk:Contributor copyright investigations/Archive 3) (bot 
(423 intermediate revisions by 100 users not shown)
Line 2: Line 2:
|archiveheader = {{talkarchive}}{{atn}} |archiveheader = {{talkarchive}}{{atn}}
|maxarchivesize = 130K |maxarchivesize = 130K
|counter = 1 |counter = 3
|minthreadsleft = 2
|algo = old(14d) |algo = old(60d)
|archive = Misplaced Pages talk:Contributor copyright investigations/Archive %(counter)d |archive = Misplaced Pages talk:Contributor copyright investigations/Archive %(counter)d
}} }}
{{talkheader}} {{talkheader}}
{{AutoArchivingNotice|bot=MiszaBot|small=yes|age=14|dounreplied=yes}}


== Authentication is now required for search engine checks on Earwig's Copyvio Tool ==
== A new template, presumptive deletion ==


Hello! As of ''right now'', ] will now require logging in with your Wikimedia account for search engine checks. This is an attempted solution at trying to curb bot scraping of the site, which rapidly depletes the available quota we have for Google searches. New checks will require you to log in first prior to running. You will also still keep getting "429: Too Many Requests" errors until the quota resets, around midnight ], as we've run out of search engine checks for the day. If this broke something for you or if you're having issues in trying to authenticate, please let ] or ] know. Thanks! <span style="background:#ffff55">''''']'''''</span>&nbsp;<small style="font-size:calc(1em - 2pt)">(])</small> 00:06, 5 October 2024 (UTC)
In conjunction with ], I have created {{tls|CCId}} for articles which are tagged for deletion ''without'' verification of copyright infringement. Current ] supports this presumptive deletion in cases where it has been verified that an individual has violated copyright in multiple points. The template presumes listing at ] and advises interested contributors how to help verify the copyright status of the material or to rewrite the content if interested in its preservation. It cautions against use in cases where previous contents can be restored (where the contributor was not the creator) and recommends instead verifying infringement where other contributors have invested time (and creative content) into the article. --] <sup>]</sup>


== Integrating Misplaced Pages Library search into Earwig's Copyvio Detector ==
== Becoming involved with this process ==


Hi - we recently spoke with ], who host the search tool that uses to browse content available through the library, and they granted us API credentials + permission to integrate EBSCO Discovery Service with Earwig's Copyvio Detector. This would enable copyvio searches of more paywalled PDFs (it's not currently clear how much additional coverage this gives beyond what Turnitin provides, but we'll be able to investigate further once this is integrated). We've put some basic mockups of what this might look like at ] - thoughts and questions welcome! ] (]) 14:55, 28 October 2024 (UTC)
A users popped by my talk page, after my unsuccessful RFA, with some suggestions about how I can become involved in this area. I gather that there are not a whole lot of active users that are interested in doing copyright work, so I would like to lend a hand. However, this does appear to be a bit of a walled garden, and I can't quite figure out how what the procedure is to go through the checking of contributions. Is it just a simple google search for the added content, combined with an evaluation of the current state of the article, mixed with a bit of common sense and copyright policy? I think more people would become involved if the system itself were easier to understand. --]]<sub>(])</sub> 15:31, 4 April 2011 (UTC)
: Take a look at ], and don't miss ] at the bottom. ] (]) 04:46, 8 April 2011 (UTC)
:Note that some of our copyright violators extensively use print and/or subscription only sources. ] 05:22, 16 April 2011 (UTC)
::Do we need to revise our CCI instructions? Certainly we could stand to link to ]. I'm all for encouraging assistance any way we can. :) --] <sup>]</sup> 11:36, 16 April 2011 (UTC)

== How to update? ==

I've been working sporadically on ]. A few new contributors/socks have been identified; the contributions checked and scrubbed. Do they need to be added to the report? If so, how exactly is that done? Thanks. ] ] 19:19, 25 May 2011 (UTC)

== Question ==

I know that I am rather new to this process, but is there a way that we could help eliminate the backlog that is currently on some of these cases that are lasting two years or more? It doesn't seem as though there are a lot of people who are active in this process in terms of clearing the backlog, but I would be willing to help get people on board should there be a general consensus to create a drive of sorts. I personally have an investigation ongoing against me and I really don't want to see this open in a few years as it will just be a pain should I want to run for something and it isn't even half done. It's just a thought, but I feel like doing this will be quite a good thing for increasing the credibility of a process where things just seem to initially be worked on, and then languish for a number of years. ] (]) 23:23, 30 May 2011 (UTC)
:If you can figure out how to get people involved in any way, shape or form, that would be fantastic. I hate the backlog here. :/ --] <sup>]</sup> 23:32, 30 May 2011 (UTC)
::I don't know if I could commit to it completely, but is there a way that we can bring the process into more of a mainstream thing so that it would garner more attention. It might even be good to implement a clerk process where users who want to can help to not only clear the backlog but maintain the pages. ] (]) 19:41, 4 June 2011 (UTC)
:::Thanks; CCI ''has'' clerks, though. What we really need are people to do the necessary work of checking articles. --] <sup>]</sup> 21:16, 4 June 2011 (UTC)
::::Ah, I forgot this even though a year ago I knew the answer. I know asking the clerks to do this wouldn't be good but I wonder if we could advertise on a noticeboard, see what kind of response we get, and go from there. Considering there are are between 20-35 pages that need clearing, it would be good to at least attempt some sort of action at this point. What are your ideas for going about this? ] (]) 22:56, 4 June 2011 (UTC)
:::::I've tried advertising at AN a couple of times, but so far have not seen much response to it. We did a Signpost piece, although we didn't focus that much on CCI but on copyright in general. Every time I open a new CCI personally I advertise it at relevant noticeboards, and sometimes that has gotten us assistance. What noticeboard did you have in mind? --] <sup>]</sup> 23:03, 4 June 2011 (UTC)
::::::Well, there is always the option of posting on the ] page as well as canvassing on the main IRC channel. It might be a stretch, but involving editors over at the ] might also be a good start since the people over there are doing work in a similar area as this one. ] (]) 00:20, 5 June 2011 (UTC)
:::::::There's nobody who works at ] who doesn't know about this. :) (More specifically, ] is very busy with SCV and CP; ] is only here part time.) --] <sup>]</sup> 00:22, 5 June 2011 (UTC)
::::::::Maybe I'll try to see if Sonia could help us out in her spare time. Maybe we should explore working with the ambassador program to see what they think of involving new students with copyright issues. The idea with that is it would help to show new users how to correct copyright issues, and therefore help prevent them from slipping up. ] (]) 01:16, 5 June 2011 (UTC)
:::::::::I think we have better streamlined the system at CP and as always, more "manpower" is needed. I want to help at CCI but I get bogged down at CP and SCV with the time I have. ] recently burned me out. I have been thinking of ways to advertise but if MRG gets lukewarm responses, I don't know how much more I can help. Maybe advertising the space as a great way to gain admin experience can be an advantage.--] <small>]</small> 01:37, 5 June 2011 (UTC)
::::::::::I'm all in for advertising on IRC should we need people. Isn't there a centralized copyright channel somewhere or am I just imagining things? ] (]) 15:08, 5 June 2011 (UTC)
:::::::::::I think ] is a good hub.--] <small>]</small> 22:32, 5 June 2011 (UTC)

::::::::::::Rutherford- we don't need extra people. We're staring at the obvious- why don't we who are under investigation '''do each others investigations''' instead of searching for people who don't want to do it? most copyviolaters barely know each other, so it would be neutral if we each read up on the policy, then dedicated ourselves to clearing at least one single investigation on another user. If every single copyviolater did it, then we would be clear in no time.] (]) 02:57, 7 June 2011 (UTC)
: Just to trow an idea out there: How about we have a bot post a template to every article talkpage, notifying it is possible material was copied to this page and the relevant diffs, asking people to examine it and report their findings on the CCI page or ]. We would need to ensure the messages are not archived until the concern is adressed and remove them (or replace with {{tl|cclean}}) once it has. '''Yoenit''' (]) 07:08, 7 June 2011 (UTC)
::Alot of articles aren't viewed at all by many people. The key is to actually get people to the article, and to read it, perhaps by putting them at the top of some wikiproject list or something, like have them tagged with a template "possible copyvio", and that automatically gets it moved up on some sort of list at the wikiproject page.] (]) 01:18, 8 June 2011 (UTC)
::: It is not the solution for everything, but it would definitely attract more attention than we get now. We have perhaps over 100.000 articles currently in CCI, a significant amount of those is going to have active talkpages. Even if it helps clear out only 10% of the backlog it is still a big help. Putting a tag in the article itself rather than the talkpage is rather heavy handed and might be an idea if it turns out that talkpage notices are not working. With regards to wikiprojects, the experience I have with wikiprojects (mainly ]) are that lists like that do not work. '''Yoenit''' (]) 06:58, 8 June 2011 (UTC)
::::I think talk-page notices are a better route. I am not a fan of tags on articles unless the problem with that article itself is visibly plausible or apparent, not just based off of probable cause. Also, editors interested enough to put time into reviewing the article and fixing a potential problem would monitor the talk page well.--] <small>]</small> 11:25, 8 June 2011 (UTC)
:::::thousands of articles have gotten zero edits for years, and its safe to say that they've been viewed almost by no one. Tagging them will not help at all since there is no one to find the tag.... you need to get people to actually reach the article first] (]) 20:44, 10 June 2011 (UTC)

== One more finished ==

One more finished: ]. Can someone do the closing of this? ] (]) 14:26, 6 October 2011 (UTC)
:Done! --] <sup>]</sup> 14:36, 6 October 2011 (UTC)
::And another small one: ]. ] (]) 18:59, 14 October 2011 (UTC)
:::{{done}} ] 04:48, 15 October 2011 (UTC)

Another here: ]. ] <sub>]</sub> 03:19, 3 November 2011 (UTC)
:{{done}} ] 07:58, 3 November 2011 (UTC)

On a roll, here's another: ]. ] <sub>]</sub> 04:17, 8 November 2011 (UTC)
:{{done}} ] 02:56, 9 November 2011 (UTC)

== Little help needed ==

Another user has just substantially rewritten ], and a lot of the phrasing suggests to me there's a ''lot'' of copy and paste or insufficiently distant paraphrasing going on. Where do I even begin? →&nbsp;]&nbsp;]<small>&nbsp;16:06, 27 January 2012 (UTC)</small>

]
:By asking him? the researching and writing of the article took me about one year, on and off. You can see for yourself at this sandbox's history: ], going back to February 2011. The greatest text used for most of the article is Sir Anthony Wagner's brilliant, but exceedingly large ''Heralds of England''. The "insufficiently distant paraphrasing" is I'm afraid the fact that English is my second language after Thai. No copy and paste have been made, most of my sources are in book form. There is no definitive text on the subject, especially not from a modern viewpoint. Wagner is brilliant but very lengthy, so only snippets were prized out, Mark Noble is another good one, but being published in 1805 is limiting. Finally the College of Arms's own website is very informative and is in itself encyclopedic, so the structure of many parts of the article follow those as set out from the website. All of these are cited and referenced to the appropriate source, in fact not a single paragraph of the article is not cited. If there is issue with the content and research of the article I am happy to go through it sentence by sentence. But if this baseless suggestion is the reason why, then I can't help but think that those discussions we had in the talk page were not in good faith. Unless you have definitive proof of a copyright violation we should refrain from any more discussions on the article, because only one of us would be doing so under good faith. ] (]) 17:55, 27 January 2012 (UTC)

::Don't you dare accuse me of not acting in good faith. The article as rewritten by you is riddled with grammatical errors, is largely unreadable in parts, and given the tenuousness of your grasp of the difference between 'evidence' and 'opinion,' I am concerned about how well your information is sourced, and how well the text you wrote is actually supported by the sources given. My concern about copying wholesale from sources is a valid one given the archaic language used in most texts about heraldry and the similarly archaic phrasing you have used. Frankly I don't ''care'' what your sources are; I care that copyright is not being infringed. Thus I asked for help here, as history on Misplaced Pages indicates that the overwhelming majority of people either don't understand when they have infringed copyright (which may be the case here) and are therefore unable to even understand how to help, or they know full well they have copied and therefore do not ''want'' to help. →&nbsp;]&nbsp;]<small>&nbsp;18:05, 27 January 2012 (UTC)</small>

:::How can I not? you have made your opinion very clear. The accusation of copyright infringement is very serious, and it has been made very swiftly by you. It took me a year to complete the article, I put it out for two days and this is what is seriously being considered? The key here is proof, like I said, I am happy to go through it line by line. I am quite aware of the difference between evidence and opinion, that case was a very bad demonstration from me, but I still have full faith in the rest of the article and will stand by it. Funnily enough those reasons you cited are part of my proof that I wrote it, the mistakes and the errors. The article as I wrote it is not perfect and it still needs a lot of work, I know that. The community will deal with that. ] (]) 18:25, 27 January 2012 (UTC)

::::You're quite right. When I am concerned about an article I shouldn't ask the experts for help in either validating or alleviating my concern. How ''stupid'' of me. →&nbsp;]&nbsp;]<small>&nbsp;18:27, 27 January 2012 (UTC)</small>

:::::You are right, you have every right to ask the experts, but clearly this is ''your'' issue to sort out and not mine. I was just a little offended you didn't ask me first before you decided to raise this concern. ] (]) 18:31, 27 January 2012 (UTC)

== Just pulled ] from the main page -- input, please? ==

Hi,<br>I just pulled ] from the DYK section of the mainpage. I had looked at two phrases from the article and , and noticed:
* "The building is also opened for special events such as Historic Preservation Week." vs.
* "its new owners have opened the building to the public on numerous occasions for special events, such as Historic Preservation Week."
and
* "Their plan worked as the red-light district and Chinese population steadily dwindled away." vs.
* "the local Chinese population gradually dwindled and Bozeman's red light district soon withered and disappeared."
I haven't looked at the rest yet. Can I get a quick opinion on whether the phrasing is distinctive enough and it is to be considered plagiarism?
If I'm overreacting, feel free to put it back.
I have more articles from the same editor that concern me, e.g. in ] "and was designated a State Natural
Area in 1986" is copied word-for-word from .
Opinions? ] 10:04, 1 February 2012 (UTC)

:Without even looking to see who it is, I think that - while some of it is well-paraphrased - this contributor seems to cut corners a bit on rewriting content in his or her own language. For another example, the source says "local librarian Bell Chrisman urged the city to seek Carnegie funding", and the lead sentence in the article says "city librarian Bell Chrisman urged the city to seek funding from Andrew Carnegie". With something like this, I'd usually review his other work. If it seems to be a pattern, I'd consider using the {{tl|close paraphrasing}} tag on the article, putting some examples on the talk page of the article, and explaining at the user's talk page where I see issues. I'd tweak the language of one of my form letters (]) and offer several examples from multiple articles. I'd try to add something encouraging in there about the well-paraphrased sections (since some of them are well done) and acknowledge that it can be a pain in the neck to have to rewrite what seems to be serviceable language. Unless there are more extensive issues in other articles, I'd emphasize that this does not seem to be a major problem and could be easily resolved with just a bit more attention to potential issues. But I'd try to say that more diplomatically. I'm just up for the day. :D --] <sup>]</sup> 11:30, 1 February 2012 (UTC)
::<small>I'm up for the day, too. Yay day! --] ] 13:17, 1 February 2012 (UTC)</small>
:::LOL! :D --] <sup>]</sup> 13:52, 1 February 2012 (UTC)

: I just mostly blanked ], but I don't know if the history needs to scrubbed? ] (]) 18:14, 1 February 2012 (UTC)

Latest revision as of 18:33, 2 November 2024

This is the talk page for discussing improvements to the Contributor copyright investigations page.
Archives: 1, 2, 3Auto-archiving period: 2 months 

Authentication is now required for search engine checks on Earwig's Copyvio Tool

Hello! As of right now, Earwig's Copyvio Tool will now require logging in with your Wikimedia account for search engine checks. This is an attempted solution at trying to curb bot scraping of the site, which rapidly depletes the available quota we have for Google searches. New checks will require you to log in first prior to running. You will also still keep getting "429: Too Many Requests" errors until the quota resets, around midnight Pacific Time, as we've run out of search engine checks for the day. If this broke something for you or if you're having issues in trying to authenticate, please let The Earwig or me know. Thanks! Chlod (say hi!) 00:06, 5 October 2024 (UTC)

Integrating Misplaced Pages Library search into Earwig's Copyvio Detector

Hi - we recently spoke with EBSCO, who host the search tool that The Misplaced Pages Library uses to browse content available through the library, and they granted us API credentials + permission to integrate EBSCO Discovery Service with Earwig's Copyvio Detector. This would enable copyvio searches of more paywalled PDFs (it's not currently clear how much additional coverage this gives beyond what Turnitin provides, but we'll be able to investigate further once this is integrated). We've put some basic mockups of what this might look like at T378077 - thoughts and questions welcome! Samwalton9 (WMF) (talk) 14:55, 28 October 2024 (UTC)