Misplaced Pages

MediaWiki talk:Spam-blacklist

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by Jéské Couriano (talk | contribs) at 23:42, 5 December 2007 (freelancer: Why?). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 23:42, 5 December 2007 by Jéské Couriano (talk | contribs) (freelancer: Why?)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
This is the talk page for discussing improvements to the Spam-blacklist page.
Protected MediaWiki:Spam-blacklist is a page in the MediaWiki namespace, which only administrators may edit.
To request a change to it, please follow the directions at Misplaced Pages:Spam blacklist.
Shortcut

Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist only affects pages on the English Misplaced Pages. Any administrator may edit the spam blacklist. Any developer may use $wgSpamRegex, another method to prevent the addition of spam links. However $wgSpamRegex should rarely be used.

See Misplaced Pages:Spam blacklist for more information about the spam blacklist.

Dealing with requests here

Any admin unfamiliar with this page should probably read this first, thanks
  1. Does the site have any validity to the project?
  2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Is there a Spam project report, if so a permanent links would be helpful
  3. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regex - the disruption that can be caused is substantial.
  4. Close the request entry on here using either {{done}} or {{not done}} as appropriate. Request should be left for a week maybe as there will often be further relatede sites or an appeal in that time.
  5. Log the entry. Warning if you do not log any entry you make on the blacklist it may well be removed if someone appeals and no valid reasons can be found. To log the entry you will need this number - 176039866 after you have closed the request. See here for more info on logging.

Those interested in contributing to this page may find it helpful to watchlist this page or create their own if they want to watch other pages as well. It effectively watches threads rather than pages.

There are 4 sections for posting comments below. Please make comments in the appropriate section. they are Proposed additions, Proposed removals, Troubleshooting and problems, and Discussion. Each section will have a message box explaining them. In addition, please sign your posts with ~~~~ after your comment.

Requests which have been completed are archived. All additions and removals will be logged.

snippet for logging: {{/request|176039866#section_name}}

Proposed additions

Please add new entries to the bottom of this section. Please only use the basic URL (google.com not http://www.google.com). Please provide diffs to prove that there has been spamming! Completed requests should be marked with {{Done}} or {{Notdone}} then archived.

mesothelioma.pl

Multiple anon IP linkspamming of common mesothelioma targets. --Mdwyer 15:25, 30 November 2007 (UTC)

Thanks, warnings have been issued and it appears to have stopped. I'd block next, cheers --Herby 09:37, 1 December 2007 (UTC)
It escalated! Now done on Meta --Herby 09:49, 3 December 2007 (UTC)

thenettimes.com

Links added from multiple IPs and single- or low-edit accounts.

and many more. Gimmetrow 21:05, 21 November 2007 (UTC)

Agreed that this one is an issue. Was it all on the 21st? I'm inclined to see if they try again, in which case I'd list it straight away I think but it is possible that they have got the message? Cheers --Herby 12:22, 23 November 2007 (UTC)

Given no further info/spamming I'll close this as  Not done, we can always return to it --Herby 09:51, 3 December 2007 (UTC)

silanis.com

Please see Misplaced Pages:Conflict of interest/Noticeboard#Silanis to see why I propose the blacklisting of silanis.com. --Gerry Ashton (talk) 18:33, 28 November 2007 (UTC)

See also - Conflict of interest/Noticeboard case
See also - WikiProject Spam case


Agreed &  Done. Make sure to remove the http:// portion on talk pages where the urls are located, do this for both www.esignrecords.org and www.silanis.com. Archiving bots won't be able to properly archive if it remains hyperlinked. Thanks--Hu12 (talk) 18:47, 28 November 2007 (UTC)

megadry.com, excessivesweating-treatment.com

Added to many articles by User:Whynotthestars, and continued to this day by various anons.

Diffs (most recent edits, in most cases):

—Preceding unsigned comment added by Voyagerfan5761 (talkcontribs) 16:30, 30 Nov 2007

I think we should watch this one. The anons and the user have had final spam warnings and seemed to have stopped. I'd probably try a block next - thanks --Herby 09:35, 1 December 2007 (UTC)

sikhzone.net dhangurunanak.com cyarena.com

Persistent dynamic IP spam (latest two incidents: ). Continuing to spam despite warnings. See also WT:WPSPAM#spam.sikhzone.net spam.dhangurunanak.com spam.cyarena.com. MER-C 12:05, 1 December 2007 (UTC)

 Done thanks --Herby 12:09, 1 December 2007 (UTC)

itsleaked.blogspot.com/

This persistant spammer seems to go around claiming to have false album/video game links as an attempt to scam people. Previous incarnations of the page included instructions to dload a google program and click the ad banners on his site. Zopwx2 22:22, 3 December 2007 (UTC)

See MediaWiki talk:Spam-blacklist#blogspot.com below. -Jéské 04:52, 4 December 2007 (UTC)
Yeah - good catch &  Done, thanks --Herby 09:31, 4 December 2007 (UTC)

playturks.com/

Repeatedly spammed on Age of Empires related articles by Special:Contributions/Playturks. Dihydrogen Monoxide 07:41, 4 December 2007 (UTC)

 Done SQL 07:44, 4 December 2007 (UTC)

maxmytest.com

Persistent spamming to Graduate Record Examination and similar subjects from multiple IPs.

Yep,  Done & thanks --Herby 19:23, 5 December 2007 (UTC)

Proposed removals

Use this section to request that a URL be unlisted. Please add new entries to the bottom of this section. You should show where the link can be useful and give arguments as to why it should be unlisted. Completed requests should be marked with {{Done}} or {{Notdone}} then archived.

sikhzone.net Unlist Request

The link sikhzone.net/guru_nanak.php that i added to external links section of Guru Nanak Dev provides detailed information about Guru Nanak Dev, and it was added so that users can get more detailed information about Guru Nanak Dev..but it was removed. Also my domain sikhzone.net has been added to spam list...The link i added was really useful for detailed information about Guru Nanak Dev. I kindly request you to remove my domain sikhzone.net from spam list and allow to add the specified link. --Sikh zone 12:07, 2 December 2007 (UTC)

Note: the original WikiProject Spam entry has been updated with new information since this domain was blacklisted. See:
--A. B. 00:19, 3 December 2007 (UTC)

www.cais-soas.com - is this a bug?

The site "Circle of Ancient Iranian Studies" looks like a typical academic site. I was looking for an article describing possible concepts shared between Zoroastrianism and ancient Judaism for a background sentence in a Zoroastrian section I just added to Religion and homosexuality, and the content generally matched my impression of at least one POV about what is generally believed (though I'm out of my field here). In short I see nothing even slightly spammy about the site. The local blacklist contains no string "cais" at all; the global blacklist includes only a site www.bcais-soas.com - I have no idea why my link is tripping the blacklist at all, actually, but perhaps there's some bug in the code that ignores an extra prefix character??? The message was "The following link has triggered our spam protection filter: http:*/www.cais-soas.com" (no "b"). 70.15.116.59 21:00, 3 December 2007 (UTC)

P.S. This site is currently linked to from Godzareh depression and Gore Ouseley. 70.15.116.59 21:07, 3 December 2007 (UTC)
found these;
--Hu12 21:37, 3 December 2007 (UTC)
Thanks - I'm surprised those didn't turn up in my MediaWiki search. I see why there was concern about one user, but I still doubt the site should be blacklisted - I didn't see any indication that the user was evading blocks and continuing to "spam" Misplaced Pages; he sounds more like an overzealous (and perhaps legally incautious) editor than a spammer. And where copyright is concerned, I don't think the spam blacklist should be trying to act as a nanny filter for every potential copyright violation on the Web based on hearsay accusations. That's not its stated purpose and it's not a "necessary evil" for legal reasons either. We cannot possibly know where permission is given and where it isn't, and if the site actually is committing some kind of copyright violation the holder has quick recourse available to him that targets the site directly. I think Misplaced Pages probably would face a more serious risk of getting hit with libel accusations if we go around blocking people's websites based on accusations that they are copyright violators and plagiarists (especially from countries like England and Australia with ridiculously oversensitive laws) than copyright violation charges when we're supposedly protected by the DMCA. The most relevant argument detracting from the site is that it is no longer actually associated with the SOAS and therefore is a low-grade source - I suppose small groups of like-minded academics from a particular field straddle the boundary between what we count as a "self-published" source vs. a "primary" source. Bearing this in mind I won't say that substituting a better source is a bad idea, but I don't see reason to retract my request for removal from the blacklist either. And, I still don't understand why the blacklist even affects the link when the blacklist item has an extra "b" at the beginning. 70.15.116.59 23:32, 3 December 2007 (UTC)

Troubleshooting and problems

This section is to report problems with the blacklist. Old entries are archived

Discussion

This section is for other discussions involving the blacklist. Old entries are archived

archive script

Eagle 101 said he had one running on meta, is it possible to get it up and going here?--Hu12 10:27, 15 November 2007 (UTC)

Would be good - Eagle hasn't been working on Meta for a while though & I've not seen anything (there was supposed to be a logging script too!) --Herby 12:10, 15 November 2007 (UTC)

blogspot.com

See also: Wikipedia_talk:WikiProject_Spam § Time_for_blacklisting_blogspot.com.2C_with_whitelisting_of_specific_domains.3F

I added countingcrowsnew.blogspot.com, freemodlife.blogspot.com, and googlepackdownload.blogspot.com to the blacklist. I made a previous report about the blogspot sites and they're being spammed by the same blocked sockpuppet who I filed a report about here. Spellcast (talk) 22:03, 28 November 2007 (UTC)

Update: I've also added b5050-raffle.blogspot.com, gpd2008.blogspot.com, and itsleaked.blogspot.com. They were being spammed by the same blocked sock in that report. Spellcast (talk) 05:18, 29 November 2007 (UTC)

I'm inclined to blacklist the domain then whitelist where needed but some heavy flak is likely to arrive? --Herby 08:06, 29 November 2007 (UTC)
From an en:Misplaced Pages mission perspective (though possibly not your personal perspective:) a bigger issue than the flak that will be generated is the disruption to editing. I believe a lot of pages, particularly biographies of living people, contain legitimate links to the subject's blog - many of which are hosted on blogspot. Simply blacklisting and then waiting for whitelisting requests will likely
  1. overwhelm the whitelist page here and on meta (which given you are one of the most active admins on both, may not be ideal for you!)
  2. be confusing and frustrating to a lot of editors especially newbies, but also any who are not familiar with the blacklist/whitelist set up
  3. lead to a loss of legitimate links and legitimate edits as people struggle to work out whether to keep their edit and lose the link or the other way round while any whitelist request is ongoing.
I think a move like that will take some careful planning and preparation to avoid these issues (might also help cut down some of the heat). One way or another, I think we need human editors to assess the current blogspot links on article pages and enter appropriate ones on the whitelist before the blacklisting goes into effect. I don't think such a move will cut out most of the flak though, so we might want to ensure there are other admins involved to help spread the weight, and a nicely presented page of evidence of the issues the domain causes to point people to.
Blogspot certainly gets spammed a lot more than most domains, and I support blacklisting. But It's still a domain that has a lot of good links and I think it's important to think through how a move like that will impact people, and to adjust to the situation. -- SiobhanHansa 13:54, 29 November 2007 (UTC)
Briefly - needs quite a bit of thought but equally is worth that amount of thought --Herby 13:55, 29 November 2007 (UTC)
There are many, many legitimate links to the domain, not only to blogs belonging to article subjects but to blogs belonging to Misplaced Pages contributors. Better to blacklist individual blogs as needed. --bainer (talk) 16:23, 29 November 2007 (UTC)
Not sure why Misplaced Pages contributors would be adding their own blogs? A very limited number of blogs actualy meet WP:RS and even fewer still meet the requirements of WP:EL or are a blog that is the subject of the article or an official page of the articles subject. There are currently 32,916 blogspot.com Blog links on Misplaced Pages, if whitelisting even a thousand "legitimate links", its worth it.--Hu12 (talk) 17:03, 29 November 2007 (UTC)
You've presented some convincing reasons to leave certain blog links out of Misplaced Pages, but not a reason to leave all blog links out. Misplaced Pages contributors might want to link to their blogs because, you know, it is possible for said contributors to frequent websites on the internet other than Misplaced Pages :P See WP:COMMUNITY. There is also a performance cost to whitelisting and blacklisting; as far as I can tell, 1000 whitelisted entries costs more computationally than 1000 blacklisted entries (instead of using one large regex, which is how the blacklist works, you're doing 1000 individual regex replacements). Gracenotes § 18:04, 29 November 2007 (UTC)
I was under the impression server load was something we were supposed to leave up to the developers to worry about. If they see an issue and ask for a reassessment that would be one thing, but its not a good argument against a tactic without their weight behind it.
The suggestion isn't that all blogs should be banned. the suggestion is that this particular domain gets spammed so much it would be beneficial to the project to blacklist it and only white list the ones that are appropriate. -- SiobhanHansa 18:13, 29 November 2007 (UTC)
Hu12 I think it's important not to overstate the case here. Not all of the ~32,000 links (assukming the 1K of good links estimate) that are not legitimate external links or citations will actually be harmful to Misplaced Pages. While editors' own blogs on their user pages aren't necessary to the project, in the vast majority of cases they do no harm and may help editors fell a bond that connects them to the project. Many more will be links from discussions and projects. While I don't think that's a reason for keeping a domain that is also being spammed so much - it's not the case that we do 32,000 links worth of "good" by removing them. For the most part we only really benefit from the spam and poorly placed article links that go. -- SiobhanHansa 18:08, 29 November 2007 (UTC)

(unindent, crosspost my post from WT:WPSPAM)

The rule \bblogspot\.com is (currently) not on COIBot's monitorlist. Some of the sub-domains have been added via WT:WPSPAM, or have been caught by the automonitoring of COIBot (mainly because the name of the editor is the same as the name of the subdomain on blogspot.com).

Still, a linksearch on the resolved IP of blogspot.com (72.14.207.191) results in a mere 118 results (all COIBot linkreports)! Often the multiple use of the single subdomains is not a cause for blacklisting, as they may only have been used once or twice. Also, I suspect there are tens of thousands of blogspot sub-domains out there, but these are only the links that are caught because the wiki username overlaps with the domainname of the subdomain (or have been reported here). Would this cumulative behaviour warrant blacklisting of \bblogspot\.com .. here, or even on meta? --Dirk Beetstra 12:37, 30 November 2007 (UTC)

Appropriate links may indeed be a problem, though the majority will fail some or many of the policies and guidelines here (or don't even have to be a notable fact, or do not need to be a working link while being mentioned; "Mr. X has a a blog on Blogspot.<ref>primary reliable source stating that the blog is the official blog</ref>"; we are not a linkfarm), and I would argue that the spam/coi part of the problem becomes a bit difficult to control... --Dirk Beetstra 14:23, 30 November 2007 (UTC)
Crosspost spamlink template for blogspot.com to link this discussion to the linkreports from COIBot. --Dirk Beetstra 10:31, 3 December 2007 (UTC)
Please try to remember how frustrating generic, unexpected spam blocks can be for new and incautious editors. Last time I "checked", if you make an edit with Internet Explorer and you post it directly without preview (two things you should never do), then if the spam blacklist comes up your text is gone. Back arrow gets you the original text of the article. Edits that die that way may not get remade, and they may sour the editor on further contributions. I don't think there should be any blocks on top-level domains or large general purpose Internet sites. 70.15.116.59 23:46, 3 December 2007 (UTC)
I have to disagree in this case - there's concern that the dynamic IP spamming it is using it to perpetrate scams or send out computer bugs. -Jéské 04:55, 4 December 2007 (UTC)

freelancer

I'd like freelancer.com.ar to be removed. —Preceding unsigned comment added by 200.127.47.3 (talk) 17:57, 5 December 2007 (UTC)

Why? -Jéské 23:42, 5 December 2007 (UTC)