Misplaced Pages:Requests for comment/Archive.is RFC 3

< Misplaced Pages:Requests for comment

This is an old revision of this page, as edited by Magioladitis (talk | contribs) at 07:34, 19 July 2015 (merged multiple bot tags). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 07:34, 19 July 2015 by Magioladitis (talk | contribs) (merged multiple bot tags)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff) Ran across this as stale and in need of closure at WP:AN. The headings are a bit confusing, but there are essentially four questions in what feels like a logic puzzle:

Header: "Don't prohibit additions and don't continue removing"
Translation: Adding archive.is links should be allowed, and the continued removal of them should not be ok

Result: No consensus — Given the prior RFC, this presumably means the stare decisis would be to continue prohibiting additions and continue removing.
Header: "Continue with removal of existing links"
Translation: Continued removal is OK.

Result: No consensus — Combined with the lack of consensus affirming the inverse of this item (i.e., #1), again we have to look to the prior RFC, which presumably means the stare decisis would still be to continue removal, unless I'm missing something.
Header: "Require that another archive alternative exists before removing link (oppose/support)"
Translation: Prohibit removals where there is no archive alternative.

Result: No consensus - with disclaimer asterisk — This one trends slightly toward support, but the absolute count (low 60s support) is outside the normal range for saying "yeah, that's consensus" given the turnout (at least, for things like WP:RFA). The disclaimer asterisk here, then, is that logically the large enough support for this point could also be presumed to affect consensus against supporting a BRFA for automated blanket removal where no checks for archive alternatives or checks for the contextual appropriateness of removal are in place. That's really more an issue for the BRFA, however. Might be an idea for another RFC or other discussion on this issue?
Header: "Look at referencing templates that support links to multiple archiving sites"
Result: Supported and/or Improper venue, most likely. Several, including myself in trying to close this, wondered whether this is the appropriate venue. The subject of the RFC is archive.is, not archive sites as a whole (nor cite templates, etc...). As such, the likely-more-limited audience probably shouldn't be assumed to be reflective of community consensus on that subject, so I'd advise against some future discussion referencing this from a template talk page and saying "that RFC said there's consensus on this, so there!"

Given the confusing wording and the lack of a full range of options, however, it might be an idea to run a new RFC with individual, single-item propositions stating the exact, single-action change to the current state / existing consensus / existing actions being taken (e.g., "Stop blanket removal of links" followed by "Re-allow addition of links" instead of oddly worded, compound propositions that overlap). Doing so will more likely yield a consensus in a specific direction (e.g., "yes, there was consensus that supported stopping" vs "yes, there was consensus that opposed stopping" vs "there was no consensus on the stopping issue").

Cheers =)

--slakr 11:16, 15 October 2014 (UTC)

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Background

Archive.is is an archiving service similar to sites like Webcite and the Wayback Machine, offering different levels of service up to and including snapshots that are retained regardless of modern changes in a sites robots.txt file, which the Wayback Machine can abandon (potentially delaying rather than removing the potential for LinkRot), while Webcite has presented itself as having an uncertain long term future tied to funding. No issues have been found with the quality of the snapshots provided at archive.is.

In August 2013, a bot called User:RotlinkBot, created by Rotlink began linking Misplaced Pages articles to the new Archive.is service. This bot was not approved, and was therefore subsequently blocked. This block was procedural, and made based on the lack of approval, not the quality of the RotlinkBot's edits.

Following this block, edits matching edits from the bot, including the edit summaries, were made from hundreds of IPs, residential and business, from three different Indian states, Italy, Hong Kong, Vietnam, Bulgaria, Qatar, Latvia, Hungary, Slovakia, Romania, Brazil, Argentina, Portugal, Spain, France, Mexico, Austria, and South Africa. Based on fears that the IPs were not being used legally, these IPs, and User:Rotlink, ~~self-identified as the owner of archive.is~~ (Note: struck 3 Oct 2014 because this is an unverifiable claim with no presented evidence supporting it, per discussion in 3.7), were subsequently blocked. Rotlink has not commented on any of the blocks.

The previous RFC regarding Archive.is concluded that the site should be added to the blacklist and that all existing links to archive.is should be removed. In few cases, the removal of archive.is links has resulted in LINKROT.

Archive.is has never been added to the spam blacklist because the use of the blacklist would require the links to be removed before unrelated edits could be made to the article. Instead, an edit filter has been applied which prevents additions of the link, but does not prevent editing articles which simply contain the link.

The concerns about the potential for malware raised in the RFC have not materialized at this point, leading to arguments as to whether those fears were well-founded. An effort to get a bot approved to implement the RFC result stalled, indicating that the community may no longer believe the block to be warranted.

~~Archive.is does use advertising.~~ (Note: struck 1 Oct 2014 because this is an unverifiable claim with no presented evidence supporting it, per discussion on talk page.) The previous discussion showed that some editors considered this to be a major issue, but there was no strong consensus either for or against the site based on this.

Based on the questions of consensus raised during Misplaced Pages:Bots/Requests for approval/Archivedotisbot, the community should discuss whether the previous consensus is still in force

Options

Don't prohibit additions and don't continue removing (oppose/support)

Support per above argument as RFC creator, the Archive.is service itself has shown no negative effects and provides a net beneficial service to our project. The site has presented no malware or evidence of ill intent beyond the actions of RotlinkBot and RotLink. Sanctions should remain undoubtedly on the implementation of the bot and spam linking as per usual, but Wikipedians using the service as an archive and adding links to the site should no longer be prohibited, nor should existing links added by individual users be removed, as nothing negative has been tied to them.Darkwarriorblake / SEXY ACTION TALK PAGE! 22:53, 26 June 2014 (UTC)
Further comments The opening statement was a collaboration between myself and the creator of the original RFC, Kww. It mentions advertising but even with adblocker off I fail to see any advertising on the site, certainly if it is there it isn't significant enough to notice, but given that I have been to the main page and back, I'm either blind or there are no adverts on the site. An IP on my talk page here also informs me that any adverts there are google custom advertising, and so I fail to see the issue anyway. This isn't the Pirate Bay filled to the brim with suspect advertising. Darkwarriorblake / SEXY ACTION TALK PAGE! 17:48, 29 June 2014 (UTC)
Further example of web.archive's uselessness, that now no longer exists because the original site is gone and web.archive abandons snapshots. It does little more than delay the inevitable, we need better alternatives. Darkwarriorblake / SEXY ACTION TALK PAGE! 21:57, 24 July 2014 (UTC)
Oppose. While the RFC was phrased in a circumspect manner, there's no reasonable doubt that the owner of archive.is used an illegal botnet to add links to Misplaced Pages, and I mean illegal in the sense of contravening actual law, not Misplaced Pages policies. We should not use our status as the sixth largest website to provide links to someone that has demonstrated that he will use compromised computers to achieve his goals. That places our users in unnecessary peril.
Further, the use of advertising on a site that takes snapshots of other people's contents raises substantial copyright questions: it's hard to justify taking a complete copy of someone's work and using it to attract people to ads under current copyright law.—Kww(talk) 20:07, 27 June 2014 (UTC)
Hm, I'm fighting SPAM etc. myself, but things are not necessarily as easy as they might look at first glance, Kww. There is still alot of doubt that the owner of archive.is has anything to do with Rotlink(bot) and the IP bot nets. It could also have been a competitor trying to kill archive.is' reputation and thereby eliminate competition. That would be illegal, but it is nonetheless a common practise in the ad business. (BTW. Not all bot nets are illegal. I'm not defending them at all, but I don't like premature conclusions.) So far, you have not provided any proof, therefore instead of trying to let assumptions look like facts, please provide (links to) actual facts that the bots were/are run by the owners of archive.is (or people related to them), otherwise your whole argument is void. --Matthiaspaul (talk) 22:40, 28 June 2014 (UTC)
Whether or not Rotlink(bot) and the IP bot nets are operated by the owners of archive.is is actually irrelevant - Rotlink(bot) and the IP bot nets were abusing Wikimedia servers with their link-additions, consensus was that that behaviour should be stopped and that the links would be blacklisted. If the owner of archive.is has a problem with that, they should take that up with the person behind Rotlink(bot) and the IP bot nets, that is then outside of Wikimedia's hands. --Dirk Beetstra 05:31, 3 July 2014 (UTC)
Oppose Per Kww's points Werieth (talk) 20:44, 27 June 2014 (UTC)
Oppose Archive.is has demonstrated multiple times that they have no scruples in regards to respecting robots.txt rules, Injection of malware, replacement of ads, violation of Misplaced Pages Terms of Service, and abuse of process. Archive.is and it's advocates should demonstrate by acceptance elsewhere that the internet as a whole trusts them before they can earn our trust back. Hasteur (talk) 22:28, 27 June 2014 (UTC)
Can you please point me (and other editors) to anything providing proof to your statements? I mean actual facts, not speculation and hearsay, as so far I have only seen the latter. --Matthiaspaul (talk) 22:40, 28 June 2014 (UTC)
"Why does archive.today not obey robots.txt?"

The distributed botnet nature of how once their bot was blocked suddenly many IP editors showed up with the same editing characteristics as the bot from a great many distributed sites terminating at user based endpoints suggests that somehow they used malware to get their toolset on many different computers

Again I refer you to their FAQ which suggests that the pages are being rewritten to inject Ads of Archive.today's choosing

Terms of Use, but specifically I charge:
Engaging in Disruptive and Illegal Misuse of Facilities

Paid contributions without disclosure

Misplaced Pages:Bots/Requests for approval/RotlinkBot, Misplaced Pages:Archive.is RFC (and it's linked history)

I shouldn't have needed to enumerate these but since you appear to be going for the "intentionally dense" argument, I feel the need to prove it Hasteur (talk) 12:53, 30 June 2014 (UTC)
Hasteur, your insults are not required, Matthias asked a legitimate question and you've responded with more hearsay and speculation. Whatever Rotlink and Rotlinkbot were doing, that has no bearing on what Archive.is is doing and what functionality it serves. It is a far more flexible, powerful, and robust archiving service than the other two major contributors. Anytime you wish to retract your insult directed at Hasteur would be a good time to do so.Darkwarriorblake / SEXY ACTION TALK PAGE! 17:47, 30 June 2014 (UTC)
Wow Darkwarriorblake, you really must be blind. Its not a matter of hearsay and what Rotlink did is relevant. Rotlink is the owner of archive.is. His use of a bot net (Ive never seen a legal use of a bot net spanning such a wide geographic area) is proven. Its not hearsay, Just because we dont have the person admitting it doesnt mean that its not fact. Usage of malware to propagate itself and its willingness to ignore website's policies is very questionable at best. If you really want a good solution just ask the WMF for one. Taking over website or going into partnership with archive.org is a far better solution that using a site that is a know source of abusive behavior. Werieth (talk) 17:54, 30 June 2014 (UTC)
Even if any of it were provable beyond speculation, the site has been sanctioned since the original RFC and the first time it attempted to do something of a similar nature it would be gone forever from Misplaced Pages. The important part is that it does what is necessary, and it is a far superior archiving service to anything else on offer, and it continues to operate without Misplaced Pages. It isn't some waiting harbinger looking for the slightest weakness in our defenses to inundate us with useful archives. Darkwarriorblake / SEXY ACTION TALK PAGE! 18:13, 30 June 2014 (UTC)
~~So lets ignore the fact that they use illegal bot nets, probable malware and abusive behavior, just because you think that they are useful? Werieth (talk) 18:18, 30 June 2014 (UTC)~~
Again you prove my point "probable" malware, not sure what you mean by abusive, I assume you mean mass link addition, which is spam and those links should have been reverted, the issue is that it didn't stop at spammed links. Bot nets and illegal, again both are theoretical, nothing proven and nothing to do with archive.is. If a bot started spamming Misplaced Pages with links to foxnews would we ban foxnews or blame the botnet? Darkwarriorblake / SEXY ACTION TALK PAGE! 18:28, 30 June 2014 (UTC)
Usage of an illegal bot net is not "probable" or "theoretical", if you know anything about cyber security you could easily identify it as such. as Hasteur states below, if a site adminstrator used similar tactics regardless of who it was (CNN/FOX/MSNBC/BBC/whoever) A) any reputable source wouldnt do that. B) our response would be the same. had this been Fox the fallout of such an action would mean massive problems for them and probable lawsuits, investigations by the FBI, FCC, and probably a dozen more agencies due to misuse of electronic communications. The fallout of the negative publicity alone would probably mean that anyone involved including the CTO and CEO would be terminated, and possibly sued. I use the term probable malware because I dont know of any other method to distribute software to peoples computer without their permission, which is what happened in this case. If you try to deny that, Ive got some nice ocean front property in Kansas for you real cheap. Werieth (talk) 19:19, 30 June 2014 (UTC)

Your analogy is incorrect. If a editor registered, made a bot that started swapping out references going to ABC/CBS/NBC/CNN/etc to a similar article at MyLiberalNews.com that was caught operating outside the bot's own userspace, that started having a Bot Requests for Approval (but was withdrawn), that in the process of the BRFA it was discovered that the editor has a significant interest in MyLiberalNews.com (owner/writer/etc), that later when a horde of IP addresses start doing a replacement of links again in the same style as the original bot, that the wikipedia community comes to a consensus that MyLiberalNews.com should be prevented from being added to any new pages, then yes we'll hold the site responsible if it appears that the actions which caused the editor/bot to be blocked in the first place are still continuing in evasion of the block/ban. Hasteur (talk) 19:02, 30 June 2014 (UTC)
Support Archive.is is not my preferred archiving site for various reasons, but for as long as noone brings forward any proof that Rotlink(bot) and the possible IP bot net was/is actually controlled by the owners of archive.is (or people related to them), or that contents archived at archive.is were altered or manipulated which would make archive.is an unreliable site, I feel bound by the basic rule of "in dubio pro reo". All I have seen brought forward so far "as facts" were not actual facts but speculative assumptions and suspicions. Based on the little evidence we have so far, Rotlink and the bots could just as well have been controlled by a competitor in the archiving or ad business trying to harm archive.is' reputation and thereby eliminate a competitor. Therefore deleting all archive.is links right now is a harmful over-reaction. It not only harms archive.is, which so far has been a rare/valueable resource for us, but it harms us as well, as it destroys valueable contributions by editors and makes verifying facts more difficult. Of course, we should continue to block Rotlink(bot) and remove links actually added by bot nets. If archive.is would actually turn out to be a malicious site somewhen in the future, we could nuke them in a split-second and disable any immediate threats simply by muting the archiveurl= parameter and then run a bot to remove the archive.is url before reenabling archiveurl=. Alternatively, we could create separate archive parameters for the few common archive sites, so that we can centrally enable or disable them on an individual basis. We could also pass the archive urls through another template in order to have a chance to filter them out in case of any actual problems arising. Doing this would also turn out to be useful if an archive site's structure of semi-static links would change in the future. I consider this to be a much more reasonably approach to a - so-far - purely hypothetical problem. While we should keep an eye on it, no immediate actions are required. --Matthiaspaul (talk) 22:40, 28 June 2014 (UTC)
There are a few legitimate bot nets, however given the distribution and number of nodes it was not a legal bot net. The person who ran the bot was the website owner. I dont recall exactly where/when that happened but I know before the mass bot attack there was a discussion with the website owner and some improvements where made to archive.is as a result. But the fact that Rotlink operates archive.is is not in dispute. preventing archive.is isnt as simple as disabling the archiveurl parameter for that link. The bot attack used some very sneaky/devious methods. Often it obfuscates the original url, also the website owner took snapshots of the wayback machine's archive and then removed both the original URL and the link to the wayback, leaving only the archive.is url. Thirdly I have seen a lot of cases where links to a valid known and trusted site where hijacked to point to archive.is without notifying the reader. The only way you would know that a Link to the BBC was hijacked to point to a different location, with trust issues, instead of the valid target BBC is to mouse over the link and look for the target of the URL. Werieth (talk) 00:10, 29 June 2014 (UTC)
Oppose per Hasteur. Chris Troutman (talk)
Oppose per Kww and Hasteur. --Stefan2 (talk) 14:32, 30 June 2014 (UTC)
Oppose - There are too many uncertainties, even as to what law the server is subject to. It is not subject to US law. The domain is registered in Iceland, but the servers appear to be in Prague, and one of the name servers is in Lichtenstein. There are too many uncertainties as to what the purpose of the archive is. Robert McClenon (talk) 22:53, 30 June 2014 (UTC)
Oppose - Per Kww's arguments in particular. The problems with the archive, even if only half of them are only slightly true, far outweigh the benefits. - Aoidh (talk) 00:16, 1 July 2014 (UTC)
Oppose If Rotlink is willing to use botnets to spam here, there's no telling what he could decide to make his site do someday. Jackmcbarn (talk) 00:59, 1 July 2014 (UTC)
Partial-Support ( Allow manual additions, removal of links added by the unauthorized bot ) was Oppose ( no new additions, manual removal continues with caveats ) was Temporally neutral was Oppose per Hasteur. PaleAqua (talk) 01:58, 1 July 2014 (UTC)
While I'm still strongly leaning oppose ( i.e. prohibit additions ), because of the recent socking revelations, I want to reread the history a bit and so forth incase some of the stuff I have formed my opinion on turns out to be unjustified. The use of a botnet and process issues with inserting the links originally are fairly strong issues of concern and are still pretty convincing to me, so I'll still probably oppose. Also to clarify I still have the caveats pointed out in part 2 and 3. PaleAqua (talk) 07:48, 7 July 2014 (UTC)
Food for thought .. I haven't seen proof that the socks that were removing the links while Werieth already stopped can be linked to the IPs that Werieth used. As a consideration, it could also be the original IPs/Rotlink who are behind socks performing the mass removals (we know they were operating botnets ..) - clearing out the links so that hopefully the discussion becomes stale and unnecessary and this unfortunate situation can be forgotten and they can move on (I know, also this is a leap). --Dirk Beetstra 08:03, 7 July 2014 (UTC)

Okay still oppose after rereading a bunch of stuff. To clarify I am against adding new links to archive.is and think that manual ( and only manual not by bot ) removal of links should continue provided except in the case that the link is the only remaining identifying information of a source. In which case if a better source ( or another valid reference to the source which might not be an online source ) can be found that should be used in its place. Otherwise if possible the original deadlink that archive.is pointed to should be restored and marked as a deadlink. If none of those are an option, replace with a non-clickable link or the like that leaves information behind that others might be able to use to find the original source. PaleAqua (talk) 16:17, 7 July 2014 (UTC)
Actually thinking more about the implications of Beetstra's comment. We know that a bot was used to spam the links as added by a bot that IP addresses either were likely compromised hosts ( though there are a couple of other possibilities if we ignore Occam's Razor ). We also know the the account behind the bot claimed to related to the site. However, I think the decision needs to be focused on what we know the archive.is/archive.today does. The biggest concerns on that front seem to be future advertising, what if they host malware in the future, and fair use vs copyright concerns of archiving. I can't see how we can base decisions on the unknown future and we don't require other external sites to be advertisement free. That leads the fair use vs copyright concern. I don't think that should be our concern but instead the concern of archive.is. Concern on the archiving on individual links can be handled at that level. For example suppose someone added a link to a certain computer bus architecture standard as part of a reference using some search engines cache. It is clear that that link should be removed and not all links to the search engine. I still think that any of the spam linked should be removed, but I'm less and less convinced on banning the site at this point. Looking through the previous RfC there were several strong points about the useful of the archive. PaleAqua (talk) 18:44, 18 July 2014 (UTC)
The one which is malformed and canvassed? Anyway, there are usually alternatives. Meanwhile, even if it is useful (to you), it is not reputable, the site went against WP:NOTPROMOTIONAL, etc. The right way is different from the simple way. We can find alternative archives or even alternative sources (reliable sources are very likely to be archived by multiple sites). "I don't think that (fair use vs copyright) should be our concern but instead the concern of archive.is." - no, it is in our concern. If Wiki contains links to unreputable sites (which can get computers compromised), then Wiki is harmed (in reputation, you can say). You are not harmed and you have less work to do (finding a more reputable archive, etc), but Wiki is the one harmed. If you still think convenience overrides everything, you may go.Forbidden User (talk) 10:13, 20 July 2014 (UTC)
And you continue to assume that the one of the purposes of archive.is is to infect computers. Yet you have failed to provide proof that archive.is has infected any computer. Any website can be compromised. There is no good reason that established users should not be able to add archive.is links when other links are not available. -- Kheider (talk) 11:16, 20 July 2014 (UTC)
I based that on others' comments and the fact that tons of IPs were used to add those links, which demonstrated their will to go unethical, or even illegal (and trumped its reputation to us). Again, by your logic, give data that prove "there is a significant number of cases in which archive.is is the only choice of archiving". Perhaps I give an analogy here: When we can find a video in both YouTube and an online video watching site (with no copyright policy and bad reputation, like WatchTheseNow), we use the YouTube link; When the video is only available on WatchTheseNow, we should not use the video altogether even if we are not IT-professionals who can actually prove the claims of it injecting malware because we should be cautious on adding content. We have sites that possibly contain malware (with no technical proof) blacklisted, as well as others (like BHPBillion.com, from memory) without the solid proof you are asking for. And we do that because we don't have good faith in it, not because we assume that the site is made to attack computers. I based my words on its reputation.

The frequent request of proof is merely a step to put people who believe that the site is not clean into a dilemma: click into the site and risk the computer (and personal info) or not to give such technical proof? Good job, but it does not help building our faith in it anyway.Forbidden User (talk) 11:57, 20 July 2014 (UTC)
There are a lot of security companies maintaining safebrowsing white and black lists of malware-distributing sites.
https://www.mywot.com/en/scorecard/archive.is

https://www.mywot.com/en/scorecard/archive.today

http://www.google.com/safebrowsing/diagnostic?site=archive.is

http://www.google.com/safebrowsing/diagnostic?site=archive.today

http://www.google.com/safebrowsing/diagnostic?site=archive.org

http://www.google.com/safebrowsing/diagnostic?site=webcitation.org

— Preceding unsigned comment added by 87.69.97.159 (talk) 13:55, 20 July 2014 (UTC) This template must be substituted.
No one disputed the "never blacklisted" statement. There is copyvio concern raised (with enough basis), for its copyright policy - wait, there is none. Anything possibly constituting copyright violation is removed, and some major companies request archive sites to disallow archiving of their sites or the sites get sued (see Internet Archive). As its content may violate copyright, linking to it is not a good idea. Misplaced Pages is trusted for its clear Terms of Use and such, and in contrast, archive.is is not trusted for being the contrary. This is the major opinion to me.Forbidden User (talk) 15:56, 20 July 2014 (UTC)

No, no that one, this one Misplaced Pages:Archive.is RFC. BTW I don't think I've ever used archive.is ( or related ) links myself but that doesn't mean that I can't see how it is useful. I agree that the bot that added links was against policy and should be removed, but that is a very different issue from prohibiting other editors from using the archive.today as archive links. Yes something that should not be archived for copyright reasons shouldn't has issues, but that is a per link issue. Consider the number of videos on YouTube that are unauthorized... just because they exist does not mean all of YouTube shouldn't be used. I wouldn't bother repeating, but others have given explicit examples of cases where the finer granularity of archive.is makes it more useful that the way back machine in some cases. The argument about potential future malware ( or the like ) especially given the links 87.69.97.159 provided above ( Side question are archive.org and webcitation.org also the same site? - not that it really matters for malware scans if they are the same server. ) A lot of the arguments against remind me of arguing about grue and bleen. Almost any site could become a malware host in the future ( DNS changes, ownership changes etc. ), but when that happens it can be dealt with. That said I do want to see archival links given much less prominence as I noted in my like ISBN link comments in a section further down, which would mitigate a bunch of the future concerns. PaleAqua (talk) 20:41, 20 July 2014 (UTC)
YouTube has clear policies and tactics against unauthorisedd materials, like listing system. I agree that anny site can become a malware host, the difference only being the chance. The owner of archive.is has demonstrated his/her will to archieve his/her wants by illegal (at least unethical) way, trumping the sites' reputation. (For your reference, I've tagged the IP for it having the only edit here) Take a look at the security result of WatchTheseNow, no malware, while having thousands of comments saying that anti-virus programs block the site (and thousands of automated comments on YouTube promoting the site from thousands of users). Do we trust it? No, for those suspicious promotion and its lack of copyright policies. Fpr "that is a per link issue" - you are saying we should check all those links one by one because the site doesn't do it. I don't think that's how Wiki works. For "when that happens it can be dealt with" - no, not always, and at least Wiki's reputation is damaged. People love to say how useful archive.is is. However, no matter how good it is, a black cat is a black cat - at least its owner painted it black, and so we shouldn't use it. I remember a political leader in my country (China) said that "any cat, white or black, are good cats as long as they are good at catching rats", so do you mean we are applying that? I don't think that's a principle here, otherwise we'd have no rules on reliable sources, etc. The concern is not vague - it is here.Forbidden User (talk) 14:17, 21 July 2014 (UTC)
That parable was used to justify a market economy in China. It is a fine principle. I also recall another thing that happened in that country. During the Beijing Olympics in 2008, a photographer illegally and unethically uploaded images to Misplaced Pages. There was no copyright violation though, so the community voted to keep the images. So the site that has demonstrated a will to host unethically obtained material is this one, and we have no reputation to uphold. But we have another long-standing principle that mass changes, no matter how beneficial, should only proceed with careful discussion and consensus. So yes, we require that every link be checked by hand. Hawkeye7 (talk) 08:44, 26 July 2014 (UTC)
Yes, we should check the links. You've probably forgotten that it is you who use the links, so let me remind you... it is you who will check the links. I've proposed a change to the bot (by Kww) that it should not do too much removal per day and provide alternative archive search results. In that case you guys will have enough time to make your remedies. I did not know that Beijing issue, but Wiki obeys Virginian laws, not Chinese, so your example is not so relevant. Anyway, it is false that we don't need reputation, and it is ridiculous that we can pour black paint on Wiki because it has spots.Forbidden User (talk) 16:49, 26 July 2014 (UTC)

Do we have proof that the owner is RotLink? ( Update: See this link ). Note argumentum ad populum is not proof. Does archive.is state that they will not honor DMCA requests? I still don't see how an editor using archive.is to archive a site that shouldn't be archived is any different then say uploading a picture to flickr under the wrong license and then using that to upload it to wikipedia or commons. Yes ultimately flickr or archive.is would need to remove content, and the file or link should be removed from Misplaced Pages. But that doesn't mean we stop accepting all images that were once hosted on flickr just because a few had the wrong license, the same reasoning seems to make sense for archive.is. ( BTW as someone with a black cat, such negative associations of black cats are very harmful to cats. I have friends that work at pet shelters and I've heard horror stories of what people have done to black cats. When I was a kid, another black cat escaped my house around halloween and was badly injured ( leading to her death a few days later ) by neighborhood kids just because it was a black cat. I know it is just an expression, but I would really rather you not use it. ) PaleAqua (talk) 18:32, 26 July 2014 (UTC)
I did not expect only this being disputed. Now, please clearly state why you think the technical proof is insufficient with sufficient reasoning. As this is quite widely agreed with a few dissents made with ignorance to the given proof (perhaps they cannot persuade themselves even), you need to do so. By the way, the two sagas are different, for the fact that the photos lying in the Commons cannot have wider influence as long as they get removed once used on Wiki, and people need to search to get the photos, while the number is sure not 1% of archive.is links; however, once archive.is gets allowed, it will be used on a massive scale, according to some editors here. People are forced to meet the links when looking at references, with no choice of easily avoiding them. It can have influence uncontrollably wide as editors can use them without any restrictions (yes, including "archive.is is allowed, though widely agreed as undesirable" cannot stop them). As a last note, the true identity of RotLink being unconnected to archive.is (which is not persuasive and widely disagreed) cannot wipe the fact that archive.is is nothing clean, for its total absense of copyright policies and disobedience to a rightful script called robots.txt. For the cat, God bless your cat, and may it have a good day!Forbidden User (talk) 17:46, 27 July 2014 (UTC)
Umm what technical proof? I still haven't seen any. Again argumentum ad populum is not proof. As I've shown with changing my original position, I do take arguments into account and am always open to re-evaluate my stances, but I need to see convincing arguments not just allusions to the masses. Where did I saw I was okay with a massive scale bot level usage, I am for considered manual usage of archival links. People are not forced to click on archival links, even then the real links have precedence, especially if they are live. ( And if the cite templates don't work that way, then that is a problem with how the templates work. ) Archive.is is an on domain archiver, not a scrapper, as such robots.txt isn't really that relevant. It is the people that add links that cause them to archive the pages on demand. Because of the way archive.is works, it is actually possible that ~~RotLink~~the botnet edits were an attack ( intentional or otherwise ) against archive.is as it forced them to archive a bunch of page on a bulk level. PaleAqua (talk) 05:33, 28 July 2014 (UTC)
As someone is very angry at me, I would keep it short and talk about the central point only.

"First is Occam's razor: what would prompt anyone to actually go to the expense of negotiating individual proxy hosts in places ranging from Qatar to Brazil to Vietnam? Second is the nature of the IPs: they aren't webhosts and servers. Instead, they are individual IPs on adsl networks, FTTH networks, cable modems, etc. Everything about the setup screams "botnet". If it was a legitimate proxy arrangement, I would expect to see webhosts and servers hosted in a small number of countries with good internet access."

This is one of the explanations by Kww, which I agree. May I clarify that I don't actually want to talk about RotLink? I did not base myself much on his act, though I do agree that his act is at least disrespectful, unethical, and promotional (for the fact that archive.is says it has lots of cheap bandwidth etc, it can be deduced that a non-stupid attacker won't use only 10,000 links (current no. of archive.is links) for his attack, and he can post them on a random site as he does not really need them to be clicked - the mass archiving will fail archive.is. I don't know much about computers, so I may not satisfy your technical request, just like how no one can give me the number of ultra-exceptional cases which necessitates the use of archive.is. Why does WMF not take a look?Forbidden User (talk) 11:52, 28 July 2014 (UTC)
Per your edit summary "For Chris, I'm not bludgeoning anything, as I have yet to spend the same efforts on it as the RfC constructers." - Contributions by myself including the actual creation of the RFC, contributions by Forbidden User. Jus' sayin'. Darkwarriorblake / SEXY ACTION TALK PAGE! 12:04, 28 July 2014 (UTC)
You guys have been at the issue since the bot request (and for Kww, he takes a major part in RfC 1 in 2013), and so I suppose you guys have spent much more time on it. I admit that I often fail to ignore unreplied messages (even those about others) and get the syntax and grammar and typo right, though. If that is taken as wrong, then I'll attempt not to do so.Forbidden User (talk) 12:26, 28 July 2014 (UTC)
Can we move such side discussions to the talk page? Or at least out of the tree of replies to my !vote. PaleAqua (talk) 15:52, 28 July 2014 (UTC)

Thanks for reply. The ip addresses that Kww was talking about as I understand are the ones that added the links to wikipedia and not the ones hosting archive.is which appear to be located in France. The are part of the original reason I opposed, but I'm no longer convinced of the connection between the bot ( or of modified AWB or what not ) and the site and/or the owners of the site. To me the biggest differences opinions are: 1) was archive.is responsible for adding the links here? 2) Does the archiving page count as fair use or copyright violations? 2a) If they are copyright violations can Misplaced Pages still link to them? 2b) Is robots.txt required even for sites where the fetch is triggered by a user request instead of by a web crawler/robot? This impacts how we lock at the data. My answers would be 1) ~~unproven~~ edit: It looks like Rotlink is likely to be Dennis; I'm still unsure on the IP edits connection 2) covered by DMCA. 2a) On a case by case basis. 2b) No. I'm assuming yours would be something like 1) WP:DUCK 2) Yes 2a) no 2b) Yes. PaleAqua (talk) 15:52, 28 July 2014 (UTC)
Robots.txt is not prerequisite. I mentioned it to show that archive.is cares nothing about copyright (or moral or such, not repeating). Mine is WP:DUCK, others' are much better. 2) has a yes/no answer? The "complete" means unnecessary stuff, like ads, as almost no one cites a web for the ads.Forbidden User (talk) 17:42, 29 July 2014 (UTC)
Does any archival service fit in with copyrights (apart from fair use)? WebCite and Wayback Machine both mirror copyrighted content in a similar fashion to archive.is, and we've been using those for years. Archival sites usually allow you to contact the webmaster for DMCA takedowns if they host your content and you don't want it there, and I wouldn't expect this to be any different (haven't looked too deep into this yet). The primary focus on Misplaced Pages's copyright policies is generally for content that exists on Misplaced Pages, it seems to be different for things off-site that we link to, otherwise we wouldn't even use WebCite. The rationale behind using WebCite is based on the argument that it allows for WP:V of deadlinks, and that the ends outweigh the means. --benlisquare_T•C•E 04:08, 30 July 2014 (UTC)
Oppose Whoever is behind archive.is has demonstrated that they are willing and able to do anything, and they cannot be trusted. If thousands of links are established on Misplaced Pages, the archive operator can later do whatever they want when a link in an article is clicked. Johnuniq (talk) 02:04, 1 July 2014 (UTC)
Oppose Nil Einne (talk) 07:00, 1 July 2014 (UTC)
I've supported the removal of archive.is links for a while since I don't think anyone who has operated a botnet in that way can be trusted. I would note I'm somewhat confused about all this talk of advertising and copyvios @Kww:. The previous RFC did not find any evidence of advertising. The FAQ doesn't say they are currently advertising, all it says is they may include advertising after 2014. I don't know if it's changed because otherwise @Hasteur:'s comments seem confusing. But I see someone in the previous RFC suggesting it said the same thing and this archive from 2013, presuming it's accurate, says the same thing as it does now . And I don't see any advertising currently, so unless someone can actually show some evidence of advertising, I'm going to have to conclude this is an incorrect claim.

Now while I'm not an expert on copyright law, but even if running an advertising on an archival site is a violation, I'm not convinced that the fact a site may include such ads in the future mean it's in violation now, and therefore prevent us from linking to it. @Moonriddengirl:. Presuming I'm right, there's no copyvio concern arising from advertising currently so it's not something we have to worry about for another 6 months at a minimum.

The only other copyvio concern I can see is that the site doesn't respect robots.txt. This is a greater concern. I will note however it's also unlikely to be an issue except in those cases where the robots.txt would have prevented archival, so it's not all links that are affected. See also my comment below.

Nil Einne (talk) 07:00, 1 July 2014 (UTC)
Oppose The archive.is links present a possible (even if not confirmed) threat to readership. Immediate removal is more important that having ~16,000 some articles with dangling references most which can be fixed in time. --MASEM (t) 15:31, 1 July 2014 (UTC)
Support I see no evidence that the links to archive.is are currently harming Misplaced Pages readers or that the archived content is being altered. I edit mostly comet and asteroid articles, and the JPL Small-Body Database and Minor Planet Center database do not allow the WayBack Machine to make archives due to robots.txt. Orbital solutions for newly discovered objects can significantly change every time the observation arc doubles. -- Kheider (talk) 17:01, 1 July 2014 (UTC)
~~@Kheider:, Are you aware of the botnet abuse performed by the site owner, and mass abuse by them? Do you know why the JPL and MPC forbid archiving? Werieth (talk) 17:10, 1 July 2014 (UTC)~~
From what I have read it seems that there is strong evidence that the author lacks good social skills since he did not bother to respond to any Misplaced Pages talk page messages. But that does not mean the author is malicious. I know a lot of programmers with poor people skills. (I do agree that the unapproved mass bot edits should be reverted.) -- Kheider (talk) 17:22, 1 July 2014 (UTC)
@Kheider:The site owner used an illegal bot net to preform a distributed multinational prong attack to force links to their site into wikipedia. If this was just a failure to respond to talk page messages it wouldnt be an issue. However the site owner ignored Misplaced Pages policy, WMF terms of use, evaded blocks, and used deceptive tactics to propitiate links to their site. Werieth (talk) 17:31, 1 July 2014 (UTC)
The botnet mass-links were not approved so those mass edits should be reverted. But that does not make archive.is itself malicious. As I understand it, you can not be certain the owner of archive.is actually made all the botnet edits. -- Kheider (talk) 17:43, 1 July 2014 (UTC)
Illegal bot nets by definition are malicious. If you take the time to examine the evidence, timing, style and other characteristics you can reach a conclusion beyond a reasonable doubt that the bot net was used to run the rotlinkbot code. If the owner of archive.is did not do it himself it was done with their knowledge at least. Werieth (talk) 17:52, 1 July 2014 (UTC)
Or more specifically, when all the observable evidence is combined (the use of bot nets to add links, the nature of archive.is's practices), it smells like something incredibly fishy, and while there might be a legitimate case, this would absolutely need to be communicated by the person(s) behind it, which is not happening. They've had a chance to explain their motives and haven't decided to engage, so we're going to assume there's something malicious here. --MASEM (t) 17:58, 1 July 2014 (UTC)
+1 for Kheider. It seems that Rotlink to Archive.is is something like some editors to Misplaced Pages. A hacker, a fan, a volunteer, an autist and an our son of a bitch. But there is no evidence that he is the owner. Developing a website and run in for many years requires a different attitude. 94.181.76.11 (talk) 18:11, 2 July 2014 (UTC) This template must be substituted.
Oppose to "don't prohibit additions" part and support to " don't continue removing", per my posts under options number 2 and 3. Mayast (talk) 19:31, 1 July 2014 (UTC)
Support per Kww, Hasteur, Masem, Werieth. archive.is provides a useful service and no harm to articles or users has been shown. -- Michael Bednarek (talk) 05:12, 2 July 2014 (UTC)
Sorry, they all oppose this option. Michael Bednarek, would you like to change your stance to theirs?Forbidden User (talk) 15:34, 18 July 2014 (UTC)
Their, and later others', absurd, hysteric and paranoid "oppose" reinforced my position in favour of the archive's use. -- Michael Bednarek (talk) 07:17, 19 July 2014 (UTC)
Oppose - archive links are not necessary, and this behaviour is in gross violation of what Misplaced Pages is standing for. Continue removing them and prohibit new additions. --Dirk Beetstra 03:18, 3 July 2014 (UTC)
As long as Misplaced Pages intends to be considered a joke of unsourced, speculative, and outright made-up content, you're right, archives are not at all necessary. For anyone with the intention of information being verifiable beyond the immediate week however, archives are very much necessary. However, your opinion on the necessity, or lack thereof, of archives is not really appropriate in deciding whether or not a useful archiving tool should continue to be blocked based on a paranoid hysteria. Darkwarriorblake / SEXY ACTION TALK PAGE! 22:50, 3 July 2014 (UTC)
That is a gross misinterpretation of my point, User:Darkwarriorblake. WP:V is one of our cornerstones, but archives of sources are not necessary for that, even working links to the original are not necessary for that. The abuse by RotLink and the IPs however is ignoring our core policies and guidelines. --Dirk Beetstra 03:38, 6 July 2014 (UTC)
It is an exact interpretation of your point, archives are essential to ensuring information is verifiable. If a user cannot click a link to online material (the largest and most instantly verifiable kind) and immediately see the information they need to verify the material, then the entire article is worthless because if one point is invalid, every point can be. A lack of archiving as a back up is detrimental to wikipedia on any article that is GA or up, and frankly you shouldn't be able to pass FA without citations possessing an archive. And the abuse of Rotlink has nothing to do with what Archive.is is, Archive.is does, or the use of archives in general. Darkwarriorblake / SEXY ACTION TALK PAGE! 09:51, 6 July 2014 (UTC)
And that is a misinterpretation of WP:V - you're basically saying that offline sources are not to be used, because a reviewer may not have access to it and therefore fail the review of an article. If that is the case on FA/GA, then that process is seriously broken (and I know it is not). --Dirk Beetstra 10:20, 6 July 2014 (UTC)

See 7_World_Trade_Center#cite_note-NCSTAR_1-1-p13-12, which must be false because it, apparently, does not have a link to an online version, and hence the article should be demoted. --Dirk Beetstra 10:25, 6 July 2014 (UTC)

Even worse is the Featured Article Acra (fortress) - Only 13 of the 30 references directly link to an online version of the reference, most are in print books (and have at best a scanned version somewhere accessible through the linked ISBN/ISSN), one of them does not link to any online version in any form (suggesting it is only available offline). NONE of the references have an archive-link. According to your reasoning, without archives articles can not be properly referenced and hence never become an FA. I hence stand with the point: although archives are preferable, they are by no means necessary, and leaving out archives is in no form whatsoever detrimental to Misplaced Pages - we can do easily without. --Dirk Beetstra 10:50, 6 July 2014 (UTC)
That's a nice strawman you have there. The book references cannot be immediately verified, you'd literally have to either buy the book or visit a library to know if what it is saying is true. Digital is the way forward, that is why Misplaced Pages is popular and there aren't people coming to your door to sell you the entire encyclopedia collection at your front door. None of this matters however, since you're comparing books to online sources that can disappear simply because the originating site changes it's URL format for better SEO results, making whatever information was sourced invalid. And I can tell you from personal experience at ], that some information is just hard or next to impossible to find in online or paper format, so ensuring what is found is not only there for validation but there beyond the lifespan of the originating site and/or URL is of paramount important to the longevity of this project, lest it wishes to remain the subject of articles like this. And I know that 30 years from now, the Joker article (short of vandalism) will be verifiable, while the articles you link will not unless their fragile paper bodies are risen to a freely available digital afterlife. Darkwarriorblake / SEXY ACTION TALK PAGE! 11:01, 6 July 2014 (UTC)
Exactly, just like maybe for a website you may have to actually find the archive - which shows exactly that the archive link is not absolutely necessary. --Dirk Beetstra 11:16, 6 July 2014 (UTC)
Sites do not archive themselves, nor are all pages archived equally. Darkwarriorblake / SEXY ACTION TALK PAGE! 18:25, 6 July 2014 (UTC)

No, there are sites that archive other sites by themselves. Still, you can not convince me with the millions of offline references (which simply can not be archived), and millions of online references which no archiving site is allowed to archive, that not using a link to an archive on the millions of online references that, in principle, could be archived is going to be detrimental to the verifiability and reliability of this encyclopedia. The former two of that could also become inaccessible (or sometimes are, rare copies of books or behind paywalls) without any effect on the verifiability (it is just a bit harder). Moreover, whether or not we mention the archive on Misplaced Pages, one can find it, and there are alternative sites that can do the same thing. --Dirk Beetstra 03:17, 7 July 2014 (UTC)
If a site dies and no archive exists, for Misplaced Pages purposes it is a useless and invalid ref. Disallowing a link archive.is doesn't turn a valid reference into an invalid reference, but it does make it harder to distinguish between the two. It also makes verification much harder for both editors and readers. Neither of these are Good Things. All the best: Rich Farmbrough, 14:47, 8 July 2014 (UTC).

And that is why I encourage archiving, but to be done in a thoughtful manner. Your scenario exists throughout, that will also happen on links/material that cannot be archived - it is a red herring to suggest that we HAVE to archive anything we can archive because it may break in the future and not be retrievable anymore. It is more of a red herring to say that we NEED archive.is for that - there are, many, alternatives, our policies and guidelines do give archiving as an option, but there are alternatives that do not require (external) archives mentioned in the same text.
Oppose — My opposition is based on the funding mechanism for archive.is / archive.today and the consequences which come with that. The funding is private and not disclosed; there can be interruption of the service (likely permanent?) upon the death of one person; the private funding might not be sufficient and a turn to advertising would need to be taken; not a word is said about copyright in the FAQ. These points are in the archive.today FAQ. As an aside - much has been said here about robots.txt. The robots.txt file is a standard; it appears to imply no contract nor have a legal standing as an enforceable command. This is an interesting page on the topic → http://www.robotstxt.org/faq/legal.html . --User:Ceyockey (talk to me) 14:33, 3 July 2014 (UTC)
Let it be noted that webcite, one of our favoured not for profit archives, has run into funding difficulties. It is hard, then, to claim the source of funds makes the service inherently unreliable. Particularly as pure volunteer effort is better at running as low cost service than most public/private/third-sector organization. All the best: Rich Farmbrough, 14:47, 8 July 2014 (UTC).

Rich, you misconstrued my comment. I am not opposed due to financial instability or lack of funding; I am opposed based on the funding mechanism (private) and lack of transparency (where does the money come from). This is explained in the WebCitation FAQ, in the section "Who is going to pay for this?" --User:Ceyockey (talk to me) 15:37, 8 July 2014 (UTC)
Yes, and amusingly that FAQ is 6+ years old. I am not arguing in favour of building reliance on archive.is, it is simply a matter that in some cases archive.is is all we have. And further there is no particular reason to think that archive.is is "going away" (though it may) any more than one of the other archiving services. Unless we are prepared to start our own archiving system (which we probably should), we would be throwing out the baby with the bathwater. All the best: Rich Farmbrough, 15:54, 8 July 2014 (UTC).

Rich, I think we are still on different pages. The single-point-of-failure noted in the archive.is FAQ is of concern, yes (as I noted in my original post), but the point I'm trying to bullseye in this point-counterpoint is that of transparency in funding. Is it fair to say that we agree to disagree on the importance of transparency in funding? --User:Ceyockey (talk to me) 02:06, 9 July 2014 (UTC)
Support. Sometimes it's the only viewable citation of extremely valuable information. Regardless of the fact that yes, all citations do not have to be immediately viewable/accessible to the reader, the problems with non-clickable online citations far outweigh in my mind the past inappropriate actions of those who spammed WP with links using this site. Good-faith users should not be penalized for the past less-than-GF actions of others. Softlavender (talk) 04:45, 7 July 2014 (UTC)
"Sometimes it's the only viewable citation of extremely valuable information" - there are other archive sites out there that can take part of that (wayback, etc.). --Dirk Beetstra 04:52, 7 July 2014 (UTC)
@Beetstra: No offense, but what part of "the only viewable citation" did you not understand? The "only viewable citation" means the citation is not on other archive sites. Softlavender (talk) 05:03, 7 July 2014 (UTC)
Well, you already say 'sometimes' (and I wonder how many of the archive.is pages actually have that property now ... a few?), this will be exceptions which can be handled appropriately, it is no reason for the bulk to stay. And I still believe that if archive.is was able to archive the content, then other websites can or could have done the same (and if other sites did not, then one should wonder why archive.is could ..). --Dirk Beetstra 05:10, 7 July 2014 (UTC)
Yes, sometimes, if not many times. Your two comments make me think you haven't been paying attention. The fact that this is a major problem is why the proposal Misplaced Pages:Archive.is RFC 3#Require that another archive alternative exists before removing link .28oppose.2Fsupport.29 exists and has a noticeable consensus, and why the ability to use archive.is in the future should not be blocked. Ignorance of the problem is not a valid reason to weigh in against it. I don't want to waste my time furthering this little side discussion, so I'm going to leave it at that and not respond further; my point, and the problem, is clear. Softlavender (talk) 05:50, 7 July 2014 (UTC)
Support: I really don't see the benefit of prohibiting the use of archive.is due to the suspicion that the webmaster operated a botnet. When I say suspicion, I'm referring to how nobody has any firm evidence at all regarding this allegation, and the justification for prohibiting the archive site was based largely on speculation. Give WP:AGF one last chance, and let us see whether or not the problems of the past start up again, "give them enough rope" etc etc. --benlisquare_T•C•E 11:10, 7 July 2014 (UTC)
When it violates policies (and even the Terms of Use), WP:AGF no longer covers it, as policies override guidelines. Chances were given, but not treasured by the site.Forbidden User (talk) 15:34, 18 July 2014 (UTC)
Oppose – The links were added by a botnet, that much is clear. Whether by Rotlink or someone else doesn't really matter. They should be removed. Archive.is shouldn't be profiting with better Google ranks at the expense of our reputation. Mojoworker (talk) 16:08, 7 July 2014 (UTC)
There is no evidence that Misplaced Pages has helped archive.is with their index rankings, and the statistical evidence seems to even argue against you. Most of archive.is/archive.today's Alexa growth occurred independently of Misplaced Pages, otherwise why do the fluctuations make no logical match with the events that occur on Misplaced Pages? --benlisquare_T•C•E 16:52, 7 July 2014 (UTC)

Misplaced Pages automatically adds nofollow to outgoing links to avoid adding to the page rank. PaleAqua (talk) 16:57, 7 July 2014 (UTC)
Benlisquare, I was talking about search engine ranking, not Alexa ranking. PaleAqua, I'm pretty sure I remember from back five or so years ago, when nofollow was implemented, that it is somewhat a misnomer in that, despite what it implies, Google does indeed follow the link (and IIRC Yahoo ignored nofollow completely at the time). I believe there was implication that if sites other than Misplaced Pages link to the target, so that it is already indexed by Google, the link may actually affect rank. Things may have changed in the interim and whether or not it actually does result in a higher page rank, I don't know, but the rest of my argument stands. Mojoworker (talk) 18:53, 7 July 2014 (UTC)
I did a quick search and found this which is what I remember. But, I guess it was more like 7½ years ago, so perhaps much has changed… Mojoworker (talk) 19:01, 7 July 2014 (UTC)

Regarding "Most of archive.is/archive.today's Alexa growth occured independently of Misplaced Pages" - Is it a result of the quality of the site, or a result of aggressive SEO tactics? --Dirk Beetstra 05:12, 9 July 2014 (UTC)
Is the success of Windows 95 due to the quality of the operating system, or as a result of aggressive marketing and monopoly-forming antitrust? Use a bit of common sense here: Anyone with their own personal creation, or a significant stake in something, would not want to see it fail or fall into disuse, and thus would obviously promote their creation as much as they can to keep it alive. Archive.is is used in various places outside of Misplaced Pages, who knows whether it's due to word-of-mouth or some other reason, but its overall growth has little to do with Misplaced Pages. To say that the creators of archive.is never spent effort promoting their site is naive (because they most likely do), but to say that they shouldn't do so due to moral reasons is even more naive. If the world worked on morals, we should all be using Solaris and OS/2, and not Microsoft Windows. --benlisquare_T•C•E 08:23, 9 July 2014 (UTC)
That is exactly the answer that I expected. I am however not talking about morals, I am talking about our pillars: it is against our policies and guidelines to be a vehicle for that (and in conflict now as well with our Terms of Use). --Dirk Beetstra 11:01, 9 July 2014 (UTC)
How would usage on Misplaced Pages make the situation any more different than it currently is? With the editfilter in place, archive.is is still getting hits, and plenty of them. --benlisquare_T•C•E 15:00, 9 July 2014 (UTC)
But its not using ads and Misplaced Pages hits are a small part of its traffic. Much of the issue of its captures and the bot run stems from the simple script with the Momento system as noted by the Web Science and Digital Libraries Research Group. I do not see why it matters if it gets hits when Archive.org and Webcite does not have the content - you seem to advocate an eternal 404 over a known backup. ChrisGualtieri (talk) 04:54, 25 July 2014 (UTC)
Support as per Darkwarriorblake, the Archive.is service of itself has shown no negative effects and provides a net beneficial service. The implementation of the bot and any spam linking should be prohibited, but Wikipedians using the service as an archive and linking to the site should not be prohibited. -- Ham105 (talk) 09:39, 8 July 2014 (UTC)
Support 100% agreement with Ham105. All the best: Rich Farmbrough, 15:57, 8 July 2014 (UTC).
Support Link rot is a bad thing, and so is blacklisting a website that helps to protect against link rot of references used to verify content on Misplaced Pages. If spamming is still a concern, then the abuse filter should be modified so that only experienced or flagged editors may add links to archive.is. --Joshua Issac (talk) 15:14, 17 July 2014 (UTC)
Oppose Supporters here seem to stress that the site has "almost no negative effect on users", which is disputive itself. Arguments based on it would be weak. Another issue is that the RfC is not neutral. The "background" should be renamed "how good the RfC creator feels about archive.is". This would inevitably degrade any "consensus" built here. The discussion above has revealed its violation to Wiki policies and guidelines and the Terms of Use, which is a legal issue. This site may not harm your computer, but it has to others' (here I assume that editors claiming that they inject malware, etc, are true), and it is the thing that harms Misplaced Pages much more than LinkRot. The link is rotten, then find an alternate archive or find another source. It is our burden to keep Wiki verifiable while not using any unreputative website. Wiki should not have links to such websites. "Shown no negative effects and provides a net beneficial service" is another false statement, as it lacks the object - to me. It may help some individuals, but not Misplaced Pages. Using "beneficial" (a.k.a. convenient) as an excuse to using such websites is unacceptable.Forbidden User (talk) 18:08, 17 July 2014 (UTC)
Perhaps I'm not stressing things enough: Permanent and irreversible information loss is real. The library of Alexandria burned down, along with centuries of human knowledge. The first emperor of China burned books and buried scholars. Anything can happen to information. Regardless of the case, information redundancy is always a good thing, there should always be copies of information, and this also holds true for information on the internet. Online content can disappear at any time, and this hurts verifiability on Misplaced Pages. The argument that "if the information disappeared online, then it probably wasn't even worth documenting in the first place" is a red herring, because content can disappear online due to a multitude of reasons, from web host bankruptcy to even government censorship. Your argument is largely making assumptions as to why links rot, rather than addressing the problem of link rot. --benlisquare_T•C•E 02:44, 26 July 2014 (UTC)
I have said that I agree with "permanent and irreversible information loss is real", and I did not say archiving is not needed at all. Yes, online info can disappear wholly, and archive.is could be the only backup, but you will have to prove its significane (by numbers). Besides, you (and another person) seem to love focusing on postnotes. I asked that question to make others rethink a bit only. This can be true to one person, and untrue to you - it doesn't matter. Also, look through all my comments. Don't make me repeat.Forbidden User (talk) 17:25, 26 July 2014 (UTC)
Support Archive.is service of itself has shown no negative effects and provides a beneficial service. Blacklisting is disruptive and damages the integrity of the Misplaced Pages. I understand the desire not to allow commercial use of the pages but this is not the way to go about it. Hawkeye7 (talk) 12:16, 21 July 2014 (UTC)
Support - I find it reprehensible that the "opposers" have resorted to more unsupported rhetoric and misinformation. Forbidden User takes the malware injection assumption as fact and hangs on it despite it being a speculative "What if". Mojoworker's opposition seems to fall on the assumption Rotlink was the site operator of Rotlink when that simply isn't accurate or proven. Now "botnet" is laughable, the editor who broke the rules and mass-added them is a problem - but the first RFC included people who were independent and punished for one user's script. Bypassing blocks with a proxy is bad- but that is not a botnet. Our long term abuse users could all be "illicit bot net operators" by that token description. Archive.is continues to be important for Video Game related sources and particularly GameSpot and other more complex interfaces, Webcite and Archive.org removed their snapshot availability after the robots.txt went up, but the content was now shifted and broken. Archive.is retains the original appearance and the content itself has not been paywalled or such, I'll not debate endlessly on the subject of its appropriateness, but Misplaced Pages linked directly to the Pirate Bay, the Silk Road marketplace and extremist websites, even using their own pages because its a source. I'm against continuing to degrade Misplaced Pages and promoting linkrot - the single user's actions were a problem, but don't break some 20,000 working links and prevent the addition of many more because of a unapproved bot operator. ChrisGualtieri (talk) 04:39, 25 July 2014 (UTC)
Thank you, thank you, for pointing out that editors want general allowance of archive.is because a video game site is very well archived and so is useful for one project. Kww has mentioned it, though I wouldn't use it if you hadn't verified it. This editor has demonstrated his problem's localness, that is, it mainly affects one particular project (reason why people from that project is so eager to voice here). He also fails to demonstrate the significance of cases in which there is only one RS available while archive.is provides the only backup while the archive does not violate source sites' policies, copyright and robots.txt (though causing inconvenience, it ensures that copyright, etc, is duly respected). Now "botnet" is laughable, the editor who broke the rules and mass-added them is a problem - but the first RFC included people who were independent and punished for one user's script. Bypassing blocks with a proxy is bad- but that is not a botnet. — laughing at others' opinion is laughably a violation of WP:Harass. Ugh, read my reminder, please. You did not look at the RotLink saga thoroughly — only a bot can do his mass-editing, unless he is a superman (agreed by editors experienced in bots). The mass adding from IPs indicates that program being run on many computers (technical explanation given by others), so the owner's will to misuse technology for his own interest is well-demonstrated by himself, not any speculation. He claims to be the owner of archive.is and is widely agreed to be genuine. Are you questioning the group intelligence at that time and saying that you are superior? Perhaps you are the one resorting to unsupported misinformation, which, by yourself, is reprehensible (again, no comment on editors, not even vandals and SPAs, savvy?).

For the links the extremist sites etc — they are sources for WP:NPOV, unlike archive.is, which is not a source (remember WP:SELFSOURCE). We do not use their services (like putting a pipebomb somewhere). See fair use for Free Republic being knocked down. It is not as unsupported as you think (and I'm not going to Wikilawyer anymore). I've clearly stated that's my idea, not putting assumption as facts.

20,000 - cite where it comes from. Anyway, I'm requesting for another number, not this.

You are not arguing for the appropriateness, of course. I myself find it a tough task to justify the site's action (and its extremely blurred "I can't promise that I won't add ads. Well, I can promise I won't until the end of 2014", which is a reason for having "archive.is does use advertising, another is that others mentioned it replacing ads, ask evidence from them). I cannot say it's legal, I cannot say it's fair, I cannot say it complies God's will (this one not that serious), and so I abandoned my very original supportive stance.

The last thing is, if you are here for failing to find more than one source for a trivial info, think of it: is it WP:UNDUE? Is the fact that you are facing a dead link because the thing itself is too trivial?Forbidden User (talk) 17:27, 25 July 2014 (UTC)
For more detailed please feel free to continue to discuss this on my talk page, but this is my reply in short. I don't care if its a "local" problem or not, you are impacting literally thousands of articles and hundreds of Good and Featured articles' verifiability. It affects many other topics I work on, but its not as prevalent amongst Misplaced Pages's best content. The definition of botnet is pretty clear that this was not an automated process by machine - it was controlled by a single user. A one-click archiver and javascript does not make for a botnet. If Rotlink had AutoWikiBrowser a thousand times more edits could have been done - automatically, but it wouldn't be a botnet by definition. Going on assumptions of ads and the site closing as reason to remove this is just more contrived excuses. Webcite wasn't sure they would be operational and Archive.org has been even more shaky in its history. Cross the bridge when you get there I say. Now, Forbidden User you think its over trivial details because I mentioned the Gamespot matter - what a bold ad hom you make! Trying to discount my entire position because I mention hundreds of Good Articles and Featured Articles being effected and writing it off as some trivial matter instead of important development and production articles that were exclusive to the sites! Might as well forget about Lucille LaVerne and the Evil Queen (Disney) thing I see you working on because Archive.is just so happens to have the non-blog version and the backups to the Helen Gahagan lines that are irretrievable in that article right now! At least Archive.org works for the particular link which resolves it nicely for you - this time! Archive.org was the preferred backup for many, but for those that don't work - Archive.is allows us to verify that which was referenced. ChrisGualtieri (talk) 19:52, 25 July 2014 (UTC)

Forbidden User, you are entitled to your opinions, however could you please refrain from being obnoxious and irritating about it? Video games are a part of this encyclopedia project, just like physics, psychology and politics, and there is no reason for you to behave condescending towards people because of their involvement in a topic. Verifiability is a universal issue on Misplaced Pages, and additional archiving benefits everyone. There was no reason for you to take a shit on him simply because he used video game articles as an analogy. --benlisquare_T•C•E 02:15, 26 July 2014 (UTC)
"Obnoxious", I guess you mean. First of all, I should admit I'm quite hard on the issue, so apologies if anyone feels offended, and with your cooperation, I can tone down a bit. I'm not actually not impacting that many articles, because cases where the removal of archive.is link causing the content to be impossible to verify at all (that is, no alternate source, no alternate archives, a link is absolutely needed, and content not trivial) ≠ total nomber of references using archive.is (and insignificant, unless you find reliable numbers to prove otherwise). Thank you for your opinion on definition of botnet, though other experts in bots don't agree (at that time). The owner realises and solidifies the ad concern and site closing himself, not me. Wiki itself is quite shaky too, for it having donation-dependent nature, so using your logic, we should quit immediately to avoid wasting more time, which won't be much supported. Lucille LaVerne is not even on my watchlist, fyi. Thank you again for the alternate link, which demonstrates exactly the feasibility of using alternate archive links (that is, it does work). Again, I anticipate the numbers I frequently request. You have better reasonings than other "supporters", honestly. Good day!Forbidden User (talk) 17:17, 26 July 2014 (UTC)

I think someone has raised that "additional archiving benefits everyone" is a red herring to "we should archive everything" and so on, so I'm not repeating here.Forbidden User (talk) 17:17, 26 July 2014 (UTC)
Oppose, prefer the second option.—S Marshall T/C 21:51, 25 July 2014 (UTC)
Support, the original ban was a knee-jerk, and nefarious intentions of alleged spammer and link to the site were never proven. The archiving quality of the site is good, and having the archive links working for WP does infinitely more good than harm. In some cases, it's proven productive where other engines have failed. -- Ohc 08:42, 30 July 2014 (UTC)
Prohibit and Remove - reading about the botnet was pretty much game over for me. There's other issues, but that alone is enough for me to oppose this website from being anywhere on Misplaced Pages. Treat just like a banned user. No edit (even if seemingly of value) is to be kept. - jc37 07:30, 9 August 2014 (UTC)
Prohibit and Remove. I really do not think this is a reliable archive and we should not use it. Se comment below: arcive.is_and_archive.today_are_not_reliable_archives.--ツDyveldi _{✉ post} 19:57, 29 August 2014 (UTC)
STRONG SUPPORT. A bad decision by wikipedia caused this mess because the bot should have been approved in the first place, and things snowballed from there. We ought to fix the damage caused, by reversing that bad decision Fact is, when writing a bot, users have been encouraged to try it out before getting formal approval. That is normal. Instead the bot was blocked because it was "not approved", and as noted, this block was procedural, and made based on the lack of approval, not the quality of the RotlinkBot's edits.--{{U|Elvey}} 17:52, 9 September 2014 (UTC)
That's definitely a part of the story… —innotata 06:55, 15 September 2014 (UTC)
Support I pretty much agree with every other opinion in support, especially ChrisGualtieri's. First, I don't see how we should be swayed so much by user conduct/disputes in determining how useful a website is as an external link. Sites that definitely are valuable to have as external links simply shouldn't be blacklisted, permanently, and without exception. Then, as far as other issues with the site, it would be an attempt to decide what U.S. law provides for to say we're banning archive.is (or the Wayback Machine or WebCite) due to copyright law; we're not lawyers, much less U.S. federal judges, and this is not the same as our normal precautionary removal of copyright violations. —innotata 06:55, 15 September 2014 (UTC)
Well, it's unlike the ape selfie issue, where editors really try to interpret laws themselves. We can raise concerns over legal issues, but not putting it as solid hard facts (there are not many). The ape selfie case is worth noting for it's harm to Wiki's reputation though. No one can say it certainly is copyvio, but Wiki gets harmed anyway, no matter what the truth is. To avoid making a mess, I'd try to keep myself from continuing here. P.SForbidden User (talk) 16:51, 18 September 2014 (UTC)
Support The offending bot is well in the past, and really has no bearing on the current discussion. I agree that there is no evidence that the links to archive.is are currently harming Misplaced Pages readers or that the archived content is being altered. Link rot is inevitable, and the Wayback Machine's retroactive reading of robot.txt makes that more likely. Links that I thought I had protected are now gone. While I like WebCite it doesn't always succeed where the Wayback Machine does, and vice-versa. Lastly there are a number of pages from the now defunct German World Gazetteer that are only extant on archive.is. --Bejnar (talk) 16:58, 23 September 2014 (UTC)
Support, as someone who is still of the firm opinion that the initial blacklisting was out of order, was not fully consensus backed, and who has also seen a number of web pages only archived on this site. Lukeno94 (tell Luke off here) 18:03, 12 October 2014 (UTC)
Better solution: - I strongly support using Archive.is but please stop editing articles just for adding archive links, this is like adding interwiki links - insane! No need to load the servers with gazillions of edits for adding (and then removing!) archive links or interwiki links. For interwiki links, the solution was Wikidata. For the archive links, there is a gadget (JavaScript) that adds automatically the archive links after each external link. We are using it at Romanian Misplaced Pages and it works flawlessly. I announced those gadgets more than one year ago at Misplaced Pages:Gadget/proposals#Archives of external links. In short: ro:MediaWiki:Cache.js adds the Archive.is links and ro:MediaWiki:Cache-Archive.org.js adds the Archive.org links. Try it and then use it and then we'll only have to talk about allowing or not allowing such a gadget to show the Archive.is links. — Ark25 (talk) 21:27, 13 October 2014 (UTC) — P.S. In the future, there might be 20 archiving sites available to use. And then, what we do? For each online reference, we edit 20 times the article containing that reference, in order to add the archive links? Say an article contains 100 references. And then, 2,000 edits will be just for adding archive links. And each time we add a reference in the article, the edit will be quickly followed by 20 more automatic edits for adding the archive links. How would the history of the article look like? And how will the server "feel" with it's database filled with gigtons of unnecessary information? Isn't it just better to make 20 gadgets, and then the user selects the archives he prefers to use? — Ark25 (talk) 21:43, 13 October 2014 (UTC)
Support I was tripped up by the editfilter that is designed to prevent any new archive.today links from being added. Since the original url link was broken, I had to look for other non-archive.is/archive.today providers for the same link and those alternatives didn't archive the page. In the end I had to remove the archive.today link just to save my edit. If the editfilter was a bit more flexible, I would have invoked ignore all rules because it was clearly interferring with verifiability by intentionally making the contents less accessible. OhanaUnited 05:31, 15 October 2014 (UTC)
@OhanaUnited: You mean this reference? Or this? (I can't check whether it is the same as the archive.today link since I am not allowed to go to those websites ..). --Dirk Beetstra 06:23, 15 October 2014 (UTC)

Continue with removal of existing links (oppose/support)

Support. While the RFC was phrased in a circumspect manner, there's no reasonable doubt that the owner of archive.is used an illegal botnet to add links to Misplaced Pages, and I mean illegal in the sense of contravening actual law, not Misplaced Pages policies. We should not use our status as the sixth largest website to provide links to someone that has demonstrated that he will use compromised computers to achieve his goals. That places our users in unnecessary peril.
Further, the use of advertising on a site that takes snapshots of other people's contents raises substantial copyright questions: it's hard to justify taking a complete copy of someone's work and using it to attract people to ads under current copyright law.—Kww(talk) 20:07, 27 June 2014 (UTC)
You have no evidence of any wrong doing and it is not reasonable to throw accusations against the site or its usefulness, if the aim was to archive the web, post links on wikipedia, and profit, then that went out the window after the last RFC and the site would be shut down. Walt Disney wasn't a great guy either, but we don't remove the Disney articles because it helps funnel profits into Disney's pockets, Rotlink and Archive.is are not the same thing. Darkwarriorblake / SEXY ACTION TALK PAGE! 20:20, 27 June 2014 (UTC)
This demonstrates solely that you have not taken the time to analyze the IPs presented in the previous RFC or other discussions, not any weakness in my argument.—Kww(talk) 21:47, 29 June 2014 (UTC)
Oppose per above. Darkwarriorblake / SEXY ACTION TALK PAGE! 20:20, 27 June 2014 (UTC)
Support Per Kww's points Werieth (talk) 20:44, 27 June 2014 (UTC)
Strong oppose see my reasoning in response below. - Favre1fan93 (talk) 22:09, 27 June 2014 (UTC)
Support with the caviat that the links not be absolutely lost but replaced with appropriate replacements. Something that I observe Kww has been kicking and screaming against. Hasteur (talk) 22:30, 27 June 2014 (UTC)
Oppose, see comments below. Corvoe (speak to me) 20:04, 28 June 2014 (UTC)
Oppose. See reasoning in my detailed response above. --Matthiaspaul (talk) 23:09, 28 June 2014 (UTC)
~~Oppose~~ The links should be replaced with other archive links first before removal. Del_♉sion23 (talk) 00:30, 29 June 2014 (UTC)
I'm genuinely confused by your comment, Delusion23. Do you support or oppose this proposal (the removal of links)? Supporting this proposal and supporting proposal 3 would be "remove links after replacing them with another archive", which would seem to be what you are saying.—Kww(talk) 00:49, 29 June 2014 (UTC)
Support removal if they are replaced with links from another archive. I don't support the removal if they are not being replaced. (sorry that it was confusing) Del_♉sion23 (talk) 09:19, 29 June 2014 (UTC)
Oppose. Bertaut (talk) 02:37, 29 June 2014 (UTC)
Oppose Removing these links without discretion and without facilitating a shift to other archives is harmful to the life of the articles. I caught this being done on one of the articles I watch, and by chance archive.org had mirrored the source at the right moment in time so I was able to repair the rot. Other articles might not be so lucky. If it's decided that some or all archive.is links should be removed from Misplaced Pages, then it should be done deliberately and with a plan of including other archived sources to ensure that the removal does not harm the article. --Odie5533 (talk) 21:38, 29 June 2014 (UTC)
Yes, I have an example of such a 'less lucky' article on my watchlist. Yesterday, Werieth removed the archive.is link for the Foreverly (2013) album sales, and when I expressed my frustation with that action, he responded by providing a different URL that was archived in January 2010, before the album was even recorded (!), and also leaving a 'friendly' message on my talk page. The absence of sales numbers reference in that particular article probably isn't a big deal, but it's one of many articles in which the refs are being removed while no other alternative captures exist. As you can see below, I strongly support the third option. — Mayast (talk) 22:34, 29 June 2014 (UTC)
Support I don't care if there's a replacement or not. Misplaced Pages is not a business partner of anyone, for any reason. Chris Troutman (talk) 04:12, 30 June 2014 (UTC)
What on Earth are you talking about? This is not a question of being in business with anyone nor promoting and so I assess, based on your comments, that you haven't read the opening. Darkwarriorblake / SEXY ACTION TALK PAGE! 17:43, 30 June 2014 (UTC)
Support per Kww and Chris troutman. --Stefan2 (talk) 14:34, 30 June 2014 (UTC)
Support - Remove the links. We don't need an alternate archive. Misplaced Pages is its own archive. Robert McClenon (talk) 22:53, 30 June 2014 (UTC)
You don't seem to understand what an archive is. Darkwarriorblake / SEXY ACTION TALK PAGE! 22:57, 30 June 2014 (UTC)
Support - Per Kww and Chris. I don't believe these links don't belong on Misplaced Pages, whatever perceived good they do does not justify it. - Aoidh (talk) 00:22, 1 July 2014 (UTC)
Support Per my comment above. Jackmcbarn (talk) 00:59, 1 July 2014 (UTC)
Support Per Kww and, frankly, the juvenile display at ANI. Protonk (talk) 01:02, 1 July 2014 (UTC)
Support reversion of damage caused by block evaders is business as usual. Stuartyeates (talk) 01:17, 1 July 2014 (UTC)
Conditional support - Conditional removal in cases that there are no other links, the links that were being archived should be restored even if they are dead or replaced with other sources first. See "Require that another archive alternative exists before removing link". Full support in the case of external links sections. PaleAqua (talk) 02:03, 1 July 2014 (UTC)
I agree with much ( if not all ) of what Nil Einne says below. Especially the bit about online only vs sources with offline counterparts. PaleAqua (talk) 07:05, 6 July 2014 (UTC)

Given my updated comment to first section, I'm specifically referring to the links added by the unauthorized bot. PaleAqua (talk) 15:21, 19 July 2014 (UTC)
Support The archive operator cannot be trusted, and the links have to go, per this and previous discussions. Johnuniq (talk) 02:06, 1 July 2014 (UTC)
Partial support See my comment above for why I support. Nil Einne (talk) 07:08, 1 July 2014 (UTC)
As for why it's partial, as I've expressed at AN my view is similar to PaleAqua. I will only support if citations are not removed in their entirety, as has happened in the past. I will not support removal of citations, even in the case of a completely bare URL and where the archive.is URL doesn't work, since it's possible that the original URL or an alternative archive source can be recovered, e.g. from the history of the article or by searching for the archive.is URL. While the info will always be present in the history, it's unrealistic to expect people are going to be aware that the info was cited, and the citation was removed without some indication this happened in the current page (preferably the talk page too).

I also now definitely feel that the archive.is should not be removed if is the only URL and it's a web only citations (in other words, this doesn't apply to offline new sources, journals etc where the URL is just convience link) until the original URL is recovered from archive.is, even if there may be enough info in the citation to find the URL. (This includes cases where archive.is itself isn't working for reasons I outlined earlier.)

I'm fine with removal even if the original URL is dead and there's no current alternative archive, even though I recognise having the archive.is URL may assist in finding an alternative archive and it's possible no other archive will be found. One of the factors of course is that anyone looking for an archive could easily still lookup and see if archive.is has a copy and then use that to help them find out copies so it's not really that big a loss. And as I've said above, I don't think we can trust archive.is, so it's not something we want to keep around for ever so the apparent lack of an alternative archive is something we'll have to live with until a solution is found.

As I've said at AN, I'm not saying we have to keep the archive.is as a clickable link. I'm fine with hidding the URL, or even doing something like putting a hidden comment with AISID where the ID is contained if people feel it's necessary. (In cases where it's not the archive.is ID but the full URL, then it's trivial to recover the original URL without visiting archive.is so it shouldn't be an issue.)

But as I've also said, I think there's a fair chance the numbers are small enough that they can be dealt with manually within a few months. And combined with the lack of any real evidence of current harm no matter how dodgy the people behind archive.is may seem, I don't personally feel it's big deal to keep the likely small number of cases where archive.is is the only URL until they are dealt with.

And just to be clear, I'm only referring to citations. I'm fine with the complete removal of external links to archive.is.

Nil Einne (talk) 06:56, 1 July 2014 (UTC)
Support They have to go, they present a danger (even if not malignant at the present time) to our readership. Damage control in the safer enviornment can be done later. --MASEM (t) 15:33, 1 July 2014 (UTC)
Strong oppose when no alternative archives exist.
Neutral if the archiveurl would be replaced with another archive with similar time of capture (and not captured for example three years earlier, as with Foreverly). — — Preceding unsigned comment added by Mayast (talk • contribs) 19:25, 2014 July 1
Oppose per Kww, Masem, Werieth. archive.is provides a useful service and no harm to articles or users has been shown. -- Michael Bednarek (talk) 05:12, 2 July 2014 (UTC)
Oppose unless replaced with alternative archive/source, concerns raised doesn't seem serious enough to justify damage this can cause.--Staberinde (talk) 13:54, 2 July 2014 (UTC)
Support - archive links are not necessary anyway. Remove them first, find alternatives later. --Dirk Beetstra 03:19, 3 July 2014 (UTC)
Support — if we want to move away from archive.is, let's do it and not just half-way. So, I'm coming down on the side of full implementation of the decision and deprioritising Misplaced Pages content support. --User:Ceyockey (talk to me) 14:36, 3 July 2014 (UTC)
Oppose — If there is no alternative provided. STATic message me! 05:01, 4 July 2014 (UTC)
Oppose. I'm thankful Werieth "retired" before he/she deleted every single valid and legitimately placed archive.is link in the entire encyclopedia without adding (or even bothering to explain how to try to find) a single alternative. There is no need to delete existing links unless we know for certainly they were placed into WP in bad faith. Softlavender (talk) 23:23, 5 July 2014 (UTC)
I totally agree.

Regarding Werieth's "retirement", he not only deleted archive.is links without substituting them with alternative urls or reinstating the original (possibly dead) urls (where they were missing), but in numerous cases he deleted slightly malformed, but otherwise completely valid and easily fixable references as a whole, leaving previously referenced material unreferenced (one example: ). Regardless of the outcome of this RFC, someone will have to go through all his edits and try to recover and fix such lost references. It's bitter, that it took our admins so long to stop this obviously "malfunctioning" editor; in my judgement Werieth's actions have caused magnitudes more damage to Misplaced Pages as a project and the spirit in our community than anything done by archive.is so far; in fact, so far archive.is itself did not cause any harm at all (the bot net and to some lesser extent Rotlink(bot) did). --Matthiaspaul (talk) 09:25, 6 July 2014 (UTC)
Oppose: Why is enwiki the only project that has a beef with archive.is? No other Wikimedia project has a problem with it, which suggests to me that we're thinking too much on the issue. If the webmaster is as malicious as everyone claims, surely this archival website would have caused an uproar on the other projects as well? There is no reason to assume that the users of other wikis are simple-minded blind-deaf-mutes, and enwiki users are some kind of exclusive voice of reason. Archive.is links are used everywhere on the Chinese and Russian Wikipedias. --benlisquare_T•C•E 11:13, 7 July 2014 (UTC)
Not all of us are assuming malicious intent. My concern is based primarily on the financials of the operation. --User:Ceyockey (talk to me) 12:50, 7 July 2014 (UTC)
Do note that there are several Wikis where RotLink and RotLinkBot are/have been blocked, and where blocks were issued to the IPs who were running illegal bots. En.wikipedia is, by far, the biggest wiki, and is often on the front-line of noticing possible issues. --Dirk Beetstra 05:18, 9 July 2014 (UTC)
Hm. According to SUL those blocks are irrelevant to archive.is and illegal bots. Discussion in German (the second) wiki does not mention them. Also in many smaller wikis where RotLink and RotLinkBot are not blocked they have bot flag.— Preceding unsigned comment added by 118.38.211.247 (talk • contribs) This template must be substituted.
"irrelevant to archive.is and illegal bots" - I think that 9 out of 10 blocks on RotLink are because they were running illegal bots on their main account, and at least 6 blocks on RotLinkBot are for undeclared bot (I can't read the last message). I do however agree that on some wikis they have a bot-flag (for RotLinkBot) as well. I do note that RotLink(Bot) is still active on Wikimedia. --Dirk Beetstra 06:02, 9 July 2014 (UTC)

What are you talking about, 118.38.211.247? Every block on that list is about unauthorized bot usage, and the German discussion is about unauthorized bot usage.—Kww(talk) 06:05, 9 July 2014 (UTC)
"Illegal bots" is "botnet of compromised computers" or "wikibot unapproved by wikiadmins"?

If first, it is only enwiki issue. If second, then word "illegal" is too strong and misleading to the first issue which is enwiki only.

What I see in dewiki. They do not associate RotLink to "illegal bots" nor to archive.is.

All examples of bot malfunction given there are not related to archive.is.

You may use google translate to read German or other language. It is cool.— Preceding unsigned comment added by 118.38.211.247 (talk • contribs) This template must be substituted.

They were operating bots who were adding only archive.is links, that was the nature of why they were blocked. They were blocked because they were mass adding archive.is links using automated accounts. Indeed unauthorised bot usage (maybe my word 'illegal' is there a bit strong).

This here is indeed an en.wikipedia-only issue, but the argument that we are 'the only project that has a beef with archive.is' is not the case, the link-pushing was, obviously, also noted on other wikis.

You are right that other wikis have not made the link (or those discussions have not been found). That however does not mean that it may not be a cross-wiki issue, a WikiMedia-issue or a legal issue. A lot of (cross-wiki) actions go unnoticed for a long time, as this whole situation has been showing in itself already. --Dirk Beetstra 07:16, 9 July 2014 (UTC)
Support – As I said above, the links were added by a botnet, that much is clear. Whether by Rotlink or someone else doesn't really matter. They should be removed. Ideally, commented out in the Wiki markup if possible, since they may be useful in finding replacement links. But again, archive.is shouldn't be profiting with better Google ranks at the expense of our reputation. Mojoworker (talk) 16:15, 7 July 2014 (UTC)
Support, we've seen enough bad faith from archive.is operators so that I personally can't trust them with thing as important as many thousands of our references. I don't particulary trust them with regards to security of users visiting their site. Max Semenik (talk) 20:23, 7 July 2014 (UTC)
Oppose. And I'm also thankful Werieth has "retired" before reinstating more dead links. -- Ham105 (talk) 09:39, 8 July 2014 (UTC)
It is fact that you give no reasoning as well.Forbidden User (talk) 17:45, 17 July 2014 (UTC)
Oppose - I have had cases where archive.is is the only open existing archive of a deleted page. There are no grounds to oppose except vague "suspicion". All the best: Rich Farmbrough, 14:34, 8 July 2014 (UTC).

"I have had cases" - can you put numbers to that? --Dirk Beetstra 05:18, 9 July 2014 (UTC)
The onus to provide the numbers is on you!!! You're are the one destroying the pages! And so far the only case you have mounted is that you feel like doing it. Hawkeye7 (talk) 12:28, 21 July 2014 (UTC)
He raised the point, and so he has to put a solid proof. Obviously you (instead of Dirk) are being WP:IDONTLIKEIT. By the way, read my reminder in Discussion, please.Forbidden User (talk) 18:10, 23 July 2014 (UTC)
Oppose It makes no sense to harm Misplaced Pages by breaking links to references (especially where archive.is has the only archived copy of the page) when there is no benefit to Misplaced Pages from doing so. --Joshua Issac (talk) 15:19, 17 July 2014 (UTC)
Support Links are rather optional: We can cite a book with no links at all, and alternative archives, while desirable, having none should not be the reason of opposing the removal. Though someone claimed to have such cases, the editor fails to give numbers to show its significance that "can break Misplaced Pages".Forbidden User (talk) 17:45, 17 July 2014 (UTC)
Oppose Per WP:VERIFIABILITY. We need news sites and archives. Blacklisting is damaging the integrity of the Misplaced Pages. It is up to the censors to justify themselves, not the contributors to the project. Hawkeye7 (talk) 12:21, 21 July 2014 (UTC)
Oppose - This degrades the actual content and our verifiability. It is damaging the integrity of Misplaced Pages and it more difficult to fact-check basic claims that are otherwise easily fact-checked. When you already have a source, you shouldn't remove that source just because you don't like its provider. I use Google Books to cite a text for verifiability because I can - especially when I have the real book on hand, because it makes it much easier to confirm the assertion. This is more important because it may serve as the only existing back up for 404ed content, meaning that it could otherwise be an eternal deadlink. ChrisGualtieri (talk) 04:43, 25 July 2014 (UTC)
You have ruled out the fact that alternative sources are solutions. Misplaced Pages is biased to change — we don't have "unnecessary" as a reason of revert. What you throw out is a per-link issue, after my filtering stated above.

We have no problem with book cites that does not provide online viewing. For Google Books, so far I've found no need to archive that site. May serve as the only existing backup - the "what if" speculation you've accused me of having. Show the numbers repeatedly requested.Forbidden User (talk) 16:55, 25 July 2014 (UTC)
Support removing the links. Commerce is not evil, but the behaviour employed here for commercial advantage is something the community cannot tolerate. I think we will, at some point, need a proper archiving solution. A commercial entity of unknown provenance that could disappear tomorrow is not a proper archiving solution.—S Marshall T/C 21:57, 25 July 2014 (UTC)
What's with the crystal ball reasoning? If a commercial site "ends", then we deal with that then. It hasn't happened yet, so using that as an argument is weak and irrelevant. The US Federal Reserve might disappear overnight, does that mean that everyone should ditch the US dollar and buy gold bullion?
"I think we will, at some point, need a proper archiving solution." - When that happens, then by all means provide full support for it, I personally would welcome such a thing. But at the moment, we don't have anything like that. Why not make do with what we have in front of us? There's the saying that "beggars can't be choosers", and we're all beggars here; if you're not one, then I invite you to drop a few thousand dollars on a budget-performance RAID array server to host your own archival service, and provide the non-commercial "perfect" service that you've so keenly described for all of us. If no one's going to do it, why not take the initiative yourself? Don't want to? Then making do with archive.is is a plausible beggar's solution until someone comes along and wants to. --benlisquare_T•C•E 02:09, 26 July 2014 (UTC)
It's hardly "crystal ball reasoning" to contend that an outfit that behaves in this way might disappear tomorrow, whether because of commercial failure or because of law enforcement action. In spite of what you say, it's foolish and irresponsible to assume that a criminal website will continue forever.—S Marshall T/C 09:32, 26 July 2014 (UTC)
"criminal website" - and here we're back to the unproven rhetoric. Someone used an unapproved WP:BOT to make edits to Misplaced Pages - prove to me that this person used a botnet, and didn't just use closed VPNs or proxies. Prove to me that this person is related to the management of the website. If this was a court case instead of Misplaced Pages, you'd probably be running the risk of giving false testimony. --benlisquare_T•C•E 10:40, 26 July 2014 (UTC)
Soon as you prove to me that the website will still be up this time next year.—S Marshall T/C 16:21, 26 July 2014 (UTC)
"If this was a court case instead of Misplaced Pages, you'd probably be running the risk of giving false testimony." Wiki is not a court, so mentioning legal consequences on a user in any way is not needed, and subject to WP:No legal threats. I think the top comments have the proofs you want. They may not satisfy you, but the botnet and RotLink being the owner of archive.is is agreed by editors at that time.Forbidden User (talk) 17:33, 26 July 2014 (UTC)
He did not make a legal threat he's pointing out that the accusation is unsupported. Considering that the statement is repeatedly challenged doesn't make it accurate just because it is repeated. There is no proof, just you repeating the unproven to demonstrably false rhetoric. WP:WIAPA states that is wrong to make, "Accusations about personal behavior that lack evidence. Serious accusations require serious evidence. Evidence often takes the form of diffs and links presented on wiki." Also, it would be a gross violation to WP:OUT the identity behind the Rotlink account, but the e-mails from Archive.is show that they are not the operator of the account. Falsely accusing is bad, outing is instant indefinite block and denying the evidence from Archive.is's operator is calling them a liar - which I guess is allowable... No one has proved Rotlink is the operator of Archive.is and the operator of Archive.is has denied being Rotlink. Sorry buddy, but just because Archive.is could benefit from the links on Misplaced Pages doesn't mean it was them. And the actual page views show that Rotlink's activity on Misplaced Pages did not and has not been a big thing. In the e-mails with Archive.is, Misplaced Pages is a small part of the views. ChrisGualtieri (talk) 17:58, 26 July 2014 (UTC)
RotLink outed himself, not anyone else. I think it is mentioned in WP:OUT that you can out yourself, but that is not beneficial. Many are calling them liars for the ultra-gross promotion and stuff about those IPs, which is a justified call. The statement is proven (though you don't accept it) by the first few editors and agreed by others at the time of the saga (and there is an ArbCom thing (?)). It sounds like RotLink quacks like a person in archive.is and the edits quack like a botnet and then the editors decided that they are true. You should consult those experts, not me, who trust them. I'll let Marshall speak his part for himself - I don't want to do a voiceover.Forbidden User (talk) 18:22, 26 July 2014 (UTC)

(outdent)

That's desperation in its naked form. I suppose by the same logic that person who is blatantly socking to mass-remove Archive.is links must be the same certain editor who has been repeatedly trying to do so from the beginning! Please, we have an abuse filter specifically 620 to counter this problem that is mysteriously avoided during this RFC! Its another botnet user who is violating consensus, our policies and committing criminal acts! /sarcasm off. That kind of alarmist tone and over-reaction to the most mundane of issues doesn't register even a peep when its on the other foot! I also think this side discussion is long overdue for being moved to its proper location. Sorry to the closers that need to wade through this all, but you just honestly had someone make blind accusations and parrot it as the basis for truth. A great person noted how the pitchfork mob-mentality thrives on ignorance, but this mirrors the Salem witch trials in a scary kind of way. The last one was bad enough and shows that someone's mighty keen on the flagrant and willful disregard of our policies to mass remove Archive.is links during this RFC. And they are committed to doing a lot more damage and block evasion than Rotlink ever did! Oh and it involved "hacking" to do it by violating AWB and Misplaced Pages policies and make many additional usernames in succession after being repeatedly blocked. I'm still waiting for someone to denounce this "illicit botnet" on the other side of the fence. Anyone? ChrisGualtieri (talk) 05:05, 27 July 2014 (UTC)
"Soon as you prove to me that the website will still be up this time next year" - Your demand makes no sense, if you don't have a crystal ball, neither do I. You can't just demand a crystal ball answer out of me like that. If you would like to ask me a question or have me elaborate my points if they are unclear, then by all means do so, however make sure that it is a question that I can actually answer. --benlisquare_T•C•E 18:30, 26 July 2014 (UTC)
So you concede that you can't be certain that archive.is will still be up next year?.—S Marshall T/C 23:55, 26 July 2014 (UTC)
That's a non-argument. I can't be sure that Iraq will exist next month either. What's your point? If you're going to continue to drive down this train of logic, then there is no further point in me continuing this discussion with you.
Winston Churchill: "I think you really should reconsider your position, Mr. Chamberlain."
Neville Chamberlain: "Do you have any proof that Hitler will invade Poland next year? I won't be convinced until you prove to me that this will happen in the future. What's the matter, you can't do that? I guess that settles then. Peace in our time!"
Do you realise how silly you sound? --benlisquare_T•C•E 05:01, 27 July 2014 (UTC)
I'll take that as a yes. Will you concede that this uncertainty means archive.is is unreliable?—S Marshall T/C 10:12, 27 July 2014 (UTC)
You're not a person worth talking to. You form conclusions only to suit yourself, and don't bother listening to the points of anyone else. I'm not going to waste my time, you've proven that you're hardly cooperative, and have no intention of having a meaningful discussion. --benlisquare_T•C•E 10:53, 27 July 2014 (UTC)
If this means you'll stop badgering and insulting me now, then I'm rather relieved.—S Marshall T/C 12:55, 27 July 2014 (UTC)

(outdent)

Never saw Chris in such an irrational tone. Anyway, he is talking about another botnet (whose act cannot discount anything or support RotLink's behaviour or support the use of archive.is) and uses the kind of "proof" he is against. I assume it's an emotional outburst, so I'm not firing at it. Not that I cannot.Forbidden User (talk) 18:28, 27 July 2014 (UTC)

S Marshall, are you suggesting that only sites that will exist a year from now should be linked from Misplaced Pages? And that links to sites for which there is no such guarantee should be removed? Because that is what the logical extention to your argument would suggest. The exact reasoning can be used for removing any and all external links. --Joshua Issac (talk) 20:14, 29 July 2014 (UTC)
No, of course not. My position is that with links that aren't to archive sites, reliability is the only important concern. However, specifically with archive sites, permanence is important, because the purpose of using archive sites is to prevent linkrot, and the whole effort is wasted if the archive site goes down. So we need archive sites that we can be confident will be there tomorrow. If we have no such option then the safest way would be a mix of archive sites, so that the failure of one such site would be less catastrophic----but I would not want to use a site that's so associated with Rotlink's behaviour. I suspect that Rotlink's claim that he's the owner of archive.is may be true, or may have been true at the time he made it.—S Marshall T/C 23:13, 29 July 2014 (UTC)
Just wanted to say that while I have a few different opinions, I completely agree with "However, specifically with archive sites, permanence is important If we have no such option then the safest way would be a mix of archive sites, so that the failure of one such site would be less catastrophic". Surprised you haven't commented at the "Look at referencing templates that support links to multiple archiving sites" question. To me that is the much more important question than if we allow some use of archive.is or not. PaleAqua (talk) 23:39, 29 July 2014 (UTC)
It's certainly an important question. I agree with ForbiddenUser that it's a separate question that belongs in a separate RFC, and I think I'd prefer to wait for that separate RFC to discuss it. I agree with the principle of using multiple archiving sites but who wouldn't? It's like asking people if they want world peace and free ice cream, of course they'll say yes. I think there needs to be a detailed discussion about which archiving sites we should use before we'll produce something useful. And before we can do that I think we need to bottom out the question of whether archive.is/archive.today are acceptable to the community.—S Marshall T/C 23:52, 29 July 2014 (UTC)
True enough. Amended my vote below. PaleAqua (talk) 00:19, 30 July 2014 (UTC)
Support; as S Marshall notes, the behaviour here is the problem, not the commercial nature of the site. Ironholds (talk) 19:45, 27 July 2014 (UTC)
Oppose as disruptive. Let's get off our high horses and examine the evidence closely. -- Ohc 08:46, 30 July 2014 (UTC)
Oppose – Indiscriminately removing links to an archive website which is yet to have been proven to actually have problems is damaging to Misplaced Pages. Links archived with this service don't have the finite life span like with other archiving services, and in some ways, it is even superior from what I can tell. I don't get the usage of "commercial"; I haven't seen any advertisements on the website, so I don't see how it could be making money off of this. Perhaps "private" would be a better way to put it? Dustin (talk) 22:22, 30 July 2014 (UTC)
Prohibit and Remove all - reading about the botnet was pretty much game over for me. There's other issues, but that alone is enough for me to oppose this website from being anywhere on Misplaced Pages. Treat just like a banned user. No edit (even if seemingly of value) is to be kept. This isn't disruptive, it's how we've done things for quite some time. WP:DENY also comes to mind. - jc37 07:30, 9 August 2014 (UTC)
Prohibit new additions. For existing links, replace with other sources if possible, otherwise removal of it and corresponding information in article that sources to it is warranted. - Penwhale | 16:31, 12 August 2014 (UTC)
Prohibit and Remove. I really do not think this is a reliable archive and we should not use it. Se comment below: arcive.is_and_archive.today_are_not_reliable_archives.--ツDyveldi _{✉ post} 19:58, 29 August 2014 (UTC)
Support It is ridiculous to even consider this. The site is a total unknown, openly overrides instructions left on sites, and was added either in error or maliciously. It reminds me of discussions on non-free content. Everyones instinct is to lap up as much as they can... but that is not what the site is about... is it? ~ R.T.G 00:51, 21 September 2014 (UTC)
STRONG OPPOSE. A bad decision by wikipedia caused this mess because the bot should have been approved in the first place, and things snowballed from there. We ought to fix the damage caused, by reversing that bad decision Fact is, when writing a bot, users have been encouraged to try it out before getting formal approval. That is normal. Instead the bot was blocked because it was "not approved", and as noted, this block was procedural, and made based on the lack of approval, not the quality of the RotlinkBot's edits.--{{U|Elvey}} 17:52, 9 September 2014 (UTC)
Opposed to blanket ban/removal. Replacing links is good, and maybe this can be discussed on an individual basis, but no reason to keep on removing. See also my comment in above section. —innotata 06:58, 15 September 2014 (UTC)
Oppose again the offending bot is well in the past and is not the issue here. Some sites are only available at archive.is. See also my comments on the first topic. --Bejnar (talk) 17:02, 23 September 2014 (UTC)
Oppose, as per my earlier comments. Knee-jerk reactions were bad enough, but continuing to perpetuate them is not helpful. Lukeno94 (tell Luke off here) 18:03, 12 October 2014 (UTC)

Require that another archive alternative exists before removing link (oppose/support)

Oppose The benefits or archiving are small compared to the legal questions involved.—Kww(talk) 20:08, 27 June 2014 (UTC)
Support, it would be nice if we lived in a world powered by candy floss and dreams but we don't, the near death of webcite is evidence of that, an archive site using adverts is not a bad thing, most of the sites we source will have adverts. If the site is being spammed that is a different story but this RFC is not about allowing RotlinkBot to spam links, and noone has to use Archive.is links if they are morally opposed any more than anyone else has to use webcite because they don't have faith in it's longevity. Darkwarriorblake / SEXY ACTION TALK PAGE! 20:20, 27 June 2014 (UTC)
Oppose Per Kww's points Werieth (talk) 20:44, 27 June 2014 (UTC)
Strong support The original intent of using archive.is was to help with WP:LINKROT and provide an alternative to WebCite or Wayback Machine. I have personally seen User:Werieth working their way through countless articles removing all of these links over the past few days, and not replacing any of them with a valid alternative to the archive.is link. Myself and others have not taken fondly to this editing, which many of these opposers (see an example on Werieth's talk page here) are considering disruptive. By removing all these links with out a replacement, is doing more harm than leaving them on the site until replacements can be made, in my opinion. If archive.is is considered illegal in actual law, then fine, let's get the links off the site and we won't continue adding new ones. But in some cases, these archive links are all that is remaining of the sourced content. So there has to be a solution to properly remove the links with correct replacements to fulfill the original editor's intent of having the link archived, and if the original url does not exist anymore, a secondary alternative needs to be figured out to preserve that content, with out just blatantly removing the one link to the source because an RfC told us so. - Favre1fan93 (talk) 22:09, 27 June 2014 (UTC)
Support Per Favre1fan93 points. I've suggested multiple times, but the editor leading the charge in scorched-earthing any Archive.is link refuses to come to a compromise. Hasteur (talk) 22:33, 27 June 2014 (UTC)
Strong support per Favre1fan93 as well. Corvoe (speak to me) 20:04, 28 June 2014 (UTC)
Support. See my reasoning above. Also per Favre1fan93 above - I too am serious concerned about Werieth's continued behaviour trying to create facts by removing archive.is links all over the place in the face of ongoing discussions and no established consensus. --Matthiaspaul (talk) 23:12, 28 June 2014 (UTC)
~~There was an RfC already that established that these links are unacceptable. So please do not mis-represent the issue. Werieth (talk) 23:58, 28 June 2014 (UTC)~~
As I understand it, the general consensus was to remove links added by bots. -- Kheider (talk) 16:46, 1 July 2014 (UTC)
Strong support per Favre1fan93 also. Spc 21 (talk) 00:28, 29 June 2014 (UTC)
Strong support When we deprecated Template:Wikify we replaced it with other helpful templates before removing it. When we deprecated the ratings parameter on the Album Infoboxes we moved the ratings to the article body text while removing it from the infobox. This way no information was lost. These archive.is links should be removed, but why the rush? They should be replaced with alternatives before being removed, otherwise valid sources are being removed from Misplaced Pages for no good reason. The RfC decision stands, but the aftermath could be handled a lot better. Del_♉sion23 (talk) 00:29, 29 June 2014 (UTC)
Support. As per Favre1fan93's argument above. Bertaut (talk) 02:37, 29 June 2014 (UTC)
Strong support per Favre1fan93, and my comments at Misplaced Pages talk:Archive.is RFC#Any alternatives to Archive.is? from February.— Mayast (talk) 16:11, 29 June 2014 (UTC)
Oppose per Kww. Chris Troutman (talk) 04:12, 30 June 2014 (UTC)
Oppose per Kww. --Stefan2 (talk) 14:34, 30 June 2014 (UTC)
Oppose - Why do we need an alternate archive? Misplaced Pages maintains its own history and good backup. This option isn't persuasive. Robert McClenon (talk) 22:53, 30 June 2014 (UTC)
Misplaced Pages does not maintain the history of or back-up websites linked from it, which is what archiving services like the Wayback Machine and archive.is are for. --Joshua Issac (talk) 01:57, 18 July 2014 (UTC)
Oppose - Per Kww. I don't find the argument that "The site is bad but the world isn't perfect" to be particularly compelling. - Aoidh (talk) 00:25, 1 July 2014 (UTC)
Oppose We don't require this for any other spam link we remove. Jackmcbarn (talk) 00:59, 1 July 2014 (UTC)
Support per Favre1fan93, but only for links used as sources. Neutral if used in external links sections. Just because the site might be bad, doesn't mean that the information it archives is bad. I see this as very similar to what happens when a link becomes dead. We don't just remove a reference because a link becomes dead, we try to find a good link first or a another source. PaleAqua (talk) 01:52, 1 July 2014 (UTC)
Clarification: I am also fine with a solution that de-activates the link while leaving a source reference for example by hiding the link, or just having the address in a comment along a request for editors that see it to replace with a better link. PaleAqua (talk) 09:03, 1 July 2014 (UTC)

Update: As noted above I no longer against editors adding links to archive.is, as of such I have struck my suggestion of make the links non-live. I still support removal of the links added in bulk by bots. PaleAqua (talk) 03:27, 22 July 2014 (UTC)
Oppose Previous discussions have established that the links need to go—that is standard procedure for spam. Energy might better be directed at persuading the WMF to help a reputable archive. Johnuniq (talk) 02:10, 1 July 2014 (UTC)
Mostly neutral I don't see the need for archive.is when the original URL is working, if anything, it may discourage people unaware of the controversy of finding an archive link. As I said above, archive.is is not a long term solution and so for some dead links, it's possible we'll never have a suitable archive. For dead links where an alternative archive does exist, I'm sorry to the reader for the temporary loss of the source, but we also have to weigh that up against the issues from linking to a site run by someone with such dodgy practices even if there's no evidence of current harm and I think people have had plenty of time to replace archive.is links. Still I wouldn't completely oppose people being given more time, but they have to accept in cases where the original URL is present (whether working or not), they're not going to get unlimited time to replace the archive.is link. Nil Einne (talk) 07:19, 1 July 2014 (UTC)
Comment: When a webpage is still online or an alternative capture exists, I have no problem with archive.is link being removed or replaced with another archive, and I think some of the users supporting this option would agree with me. But in multiple articles archive.is links have been removed while there are no alternative captures – and those reference sources have now become dead links, probably forever. — Mayast (talk) 19:16, 1 July 2014 (UTC)
Note as I said, IMO we cannot rely on archive.is long term. Therefore, if nothing can be found after extensive search and it is a web only citation, it may be necessary to replace the citation with something else (just as if there was no archive at all). I'm not particularly fixed on when this should happen, but I do hope supporters of this proposal are planning for it. Note also as I expressed earlier, I'm also not opposed to the idea of just removing most archive.is links at the moment (provided the original URL is present, or isn't needed) regardless of whether the URL is working. Nil Einne (talk) 15:43, 5 July 2014 (UTC)
Oppose The danger of the links is more a concern than having an archiveurl. We are not wiping the history of these articles, so as long as the diff where the link was removed, editors can go back to figure out what the archive.is link was and capture the details in a safer manner. --MASEM (t) 15:32, 1 July 2014 (UTC)
Support, especially for links used as sources. -- Ssilvers (talk) 15:50, 1 July 2014 (UTC)
Support as alternative archives are not always available and removing a verifiable source harms Misplaced Pages. -- Kheider (talk) 17:08, 1 July 2014 (UTC)
Support There are many situations where they are being removed, resulting in the archived content being lost forever and I cannot ever support the quality of articles being brought down. STATic message me! 18:30, 1 July 2014 (UTC)
Support - danger of links not serious enough to justify removal without replacement.--Staberinde (talk) 13:56, 2 July 2014 (UTC)
Oppose - again, archive links are not necessary. --Dirk Beetstra 03:20, 3 July 2014 (UTC)
Oppose — if we want to move away from archive.is, let's do it and not just half-way. So, I'm coming down on the side of full implementation of the decision and deprioritising Misplaced Pages content support. --User:Ceyockey (talk to me) 14:20, 3 July 2014 (UTC)
Support. per Favre1fan93 who makes a great argument. -- œ 18:54, 5 July 2014 (UTC)
Support. Good grief yes; this is a no-brainer, especially since WP:VERIFIABILITY is a cornerstone of Misplaced Pages. Softlavender (talk) 23:25, 5 July 2014 (UTC)
Verifiability, not verify. Having to find the archive by hand as opposed to having it readily available does not make the ability to verify impossible. --Dirk Beetstra 03:34, 6 July 2014 (UTC)
Weak support as a final alternative: I'm opposed to the removal of archive.is links because it makes it difficult to use URL citations when they've already died. However, if Wayback Machine or Webcitation archives are already available, then that already allows a dead citation to be still useful as a reference. If other archives are not available for a URL, then archive.is is the only choice remaining, short of deleting the citation with the dead URL, which really seems counterproductive to me if we have a useable archive right under our noses. To me, maintaining verifiability for old citations is the first priority, and chasing after alleged botnets comes much later. --benlisquare_T•C•E 11:21, 7 July 2014 (UTC)
Oppose – As Dirk Beetstra said, there's no need for a link at all. Although I do think the URL should be retained as a comment in the Wikicode (with some indication to look there), or maybe in a template similar to {{Wayback}}, but without the hyperlinks, as long as that doesn't help their Google rankings, but so someone could go there manually by cutting and pasting the URL. As long as archive.is isn't profiting with better Google ranks at the expense of our reputation. Mojoworker (talk) 16:33, 7 July 2014 (UTC)
Being linked from Misplaced Pages does not improve a website's Google pagerank. See meta:nofollow. --Joshua Issac (talk) 15:25, 17 July 2014 (UTC)
As discussed above, that's not necessarily the case. Mojoworker (talk) 16:22, 21 July 2014 (UTC)

He thinks that references are necessary because he knows nothing about content creation. Hawkeye7 (talk) 12:36, 21 July 2014 (UTC)
Oppose, either these links are harmless and we should keep them or they're not and thus they should be removed. Having an alternative is good, of course. Max Semenik (talk) 20:26, 7 July 2014 (UTC)
Strong support as per Favre1fan93 above. -- Ham105 (talk) 09:39, 8 July 2014 (UTC)
Comment - what legal questions? All the best: Rich Farmbrough, 14:36, 8 July 2014 (UTC).

You don't think compromising people's computers constitutes a legal question? Or you don't think linking to a site operated by someone that compromises other people's computers without warning users that clicking the link may be dangerous constitutes a legal question? Or do you think legal botnets include residential IPs scattered across multiple continents?—Kww(talk) 03:03, 9 July 2014 (UTC)
Is there any evidence to say that Rotlink compromised anyone's computer, or operated a botnet? Or is it just conjecture? --Joshua Issac (talk) 01:29, 18 July 2014 (UTC)
The constant repetition and reminder of the botnet allegations seem pretty close to fear, uncertainty and doubt; in the end, whether the allegations are true or not wouldn't even matter, having enough fear instilled in people would allow the RfC to be closed in favour of banning the use of archive.is. That's what it feels like. --benlisquare_T•C•E 12:24, 18 July 2014 (UTC)

This is all wild conjecture. All the best: Rich Farmbrough, 04:19, 14 August 2014 (UTC).
Weak support I do not support removing the links at all, but if they are removed, then there should be an alternative link to another archive to prevent cited, verifiable information from being removed from Misplaced Pages just because archive.is is needed to verify it. --Joshua Issac (talk) 15:25, 17 July 2014 (UTC)
Oppose There are a lot of archive.is links, requiring an alternative in advance means that we need a ton of time, resources and effort to remove all the links. Sometimes finding another source or leaving no links altogether would be more efficient, and so whether to put in another archive link should be left to editors' discretion.Forbidden User (talk) 18:20, 17 July 2014 (UTC)
Support, if archive.is/today provides the only online backup of a citation, it is providing a useful service. But it should not be our preferred archive site. John Vandenberg 09:31, 20 July 2014 (UTC)
Too few benefits in comparison to its harm. Also, the case you mentioned is too scarce (prove it if I'm wrong). Anyway, archives are not prerequisite. Again, there is no excuse to use such an unreputable website, be it useful or not.Forbidden User (talk) 09:41, 20 July 2014 (UTC)
Sorry bucko, you need to edit more content if you want to be taken seriously. I am here at this RFC because one of my edits was prevented by the edit filter. I wouldnt know about this RFC if it wasnt for the fact I tried to use this archive here, which is the proof you ask for. What was especially annoying is that this edit filter hit me when user:hawkeye7 and I were conducting a Misplaced Pages editing workshop with newbies, explaining how we are careful to include references at the end of every sentence so our readers can see where we obtained the information from, *and* easily verify it for themselves. I had to quickly react to the edit filter and divert everyones attention to something else so they didnt see the very confusing edit filter message. Archives are not a pre-req, but they sure are helpful when trying to verify and carefully edit a BLP. John Vandenberg 06:48, 21 July 2014 (UTC)
You are the first one the slam a "newbie" tag on me. I suppose it is out of honesty, but also off-topic. 1+1=2, not the proof I want. As Wiki has more than 4.5 million articles, I'd suppose there are at least 50 said cases (so example competition is meaningless), and the number is scarce. I need clear data demonstrating the significance of the said case. There are lots of alternatives, therefore stating the benefits of archives sites in general does not support anything here.Forbidden User (talk) 14:44, 21 July 2014 (UTC)
Support Does no harm. Provides a useful service. Supports WP:VERIFIABILITY. Oppose deletion of links per WP:NOTCENSORED. If we want to prevent commercial use of the pages, we4 should mark them as CC-NC. Hawkeye7 (talk) 12:12, 21 July 2014 (UTC)
Wiki censors copyright-violationg materials, does that count? Wiki censors unsourced materials, does that count? What you have stated cannot outweight its harm, and per my comment above.Forbidden User (talk) 14:44, 21 July 2014 (UTC)
Forbidden User, your account is less than 3 months old. You have failed to demonstrate any harm caused by established editors using links to archive.is to meet Misplaced Pages:Verifiability. I do not understand why a fairly new user like yourself is so against the use of archive.is. -- Kheider (talk) 18:11, 21 July 2014 (UTC)
You are not reading my reminder, I suppose. I strongly oppose that site for its mess in copyright policies, its owner's illegitimate act, the unapproved botnet, Wiki's Terms of Use, and my sense as a human being. The reason why I have never said it harms a particular user is that proof is rather lacking; what I have been saying is that it is not the right thing to do, partially based on moral (that it does not respect some sites' disallowance of archiving), partially legal (the site is violating Wiki Terms of Use by replacing ads (according to another editor), etc, and it can obviously get sued in Israli/US courts based on illegitimate use of computers (the owner alone can get into troble), copyright violations - not us, per WP:No legal threats, but it's still inappropriate, I'm not Wikilawyering here), partially on the fact that its owner owns zero creditability, and partially Wiki policies (WP:NOTPROMOTIONAL, WP:COPYVIO). It is not that we can use anything that apparently cause no harms. Anyway, many editors supporting the first proposal have stated that archive.is is not preferrable, so you need to give the numbers to prove that there will be a significant problem if we don't use archive.is at all (that is, the significance of cases in which 1)archive.is is the only archive available; 2)the original source organisation does allow archiving; and 3) no alternative sources are available).

For this proposal, I believe that there is a lot of solutions to Link rot problems other than an alternate archive, as stated in my !vote.Forbidden User (talk) 15:44, 22 July 2014 (UTC)

For its harm to Wiki (I guess you are on it), its proof is in deductive approach, so you can disagree with my logic (like inadverted association brought by use of service). Here I should summarise myself:

First of all, the major reason of editors supporting use of archive.is lies in its usefulness to them. This opinion, when having weight, is a weak rationale in supporting its usage, as there are a number of archive sites that provide the same service while having clearer copyright policies, archival regulations (obedience to robots.txt), and demonstration of willingness to comply with our policies, guidelines and most importantly, our Terms of Use. For this, there is a considerable opinion within the supporting side that archive.is is not preferrable, and here I add my cent that we should use the other sites whenever possible. Therefore it being useful does not necessarily means that we should use it.

There is another reason, which is it apparantly doing no harm, and that there is no scientific or explicit proof against it. The requirement of such proofs lies in the harm's nature. The harm I proposed can be summarised as damage in reputation, creditability and legitimacy (highly elaborated already), which is undermining and mental. The issue about irreplacable archives and such is practical difficulties, and that can be objectively measured, hence requiring a proof to its significance, not that it exists.

A relatively stronger reason is that there are cases in which archive.is is the only available archive. There are such cases, however, such cases are the real per link issues, as no one has given the solid proof of its significance. Some editors have explicitly stated that they vote because of one particular link they are dealing with. That is no reason to support a general allowance to archive.is. Have you considered whether the sites allow outside archiving? If it is disallowed, using such illegitimate archives is not a secure fair use. Have you considered substituting with other sources? After these considerations, such per link issues will be trimmed down even more.

I wish to add a food for thought here: As archive.is and its owner is provably walking on thin ice, the chance of it being legally brought down is much higher than other sites. So, if that happens, what would be the damage to Wiki if 1) we use it as we want or 2) we have removed all/most archive.is links?

I hope that editors supporting the use of archive.is can reconsider their stance.Forbidden User (talk) 17:59, 23 July 2014 (UTC)
Of course it is not preferable, but what if it is the only option? Usefulness is not a weak rationale when the alternative is to have no link at all. Yes, it's true that we cite books, paywalled journals, and other things that cannot be easily accessed, but that does not undermine the fact that having a link is much better than not having a link. If archive.is is brought down, we wouldn't have had those links in the first place (since archive.is was the only available copy), so no harm there either. You seem to be supporting a false dichotomy: Since archive.is is not preferable in general (a viewpoint I support), it shouldn't be used at all. All I want to argue for is per link inclusion in the cases I mentioned. But you turned it around and said that since per link issues are a small case, we should not allow archive.is at all. Huh? -- King of ♥ ♦ ♣ ♠ 16:01, 27 July 2014 (UTC)
Sir, your rationale should be "there are cases that archive.is is the only available backup, while no substitution sources are available, while the information is not trivial (I remember seeing some quotes from Jimbo saying that an important information should have many RS coverage), while not violating the source sites' policies, while a link is absolutely needed)". I've said it's relatively strong, though you'll need objective figures to justify its significance. That is, insignificant cases cannot be the reason for supporting something like archive.is that affects the whole Misplaced Pages. By your mature logic you should know this. "If archive.is is brought down, we wouldn't have had those links in the first place (since archive.is was the only available copy), so no harm there either." - for this I guess you mean if archive.is is down thenwe would not have had links, which is not true, as it is not down yet, but with its awful policies and opaque funding mechanism, it is not that reliable. When it goes down, we will have an awful amount of link rot, because you think that it is superior in quality, while ignoring that it is not preferred by many, and you yourself support using it as last resort. This, by itself, breaks the "useful" glory shining archive.is by the fact that there are concerns above usefulness that make editors put it to last resort. For "having a link is much better than not having a link", true and false. It is more convenient to have a link if it is from web.archive, etc. However, I should introduce you to a balancing game - does this little benefit offset the mass use (yes, a few editors have said they plan to add a lot of archive.is links for its "usefulness") of such an illegitimate service?

Oh, someone told me "what if" speculation is bad. You have cases, but perhaps a few...?Forbidden User (talk) 17:29, 27 July 2014 (UTC)
Don't try to associate my argument with those whose want to justify all use of archive.is. Meanwhile, I do not need to show how prevalent the issue of archive.is links being the only one available is since this is the only use which I support. If it's 1,000 links, then let's use it for those 1,000 links. If it's 100,000 links, let's use it for those 100,000 links. "When it goes down, we will have an awful amount of link rot" - true, but will we be any better off by not using archive.is and not having working links in the first place? "does this little benefit offset the mass use" - you're treating this as a slippery slope. In my opinion, yes, this little benefit does justify this little use (in the case of a few links), and this mass benefit justifies its mass use (in the case of many links). The amount of intended use is irrelevant as I believe that for each additional link, the amount of marginal benefit outweighs the marginal harm. -- King of ♥ ♦ ♣ ♠ 19:25, 27 July 2014 (UTC)
I said that for those who want universal usage of archive.is. Well, how about 50 links? The harm is to the whole Wiki, while the benefit is to a very small proportion of articles (consider that we have >4,500,000 articles on en.wiki). Last note: How would it be possible to effectively assess every link with the criteria you mention below? I don't think we have the mechanism to realise your ideal.Forbidden User (talk) 12:03, 28 July 2014 (UTC)

I'm putting an additional info here: Advertisements have copyright, therefore displaying the whole of it (without good reason) is a possible copyright infringement. Meanwhile, archiving and displaying whatever users put in (without copyright policies) may not be a fair use, so is usage of the archives. I'm not a lawyer, so I'm not going to say it as something solid hard.Forbidden User (talk) 14:27, 28 July 2014 (UTC)
Your comment about the copyright of advertisements is irrelevant; under your interpretation, linking to a news website would be a possible copyright infringement. When a website hosts ads, presumably the advertiser wanted it there; very few people would voluntarily host ads without the ad owner's authorization. We can't host copyrighted content without good reason, but we certainly can link to it. As for the archiving, the copyright foundation is definitely shaky, but if it's regarding archiving as a whole, then it applies to archive.org as well (which general consensus approves of). -- King of ♥ ♦ ♣ ♠ 06:55, 30 July 2014 (UTC)

(outdent)

Hmmm, you can take your opinion on it. However, I'd say a desk for assessing archive.is links would get a backlog 100 times more severe than GAN. I really wish to know more about your idea on how to make it practical.Forbidden User (talk) 16:29, 30 July 2014 (UTC)

So, will you concede that your idea is impractical, and that any consensus on supporting this option would inevitably hinder the enforcement of consensus in RFC#1 and option 1 and 2?Forbidden User (talk) 16:51, 3 August 2014 (UTC)
Sorry, away for a while... Why would we need a desk? Just let archive.is links remain in their articles for now, and people should go through all of them to see if they can be replaced. If not, then the archive.is links remain. -- King of ♥ ♦ ♣ ♠ 03:02, 5 August 2014 (UTC)
The problem is we cannot force them to do so. Perhaps we can force reviews on the links before any edits can be done, but how can we ensure that there is really no alternatives beside leaving the ref with the archive link? P.S. I think it's better for us to move this to my talk page, as this RfC is going to be closed.Forbidden User (talk) 08:35, 5 August 2014 (UTC)
Support, usefulness to our readers is key. In my opinion, a link to archive.is is permitted if 1) no other copy exists; and 2) that specific page does not violate the site's robots.txt (or at least there is no evidence of such). -- King of ♥ ♦ ♣ ♠ 02:29, 22 July 2014 (UTC)
Your cases should be treated as a truly per link issue. Its disrespect to robots.txt (that is, completely disobeying) and lack of copyright safeguard is not a per link issue. I guess you know the fact that with or without archive.is, link rot is commonplace even in FAs. I have suggested remedies to the removal in the third section as well as the bot request by Kww, such as limiting removal per day so that editors can fix the references more timely.Forbidden User (talk) 17:59, 23 July 2014 (UTC)
See my reply above. -- King of ♥ ♦ ♣ ♠ 16:01, 27 July 2014 (UTC)
Oppose any delay in removing the links to archive.is and its successor sites.—S Marshall T/C 21:54, 25 July 2014 (UTC)
Support. Yes, usefulness to our readers is key. Any removal without replacement with corresponding archives would be reckless and irresponsible, and contrary to qualitative goals and our mission of supplying verifiable information to our readership. -- Ohc 08:50, 30 July 2014 (UTC)
NO! - this is a wiki, there is no deadline. So if you want to go look for a replacement, feel free, but this needs to be removed in the meantime. - jc37 07:30, 9 August 2014 (UTC)
Support With automated help most existing alternative links could be quickly found. The archive.is remains in history and can be recovered if the alternative link dies. All the best: Rich Farmbrough, 12:44, 12 August 2014 (UTC).
Support: if the site is considered to be worse than other archives, then presumably it's best to use those archives in the first instance, and then this site if those aren't available? It Is Me Here 10:44, 13 August 2014 (UTC)
If you could prove the significance of those cases, welcome to show me. Otherwise, the benefit would be outweighed by the harm.Forbidden User (talk) 16:53, 13 August 2014 (UTC)
Support linkrot is a bigger problem than Kww et al.'s paper ghosts --Guerillero | My Talk 20:54, 15 August 2014 (UTC)
Prohibit and Remove. I really do not think this is a reliable archive and we should not use it. Se comment below: arcive.is_and_archive.today_are_not_reliable_archives.--ツDyveldi _{✉ post} 19:59, 29 August 2014 (UTC)
STRONG SUPPORT - that is, STRONG OPPOSITION to removing archive. links, PERIOD. A bad decision by wikipedia caused this mess because the bot should have been approved in the first place, and things snowballed from there. --{{U|Elvey}} 17:49, 9 September 2014 (UTC)
support as better than nothing, but I would rather not remove any archive.is links. So I would oppose this if the choice were phrased differently. As a compromise this is not very good given the weaknesses of the Wayback Machine, specifically retroactive denial of access, and the potential demise of WebCite. --Bejnar (talk) 17:08, 23 September 2014 (UTC)
Weak oppose - Linkrot needs combating, and I don't see the point in favouring any other archive site over this one. Lukeno94 (tell Luke off here) 18:03, 12 October 2014 (UTC)

Look at referencing templates that support links to multiple archiving sites

Support So long as we only support a single outgoing URL to the content, we're going to be squabbling over which is the preferred archiver. There are no substantive technical barriers to citation templates that support many outgoing URLs, even to systems that are in effective competition with each other. See for example {{Authority control}}, which links to multiple authority control systems. Stuartyeates (talk) 22:42, 30 June 2014 (UTC)
Neutral - Why do we need a non-Misplaced Pages archive? — Preceding unsigned comment added by Robert McClenon (talk • contribs) 10:53 am
Because websites customised for specific content are always going to do a better job of that context than generic ones. For example: deep-linking to a WP:MEDRS in a MEDLINE-type archive is always going to provide better context and more tools than we can do in Misplaced Pages. This kind of context is basically what archival studies is all about. Stuartyeates (talk) 00:29, 1 July 2014 (UTC)
Robert's question is specifically about a Misplaced Pages archive, and indeed we should probably have one. All the best: Rich Farmbrough, 04:23, 14 August 2014 (UTC).
Comment I'm not really sure how this is relevant to the rest of this RfC. I'd support this as long as it didn't include archive.is (by any of its many names). Jackmcbarn (talk) 00:59, 1 July 2014 (UTC)
Strong support( but details need to be worked out ) - We don't link ISBN directly to Amazon, Barnes and Noble etc, but to a system that allows access to many different catalogs. The same should be true for archives. I'd almost rather see some sort of wikidata for link archival and allow various archive links to be associated with them, similar to how links work to articles in other languages. PaleAqua (talk) 01:56, 1 July 2014 (UTC)
BTW There are several design decisions etc. that might be required with something like this. Do we just allow multiple links? Do we make citations themselves into external data, etc. So I can differently see the point of making this into it's own independent RfC. PaleAqua (talk) 00:13, 30 July 2014 (UTC)
Archive.is does use advertising -- Fatal - The fact that the facility utilizes advertising and thus presumably makes a profit is a fatal issue, Archive.is should be blocked, banned, excluded from anything to do with Misplaced Pages. Us volunteers don't get paid for the work we are doing and yet our work has value, we do not want parasites making money off of our work. Damotclese (talk) 17:25, 2 July 2014 (UTC)
Is this comment in the right section? What does that have to do with if we should provide options for multiple archive sites? This question is not about archive.is, but on allowing support for multiple archives for links that might go dead. advertisements are actually a good reason not to favor a single external resource and was one of the reasons why we already don't just link all books to Amazon etc.PaleAqua (talk) 17:54, 2 July 2014 (UTC)
Besides of which, so far no one that I've seen has provided evidence that archive.is is currently using advertising. The fact that it's claimed in the RFC intro is irrelevant if no one is able to provide evidence. As it's clear from my responses, I'm no fan of archive.is, but let's concentrate on the bad they are associated with, not that which they aren't currently associated with. (Yes they haven't ruled out advertising in the future, which may be a concern for various reasons, but that's a distinct point.) Nil Einne (talk) 15:51, 5 July 2014 (UTC)

Except archive.is does not use advertising. And it gives back to Misplaced Pages by providing an archiving service for references. But we link to several websites that do make money from advertising without giving anything back to Misplaced Pages, such as Yahoo! News and The Telegraph. See also Misplaced Pages:Copyrights, which explicitly allows both parasites and non-parasites to make money off our work. --Joshua Issac (talk) 14:33, 18 July 2014 (UTC)
Don't see the need - again, there is no necessity to having archive links in the first place. References without external links are completely, utterly acceptable. --Dirk Beetstra 03:23, 3 July 2014 (UTC)
Umm not all references have links. Consider ones to books etc. What about references to news sites that have folded ( like a lot of newspapers have in recent years. ) Just because the information is no longer visible at the original published source, doesn't mean that linking to archive at the way back machine, or other legitimate archiving service is bad. PaleAqua (talk) 03:49, 3 July 2014 (UTC)
PaleAqua, thanks for explaining my point: there are a plethora of very, very legitimate sources which indeed are not available online, not on a 'regular' server, nor at an archiving service (though places like the Project Gutenberg are trying to change part of this), there are also many sources which are available online on regular sources. Still, they do not need a link to the source (Philip Woodland et al. Nature Reviews Gastroenterology and Hepatology 11, pages 397-398 is perfectly available online, it does not need a live link to the content, it is perfectly valid - YOU can find that the article exists and where it is, and anyone with access can do the same, ánd read the article). The source is also perfectly valid if that source is behind a paywall where only a few people in the whole world have access (most people can't access http://dx.doi.org/10.1021/ja5030657 - still that information alone could be a perfectly valid reference for information). Then, as you say, newspapers that do not exist anymore and have no 'online presence' of their information anymore. However, those references exist, and can be verified (whether or not there is a live link to the 'regular', original information is, actually, utterly irrelevant). Now for archive links .. do you think that I will argue that links to archives of regular websites are necessary, where I just showed that even link-less references to regular websites are in themselves irrelevant? It helps, it is a good practice to have them, it makes information more accessible for the editors who want to add more, it has a whole set of advantages that I will not and can not ignore (and I actually encourage people to add them), but being necessary is not one of them. Misplaced Pages can become a reliable source without having live links to references (even without ANY live links), and archives are even less needed. If then an editor (or editors) feel the urge to spam, push and act against community standards including these links, where there is, in my opinion, absolutely NO necessity for these archive links. For me, the archive links should be deprecated unless the original information is 'gone', or could go into 'hidden' fields in the references (no need to show - they are only of interest to editors needing the information, as long as it is accessible). --Dirk Beetstra 05:02, 3 July 2014 (UTC)
Sorry misread your comment. So used to see the phrase "completely unacceptable" that I must have subconsciously inserted an "un-" when reading it sorry. I still believe having legitimate archives can be useful. For example consider the archival search recently on the origins of the X11 color lists, while the original sources repository is no longer available, one of the xorg developers placed an archival/mirror copy of the code online with commit comments. Being able to see a copy of the original allows for easier verification of the references; needed no, but useful. The real question to me is does the archive have the rights to display the content, if not the we shouldn't use, if yes and the original is gone then we can use. I don't think archival links are useful in most cases, but they are useful in some. I'm also against preemptively archiving except in special circumstances ( for example source sites that request links to mirrors to be used etc. ) PaleAqua (talk) 05:41, 3 July 2014 (UTC)

I do see the use, as I say - but I think it is massive overkill even to have a bot run around adding links to a legitimate archive, let alone a bot ignoring community practice, standards, and consensus pushing an archive where concerns about the reasons for linking, as well as copyright and further arise. The 'Oooooh, but this source may sometime in the future go offline and therefore we need an archive NOW'-argument is a total waste of resources - even a disappeared online source is not a real reason to add an archive link - the source is still valid and that is what counts. --Dirk Beetstra 07:12, 3 July 2014 (UTC)
Completely agree there. As I understand it this subsection of the RfC question is merely about updating references templates or the like to allow support for linking to alternate archives. It really has nothing to do with archive.is or any bots collecting information. I'm hoping that if such a change is made it is designed to give less prominence to archives ( or even hide by default ) if a live link is present and only offer one or more archive links if the original link was marked as dead. I'd prefer if {{cite|…}} and friends changed how deadurls are handled. Currently |deadurl=no is required to have the live link first if a archive link is present. I kinda wish the option had been set up differently. Say have a cite error if url and archive url are provided and deadlink is not set to "yes" or "no". If "deadurl" is set to no, I'd love for any archive link to be displayed with a CSS style that is hidden by default which would allow for custom style sheets to be able to show them for users that are interested in them. PaleAqua (talk) 07:53, 3 July 2014 (UTC)

This view is incompatible with reliable but online only sites and WP:V. If there is valid information on an online-only source (which, as the world becomes more digital, will happen) and that source disappears (site closure, site redesign with content loss, etc.) then that citation and the facts tied to it fail WP:V - no one can access the source to verify. The archiveurl is critical in these cases. This is different from the case of a site that mirrors its local print content online, and then later removes/paywalls/hides the online contact, as the print version should still be available to someone, and there, the archiveurl isn't a problem though helpful. (There is the begged question of whether a truly reliable online only source will ever go away without leaving an archive, but it has happened to generally reliable ones too). --MASEM (t) 11:46, 3 July 2014 (UTC)
You mean that the view is incompatible with some cases of 'online only' sources - not with what, by far, will be the cases for references. And most of these volatile references still need the original reference, even if it does not work anymore. The word is verifyable, not verify. --Dirk Beetstra 03:31, 6 July 2014 (UTC)
Note that I do mostly agree with your points, just that as more and more material is published in an online-only version, we have to be aware on the volatility of the site and the critical-ness of the information and whether we should be preemptively archiveurl'ing those details. I agree this is not required when the online version is duplicating some print version, as the print version is (theorhetically) always verifyable. --MASEM (t) 06:02, 6 July 2014 (UTC)
I agree with that - though I wonder how that would really invalidate a source. Is that situation not somewhat akin to using a book as a reference for which only 1 copy is available, hidden in an obscure library in an outskirt of Tibet without any online copies available. It is near impossible to verify the data, but .. it is verifiable. --Dirk Beetstra 06:58, 6 July 2014 (UTC)
In considering PAYWALL, we do expect that for a source to meet WP:V that a member of the public should have some reasonable (if not monetized) access to the source that otherwise doesn't violate any trespassing or property laws. If anyone can go to that monestary and request a look at that book, even if costs something reasonable, it's still an RS (mind you, if we are citing material from it, how did that editor get it? there are questions that do get begged, in addition to whether this is a "published" source to start with (eg made available to the public) which is also an RS requirement) "Actor John Q Smith's little black book claims he has over 40 mistresses" may be true but we're not going to be getting our hands on that book, and we'd never include that fact (ignoring the glaring BLP issue too). Same would be with online-only sources. If a site has gone dark, sure the material on it might be stored somewhere, but to access it would require hacking into someone's computer, heck no. But again, to iterate in agreement with your main point, archiveurl is not required in any case but strongly recommended when the material published is online only). --MASEM (t) 13:29, 7 July 2014 (UTC)
Sports articles face a particularly bad problem. Pages often have a half life of six months or less. I've had half the links on an article decay while it awaited a GA review. The London Olympics web pages were taken down last year, and while the Wayback Machine did a pretty good job it did not capture it all. I'm going to be requesting a special archive of the Rio games. In the meantime, sports officials contact me in the belief that Misplaced Pages archives everything. Hawkeye7 (talk) 12:49, 27 July 2014 (UTC)
Support — I agree with PaleAqua that having a link-to-archives implementation like that for ISBN, for instance, would be quite useful. Barring that, I would support multiple archive links per use. In regard to Dirk Beetstra's comment about no need for links: quite true, but there are many of us who do wish to maintain accessibility where possible, thus making verifiability easier. A newspaper you might find on microfiche somewhere, a website you won't find offline in most cases; thus, in the absence of an accessible physical archive, we need an accessible electronic archive. I don't think we want to encourage the recording of the web on microfiche so it is as verifiable as newspapers. --User:Ceyockey (talk to me) 14:15, 3 July 2014 (UTC)
You may have noticed that I do encourage it as well, but Misplaced Pages does not get worse if we wouldn't do it. --Dirk Beetstra 03:31, 6 July 2014 (UTC)
Support: If someone is willing to put effort into it, then why not? Seems constructive to me. If we're providing the reader with choice of variety and not forcing them to use one particular archival site out of many, that's power to them. No harm is done with implementation, and nobody loses anything. --benlisquare_T•C•E 11:16, 7 July 2014 (UTC)
Oppose – Although, as I said above, I do think the URL should be retained as a comment in the Wikicode (with some indication to look there), or maybe in a template similar to {{Wayback}}, but without the hyperlinks, as long as that doesn't help their Google rankings, but enough info so someone could go there manually by cutting and pasting the URL. As long as archive.is isn't profiting with better Google ranks at the expense of our reputation. Mojoworker (talk) 16:40, 7 July 2014 (UTC)
Support - Multiple alternatives increase the ready verifiability and decrease the reliance on single service providers. -- Ham105 (talk) 09:39, 8 July 2014 (UTC)
Support - Redundancy is good. All the best: Rich Farmbrough, 14:37, 8 July 2014 (UTC).

Sounds like some I-am-here-to-vote-against-rules !vote.Forbidden User (talk) 15:47, 18 July 2014 (UTC)
Support - Would aid verifiability. --Joshua Issac (talk) 15:32, 17 July 2014 (UTC)
And I think that PaleAqua's suggestion to handle it like we do ISBNs is a great idea. --Joshua Issac (talk) 15:52, 18 July 2014 (UTC)
Oppose for now, open another RfC It is too redundant; we don't amend a problem before it exists. Some (including the RfC creator) has raised concern about the stability of donation-reliant archive sites, forgetting that Misplaced Pages itself is one of them. So, it isn't that needed for now. Forbidden User (talk) 15:47, 18 July 2014 (UTC)
This. It needs to be done step by step because this RFC already is a complete and utter mess to close. The whole situation is not something that can be decided wholesale and it requires likely another RFC or some derived agreement based on the knowns and unknowns. ChrisGualtieri (talk) 04:30, 28 July 2014 (UTC)
Support I suggest yinz write some content with websites as a source, have them disappear, and then come back to this issue. Linkrot is a real problem for content creators --Guerillero | My Talk 21:02, 15 August 2014 (UTC)
Support. Beiing able to link to two (or more) sites would be an improvment and a safety measure for important references. Two inline citations would achieve the same, but that is not reader friendly. Misplaced Pages is for the future and the future is often many years ahead. Even well renowned and good archives can fail in some distant future. Misplaced Pages needs long term accountability and reliability. At present the referencing templates does not say explicitly that they link to an archived page and if they did so this would be an improvement. --ツDyveldi _{✉ post} 20:08, 29 August 2014 (UTC)
STRONG SUPPORT. A bad decision by wikipedia cause this mess because the bot should have been approved in the first place, and things snowballed from there. ...--{{U|Elvey}} 17:52, 9 September 2014 (UTC)
Support I agree that options for accessing multiple archive sites should be definitely looked into. I agree that the ISBN model may be useful, but unlike the ISBN model if only one archive exists, it should default there. --Bejnar (talk) 17:14, 23 September 2014 (UTC)
Weak Support - No brainer to me. Not convinced this RfC is the right place for the discussion though. Lukeno94 (tell Luke off here) 18:03, 12 October 2014 (UTC)

Discussion

Darkwarriorblake your opening statement: "Archive.is has never been added to the spam blacklist because the use of the blacklist would require the links to be removed before unrelated edits could be made to the article. Instead, an edit filter has been applied which prevents additions of the link, but does not prevent editing articles which simply contain the link." is a misinterpretation of my remarks on the request for blacklisting and how blacklisting/edit-filtering works. The way the edit-filter is currently set-up is exactly the same functionality that the blacklist would give (when the blacklisting was put on hold, the filter was more restrictive than the blacklisting would be). My reason not to blacklist at that time was that there were so many links that accidental removals could hurt editing experiences - unfortunately, the edit-filter is now having the same effect (though edit-filter-managers are trying to mitigate that). IMHO, blacklisting can (should?) now be implemented and the edit-filter disabled (the latter is a bigger strain on the server) per the standing consensus of the previous RfC. If this RfC then overturns that decision then blacklisting could be removed. --Dirk Beetstra 05:02, 3 July 2014 (UTC)

That's my misunderstanding, not Darkwarriorblake's, Dirk Beetstra. We wrote the opening of this as a joint effort. Certainly my reason for writing the filter was that I believed the blacklist would prevent people from editing the articles in question unless they removed the blacklisted links before saving any section they were editing. If I'm wrong about that, fine: I learn things every day.—Kww(talk) 05:11, 3 July 2014 (UTC)

OK, sorry, that part was confusing then. The edit filter that I saw around the time of the request for blacklisting was something like this version. That also blocks re-additions, reverts of vandalism which ('accidentally') removed the link, re-additions after genuine mass-removal efforts, etc. That is way more restrictive than blacklisting. If your goal is get the links removed, then, by all means, do not allow reverts of removal attempts, and when trying to revert a vandal, take the extra step to remove the links first. Get rid of them, as I argue above, they should go and are not necessary in any form. The way the filter is now, it is pretty much the same as the blacklist: plain editing of the page where the link is on is allowed, undoing/reverting edits where the links get removed are allowed as well (likely because removal took so long after the consensus of the previous RfC, now the edit-filter started to interfere with regular editing behaviour). Also mass rollbacks of bots or editors doing a massive cleanup of these links (as I requested to be done a long time ago) would be allowed (we see that sometimes - a spammer spamming 50 pages, someone cleaning up the 50 pages, and another editor who does not understand what is going on is rolling back all 50 removals ..). --Dirk Beetstra 05:25, 3 July 2014 (UTC)

Correlation of archive.is alexa graph and Rotlink activity

People, what are you talking about? Have you ever looked up at Alexa graphs during your investigation? Was Rotlink's job really useful to archive.is as you assume uncontionally? Just check the graph and then tell me, who is enemy of archive.is and who are paid editors.— Preceding unsigned comment added by 83.245.226.111 (talk • contribs) This template must be substituted.

Whether it was useful to Rotlink or not is utterly irrelevant - what matters is that Rotlink and the IPs were editing in total violation of our core policies and guidelines, pushing links without or even against community consensus. That is abuse of editing privileges, and since a) blocks of the accounts do not work seen the many IPs, b) protection of pages does not work because this would cover all of Misplaced Pages's content pages (and possibly even beyond) - prohibiting use of the link is the only way forward. They are not necessary anyway. --Dirk Beetstra 07:08, 6 July 2014 (UTC)

And the series of random new accounts that are probably Werieth that are mass removing those links in contravention of this ongoing RFC? Does that mean we should keep all the links to spite them? Darkwarriorblake / SEXY ACTION TALK PAGE! 09:54, 6 July 2014 (UTC)

'that are probably Werieth' .. any proof for that? --Dirk Beetstra 10:11, 6 July 2014 (UTC)

One way or the other, I've installed filters that block indiscriminate removal of archive.is links to stop these accounts from doing so.—Kww(talk) 19:14, 9 July 2014 (UTC)

In any case, as someone who has heavily criticised Werieth's known actions even when it was unclear they were a sock of an older banned editor, there's an obvious big difference between the cases. In one example, we have someone who appears to be intimately associated with the site doing very dodgy stuff. (As I've said elsewhere, if they really aren't that's unfortunate. But AGF only goes so far and considering the evidence and the lack of any contact we can no longer AGF.) Wereith's history suggest we shouldn't consider their opinion on to count much for anything on wikipedia. But they clearly aren't part of the "official" voice of those who consider there are problems with archive.today (like whoever behind Rotbot appears to be) since no such thing exists, and the opinion of the plenty of other independent editors still counts just as much as it did. Nil Einne (talk) 22:47, 17 July 2014 (UTC)

Use File: namespace for archiving links?

I continue to feel frustrated by this situation, and the apparent failure of the WMF to be more proactive about web archives. Clearly a significant portion of the community feels a need for proactively archiving linked references in advance, in order to deal with the potential for link rot. But there seem to be no widely acceptable solutions. Whether any third-party site can be relied on never to spam or place advertisements is questionable. So why not this solution: Use the file namespace for link archives. Why can't an editor simply upload an image of the page that they want to archive, and simply store it—either locally or on Commons? Wbm1058 (talk) 16:52, 6 July 2014 (UTC)

Because the images would be non-free content (copyright) so Commons is out. While we can argue that each usage is an acceptable fair use, we'd need a whole new series of policies in place to make sure that the file usage is appropriate. -- Ricky81682 (talk) 20:18, 6 July 2014 (UTC)

OK, I was anticipating that I might get an answer like that, so I have a followup question ready. If Misplaced Pages can't archive these links because they violate copyright laws, then how can archive sites like archive.is and archive.org do that? Are these archive sites nothing but "pirate bays"? Or have they figured out and put a series of policies in place arguing that each usage is an acceptable fair use? Is this really hard to do? If these sites can figure out how to legally do archives, why is it so hard for the Wikimedia Foundation to figure out how to do it in a similar manner to the way these sites do it? And if they are "pirate bays", should we really be accessories to crime by linking to their sites? If that's the case, then maybe we should remove all links to archive sites. – Wbm1058 (talk) 12:51, 7 July 2014 (UTC)

Speaking for US laws, we have what is called a "fair use defense" that allows people to use copyright material without seeking copyright holder permission, as long as the use meets a number of factors, generally broken out as: respect for commercial opportunity, portion of the work used, the use of the material (eg, educational/transformative), and the nature of the work taken (published vs unpublished). Our non-free content guidelines are set to start at these points and go tighter, because our goal is not to be just legal but to drive people towards using free content and minimizing non-free. So non-free images or in this case archive images would not be so much illegal but against the Foundation's purpose (and probably why they are very hesitant in running such a service themselves as long as free content is their mission).

Now to turn to other sites, which don't have the free content mission but do want to keep legal, we have a key US case decision, Field v. Google, that says that wholesale archival of website content is a fair use of copyrighted materials as well as falling within the allowances set forth by the DCMA (*note - not a SCOTUS case, therefore not fully tested but there have been no major challenges to this). This is likely the same logic that archive.org (note: in US) operates under and legally. They do also make it clear they will respect things like robots.txt that will prevent archival, helping to further their case. I would suspect webcite.org is also using the same logic (this due to being selective backup of pages submitted by users) One can spend a bit of time reviewing these types of cases, the EFF likely has many of them documented.

Now, that begs the problem of archive.is. I doubt their servers are in the US, so the issue of archive and copyright law is unknown, which is another point in favor of moving away from it. It could be operating completely legally with whatever country's fair use provisions, but we can't say for sure. --MASEM (t) 13:13, 7 July 2014 (UTC)

Not to mention that archive.is explicitly asserts that they will not honor robots.txt, making them the tallest weed in the garden in terms of copyright scofflaw. It's easy to pick on the most visible/vocal offender when there's little/no redeeming virtue to potentially mitigate the offense they're causing Hasteur (talk) 13:19, 7 July 2014 (UTC)

Honoring or not robots.txt instructions has no impact on the copyright question. Robots.txt is not put in place to enforce copyright, nor is its absence a notice saying "no copyright here". --User:Ceyockey (talk to me) 18:35, 7 July 2014 (UTC)

But it is related to the safe harbor provisions of the DMCA; a site that honors robots.txt is going to seen in a more favorable light (eg: they give abilities of copyright owners to remove material that users or others may add to it). You're right its not immediatately connected, but it is something to consider on the ultimate motives of these various sites. --MASEM (t) 18:43, 7 July 2014 (UTC)

You might be thinking of something like this? → http://spiresecurity.com/?p=293 . -User:Ceyockey (talk to me) 00:00, 8 July 2014 (UTC)

To some extent yes, but more thinking on the "safe harbor" provisions of it. (as all noted, this is not yet tested by case law to any real degree). --MASEM (t) 00:06, 8 July 2014 (UTC)

Often times robots.txt is set-up to prevent CGI scripts from creating a significant load on the server due to users setting up scripts to mine a database using an interface that was never designed for mining the entire database. The JPL Small-Body Database can not be accessed by archive.org because of robots.txt. -- Kheider (talk) 18:55, 7 July 2014 (UTC)

There is also the matter of preserving linkage between content and advertising supporting content. If content is archived, it becomes disconnected from the real-time advertising linkage which might have supported either its creation or dissemination. Thus, there is a financial incentive to invoking robots.txt blockage. --User:Ceyockey (talk to me) 02:11, 9 July 2014 (UTC)

From When will the Reflinks tool be moved to the stable server?: I want to cache all 20 million external link so references can be done in a post processing stage where details (title, author, date) can be cross verified and lots of other goodies. The foundation employee have calling 24 TB excessive (it's not) and are not working with me. — Dispenser 20:40, 17 February 2014 (UTC)

24 Terabytes. Cache all 20 million external links. Is that all it would take to do our our own reflink archive, under the same US-based legal theory (Field v. Google) that justifies archive.org? Wbm1058 (talk) 00:37, 9 July 2014 (UTC)

Legally we could probably evoke the same defense if we were challenged (IANAL however!), but I would think that the Foundation would be more worried on how that would interaction with the free content mission. It would definitely be their call, and I know the idea of a webcite-like service run by the Foundation has been brought up before. --MASEM (t) 01:41, 9 July 2014 (UTC)

There is an interesting conversation on this topic at Misplaced Pages:Misplaced Pages Signpost/Newsroom/Suggestions#Reflinks is dead. Wbm1058 (talk) 02:38, 9 July 2014 (UTC)

Change of domain name

I'm not sure if the Administrators already know that, or if its even the right place to let them know, but I've just noticed that archive.is changed its domain name to archive.today. — Mayast (talk) 20:09, 13 July 2014 (UTC)

At least Kww knows. This is rather technical, and mentioning it a bit is enough.Forbidden User (talk) 18:25, 17 July 2014 (UTC)

By the way, changing domain name to this is a little bit suspicious to me...Forbidden User (talk) 16:04, 20 July 2014 (UTC)

Nothing untoward. WMF wants to do the same thing. Hawkeye7 (talk) 02:43, 22 July 2014 (UTC)

A reminder

As the discussion goes heated, I'd recommend everyone to comment on content and remain civil. Some comments, like "he knows nothing about content creation", "absurd, hysterical and paranoid", or "paranoid hysteria" are no good arguments, and nothing helpful to the discussion.Forbidden User (talk) 16:45, 21 July 2014 (UTC)

Allegations

nothing more to see here, just some incivility. --Mdann52talk to me! 17:17, 12 August 2014 (UTC)
The following discussion has been closed. Please do not modify it.
This is an attempt to review the allegations and analyze them with Archive.is's own words and the evidence already gathered. The most common complaint that Rotlink must be Archive.is is that "other editors said it so it must be true". The claim has been repeatedly been rejected on the lack of evidence, but also by Archive.is's operator Denis. The following comes from the e-mails from Misplaced Pages:Archive.is RFC discussion between Lexein and Denis of Archive.is. Of particular concerns are: Archive.is using malware or injecting ads. This was rejected by Denis of Archive.is who wrote in an e-mail wrote "Do not panic! I do not plan to stop the service, to delete or alter the snapshots, to put ads or malware on it, etc." ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC) Archive.is is profiting off Misplaced Pages This is simply false by hit records, lack of ads and the fact that according to Denis from Archive.is that Celeste (pornographic actress) resulted in the most traffic hits of any article from Misplaced Pages. Which was viewed 6000 times in October 2013. Also, it is hard to profit with no source of income from ads and such.ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC) Rotlink was not operated by Archive.is Denis noted in the email that Rotlink was doing it for SEO link building and had some information that should not be publicly disclosed given the nature of said information. Also, Denis from Archive.is noted that "because it can affect other people and because the RFC participants are too free in rearranging the words of others" thought it would just provoke it further. ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC) Rotlink was not using a botnet Both the Momento script and Archive.is's public script both serve as a way to one-click archive a site. Misplaced Pages has its own systems for mass-updating and replacing links, but not even Autowikibrowser was used. The process was comparatively slow and the key issue arose out of the stalled process from the original WP:BAG request. It is clear that the person behind Rotlink was already familiar with Misplaced Pages's policies and procedures - but got fed up and attempted to force it through. Resulting in the situation. Note that the person mass-removing Archive.is was using a "hacked" version of AWB to bypass the requirements for permission and socking bad enough that a special abuse filter, edit filter 620, had to be created and reactivated because of continued abuse. Neither case was a botnet, just a committed proxy/IP hopping single editor. ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC) Archive.is's operator actually provided a solution to the "trusting Archive.is" problem Denis said that arguing for only persistence of links (as per SEO link building) which can be used to artificially raise the page's search engine rank based on algorithms. This probably isn't entirely accurate given Google's algorithms. Though, why was Archive.is chosen? Archive.is was the only one to provide a nice "package" to effectively automate this task enmasse for a single person to quickly perform. Archive.is was just a tool to improve the Misplaced Pages page's own rankings - not result in hits for Archive.is. As evidenced above. ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC) Rotlink's writing patterns compared to Denis from Archive.is do not match. This is more of a technical issue, but a denial from Archive.is's operators and the completely different writing patterns aren't enough to be sure. However, Denis's reponse and stance show that Rotlink was a problem for us - but Misplaced Pages is not a source of much concern for the service and Denis wonders why Misplaced Pages doesn't operate its own or use a paid service if archiving site results in such problems. ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC) Rotlink is still Archive.is - they must be denying it. Everything about Rotlink and Archive.is doesn't match up - Archive.is gains nothing and takes quite the hit from Misplaced Pages. Its not a false flag type of attack, its just the Archive.is was a convenient tool for the artificial bolstering of Google's page rankings. And all without actually providing any change to the content contained within. For Rotlink, most of those pages were purposeless updates, not even 404ed and unlikely to do so, and the vast majority were worthless ones that had few views (fodder) against the real page-rank bolstering campaign. Rotlink's edits should have been purged, but because they were already made - the whole point of whether they exist or not doesn't actually matter! Rotlink stopped because the job was done or that it became a hassle to get around the edit filter - probably making the job useless. Whoever was behind it knows Misplaced Pages and likely accomplished the mission and moved to a different tactic - typical of paid editing and search rankings go. When the filter was down and didn't work, it didn't start up again and the discussion renewed over an issue that has been partially resolved from a different manner with Archive.org. At the end of it all, the needed response was swift, a bit brash, but worked - now we are just dealing with collateral damage as a result. All the problems and accusations boil down to one bad editor, Rotlink. ChrisGualtieri (talk) 17:14, 27 July 2014 (UTC) Thank you for your reasoning here, so that I can understand your point better. So, you think editors are being sheeps that follow the one before them. No, they think those "leaders" have sufficient reasons and proofs. There are too many unsupported statements that I don't want to waste time citing. For example, RotLink's act at least raised the notability of archive.is, and no matter how small, it helps the rankings, and that's it, it is gaining off our works - proportion does not matter. "It is clear that the person behind Rotlink was already familiar with Misplaced Pages's policies and procedures - but got fed up and attempted to force it through. Resulting in the situation. Note that the person mass-removing Archive.is was using a "hacked" version of AWB to bypass the requirements for permission and socking bad enough that a special abuse filter, edit filter 620, had to be created and reactivated because of continued abuse. Neither case was a botnet, just a committed proxy/IP hopping single editor." I'm not joking, this is a very appealing argument, made out of zero basis. You said argumentum ad populum is not proof, so take your own advice. The same for the writing pattern issue, unless you're an expert on it. "Do not panic! I do not plan to stop the service, to delete or alter the snapshots, to put ads or malware on it, etc." He lies in the first sentence? What a good liar. In their FAQ they don't have the courage to say so, shown by the stammery statement with total uncertainty and such. Anyway, do you think many people around still trust Denis? If he wants trust, obey robots.txt, make the funding system clear, and make clear policies and regulations on content. P.S. The things you raise are not as important as what archive.is is - a page with absolutely no copyright safeguard (unsafe for editors, particularly new ones, to use), no content regulation (so he can prepare for jail, and site down), and a proven (not to you, I know, to me it's proven, the WP:QUACKs are already enough) usage of Wiki for promotion.Forbidden User (talk) 18:17, 27 July 2014 (UTC) Stop WP:BLUDGEONing the RFC. We understand where you stand, but you are just TLDRing everything with unsupported rhetoric. You say Archive.is not being able to say for certain (guarantee) operation is equated to lying. Cease the polemic rants already! You continue to badger and flout unsupported accusations as truth and hide behind others saying it. Your entire opinion and stance is illogical and is not even your own, its an unsupported statement you pound over and over even when evidence proves otherwise. Then you flip and call for impossible crystal ball claims to continue spouting claims like "What a good liar." You don't know enough about computers or what the terminology even means. Lastly, you avoid linked evidence and even entire discussions when it suits you, a classic polemic strategy. I think removing your polemic and unsupported or misguided defense of baseless charges is needed. You do not even understand what Robots.txt is even for - much less how it is used! I'm sorry, but you are emotional, uninformed and highly opinionated. Continuing to repeat highly charged claims over and over and using that unsupported claim as "proof" is meaningless. You don't even know what WP:QUACK is for. Continue this disruption and I'll ask for the removal of your polemic rants. ChrisGualtieri (talk) 04:16, 28 July 2014 (UTC) Addressing it at my talk page. First of all, this whole thing is unrelated to this RfC. I reckon having commented on your logic a few times, and a few comments on you. However, the only negative one I remember is "Never saw Chris in such an irrational tone before.", posted just before your bombing here. As per WP:NOTBATTLE, I'm not going to violate it with you here, so I will not reply your editor-directed comments here even if you insist on continuing here. For the few statements relevant, here is my response: I should take some responsibility on your WP:BLUDGEON accusation, as mentioned to PaleAqua. However, leaving an edit summary like "Silence when the shoe is on the other foot! Salem witch trial!" makes it difficult for others not me reply, and a provocation itself. You keep saying I'm unsupported (while I'm actually using others' explanation), so I'd like to give you your advice: Repeating doesn't make something true. For the polemic accusation, I don't understand the point of using a user talk page guideline here. Meanwhile, the RfC is framed in such a way that we are answering a de facto yes/no question, which makes common grounds not as possible as normal discussions. Actually we at least agree that link rot as a whole is a problem, see? I saw the linked evidences, but they are not relevant to my message, so I don't really know why someone puts it on.Forbidden User (talk) 14:13, 28 July 2014 (UTC) To my reading, if we accepted this, it would overturn the consensus at RFC#1. I'm not ready to take such a step at this stage, but what would help me would be an opinion from a neutral person with technical qualifications in this field (e.g. a WMF employee) who can tell us whether the allegation that illegal botnets were used introduce these links is probable, plausible, or unlikely.—S Marshall T/C 10:54, 28 July 2014 (UTC) Agree. We need the same extent of consensus on accepting archive.is as that on rejecting it in RfC#1 to overturn it. WMF should be here early on.Forbidden User (talk) 14:13, 28 July 2014 (UTC) Well, hang on, that's not quite my position. As I understand what ChrisGualtieri is saying, it's that subsequent events have shown the conclusion of RFC#1 to be wrong. I'm open to the possibility that ChrisGualtieri is right, and I don't think it would take a full RFC consensus to prove it. I'd just like to see some measured advice from the technical illuminati on the subject.—S Marshall T/C 15:49, 28 July 2014 (UTC) Also, RFC #1 was a non-admin closure and was contentious, but rather than having a pass on something like "purge Rotlink's additions", as was supported, the whole of Archive.is was edit filtered. Though Lexein's e-mail exchange reveals a lot of good information, its clear Archive.is doesn't care what Misplaced Pages decides, as it costs more to keep Misplaced Pages updated. I am reminded of the Webcitation and other archiving site issues of the past, and how the grok backup was made as an alternative to Archive.is, but it lacks any formal or unified and WMF backed option. A Misplaced Pages-only or WMF backed option is more of a liability than running Misplaced Pages already is so I doubt meaningful change will occur in the current environment. @S Marshall:, please consider that long term abuse users and Rotlink's acts are simple. The proxy lists and the IP's used were already on blacklists and such, they were open proxies that were likely obtained from a proxy list. No need to compromise computers or anything and run an IRC zombie network, but hey.... I'm confident that someone with the technical background will explain better than I. The RFC can go eitherway, but a small set of agreements should have been made prior to this "yes/no" type. Oh well, too late and I don't feel like arguing endlessly over basic terminology. ChrisGualtieri (talk) 16:43, 28 July 2014 (UTC) Re:"they were open proxies that were likely obtained from a proxy list". Certainly. What about consulting a list of computers that are already compromised makes it acceptable to use those computers for your own purposes? If you believe that is how the IPs were obtained, then you are agreeing that the use was illegal.—Kww(talk) 17:14, 28 July 2014 (UTC) I'm not going to be playing wordplay with you, but you should know the difference between a botnet and a proxy. And proxies are not illegal, but your stance is well-noted. Rotlink is not Archive.is and the WMF or ArbCom is the only way checkuser's and other data related to the true problem will be proven. Kww - you should also know Rotlink's identity - but since Archive.is doesn't care enough to get involved in Misplaced Pages's chaos, I think I have no need to continue. The WMF spoke on an unrelated matter and I just hope people with technical knowledge and a sense of what's right make good decisions. I don't know why Rotlink's additions were not just purged and the SEO activity related to promotion wasn't handled - it still hasn't in 9 months. Though if no one else cares, why should I? Have fun - debate amongst yourselves endlessly for all I care. ChrisGualtieri (talk) 17:56, 28 July 2014 (UTC) That's a strange claim: you already are playing wordplay. "Botnet" vs. "proxy network containing computers that have been illegally compromised" is a distinction without a difference.—Kww(talk) 19:29, 28 July 2014 (UTC) There are several types of proxies, not all of them are botnets. Intentionally open sock-4/5 proxies used to be more common during the innocent days of the internet for example. TOR is also kinda a proxy. That said given that those IPs were used to automatically (or even semi-automatically again the distinction might be real but doesn't matter here) insert links out of process is enough of a concern without needing to worry about if they were compromised computers etc. Socking is socking. That still doesn't connect the insertion of the links to archive.is itself. That is what I would like to see proof or even just good evidence for. PaleAqua (talk) 20:38, 28 July 2014 (UTC) Wow. I genuinely admire your ability to assume good faith on that one.—S Marshall T/C 21:24, 28 July 2014 (UTC) Actually it's more like I started unfortunately with the assumption of bad faith on the part of archive.is taking the logical leap as an implied given. It's only though digging through the history combined with doing a few network traces( as an aside I work in the telecom/networking R&D -- abet mostly compilers, embedded operating systems and the like ) and the like on my own after the Werieth's socking thing came to light that my opinion changed. And even then it took me a bit to really get a feel of the situation and realize the disconnect. I'm still willing to be convinced that I am wrong, but right now I don't see any evidence or even logic for why archive.is would be behind a botnet or hacked AWB instances or what not. All of the most likely scenarios that I can see have wikipedia and archive.is as pawns in some sort of scheme. Malware doesn't seem like the likely goal to me. If the goal was to get malware onto Misplaced Pages given access to such an array of proxies, they could have taken several other approaches that wouldn't have been noticed as quickly and probably would have hit more targets than such a long term plan that requires people to click on archive links when the original source links are still around. ( And to avoid completely assuming bad faith, yes there is the possibility that it was just a misguided effort to help Misplaced Pages and archive.is -- though the use of proxies leaves that very questionable at best. ) PaleAqua (talk) 04:53, 29 July 2014 (UTC) The distinction is valid. I'd be hard put to say that socking corresponds to a risk that the perpetrator will also distribute malware. I don't have any hesitation in saying that the use of compromised computers, whether as proxies or in a formal botnet, is an action that leads to me believe that the perpetrator is also more likely to distribute malware.—Kww(talk) 21:46, 28 July 2014 (UTC) Paleaqua is correct, "That still doesn't connect the insertion of the links to archive.is itself. That is what I would like to see proof or even just good evidence for." - Rotlink is a problem and those additions should have been nuked as Archive.is noted or all "suspect" links from Archive.is moved as freely given. The whole Archive.is matter breaks down once you realize that Misplaced Pages is costing Archive.is and they don't care what we do, but are willing to help our decision if need be. As I tried to state before, existing policy and procedure guides us, but I'm not going to condemn Archive.is for something they did not do. Try this whole thing when you separate Rotlink and Archive.is and you get what should have been done from the beginning - purge the Rotlink additions and CU those unusual pages Rotlink altered with unusual view counts. Rotlink's activity was a clear SEO and page ranking run - and the unusual activity spilled over onto Archive.is's hits. I don't think it was out to crash Archive.is - but it could have if gone unchecked by Kww. ChrisGualtieri (talk) 04:05, 29 July 2014 (UTC) Also, this is my last post on it unless people really care to continue advancing it, the RFC's outcome really doesn't matter to me in a large scope - but I like accuracy for the historical record here. The closer's have a monumental task in sorting out the issues, but I see no reason to really worry. Just message me once the RFC to purge Rotlink's additions if you wish. The community can sort most issues out themselves - and Kww, thanks for your work, you shouldn't be so silent on the good you do even when people (I) disagree about the indefinite length of that response. I'm not a beer drinker, but I'd buy you one. ChrisGualtieri (talk) 04:12, 29 July 2014 (UTC) Sorry, can I just go back over this because I think I'm missing something huge. As I understand the story, Rotlink went through using techniques that (probably? possibly? what's the right adverb here?) involved illegal botnets to add links to archive.is to Misplaced Pages. The effect of this was to massively boost archive.is' page ranking. This would obviously have had the effect of increasing archive.is' hosting costs as well as their advertising revenue. Since then archive.is' representatives have disclaimed responsibility for Rotlink. To my reading, that claim is plausible if they employed SEO consultants as part of their launch. Given the scale of the operation and the technical expertise employed to achieve this specific result, the idea that someone independent did this as an act of random vandalism is (ludicrous? implausible? a bit hard to swallow? What's the right adverb here?) I observe that there's broad consensus on this subsection of the page to purge archive.is' links. It's much less clear from the RFC voting that this should happen.—S Marshall T/C 08:59, 29 July 2014 (UTC) The easiest way is to make WMF turn from its Media Viewer arbitration to here, look at the RotLink saga, and give a good answer. The argument on RotLink being related to archive.is or not is neither completely proved nor disproved, so starting to base oneself on the assumption that archive.is is not connected to RotLink is not a good idea. You may choose between whether to keep assuming good faith on a site with no copyright policies — it's up to your discretions and opinion.Forbidden User (talk) 17:58, 29 July 2014 (UTC) "Well, it could be a rabbit in disguise..." (but it isn't) The botnet claim is really far-fetched and fantastic because the only "evidence" for it is that the IP addresses of the unapproved bots came from three or four different countries. The simplest explanation is that Rotlink or his minions fed their bot a list of anonymous open proxies. Such lists are widely available on the web from sites like https://proxy.org/, and are used to circumvent censorship, for example, by schoolkids to get around school Internet filters, and by people behind the Great Firewall to access blocked sites. The technical expertise needed to use these proxy lists is very low (most of the kids at school were using them when I was a schoolkid). Compared to this simple explanation, the suggestion that Rotlink & Co. used a botnet requires a lot more imagination to believe. And building an illegal botnet when there is a very simple copy-and-paste alternative solution using free and legal proxy lists is a bit like building a nuclear weapon to demolish a small house, instead of using a freely-available bulldozer. Forbidden User mentioned the duck test above, and I think that it definitely applies here. --Joshua Issac (talk) 20:58, 29 July 2014 (UTC) That would make sense if it weren't for the fact that most open proxies are blocked on Mediawiki projects.—S Marshall T/C 21:38, 29 July 2014 (UTC) That plus the fact that these "open proxy" lists typically contain a large number of compromised machines. Using a free open proxy list isn't ethically different from compromising the machines yourself. "Three or four different countries" is quite a mistatement as well: the opening statement lists eighteen, and that is far from an exhaustive list.—Kww(talk) 21:42, 29 July 2014 (UTC) "That would make sense if it weren't for the fact that most open proxies are blocked on Mediawiki projects" - it costs $5 per month to purchase a private VPN subscription. Per Misplaced Pages policy, only open proxies are blocked on sight, and even then, it's impossible and unrealistic to block every single open proxy in existence. I used to rely on a private VPN many years ago, because Steam games were cheaper in the United States than Australia, and it was cheaper to throw away $5 every month than to be constantly ripped off because I don't live in the country of freedom, liberty, and 11 aircraft carriers. I would fool Steam into thinking I lived in the United States, and could buy cheaper computer games - practically everyone my age back then (14-17 years old) did this, because when you're a kid, the only money you had came from slaving away at a fast food joint. The technical know-how to use these things was very minimal, back in the day any kid at school knew how to use one. The service I used allowed me to pretend to be from a wide selection of countries, from Sweden, Denmark and the US, to even places like Vietnam, Mexico, Brazil, Ukraine, Serbia, Saudi Arabia, and Cambodia. You were seen as someone living in Michigan using AT&T, or living in Ho Chi Minh City on VietTel Corporation Pty Ltd. I don't know if the pricing and services are still the same since I haven't used it in a long time, but I don't expect things to have changed that much differently since then. --benlisquare_T•C•E 04:05, 30 July 2014 (UTC) Again even if they were added just through normal socking or even meat-puppetry, or mechanical turk service is no different than if they we added via a botnet or some sort of malware. It doesn't change the fact that they were added in a way outside of process and the ones added thusly should be removed. Though in a way that might exactly be what is desired if Chris's theory is correct. Because of nofollow, the links themselves don't add to the page rank of the articles, but the freshness of articles on wikipedia does make the Misplaced Pages article itself more likely to show up to search results. For example searching on bing will often put an exert from a relevant Misplaced Pages article on the top of the right column. I would actually argue that the notability of the impacted articles should probably be checked as well. PaleAqua (talk) 22:07, 29 July 2014 (UTC) Yeah, the technical details (which cause headache) are not as important as the nature of RotLink's act, and if we are to apply WP:DUCK, then apply it on the nature of RotLink's act. It does not work well on technical details. Perhaps the "what if xxx's theory is correct" can be put out for the instance. By the way, someone has put a proof that nofollow doesn't necessarily nullify the effect of links of search engine ranking, if I'm correct on interpreting the content... Anyway, if most articles with the links are not notable, then removing them isn't much of a problem...Forbidden User (talk) 16:25, 30 July 2014 (UTC) I would say archives of a potential comet impact with Mars are notable ( https://archive.today/duZRu ) and JPL/NASA data is PD. Archives.is is the perfect way to keep track of the expanding observation arc and the refinement in the solutions. The WayBack Machine is a fail when it comes to archiving public domain (PD) material from the JPL small body database. -- Kheider (talk) 18:17, 30 July 2014 (UTC) I have plenty of articles on notable athletes with links to local newspapers available on the web only (eg. http://archive.today/bqdNY) Hawkeye7 (talk) 22:08, 30 July 2014 (UTC) And so some links will actually be clicked. If the nofollow doesn't necessarily nullify the effect of links of search engine ranking (as proved by someone with linked evidence, correct me if I'm wrong), then archive.is is profitting off our free work (here "profit" is not confined to money).Forbidden User (talk) 13:31, 31 July 2014 (UTC) How is this any different from the New York Times or Engadget "profiting" from having their URLs linked in Misplaced Pages citations, and readers clicking on them? Why is this even an issue? And if "readers clicking on the links and making archive.is profit" is really such a big deal as claimed, then why not introduce a new parameter in {{cite web}} so editors can choose to have the "archiveurl" parameter show up as a tiny and inconspicuous blip? Something like "smallarchive=yes", which does something like this: Before: Vasovic, Aleksandar (14 March 2014). "Serbian paramilitaries join pro-Russian forces in Crimea". Yahoo News. Archived from the original on 15 March 2014. After: Vasovic, Aleksandar (14 March 2014). "Serbian paramilitaries join pro-Russian forces in Crimea". Yahoo News. See archive created on 15 March 2014. In the second case, no one is going to bother clicking the second link since it's so small, unless the original link is already dead in the first place. Seriously, there's hardly any justification left for the paranoia surrounding SEO if a simple tweak is done to the citation templates: Instead of having an archive URL replace the original if the parameter is filled, it should just include a small archive link to the side in case it is ever needed. There are all sorts of potential solutions to this issue as long as we bother to come up with constructive ideas, but we're only putting the whole thing into limbo due to the huge focus on disagreements. --benlisquare_T•C•E 13:57, 31 July 2014 (UTC) Is it relevant to talk about the cite web template here? It now works in the way that prompts clicking into the archive URLs and so it is. Linking sources is due to usage of the sources, necessitating attribution to them. For archive.is, links are put on by force (and a porn stub link gets 6,000+ hits per month, look through others' comments above). Yes, it is a big deal.Forbidden User (talk) 14:34, 31 July 2014 (UTC) I have to break my silence to counter this continued and completely wrong statement. Archive.is does not use ads - they do not PROFIT off Misplaced Pages in the least. As for the Celeste page, its an oddity. Even prior to Rotlink, that page consistently got 6000+ views a month, including 8000 in August 2010. Archive.is said it just got the most hits from Misplaced Pages on that page, which given its nature was probably for prurient interests (naked pictures) instead of content verification. Still Archive.is doesn't make a dime off Misplaced Pages clicked links and Google doesn't boost Archive.is for its mere citation in Misplaced Pages - it boosts the actual Misplaced Pages article in its rankings. Also, if you want Archive.is to not run on donations or on its owner's dime - might as well shoot down Archive.org or Webcite and the others as well. Lastly, the "Not promotional" policy you linked is not about links or their hosting in any way. I regularly use a lot of paywalled sources, many cost you $100+ a year to view the content or journals which cost you $35 an article. I hope to have Newspapers access soon and that's I think only $80 a year, but you know I am going to be using that to do research for articles and then add hundreds or thousands of sources that link directly to the database. ChrisGualtieri (talk) 15:41, 31 July 2014 (UTC) No, no, no. I did not talk about money or ads. It's totally fine if they make a transparent donation system, just like Wiki here. I saw discussion which says from the Alexa graph, those links being clicked (not the links alone, though someone put a link to something about nofollow in this edit, saying that nofollow does not necessarily work perfectly) do conribute to page traffic. One click on RotLink's spam links is enough to constitute promotion, whether or not he is connected to archive.is or intended to do the promotion (as someone put in this RfC:Ends are more important.). Profit is not just money. Traffic is an asset just like money. After all, why did archive.is not give some technical logs and proofs, etc, to defend itself? It does not need much time, given their expertise. I'm not putting some FUD here, but that is indeed one of the factor leading to this endless discussion.Forbidden User (talk) 16:48, 31 July 2014 (UTC) Oy, vain much - I was addressing more than just your post. S Marshall mentioned, "...his would obviously have had the effect of increasing archive.is' hosting costs as well as their advertising revenue." Was what the first part was in response to. Misplaced Pages is tiny part of Archive.is's traffic and looking for a porn actress's pictures were (above all others) the highest "hit" from Misplaced Pages. Archive.is is a service, but WP:NOTPROMO doesn't matter and when revealing the data constitutes a violation of EU law - you are being unreasonable. You are making demands that no respectable company would agree to - its internet vigilantism and Misplaced Pages has a recent case of a spectacular bad use of even private data. But that's a diversion, because no one will advocate Rotlink's activity as being acceptable. Rotlink should have been blocked for copying the Archive.is blog post and trying to assume the identity of the website - if not for a impersonation than a WP:COI violation. And this assumption that if Archive.is even shows something, what on earth are you going to do with the data? What can you compare it to, and wouldn't it still be Archive.is's word against your own beliefs? Rotlink should never have been allowed to do what was done - but that is past. ChrisGualtieri (talk) 17:34, 31 July 2014 (UTC) P.S. Barring a criminal investigation, no "hard evidence" will ever likely be given publicly and Misplaced Pages soured its own chances with Archive.is by demanding they "come clean or else". That's a legal threat and Wikipedians' burned the bridge, but also, actual law prevents disclosure of such information. To take from an ANI thread, continuing to "stir the shit pot" is about all people can do on the alleged Rotlink / Archive.is connection. Also, hi Rotlink. ChrisGualtieri (talk) 17:34, 31 July 2014 (UTC) @S Marshall:, @ChrisGualtieri: A few people have said the archive.is operator has denied being Rotlink. Can someone point out where this is? It may be somewhere here but I can't find it and so far, all I've seen is Misplaced Pages:Archive.is RFC/Rotlink email attempt. Although this was pointed out to me back in October, I didn't look at this very well before, I admit I should have since it has some interesting info. I may change my opinion on how to handle archive.is. But anyway, the email discussion actually makes any denials now from archive.is that they were connected to rotlink very weird because there are strong indications from the emails that rotlink i.e. Denis is the operator of archive.is who evidentally is also Denis. (Remember that the emails were to the rotlink account. So it's very very weird if the person replying wasn't the person behind the account. And there definitely doesn't seem to be anything in the email discussion indicating they are replying for some reason even though they are not rotlink.) It's true that Denis/rotlink appears to deny the were behind the proxy/botnet/whatever (the SEO comment), this is a distinct point. I hope people aren't confusing the two as this whole discussion is already so damn confusing. Nil Einne (talk) 17:10, 1 August 2014 (UTC) Thanks for that link btw, it is exactly what I was looking for when I asked for connection information. Going to spend some time reading it and reevaluating my positions. PaleAqua (talk) 17:43, 1 August 2014 (UTC) The situation is complex - I won't go into too much detail, but please do not spam the webmaster. I've sent a final e-mail to my contact - since cooler heads are now dominating this discussion - Wikipedians seem ready to listen. I've also asked for permission to publish the e-mail and details. I only hope that they are sympathetic and open to discussion. ChrisGualtieri (talk) 18:28, 1 August 2014 (UTC) Oh, so anyone disagreeing with you is not "cooler heads" and all the people expressing concern on archive.is are not "cooler heads"... Could you please focus on the content instead of categorising editors? Also, I did not request anything from archive.is, as it would be COI evidence anyway, though WMF'd have more to work on. So, take care before accusing others of legal threats. You saying that EU law forbids the disclosure pursued by some can be taken as legal threat as well by your logic. By the way, did you ignore that linked "evidence" (I don't know if it is)? I would also suggest you to check the application area of policies and guidelines before citing them. WP:NPA applies on contributors (i.e. accounts), while WP:No legal threats applies on the said group. You actually put a point of mine — we at least need CheckUsering. WMF intervention is better, and if we need the truth, we will need a criminal investigation. Sorry, but WP:NOTPROMO matters. Even if RotLink is not related to archive.is, he is promoting archive.is as its traffic increases. (One can help marketing something unrelated, right?) Most small sites face low notability and the subsequent lack of new customers. It needs more first-clicks so that it can have more regular customers. For for-profit sites this means more advertising revenue, and for NGOs it means more donations. By the way, @Kheider:By PD you mean PDF or Perfect Dark (P2P) or Pure data? By PD I meant public domain date. NASA data is public domain. CGI scripts place significantly more load on servers than static HTML or even PHP. So Robots.txt at the JPL small body database prevents CGI scripts from mining the database using an interface that was never designed for mining an entire database. So this is a case where Robots.txt has nothing to do with copyrights. -- Kheider (talk) 20:59, 3 August 2014 (UTC) By looking at the page robots.txt, it is not cumpulsory and not always having to do with copyright, though it can often help safeguarding copyright. Repeat: Anyone knows why we cannot enter archive.is?Forbidden User (talk) 17:00, 4 August 2014 (UTC) I have no problem accessing archive.is today (pun). It did appear to be down for a little bit yesterday. Could have just been server maintenance. -- Kheider (talk) 18:27, 4 August 2014 (UTC) Last note: Failed to enter archive.is/today. Does that happen on you guys?Forbidden User (talk) 16:42, 3 August 2014 (UTC) Additional comment: Should truth be a mystery, we will have to look for the best explanation. Mine is that RotLink is promting archive.is. Since it sounds quite absurd to put a double conspiracy theory that RotLink uses Wiki in order to overload archive.is (bad faith to both organisations), and it does not sound reasonable that someone would promote an unrelated site for no reasons, I come up with the conclusion that archive.is wants traffic to sustain itself and so tell RotLink to do his work. Per archive.is FAQ, he says "with current growth I can run the site with no ads", with shows how vital traffic is to him. P.S. Archive.is could be run by one person only.Forbidden User (talk) 17:10, 3 August 2014 (UTC) Ping, it's been almost a week. Any word? PaleAqua (talk) 05:56, 8 August 2014 (UTC) Chris gave some information in which Denis simply made contradiction (see Graham's Hierarchy of Disagreement, it's at the third stage from bottom, which is not good enough). Somewhat I don't think archive.is will reply, because then editors can discount any concern as hysteria.Forbidden User (talk) 11:41, 8 August 2014 (UTC) The day Forbidden User made those remarks was the exact time my communications halted and haven't resumed. Archive.is doesn't want anything to do with Misplaced Pages. Frankly, I don't blame them... because people are actively out to bring them down and twist every word ever said. For those of you who have already seen through the veil and put the pieces together, Lexein's conversation is the truth - its been there the whole time. To the three, maybe four, people who understand the whole situation we know where, of all places, it would have to go. Also contained in Lexein's conversation is your other answer about such participation. I'm sure that the admin closer who goes through this will also come to know these things. Once you've assembled the whole picture it makes perfect sense and you know what has to be done. So you'll share in my indifference to the closing decision and find either outcome favorable. ChrisGualtieri (talk) 05:23, 10 August 2014 (UTC) Perhaps we can all stop trying to persuade the admin to the closures we individually want? Anyway, no one knows the exact truth. Stop, please.Forbidden User (talk) 16:36, 10 August 2014 (UTC) Another remark is that you seem to love doing things you allege others to be doing... like how you made false claims of knowing the "truth" here, and how you antagonise editors who are against archive.is based on their own reasonings/concerns/judgement on the event (Quote:"Frankly, I don't blame them... because people are actively out to bring them down and twist every word ever said.) and protagonise archive.is as some "innocent victims", again with no evidence that you promptly request from others.

nothing more to see here, just some incivility. --Mdann52talk to me! 17:17, 12 August 2014 (UTC)

The following discussion has been closed. Please do not modify it.

This is an attempt to review the allegations and analyze them with Archive.is's own words and the evidence already gathered. The most common complaint that Rotlink must be Archive.is is that "other editors said it so it must be true". The claim has been repeatedly been rejected on the lack of evidence, but also by Archive.is's operator Denis. The following comes from the e-mails from Misplaced Pages:Archive.is RFC discussion between Lexein and Denis of Archive.is. Of particular concerns are:

Archive.is using malware or injecting ads.

This was rejected by Denis of Archive.is who wrote in an e-mail wrote "Do not panic! I do not plan to stop the service, to delete or alter the snapshots, to put ads or malware on it, etc." ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC)

Archive.is is profiting off Misplaced Pages

This is simply false by hit records, lack of ads and the fact that according to Denis from Archive.is that Celeste (pornographic actress) resulted in the most traffic hits of any article from Misplaced Pages. Which was viewed 6000 times in October 2013. Also, it is hard to profit with no source of income from ads and such.ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC)

Rotlink was not operated by Archive.is

Denis noted in the email that Rotlink was doing it for SEO link building and had some information that should not be publicly disclosed given the nature of said information. Also, Denis from Archive.is noted that "because it can affect other people and because the RFC participants are too free in rearranging the words of others" thought it would just provoke it further. ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC)

Rotlink was not using a botnet

Both the Momento script and Archive.is's public script both serve as a way to one-click archive a site. Misplaced Pages has its own systems for mass-updating and replacing links, but not even Autowikibrowser was used. The process was comparatively slow and the key issue arose out of the stalled process from the original WP:BAG request. It is clear that the person behind Rotlink was already familiar with Misplaced Pages's policies and procedures - but got fed up and attempted to force it through. Resulting in the situation. Note that the person mass-removing Archive.is was using a "hacked" version of AWB to bypass the requirements for permission and socking bad enough that a special abuse filter, edit filter 620, had to be created and reactivated because of continued abuse. Neither case was a botnet, just a committed proxy/IP hopping single editor. ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC)

Archive.is's operator actually provided a solution to the "trusting Archive.is" problem

Denis said that arguing for only persistence of links (as per SEO link building) which can be used to artificially raise the page's search engine rank based on algorithms. This probably isn't entirely accurate given Google's algorithms. Though, why was Archive.is chosen? Archive.is was the only one to provide a nice "package" to effectively automate this task enmasse for a single person to quickly perform. Archive.is was just a tool to improve the Misplaced Pages page's own rankings - not result in hits for Archive.is. As evidenced above. ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC)

Rotlink's writing patterns compared to Denis from Archive.is do not match.

This is more of a technical issue, but a denial from Archive.is's operators and the completely different writing patterns aren't enough to be sure. However, Denis's reponse and stance show that Rotlink was a problem for us - but Misplaced Pages is not a source of much concern for the service and Denis wonders why Misplaced Pages doesn't operate its own or use a paid service if archiving site results in such problems. ChrisGualtieri (talk) 17:13, 27 July 2014 (UTC)

Rotlink is still Archive.is - they must be denying it.

Everything about Rotlink and Archive.is doesn't match up - Archive.is gains nothing and takes quite the hit from Misplaced Pages. Its not a false flag type of attack, its just the Archive.is was a convenient tool for the artificial bolstering of Google's page rankings. And all without actually providing any change to the content contained within. For Rotlink, most of those pages were purposeless updates, not even 404ed and unlikely to do so, and the vast majority were worthless ones that had few views (fodder) against the real page-rank bolstering campaign. Rotlink's edits should have been purged, but because they were already made - the whole point of whether they exist or not doesn't actually matter! Rotlink stopped because the job was done or that it became a hassle to get around the edit filter - probably making the job useless. Whoever was behind it knows Misplaced Pages and likely accomplished the mission and moved to a different tactic - typical of paid editing and search rankings go. When the filter was down and didn't work, it didn't start up again and the discussion renewed over an issue that has been partially resolved from a different manner with Archive.org. At the end of it all, the needed response was swift, a bit brash, but worked - now we are just dealing with collateral damage as a result. All the problems and accusations boil down to one bad editor, Rotlink. ChrisGualtieri (talk) 17:14, 27 July 2014 (UTC)

Thank you for your reasoning here, so that I can understand your point better. So, you think editors are being sheeps that follow the one before them. No, they think those "leaders" have sufficient reasons and proofs. There are too many unsupported statements that I don't want to waste time citing. For example, RotLink's act at least raised the notability of archive.is, and no matter how small, it helps the rankings, and that's it, it is gaining off our works - proportion does not matter.

"It is clear that the person behind Rotlink was already familiar with Misplaced Pages's policies and procedures - but got fed up and attempted to force it through. Resulting in the situation. Note that the person mass-removing Archive.is was using a "hacked" version of AWB to bypass the requirements for permission and socking bad enough that a special abuse filter, edit filter 620, had to be created and reactivated because of continued abuse. Neither case was a botnet, just a committed proxy/IP hopping single editor." I'm not joking, this is a very appealing argument, made out of zero basis. You said argumentum ad populum is not proof, so take your own advice. The same for the writing pattern issue, unless you're an expert on it.

"Do not panic! I do not plan to stop the service, to delete or alter the snapshots, to put ads or malware on it, etc." He lies in the first sentence? What a good liar. In their FAQ they don't have the courage to say so, shown by the stammery statement with total uncertainty and such.

Anyway, do you think many people around still trust Denis? If he wants trust, obey robots.txt, make the funding system clear, and make clear policies and regulations on content.

P.S. The things you raise are not as important as what archive.is is - a page with absolutely no copyright safeguard (unsafe for editors, particularly new ones, to use), no content regulation (so he can prepare for jail, and site down), and a proven (not to you, I know, to me it's proven, the WP:QUACKs are already enough) usage of Wiki for promotion.Forbidden User (talk) 18:17, 27 July 2014 (UTC)

Stop WP:BLUDGEONing the RFC. We understand where you stand, but you are just TLDRing everything with unsupported rhetoric. You say Archive.is not being able to say for certain (guarantee) operation is equated to lying. Cease the polemic rants already! You continue to badger and flout unsupported accusations as truth and hide behind others saying it. Your entire opinion and stance is illogical and is not even your own, its an unsupported statement you pound over and over even when evidence proves otherwise. Then you flip and call for impossible crystal ball claims to continue spouting claims like "What a good liar." You don't know enough about computers or what the terminology even means. Lastly, you avoid linked evidence and even entire discussions when it suits you, a classic polemic strategy. I think removing your polemic and unsupported or misguided defense of baseless charges is needed. You do not even understand what Robots.txt is even for - much less how it is used! I'm sorry, but you are emotional, uninformed and highly opinionated. Continuing to repeat highly charged claims over and over and using that unsupported claim as "proof" is meaningless. You don't even know what WP:QUACK is for. Continue this disruption and I'll ask for the removal of your polemic rants. ChrisGualtieri (talk) 04:16, 28 July 2014 (UTC)

Addressing it at my talk page. First of all, this whole thing is unrelated to this RfC. I reckon having commented on your logic a few times, and a few comments on you. However, the only negative one I remember is "Never saw Chris in such an irrational tone before.", posted just before your bombing here. As per WP:NOTBATTLE, I'm not going to violate it with you here, so I will not reply your editor-directed comments here even if you insist on continuing here.

For the few statements relevant, here is my response: I should take some responsibility on your WP:BLUDGEON accusation, as mentioned to PaleAqua. However, leaving an edit summary like "Silence when the shoe is on the other foot! Salem witch trial!" makes it difficult for others not me reply, and a provocation itself. You keep saying I'm unsupported (while I'm actually using others' explanation), so I'd like to give you your advice: Repeating doesn't make something true. For the polemic accusation, I don't understand the point of using a user talk page guideline here. Meanwhile, the RfC is framed in such a way that we are answering a de facto yes/no question, which makes common grounds not as possible as normal discussions. Actually we at least agree that link rot as a whole is a problem, see? I saw the linked evidences, but they are not relevant to my message, so I don't really know why someone puts it on.Forbidden User (talk) 14:13, 28 July 2014 (UTC)

To my reading, if we accepted this, it would overturn the consensus at RFC#1. I'm not ready to take such a step at this stage, but what would help me would be an opinion from a neutral person with technical qualifications in this field (e.g. a WMF employee) who can tell us whether the allegation that illegal botnets were used introduce these links is probable, plausible, or unlikely.—S Marshall T/C 10:54, 28 July 2014 (UTC)
Agree. We need the same extent of consensus on accepting archive.is as that on rejecting it in RfC#1 to overturn it. WMF should be here early on.Forbidden User (talk) 14:13, 28 July 2014 (UTC)
Well, hang on, that's not quite my position. As I understand what ChrisGualtieri is saying, it's that subsequent events have shown the conclusion of RFC#1 to be wrong. I'm open to the possibility that ChrisGualtieri is right, and I don't think it would take a full RFC consensus to prove it. I'd just like to see some measured advice from the technical illuminati on the subject.—S Marshall T/C 15:49, 28 July 2014 (UTC)
Also, RFC #1 was a non-admin closure and was contentious, but rather than having a pass on something like "purge Rotlink's additions", as was supported, the whole of Archive.is was edit filtered. Though Lexein's e-mail exchange reveals a lot of good information, its clear Archive.is doesn't care what Misplaced Pages decides, as it costs more to keep Misplaced Pages updated. I am reminded of the Webcitation and other archiving site issues of the past, and how the grok backup was made as an alternative to Archive.is, but it lacks any formal or unified and WMF backed option. A Misplaced Pages-only or WMF backed option is more of a liability than running Misplaced Pages already is so I doubt meaningful change will occur in the current environment. @S Marshall:, please consider that long term abuse users and Rotlink's acts are simple. The proxy lists and the IP's used were already on blacklists and such, they were open proxies that were likely obtained from a proxy list. No need to compromise computers or anything and run an IRC zombie network, but hey.... I'm confident that someone with the technical background will explain better than I. The RFC can go eitherway, but a small set of agreements should have been made prior to this "yes/no" type. Oh well, too late and I don't feel like arguing endlessly over basic terminology. ChrisGualtieri (talk) 16:43, 28 July 2014 (UTC)
Re:"they were open proxies that were likely obtained from a proxy list". Certainly. What about consulting a list of computers that are already compromised makes it acceptable to use those computers for your own purposes? If you believe that is how the IPs were obtained, then you are agreeing that the use was illegal.—Kww(talk) 17:14, 28 July 2014 (UTC)
I'm not going to be playing wordplay with you, but you should know the difference between a botnet and a proxy. And proxies are not illegal, but your stance is well-noted. Rotlink is not Archive.is and the WMF or ArbCom is the only way checkuser's and other data related to the true problem will be proven. Kww - you should also know Rotlink's identity - but since Archive.is doesn't care enough to get involved in Misplaced Pages's chaos, I think I have no need to continue. The WMF spoke on an unrelated matter and I just hope people with technical knowledge and a sense of what's right make good decisions. I don't know why Rotlink's additions were not just purged and the SEO activity related to promotion wasn't handled - it still hasn't in 9 months. Though if no one else cares, why should I? Have fun - debate amongst yourselves endlessly for all I care. ChrisGualtieri (talk) 17:56, 28 July 2014 (UTC)
That's a strange claim: you already are playing wordplay. "Botnet" vs. "proxy network containing computers that have been illegally compromised" is a distinction without a difference.—Kww(talk) 19:29, 28 July 2014 (UTC)
There are several types of proxies, not all of them are botnets. Intentionally open sock-4/5 proxies used to be more common during the innocent days of the internet for example. TOR is also kinda a proxy. That said given that those IPs were used to automatically (or even semi-automatically again the distinction might be real but doesn't matter here) insert links out of process is enough of a concern without needing to worry about if they were compromised computers etc. Socking is socking. That still doesn't connect the insertion of the links to archive.is itself. That is what I would like to see proof or even just good evidence for. PaleAqua (talk) 20:38, 28 July 2014 (UTC)
Wow. I genuinely admire your ability to assume good faith on that one.—S Marshall T/C 21:24, 28 July 2014 (UTC)
Actually it's more like I started unfortunately with the assumption of bad faith on the part of archive.is taking the logical leap as an implied given. It's only though digging through the history combined with doing a few network traces( as an aside I work in the telecom/networking R&D -- abet mostly compilers, embedded operating systems and the like ) and the like on my own after the Werieth's socking thing came to light that my opinion changed. And even then it took me a bit to really get a feel of the situation and realize the disconnect. I'm still willing to be convinced that I am wrong, but right now I don't see any evidence or even logic for why archive.is would be behind a botnet or hacked AWB instances or what not. All of the most likely scenarios that I can see have wikipedia and archive.is as pawns in some sort of scheme. Malware doesn't seem like the likely goal to me. If the goal was to get malware onto Misplaced Pages given access to such an array of proxies, they could have taken several other approaches that wouldn't have been noticed as quickly and probably would have hit more targets than such a long term plan that requires people to click on archive links when the original source links are still around. ( And to avoid completely assuming bad faith, yes there is the possibility that it was just a misguided effort to help Misplaced Pages and archive.is -- though the use of proxies leaves that very questionable at best. ) PaleAqua (talk) 04:53, 29 July 2014 (UTC)

The distinction is valid. I'd be hard put to say that socking corresponds to a risk that the perpetrator will also distribute malware. I don't have any hesitation in saying that the use of compromised computers, whether as proxies or in a formal botnet, is an action that leads to me believe that the perpetrator is also more likely to distribute malware.—Kww(talk) 21:46, 28 July 2014 (UTC)

Paleaqua is correct, "That still doesn't connect the insertion of the links to archive.is itself. That is what I would like to see proof or even just good evidence for." - Rotlink is a problem and those additions should have been nuked as Archive.is noted or all "suspect" links from Archive.is moved as freely given. The whole Archive.is matter breaks down once you realize that Misplaced Pages is costing Archive.is and they don't care what we do, but are willing to help our decision if need be. As I tried to state before, existing policy and procedure guides us, but I'm not going to condemn Archive.is for something they did not do. Try this whole thing when you separate Rotlink and Archive.is and you get what should have been done from the beginning - purge the Rotlink additions and CU those unusual pages Rotlink altered with unusual view counts. Rotlink's activity was a clear SEO and page ranking run - and the unusual activity spilled over onto Archive.is's hits. I don't think it was out to crash Archive.is - but it could have if gone unchecked by Kww. ChrisGualtieri (talk) 04:05, 29 July 2014 (UTC)

Also, this is my last post on it unless people really care to continue advancing it, the RFC's outcome really doesn't matter to me in a large scope - but I like accuracy for the historical record here. The closer's have a monumental task in sorting out the issues, but I see no reason to really worry. Just message me once the RFC to purge Rotlink's additions if you wish. The community can sort most issues out themselves - and Kww, thanks for your work, you shouldn't be so silent on the good you do even when people (I) disagree about the indefinite length of that response. I'm not a beer drinker, but I'd buy you one. ChrisGualtieri (talk) 04:12, 29 July 2014 (UTC)
Sorry, can I just go back over this because I think I'm missing something huge. As I understand the story, Rotlink went through using techniques that (probably? possibly? what's the right adverb here?) involved illegal botnets to add links to archive.is to Misplaced Pages. The effect of this was to massively boost archive.is' page ranking. This would obviously have had the effect of increasing archive.is' hosting costs as well as their advertising revenue. Since then archive.is' representatives have disclaimed responsibility for Rotlink.
To my reading, that claim is plausible if they employed SEO consultants as part of their launch. Given the scale of the operation and the technical expertise employed to achieve this specific result, the idea that someone independent did this as an act of random vandalism is (ludicrous? implausible? a bit hard to swallow? What's the right adverb here?)
I observe that there's broad consensus on this subsection of the page to purge archive.is' links. It's much less clear from the RFC voting that this should happen.—S Marshall T/C 08:59, 29 July 2014 (UTC)

The easiest way is to make WMF turn from its Media Viewer arbitration to here, look at the RotLink saga, and give a good answer. The argument on RotLink being related to archive.is or not is neither completely proved nor disproved, so starting to base oneself on the assumption that archive.is is not connected to RotLink is not a good idea. You may choose between whether to keep assuming good faith on a site with no copyright policies — it's up to your discretions and opinion.Forbidden User (talk) 17:58, 29 July 2014 (UTC)

The botnet claim is really far-fetched and fantastic because the only "evidence" for it is that the IP addresses of the unapproved bots came from three or four different countries. The simplest explanation is that Rotlink or his minions fed their bot a list of anonymous open proxies. Such lists are widely available on the web from sites like https://proxy.org/, and are used to circumvent censorship, for example, by schoolkids to get around school Internet filters, and by people behind the Great Firewall to access blocked sites. The technical expertise needed to use these proxy lists is very low (most of the kids at school were using them when I was a schoolkid).

Compared to this simple explanation, the suggestion that Rotlink & Co. used a botnet requires a lot more imagination to believe. And building an illegal botnet when there is a very simple copy-and-paste alternative solution using free and legal proxy lists is a bit like building a nuclear weapon to demolish a small house, instead of using a freely-available bulldozer.

Forbidden User mentioned the duck test above, and I think that it definitely applies here. --Joshua Issac (talk) 20:58, 29 July 2014 (UTC)

That would make sense if it weren't for the fact that most open proxies are blocked on Mediawiki projects.—S Marshall T/C 21:38, 29 July 2014 (UTC)

That plus the fact that these "open proxy" lists typically contain a large number of compromised machines. Using a free open proxy list isn't ethically different from compromising the machines yourself. "Three or four different countries" is quite a mistatement as well: the opening statement lists eighteen, and that is far from an exhaustive list.—Kww(talk) 21:42, 29 July 2014 (UTC)
"That would make sense if it weren't for the fact that most open proxies are blocked on Mediawiki projects" - it costs $5 per month to purchase a private VPN subscription. Per Misplaced Pages policy, only open proxies are blocked on sight, and even then, it's impossible and unrealistic to block every single open proxy in existence. I used to rely on a private VPN many years ago, because Steam games were cheaper in the United States than Australia, and it was cheaper to throw away $5 every month than to be constantly ripped off because I don't live in the country of freedom, liberty, and 11 aircraft carriers. I would fool Steam into thinking I lived in the United States, and could buy cheaper computer games - practically everyone my age back then (14-17 years old) did this, because when you're a kid, the only money you had came from slaving away at a fast food joint.
The technical know-how to use these things was very minimal, back in the day any kid at school knew how to use one. The service I used allowed me to pretend to be from a wide selection of countries, from Sweden, Denmark and the US, to even places like Vietnam, Mexico, Brazil, Ukraine, Serbia, Saudi Arabia, and Cambodia. You were seen as someone living in Michigan using AT&T, or living in Ho Chi Minh City on VietTel Corporation Pty Ltd. I don't know if the pricing and services are still the same since I haven't used it in a long time, but I don't expect things to have changed that much differently since then. --benlisquare_T•C•E 04:05, 30 July 2014 (UTC)

Again even if they were added just through normal socking or even meat-puppetry, or mechanical turk service is no different than if they we added via a botnet or some sort of malware. It doesn't change the fact that they were added in a way outside of process and the ones added thusly should be removed. Though in a way that might exactly be what is desired if Chris's theory is correct. Because of nofollow, the links themselves don't add to the page rank of the articles, but the freshness of articles on wikipedia does make the Misplaced Pages article itself more likely to show up to search results. For example searching on bing will often put an exert from a relevant Misplaced Pages article on the top of the right column. I would actually argue that the notability of the impacted articles should probably be checked as well. PaleAqua (talk) 22:07, 29 July 2014 (UTC)

Yeah, the technical details (which cause headache) are not as important as the nature of RotLink's act, and if we are to apply WP:DUCK, then apply it on the nature of RotLink's act. It does not work well on technical details. Perhaps the "what if xxx's theory is correct" can be put out for the instance. By the way, someone has put a proof that nofollow doesn't necessarily nullify the effect of links of search engine ranking, if I'm correct on interpreting the content... Anyway, if most articles with the links are not notable, then removing them isn't much of a problem...Forbidden User (talk) 16:25, 30 July 2014 (UTC)

I would say archives of a potential comet impact with Mars are notable ( https://archive.today/duZRu ) and JPL/NASA data is PD. Archives.is is the perfect way to keep track of the expanding observation arc and the refinement in the solutions. The WayBack Machine is a fail when it comes to archiving public domain (PD) material from the JPL small body database. -- Kheider (talk) 18:17, 30 July 2014 (UTC)

I have plenty of articles on notable athletes with links to local newspapers available on the web only (eg. http://archive.today/bqdNY) Hawkeye7 (talk) 22:08, 30 July 2014 (UTC)

And so some links will actually be clicked. If the nofollow doesn't necessarily nullify the effect of links of search engine ranking (as proved by someone with linked evidence, correct me if I'm wrong), then archive.is is profitting off our free work (here "profit" is not confined to money).Forbidden User (talk) 13:31, 31 July 2014 (UTC)

How is this any different from the New York Times or Engadget "profiting" from having their URLs linked in Misplaced Pages citations, and readers clicking on them? Why is this even an issue? And if "readers clicking on the links and making archive.is profit" is really such a big deal as claimed, then why not introduce a new parameter in {{cite web}} so editors can choose to have the "archiveurl" parameter show up as a tiny and inconspicuous blip? Something like "smallarchive=yes", which does something like this:

Before: Vasovic, Aleksandar (14 March 2014). "Serbian paramilitaries join pro-Russian forces in Crimea". Yahoo News. Archived from the original on 15 March 2014.
After: Vasovic, Aleksandar (14 March 2014). "Serbian paramilitaries join pro-Russian forces in Crimea". Yahoo News. See archive created on 15 March 2014.

In the second case, no one is going to bother clicking the second link since it's so small, unless the original link is already dead in the first place. Seriously, there's hardly any justification left for the paranoia surrounding SEO if a simple tweak is done to the citation templates: Instead of having an archive URL replace the original if the parameter is filled, it should just include a small archive link to the side in case it is ever needed.

There are all sorts of potential solutions to this issue as long as we bother to come up with constructive ideas, but we're only putting the whole thing into limbo due to the huge focus on disagreements. --benlisquare_T•C•E 13:57, 31 July 2014 (UTC)

Is it relevant to talk about the cite web template here? It now works in the way that prompts clicking into the archive URLs and so it is. Linking sources is due to usage of the sources, necessitating attribution to them. For archive.is, links are put on by force (and a porn stub link gets 6,000+ hits per month, look through others' comments above). Yes, it is a big deal.Forbidden User (talk) 14:34, 31 July 2014 (UTC)

I have to break my silence to counter this continued and completely wrong statement. Archive.is does not use ads - they do not PROFIT off Misplaced Pages in the least. As for the Celeste page, its an oddity. Even prior to Rotlink, that page consistently got 6000+ views a month, including 8000 in August 2010. Archive.is said it just got the most hits from Misplaced Pages on that page, which given its nature was probably for prurient interests (naked pictures) instead of content verification. Still Archive.is doesn't make a dime off Misplaced Pages clicked links and Google doesn't boost Archive.is for its mere citation in Misplaced Pages - it boosts the actual Misplaced Pages article in its rankings. Also, if you want Archive.is to not run on donations or on its owner's dime - might as well shoot down Archive.org or Webcite and the others as well. Lastly, the "Not promotional" policy you linked is not about links or their hosting in any way. I regularly use a lot of paywalled sources, many cost you $100+ a year to view the content or journals which cost you $35 an article. I hope to have Newspapers access soon and that's I think only $80 a year, but you know I am going to be using that to do research for articles and then add hundreds or thousands of sources that link directly to the database. ChrisGualtieri (talk) 15:41, 31 July 2014 (UTC)

No, no, no. I did not talk about money or ads. It's totally fine if they make a transparent donation system, just like Wiki here. I saw discussion which says from the Alexa graph, those links being clicked (not the links alone, though someone put a link to something about nofollow in this edit, saying that nofollow does not necessarily work perfectly) do conribute to page traffic. One click on RotLink's spam links is enough to constitute promotion, whether or not he is connected to archive.is or intended to do the promotion (as someone put in this RfC:Ends are more important.). Profit is not just money. Traffic is an asset just like money. After all, why did archive.is not give some technical logs and proofs, etc, to defend itself? It does not need much time, given their expertise. I'm not putting some FUD here, but that is indeed one of the factor leading to this endless discussion.Forbidden User (talk) 16:48, 31 July 2014 (UTC)

Oy, vain much - I was addressing more than just your post. S Marshall mentioned, "...his would obviously have had the effect of increasing archive.is' hosting costs as well as their advertising revenue." Was what the first part was in response to. Misplaced Pages is tiny part of Archive.is's traffic and looking for a porn actress's pictures were (above all others) the highest "hit" from Misplaced Pages. Archive.is is a service, but WP:NOTPROMO doesn't matter and when revealing the data constitutes a violation of EU law - you are being unreasonable. You are making demands that no respectable company would agree to - its internet vigilantism and Misplaced Pages has a recent case of a spectacular bad use of even private data. But that's a diversion, because no one will advocate Rotlink's activity as being acceptable. Rotlink should have been blocked for copying the Archive.is blog post and trying to assume the identity of the website - if not for a impersonation than a WP:COI violation. And this assumption that if Archive.is even shows something, what on earth are you going to do with the data? What can you compare it to, and wouldn't it still be Archive.is's word against your own beliefs? Rotlink should never have been allowed to do what was done - but that is past. ChrisGualtieri (talk) 17:34, 31 July 2014 (UTC)

P.S. Barring a criminal investigation, no "hard evidence" will ever likely be given publicly and Misplaced Pages soured its own chances with Archive.is by demanding they "come clean or else". That's a legal threat and Wikipedians' burned the bridge, but also, actual law prevents disclosure of such information. To take from an ANI thread, continuing to "stir the shit pot" is about all people can do on the alleged Rotlink / Archive.is connection. Also, hi Rotlink. ChrisGualtieri (talk) 17:34, 31 July 2014 (UTC)

@S Marshall:, @ChrisGualtieri: A few people have said the archive.is operator has denied being Rotlink. Can someone point out where this is? It may be somewhere here but I can't find it and so far, all I've seen is Misplaced Pages:Archive.is RFC/Rotlink email attempt. Although this was pointed out to me back in October, I didn't look at this very well before, I admit I should have since it has some interesting info. I may change my opinion on how to handle archive.is. But anyway, the email discussion actually makes any denials now from archive.is that they were connected to rotlink very weird because there are strong indications from the emails that rotlink i.e. Denis is the operator of archive.is who evidentally is also Denis. (Remember that the emails were to the rotlink account. So it's very very weird if the person replying wasn't the person behind the account. And there definitely doesn't seem to be anything in the email discussion indicating they are replying for some reason even though they are not rotlink.) It's true that Denis/rotlink appears to deny the were behind the proxy/botnet/whatever (the SEO comment), this is a distinct point. I hope people aren't confusing the two as this whole discussion is already so damn confusing. Nil Einne (talk) 17:10, 1 August 2014 (UTC)

Thanks for that link btw, it is exactly what I was looking for when I asked for connection information. Going to spend some time reading it and reevaluating my positions. PaleAqua (talk) 17:43, 1 August 2014 (UTC)

The situation is complex - I won't go into too much detail, but please do not spam the webmaster. I've sent a final e-mail to my contact - since cooler heads are now dominating this discussion - Wikipedians seem ready to listen. I've also asked for permission to publish the e-mail and details. I only hope that they are sympathetic and open to discussion. ChrisGualtieri (talk) 18:28, 1 August 2014 (UTC)

Oh, so anyone disagreeing with you is not "cooler heads" and all the people expressing concern on archive.is are not "cooler heads"... Could you please focus on the content instead of categorising editors? Also, I did not request anything from archive.is, as it would be COI evidence anyway, though WMF'd have more to work on. So, take care before accusing others of legal threats. You saying that EU law forbids the disclosure pursued by some can be taken as legal threat as well by your logic. By the way, did you ignore that linked "evidence" (I don't know if it is)? I would also suggest you to check the application area of policies and guidelines before citing them. WP:NPA applies on contributors (i.e. accounts), while WP:No legal threats applies on the said group.

You actually put a point of mine — we at least need CheckUsering. WMF intervention is better, and if we need the truth, we will need a criminal investigation.

Sorry, but WP:NOTPROMO matters. Even if RotLink is not related to archive.is, he is promoting archive.is as its traffic increases. (One can help marketing something unrelated, right?) Most small sites face low notability and the subsequent lack of new customers. It needs more first-clicks so that it can have more regular customers. For for-profit sites this means more advertising revenue, and for NGOs it means more donations.

By the way, @Kheider:By PD you mean PDF or Perfect Dark (P2P) or Pure data?

By PD I meant public domain date. NASA data is public domain. CGI scripts place significantly more load on servers than static HTML or even PHP. So Robots.txt at the JPL small body database prevents CGI scripts from mining the database using an interface that was never designed for mining an entire database. So this is a case where Robots.txt has nothing to do with copyrights. -- Kheider (talk) 20:59, 3 August 2014 (UTC)

By looking at the page robots.txt, it is not cumpulsory and not always having to do with copyright, though it can often help safeguarding copyright. Repeat: Anyone knows why we cannot enter archive.is?Forbidden User (talk) 17:00, 4 August 2014 (UTC)

I have no problem accessing archive.is today (pun). It did appear to be down for a little bit yesterday. Could have just been server maintenance. -- Kheider (talk) 18:27, 4 August 2014 (UTC)

Last note: Failed to enter archive.is/today. Does that happen on you guys?Forbidden User (talk) 16:42, 3 August 2014 (UTC)

Additional comment: Should truth be a mystery, we will have to look for the best explanation. Mine is that RotLink is promting archive.is. Since it sounds quite absurd to put a double conspiracy theory that RotLink uses Wiki in order to overload archive.is (bad faith to both organisations), and it does not sound reasonable that someone would promote an unrelated site for no reasons, I come up with the conclusion that archive.is wants traffic to sustain itself and so tell RotLink to do his work. Per archive.is FAQ, he says "with current growth I can run the site with no ads", with shows how vital traffic is to him. P.S. Archive.is could be run by one person only.Forbidden User (talk) 17:10, 3 August 2014 (UTC)

Ping, it's been almost a week. Any word? PaleAqua (talk) 05:56, 8 August 2014 (UTC)

Chris gave some information in which Denis simply made contradiction (see Graham's Hierarchy of Disagreement, it's at the third stage from bottom, which is not good enough). Somewhat I don't think archive.is will reply, because then editors can discount any concern as hysteria.Forbidden User (talk) 11:41, 8 August 2014 (UTC)

The day Forbidden User made those remarks was the exact time my communications halted and haven't resumed. Archive.is doesn't want anything to do with Misplaced Pages. Frankly, I don't blame them... because people are actively out to bring them down and twist every word ever said. For those of you who have already seen through the veil and put the pieces together, Lexein's conversation is the truth - its been there the whole time. To the three, maybe four, people who understand the whole situation we know where, of all places, it would have to go. Also contained in Lexein's conversation is your other answer about such participation. I'm sure that the admin closer who goes through this will also come to know these things. Once you've assembled the whole picture it makes perfect sense and you know what has to be done. So you'll share in my indifference to the closing decision and find either outcome favorable. ChrisGualtieri (talk) 05:23, 10 August 2014 (UTC)

Perhaps we can all stop trying to persuade the admin to the closures we individually want? Anyway, no one knows the exact truth. Stop, please.Forbidden User (talk) 16:36, 10 August 2014 (UTC)

Another remark is that you seem to love doing things you allege others to be doing... like how you made false claims of knowing the "truth" here, and how you antagonise editors who are against archive.is based on their own reasonings/concerns/judgement on the event (Quote:"Frankly, I don't blame them... because people are actively out to bring them down and twist every word ever said.) and protagonise archive.is as some "innocent victims", again with no evidence that you promptly request from others.

@ChrisGualtieri:So I take it from all this that archive.is owner doesn't deny being Rotlink? In fact as per the previous email discussion with Lexein, they pretty much admit they were Rotlink? They simply deny they were behind the editing from proxies/botnet/whatever? Nil Einne (talk) 15:49, 14 August 2014 (UTC)

As mentioned before, Denis did not operate Rotlink, but was aware of the bot's existence. This was confirmed through the blog of Archive.is and Rotlink apparently ran off the Momento script - meaning I, or anyone here, could utilize the code and could use it. This is more conjecture on my part, but it is likely that since the code was released publicly that Archive.is was aware of its existence and likely provided some view into how to make it work with their site. Basically, Archive.is didn't operate Rotlink - but the same way a World of Warcraft mod interfaces with Blizzard's code is relatively the same level of "Archive.is" involvement. The base matter is, Archive.is didn't operate Rotlink, but was aware of the existence. The connection and circumstances (like providing functionality) which allowed the Wikibot to operate is probably the extent of their involvement. This doesn't represent a "wrongdoing", but Rotlink was not a new Wikipedian and tried to act as a liaison for Archive.is - but the account should have immediately been blocked and cleared with OTRS. I have no qualms in stating this, but a Wikipedian of some note operated Rotlink and was present before Archive.is even existed (as Rotlink's edits indicate). For the full matter, it'd need to go to ArbCom since the evidence showing this link would violate WP:OUT. I've decided (as Archive.is did in 2013) that the community must choose how to respond to the site's inclusion and appropriateness. I just believe the Wikibot's distinct edit summary and style be edit filtered if the community decides Archive.is is an okay archival site. But if it decides against it, no worries either - other Archivers exist. ChrisGualtieri (talk) 17:30, 14 August 2014 (UTC)

It looks like you still assume that Denis did not operate Rotlink is a fact... without the proof you've always wanted from others. Actually I think it's wrong for them not to see that they should be blocking those bot requests (i.e. sudden surges of automated requests), as they are very often malicious. It is more so when they are aware of its existence. Nil gave a good point — they seemed to avoid the RotLink subject... P.S. Archive.is refuses to reveal its source code under Open Source License and funding system (see their FAQ and blog). I find this contradictory to our principles.Forbidden User (talk) 17:50, 14 August 2014 (UTC)

Skipping the first part except to say from Nil's link to email exchange that it does look like their is a connection between account and the owner of the site. For the 2nd bit, while open source is nice and all we don't require other sites or external tools that we link to to be open source so I'm not sure that matters here. PaleAqua (talk) 14:29, 15 August 2014 (UTC)

I won't tolerate someone who remove talk page content of others three times. Anyway, I should clarify that "very often malicious" alludes to archive requests by Rotlink bot, and "malicious" ≠ "illegal". Denis can operate RotLink legally - he only violates Wiki policy if he does so. Opting to threaten others instead of handing in proof to support yourself is 1) not a good idea and 2) not a good argument. Repeating your "fact" cannot make it true.Forbidden User (talk) 10:16, 15 August 2014 (UTC)

See Graham's Hierarchy of Disagreement.Forbidden User (talk) 10:57, 15 August 2014 (UTC)

Forbidden User: Can you avoid responding here. I'm pretty sure you've said all you're saying here before so I don't think it helps to repeat it again and instead risks detracting from my questions. Nil Einne (talk) 14:30, 15 August 2014 (UTC)

As long as no one forces me to repeat.Forbidden User (talk) 14:49, 15 August 2014 (UTC)

Can you provide a link to the blog post? I asked for this before but didn't get it.

Has the operator of archive.is offered any explaination for the email chain betweeen Lexein and Rotlink/Denis?

As I've said before, the emails between Lexein and Rotlink/Denis are very interesting piece of evidence here. It definitely sounds a lot like the Denis replying to Lexein was the same Denis behind archive.is. And since Lexein contacted Rotlink not the Denis behind archive.is, it's fairly weird that anyone other than Rotlink is responding, particularly if the respondant didn't even mention this. There are ways this could have happened but if the person behind archive.is is aware of this email chain, I would be surprised if they offered no explaination if they claim Rotlink isn't them.

One final question, you mentioned above that the operator of archive.is stopped responding to you after Forbidden User's comments. Would it be possible to give the exact time (at least hour, if not minute) and date when you last received contact in UTC or some defined timezone?

Nil Einne (talk) 14:39, 15 August 2014 (UTC)

For the evidence that can't be shared because of outting concerns is there a way to share with say an admin, arbitrator, or oversighter that might then collaborate the information without exposing it? PaleAqua (talk) 14:35, 15 August 2014 (UTC)

I have a little question: Is the evidence a plain denial from archive.is or technical proof, like logs?Forbidden User (talk) 14:49, 15 August 2014 (UTC)

Re: "Can you provide a link to the blog post?" - It's URL blacklisted here. It's "http://blog." followed by the name of the website we're all talking about. --benlisquare_T•C•E 16:04, 15 August 2014 (UTC)

Sorry for the confusion, but I'm looking for the specific blog post where the archive.is owner denies being behind rotlink, not the blog in general. Nil Einne (talk) 17:38, 15 August 2014 (UTC)

Yes, I will only share certain information with Arbitators, the WMF or CU's - but I'll be away for a week shortly. I've privately asked about the situation and was basically told that revealing even CU information would be a violation. Please understand why I cannot divulge information related to my inquiries. The Rotlink matter has to go through Arb Com - but the community should decide whether or not to have Archive.is - independent of the past or based on the past. ChrisGualtieri (talk) 17:53, 15 August 2014 (UTC)

I'm very confused here. If it's a public blog post, why is there so much secrecy about it? Is there an outing situation in the blog post itself? If so, I can understand why you can't include the URL or quote it, but I don't understand why you would mention this super secret public blog post at all since you're confusing the whole situation and also creating an outing risk anyway. (Just to be clear in case there remains any confusion I'm only referring to the blog post itself, not any private emails, personal speculation or anything of that sort.) I would like the date & time of your last contact as well, but that's a seperate matter probably better discussed below now and if you feel even that info is too private, as I said from the beginning that's your choice. Nil Einne (talk) 18:00, 15 August 2014 (UTC)

Blog Answers

More blog info from before. All come from the Archive.today blog.

- Question:"Can your spider be stopped in a similar way as IA Archiver: "User-agent: ia_archiver Disallow: /" Thank you. Anonymous
- Answer: There is no spider (as a machine which takes decisions what to archive). All the urls are entered manually by users (or taken from https://en.wikipedia.org/Special:RecentChanges, where they also appear as a result of user edits). If the archive would check and obey robots.txt, then, if archiving is disallowed, the error should be seen to the user, right?

Then, on seeing the error, the user will archive the page indirectly, first feeding the url to an url shortener (bit.ly, …) or to an anonimizer (hidemyass.com, ..) or to another on-demand archive (peeep.us, …), and then archive the same content from another url, thus bypassing the robots.txt restrictions. So, this check will not work the same way as it works with IA Archiver (which is actually a machine which takes decisions)."

- Question: Are there any bots that archive the externe weblinks in the Misplaced Pages automatically with archive_is? I ask because I see so many archived sites of these links, I cannot believe this is made by human. - Anonymous

- Answer: There was a wiki bot for this. It had been blocked in English Misplaced Pages few months ago but the snapshosts it has made are still here. And the bot is still working for non-English Wikipedias and other Wiki-projects (e.g. http://lurkmore.to, http://wikiislam.net, etc)

- Question: What do you think about the discussion currently on the Administrators' Noticeboard at the English Misplaced Pages? These people are making baseless claims and allegations against you, without any firm evidence. Some of these people are not allowing anybody to use links to your site on Misplaced Pages. - Anonymous

- Answer: I think archive.is better fits needs of small amateur wikis and forums than big players like Misplaced Pages or Stack Overflow. Misplaced Pages could buy an enterprise archiving service (e.g. pagefreezer) for 50-150k$/year. It is only 0.1-0.3% of their annual expenditures. I cannot provide higher level of uptime and support, even for money.

Of particular note is the "Wiki bot" matter, which was over a year ago and is an outright confirmation back before the Rotlink incident that all archives are by Misplaced Pages users or taken from "Recent Changes". The existence of a bot, which can be run at whim, is no different from others - but this isn't so much of an outright denial of the connection. I cannot really give away too much information because it would OUT someone, it'd need to be through ArbCom (and I am going to be on vacation for a week soon), but I do understand that there is indeed a minor connection between Archive.is and Rotlink through the mere circumstances which lead to Rotlink's actions. Truly, a Wikipedian was behind Rotlink and appears to have been working with Archive.is to try and end dead links on Misplaced Pages. Thank fully, shortly after the Rotlink episode a non-bot was set up to auto-archive URL additions with Archive.org and a Misplaced Pages archiver (of unknown coding) was further created (maybe off Momento's code?) to make Archive.is obsolete. Without going into private data, the IA Archiver has been around for over a decade making Archive.org the best choice. I rather not cross a certain person on Misplaced Pages, so excuse me if I need to stay silent on the message details - I asked for permission to post, but I won't badger him anymore. Check Users are well aware - as they have the technical logs for Rotlink, even if now stale. As alluded before, I didn't get the whole story - or half of it, but for the details to really be known - (as hinted at above) - Arb Com would likely need to be involved and then it would have to be conducted almost entirely in the background and Archive.is would likely stay out of it. The end result would be punishment for Rotlink for their actions, but the community, as before, would decide on Archive.is. While I see Archive.is's value, either way the community decides is fine with me - and Archive.is as well. ChrisGualtieri (talk) 16:00, 15 August 2014 (UTC)

We seem to keep going back and forth. Above you said:

As mentioned before, Denis did not operate Rotlink, but was aware of the bot's existence. This was confirmed through the blog of Archive.is

Can you please provide either the URL or quote of this blog post where the operator of archive.is says they did not operate Rotlink? So far none of what you've posted have include any denials of being behind Rotlink nor does it contradict the stuff from the discussion between Lexein and Rotlink/Denis.

If you're not sure how to provide a URL, just copy the address and remove the https or http part along with the column and two strokes before "blog.archive.today" (or whatever). This will ensure it's not picked up by the black list. See e.g. blog.archive.today/post/74234216490/are-there-any-bots-that-archive-the-externe-weblinks-in

Also I don't care about the email details other than the date and time of the last contact since you suggested you believed this due to Forbidden User's post. I don't personally believe giving the time in terms of date and hour and perhaps minute will reveal anything personal but of course, it's up to you. (Or rather, I do care, but I understood these were considered private.)

Nil Einne (talk) 17:45, 15 August 2014 (UTC)

P.S. I believe I did this a week or two ago but just to check, I did another search of the blog. I can't find any mention of rotlink or rot link. The only mentions of wikipedia are largely what you already post and none of them include any denials of being behind rotlink. So I remained totally confused about this mythical blog post. Nil Einne (talk) 17:56, 15 August 2014 (UTC)

Hmm, it looks like they don't want our links anyway. So, it sums up as "Rotlink says he's with archive.is. Archive.is does not explicitly deny connection with Rotlink, but denies initiating the Rotlink bot action and later those by the IPs. There are concerns on archive.is being Rotlink and promotion through potentially illegal ways, some saying that it's highly likely, if not technically proven. Some discount these as unproven, though there is no technical details available to disprove the concerns. Through analysis of the little information, some say that there is apparently connection between Rotlink and archive.is, though the extent is disputed."?

P.S. Closing could occur in a few days' time.Forbidden User (talk) 16:54, 15 August 2014 (UTC)

Just to add, I will respond via talk pages, so that Nil's useful comments won't be drowned.Forbidden User (talk) 17:49, 15 August 2014 (UTC)

Rotlink did not claim he is Archive.is owner

He said "it is my friends' project" in Sep 2012 (1 year before the proxy/bot incident) . 87.69.97.159 (talk) 11:11, 29 August 2014 (UTC)

If that's true he's got all the motives to promote it. Anyway, SPA tagged above, thanks.Forbidden User (talk) 15:29, 29 August 2014 (UTC)

Thank you 87* - I thought there was some solid evidence in a diff that states Rotlink is not Denis pre-incident. I've said for a long time now that Rotlink was not a part of Archive.is and this is evidence of that. This backs up all my statements above. ChrisGualtieri (talk) 00:27, 30 August 2014 (UTC)

Perhaps you've gone too over with it. He can be involved in a project headed by a friend, and he made no denial to that. Afterwards he even talks as the "owner of archive.is". Beware of your double standard on the solidity of evidence. If this is called solid all his claims of owning the site should be adequate to back up claims of self-promotion, and all his claims of relation to archive.is owner (Denis) are the evidence of promotion for archive.is being cooperation of the two. Oh, the evidence you want suddenly appears without any "criminal investigation". P.S. Read your words again - the diff is nothing about the "laughable" botnet you bickered on.Forbidden User (talk) 05:53, 30 August 2014 (UTC)

arcive.is and archive.today are not reliable archives

The domains arcive.is and archive.today have are the same private person in Prague. There does not seem to be any organization or anybody answerable to anyone behind the site. We do not know anything about archiving routines and security and consequently cannot determine if the copies of the pages are genuine exact copies.

Furthermore the site has a functiontion showing which Wikipediapages are linked to the "archived" page. This page claims that the following pages are linked to it:

en.wikipedia.org » Brčko

en.wikipedia.org » Talk:Federation of Bosnia and Herzegovina

en.wikipedia.org » Talk:Republika Srpska/Archive 4

jv.wikipedia.org » Brčko (kutha)

mk.wikipedia.org » Брчко (град)

nl.wikipedia.org » Brčko (stad)

nl.wikipedia.org » Goražde

no.wikipedia.org » Brčko

no.wikipedia.org » Trebinje

I have tried to verify that the english pages linkes to the "archive" page, but failed to verify (I could of course have missed something). What worries me more is that this could possibly for some not determined reason be the purpose of the domanins, to claim a relationship with Misplaced Pages. Furthermore I tried to archive the "archive"page, but WebCite failed to be able to do so due to receiving a Page Not Found error from the website concerned. If this was a trustworthy attemt at actually filing a reliable and legitimate copy I can't really se why it is not possible to archive.

My conclusion is that this is not a reliable archive. That using archives that are not reliable could possibly damage Misplaced Pages. In my opinion links to the netsites should be prohibited and removed. --ツDyveldi _{✉ post} 19:51, 29 August 2014 (UTC)

Nested archiving does work. More. But why would anyone need it? 79.179.139.91 (talk) 20:56, 29 August 2014 (UTC)

Re: "But why would anyone need it?" - Why would anyone need a house alarm system? Why would anyone need contraception? I don't understand why people ask questions like this. Just because it doesn't apply to some people doesn't mean that it wouldn't be useful to everyone. I for one can see a lot of beneficial uses for this. --benlisquare_T•C•E 04:23, 30 August 2014 (UTC)

Not reliable like archive.org which abandons snapshots based on the whim of a robots.txt file? And not being able to archive an archive is not something unique to archive.is archives, you're asking to archive a frame that is calling on a snapshot. This will not work on most archive sites and isn't some malicious piece of coding on archive.is's end. Meanwhile, archive.is CAN archive another archive, as seen here, archived from archive.org which has abandoned the snapshot, and hte original link is dead, so archive.is is now the only place on the internet that this information exists. How is it not reliable again? Darkwarriorblake / SEXY ACTION TALK PAGE! 21:50, 29 August 2014 (UTC)

I know you two like it. However, repeating WP:IMRIGHT is not meaningful. You showed that archive.org archives can be archived, then can you show that archive.is is the only site that can archive archives from other sites? Can you show that the majority of archive sites deemed more reliable have the same problem (i.e. their archives cannot be archived), so that such a problem is inevitable? You should read Dyveldi's comment again. He mentiomed quite a lot other than archiving archives, like a function showing Wiki pages linked to it, which is akin to corporation sites' "business relation" section.Forbidden User (talk) 06:17, 30 August 2014 (UTC)

"which is akin to corporation sites' "business relation" section" - Jesus Christ, you're looking too deeply into these meaningless things. How on earth is a list of backlinks a "business relation" statement? When you visit a Wordpress blog and it lists backlinks, are they claiming relations too? This is 100% pure paranoia.

The backlinks that archive.today provides has nothing to do with Misplaced Pages. It lists all incoming URL links to that particular archive, which include blogs, forums, private wikis, and yes, Misplaced Pages. As an example, see https://archive.today/lM1uP. --benlisquare_T•C•E 07:56, 30 August 2014 (UTC)

No, it's not "business" relation. It'd have been normal if the Rotlink saga hadn't occured. However, as a lot of links are added by illegitimate means, the function can be seen as creating an impression that it is a popular archive site and/or Misplaced Pages somehow "cooperates" with it. Your analogy to WordPress sounds odd to me, as bloggers cannot gain through showing the backlist, though I could be wrong for that. As blogs are not functional, no one will click into the site more to read the post again because of a long backlist. For archive.is, the backlist can actually create impression of reputation and prompt people to use its service. As said in its FAQ, with enough traffic he can keep the site free of ads, i.e. more traffic = more fund. This is not solid hard, but it's not paranoia.

By the way, are you going to answer my questions above, and do you remember to give a figure I repeatedly asked for? Citing a certain person who deliberately ignores that as well, it's "silence when the shoe is on the other foot". You can either answer or resort to ad hom like what you did to S Marshall, and I'd like to see the former done.Forbidden User (talk) 10:09, 30 August 2014 (UTC)

As said in its FAQ, with enough traffic he can keep the site free of ads, i.e. more traffic = more fund. This is not solid hard, but it's not paranoia.

- this sustainability issue was discussed in RFC1 and Rotlink email attempt. I understood is differently (because "more traffic = more fund" together with "free of ads" looks ridiculous; how else can they convert traffic to fund if not by using ads?) -

The less number of pages are submitted daily the longer they can run the free site on the spare hardware without need to raise funds.

Bots submit a lot without bringing profit. Fundraising of WebCite among wiki editors has been failed ($10k raised with the goal of $50k). So they are pessimistic about the collaboration with Misplaced Pages and not happy with bots (unless someone will pay them or buy a enterprise license). 79.179.139.91 (talk) 11:40, 30 August 2014 (UTC)

Thanks for the answers. I wondered about the resistance to being archived. My position though is based on the fact that there is no organization. This is a private person with an adress in Prague. There is nothing controlling how, where and what he archives or if this is a proper copy. Anything could be done to or happen to these links and nobody would be responsible for checking. Why serious sponsors should pay to maintain this I do not understand and as far as I can see nobody knows or has any control of any money here. We do not even know which country the person pays tax to. An organization on the other hand does regularily have rules and regulations and a governing body of people. To call someting an archive I do expect a minimum of control by and answerability to someone it is possible to identify. --ツDyveldi _{✉ post} 13:23, 30 August 2014 (UTC)

So wait. You do not trust the alternative Wiki archiver then? If it provides a checksum, it would be infeasible to alter it and have it go unnoticed - but again your fear of an organization seems a bit unfounded, but you do not need an international organization to provide a service and you expect control and answer-ability to someone you can idenify as a precursor? Then you must be talking of some other archive service because not even archive.org meets your "standards" and Archive.org is the most well-known. ChrisGualtieri (talk) 14:05, 30 August 2014 (UTC)

All the domains (archive.{is,today}, webcitation.org, archive.wikiwix.com - French wiki default archiver) are registered to private persons. Archive.org is the only anomaly. Registering/deregistering an organisation is not a big deal nowadays, and actually it limits the liability of a person. 79.179.139.91 (talk) 14:10, 30 August 2014 (UTC)

He means all the archives are under a few people's (or even one person's) control, and so it's hard to say there is someone who's checking the archives and monitoring the use of donations. If IA can get away with issues on their service, I'd say it is smarter than most corporations in creeping away from responsibility. Will you save your money in a bank controlled solely by me or Standard Charter?

I'm not saying they try to make money out of us. Their profit in traffic and number of links raises the chance of getting donation.

Small sites need to expand to survive. Keeping a small number of archives means that their money will eventually run out. They are done faster that way.

Anyone responding to my queries in my last comments?Forbidden User (talk) 15:43, 30 August 2014 (UTC)

For eternal preservation no bank can be trusted and no bank must be used. Only gold bullion. Printed and laminated webpages. 79.179.139.91 (talk) 22:02, 30 August 2014 (UTC)

The bank analogy would probably be a good argument against centralization and duopolization as well. A bank may go bankrupt at any time, which makes putting your eggs all in one basket a bad idea. By the same token, relying on only webcitation and archive.org for web archival would be a non-optimal strategy as well, and diversifying would provide the better final outcome in the long-term. If the "trustworthiness" of archive.today is to be questioned, the proper strategy would be to warn and discourage against using it, but not to outright ban its usage, since even though "better" archival sites are still available, that does not rule out that people can use archive.today as a last resort, as a final fallback.

People should be encouraged to use archive.org and webcitation first. In the case where that is not possible, then they should be able to consider the option of using archive.today. It wouldn't be the first archive to go to, which means that Misplaced Pages projects wouldn't be swamped with archive.today URLs, slightly alleviating the concerns that there is a conspiracy to pair archive.today and Misplaced Pages as business partners. If an alternative archive of a page is available, it seems reasonable replacing existing archive.today URLs, but deletion of the URLs wholesale only intensifies the problem of having all the eggs in the same two baskets. We need to consider the problem of compring lesser evils here. --benlisquare_T•C•E 03:24, 31 August 2014 (UTC)

It appears that the community does not really buy your opinion.
Nothing is permanent, not even Misplaced Pages or the human race. If we go by your principle, archive sites are completely useless, as they're as fragile as the source sites. I doubt if Misplaced Pages would be more stable than IA.
No one says you should solely rely on archive.org, but I'm saying "go find others, not this". The fr.wiki default can be an example of alternatives.
Remember that you're not solving link rot if some eggs get into a weary basket. If an Internet version of the Great Depression occurs, these small sites die first, while some are actually too big to fall.
We have over 10,000 archive.is links now, and your !vote says don't remove. So you're not alleviating anything.
Do we remember WP:DENY?.We are prompting others to do the same thing if archive.is gets away with the saga.
You are still avoiding my questions.
However, your tone and attitude is good overall.Forbidden User (talk) 08:43, 31 August 2014 (UTC)

We are not prompting others to do anything. We have no concrete evidence that archive.is the site had anything to do with the mass addition of links. If anything of that nature were to happen again the site would be banned and there would be no hope of it ever being accepted again, but as it stands, we have no evidence of illicit actions on the part of the site. Darkwarriorblake / SEXY ACTION TALK PAGE! 09:58, 31 August 2014 (UTC)

First of all, strictly speaking, there are two mass additions. One by Rotlink and one by the proxy/botnet.
This subsection is about the reliability of archive.is, not whether there are wrongdoings on the site's side, so it's not so relevant. However I'll attempt to reply as concise as possible. If you want definite evidence you'll have to find WMF, Checkuser or even the cop, which is out of our jurisdiction. We can look at the little information at hand though. It is extensively and excessively argued upon above. Anything to add would be that archive.is has knowledge that mass automated service requests related to Misplaced Pages is against Misplaced Pages policy, and yet they don't do anything to prevent it from happening again with those IPs. They also allow the bot to continue its work on other wikis knowing that it violates our policies. Some say they just don't care enough and some say it shows cooperation and/or connection between the two. Nil has raised something useful, too. Denis has knowledge on what we're doing. I hope I don't sound too repetitive.Forbidden User (talk) 10:36, 1 September 2014 (UTC)

I've said elsewhere, and I say it again: RotLink and those IPs were abusing Misplaced Pages by doing mass additions of the links without following the procedures, even after being told to do so. They were alerted on several wikis that these additions were not wanted (they were blocked on several Wikis as operating bots without approval). Whether it is the owner of Archive.is that is behind those additions, another user trying to give archive.is a bad name, a random fan of the website, or the Queen of England is not the question. The point is that there was abuse using that site which needs to stop. Blacklisting the links is then an option, and further additions of that site need to be reasoned (through a whitelist request for the specific link).
Just as a note: if the pushing really stopped, it is likely that the IPs, and the RotLink accounts were operated by the owner (or a fan), as they are now very aware of what happens if that continues and hence it stopped. If they were not operated by someone with links to the site, but who wants to give the site a bad name, then the pushing would continue to make sure that the site would get banned (as that is their ultimate goal, isn't it?). In either case, the pushing after the warnings was not in good faith (and in the latter case, the pushing before the warnings were also not in good faith). If it was the Queen of England .. --Dirk Beetstra 11:06, 1 September 2014 (UTC)

About prompting other archive sites to follow: Now that all archive site owners see that there are editors who will defend them banging for some undeniable proof, and that the Wiki community lacks the ability to really dig out such proof, they will know it is safe to do it themselves. Sadly, this whole drama has served as an "ad" for archive.is. They probably see it too.Forbidden User (talk) 17:53, 1 September 2014 (UTC)

Comment on ownership and reliability of archiving domains.

This swedish site: http://archive.grok.se/ did not seem to work (exept the main page) due to internal server error. And stating that "archive.grok.se provides a cryptographic checksum " does not matter if the site does not work.
archive.org does work and is an organization which we know.
webcitation.org is registeret to Gunther Eysenbach at Centre for Global eHealth Innovation. The organization can be found here and Mr. Eysenbachs page on their site is here. The Misplaced Pages article on Mr. Eysenbachs is here. This is quite above board and I fail to see many chances of anything being questionable about this site given who Mr. E is and his organization.
archive.is and archive.today are owned by Mr. Petrov of Prague and this "company" information shows clearly that this has nothing to do with an organization. This page shows Mr. Petrovs email adress on the domain denis.biz. The domain denis.biz belongs to someone else, but Administrative Contact Name is Denis Petrov. The difference is that this time Mr Petrov is of St. Petersburg. Googling "Denis Petrov" AND Prague did not really help. I also found this and this, which is not in anyway conclusive, but the absence of serious info, biography material was overwhelming.

Googling in this case only made me more convinced that this netsite is under no control we can or should rely on.

What is possble though if all these links are deleted is to leave the original url. If this domain is banned It is possible to make a small manual to explain how to find an archived url at archive.is . Getting a look at the page can help editors to find the original page which quite often is not gone but is moved to a different url. The worst problem with link rot is after all that quite often the name of the article is not given or even when or where it was found, which makes it quite impossible to find out what the reference was. --ツDyveldi _{✉ post} 19:11, 1 September 2014 (UTC)

Misplaced Pages article on Mr. Denis Petrov is here. Leningrad is former name of St. Petersburg. It could be a pseudonym though. 79.179.139.91 (talk) 21:06, 1 September 2014 (UTC)

Don't see the point of providing the Wiki page of Petrov. It doesn't show that he is any more traceable.Forbidden User (talk) 17:52, 2 September 2014 (UTC)

By the way did you say that no one is sure if the owner of the site is really that particular Denis Petrov? There is a real person called Harry Potter, but you cannot say Rowling wrote about him just because of the name. The Rotlink email is probably the sole/one of the few contact methods we can have with the owner. Your descendance into ad hom-ish remarks only shows that you're out of argument and/or desperate for face-saving. The IP removed those comments by him/herself.Forbidden User (talk) 09:18, 5 September 2014 (UTC)

Please quit the BS talk. I have never seen any evidence that the archives created by archive.is are unreliable. -- Kheider (talk) 12:49, 5 September 2014 (UTC)

What is that BS? Here "reliable" is about whether there is control/monitoring, not only whether the archives are removed. An analogy would be the requirements in WP:IRS. We don't use blogs as sources since its content lacks third-party fact-checking, etc. We don't need to find enough mistakes in a blog to say it's not RS.

You've ignored the funding and management nature. I'm not repeating this repeated argument. You don't have to repeat your disagreement as well. I see neither anything new nor answers to my questions.

By the way, I think IA is open to request for improvement. Good luck.Forbidden User (talk) 16:35, 7 September 2014 (UTC)

Forbidden User, you have taken far more effort repeatedly attacking rotlink and archive.is than anyone else on this page. Your personal distrust of rotlink has nothing to do with whether achive.today archives are accurate. I find archive.today to be better and more accurate at archiving more complex webpages than archive.org is. There is no significant reason established editors should not be able to add links to archive.today when other archivers do not have the link saved. -- Kheider (talk) 17:16, 7 September 2014 (UTC)

@Kheider, and as an example of this: Archive.today link, archive.org link, webcitation link. Archive.today is the only one which actually archived what I wanted, the url plus the code requesting the specific slide. The other two for some reason cut off the requesting information and default to the first slide, making them useless. But of course, because of this pedantry and wild accusations, I can't just add the working archive.is link. Darkwarriorblake / SEXY ACTION TALK PAGE! 21:41, 7 September 2014 (UTC)

@Darkwarriorblake: Yes, and the fact it is usable has been rejected by the "archives are not needed". Though this really goes to the blacklist and spam cabal that have blown off even admins over this matter. The core issue is that someone doesn't like it, made an arbitrary decision and now a year later - good editors are suffering. That's the whole thing in a nutshell. ChrisGualtieri (talk) 14:19, 9 September 2014 (UTC)

I think we're still at different pages.

It is difficult to trust Rotlink for what he did. He's responsible for that.
You've paid lots of efforts to defend archive.is for your convenience as well. What's the point of mentioning it? It's a non-argument.
No one said it is not reliable because it's not accurate. Some editors say that it's not reliable because of its funding system (which diminishes the archives' permanance), its lack of internal monitoring, and so on, not because of quality issues. Do we reject an unreliable source by saying it has some mistakes in it? A particular blog post could be 100% accurate, and yet it is rejected based on something other than inaccuracy.
Can't you specify which slide contains the information? If your issue can be solved without opting to archive.is don't think that it can stand as an argument.
Again, what is "BS" and how do you define "established user"? Pending changes reviewer or admin?

Forbidden User (talk) 08:34, 8 September 2014 (UTC)

I see fear here, but no evidence of actual unreliability. --Bejnar (talk) 17:23, 23 September 2014 (UTC)

@Bejnar:The onus here is not to find the site unreliable, but to find them reliable. We do not swim rivers with scorpions on our backs as favours. Policy is clear, "reputation for fact checking and accuracy," That has to be found before the site is used, not used until the mishap is found. All software is for free or for sale. Hundreds of millions of western currency can be had for a dollar in a cigarette shop. There is nothing about the existence of a site or its ability to technically function which qualifies it as a resource for Misplaced Pages. It is the sites that have passed these rigours which the question becomes valid, "evidence of actual unreliability." ~ R.T.G 22:13, 3 October 2014 (UTC)

Not a democratic situation, content against robots.txt is not-free

Not here to promote other archive sites. Policies of other sites are not our concern.
The following discussion has been closed. Please do not modify it.
Should the nature of response, to, a privacy request be determined on the strength of our will to support the request or, the strength of the request to prevent us, ignoring it? If robots.txt requests that archivers do not archive the site, then it is on us to assume that they know what they are doing. They, the websites with the robots.txt files, have elected not to make themselves available for, this process. That equates a request not to proceed use of content, not to use the content in this way, or even access it in this way. The decision has been made by the owners and providers of the content. This site does: follow owners instructions not to use material, without evaluation of any sort, except, where that material may be direct historical reference in and of itself, and for that kind of reference, archive sites are not required. Because to be verifiably historical to reference, by our Wikipedian standards, it must be published widely, and if that's not what you thought it was just read to the end and click the link and forget about these companies who seen archive.org and bought up all of the archive web names. It is a shame if any content is lost adhering to freedom policies, but if something requires a worldly change to make it freely available, that change does not start here, and that doesn't sound very cool or anything, but it is part of the rock upon which the site is founded. You'll see. We didn't need it. Also for your interest, Archive.org is much more than snapshots of web pages. It is not a race to become the archive of everything that ever appeared on the internet. It's an officially licensed book and film library, based on open freely licensed material, and much in line with the goals of the Misplaced Pages project, so you should all go there and chip them one and read it. No consensus to change. Archive.is is an erroneous addition. Move to close. ~ R.T.G 13:53, 31 August 2014 (UTC) Your entire argument hinges on a completely warped and invalid notion of what Robots.txt is. Please refrain from trying to lecture people on things you do not understand. You seem oblivious to the fact that even Misplaced Pages's robots.txt is for indexing and not about preventing archiving (which we do ourselves and in every back up). Also, kindly keep your opinions about what is and isn't valuable to yourself when Archive.org's failings have been brought up repeatedly. Lastly, Archive.org is not an officially licensed book and film library - so it seems you do not understand Archive.org's text and media archives either. You also seem unaware that it was Archive.org's retroactive voiding of all archives (done without Robots.txt) when sites added it a decade later is what caused a substantial interest in Archive.is. Your uninformed opinion and poor attitude do nothing because your arguments were thoroughly disproved even before you stated them. I await your rebuttal. ChrisGualtieri (talk) 14:33, 31 August 2014 (UTC) The Misplaced Pages article, Internet Archive, says that it is an official California library. The Citation is to the site itself. I am taking it on good faith until someone disproves it. When Misplaced Pages archives itself, that is not the same as when someone from the internet archives it. And if robots.txt has a function to refuse bots, the wider general purpose of robots.txt is irrelevant to this debate. I don't know what you mean about "archives retroactive voiding" and I doubt it will reflect on the single defining issue here. And my attitude certainly is not poor on this. It is defined by the honourable position of Wikimedia in my perception, and if you do not hold to that attitude, I do not see the purpose of debating content here at all.. ~ R.T.G 20:00, 31 August 2014 (UTC) I see RTG is not limiting his disruption on things he does not understand to SPI. What I believe is meant by "retroactive voiding of all archives" is that Archive.org has recently decided to dumped all old archived versions of a website when the robots.txt files of the modern version has denied their web crawler access. Clearly, if you cannot understand why this change is even being discussed, you shouldn't expound a faulty opinion of it.—Ryūlóng (琉竜) 21:18, 31 August 2014 (UTC) It has no relevance to the fact that their content on archive.is is obtained against the owners direct instruction, it's as good as a letter to that effect. The fact it is within a robots.txt gives no special privilege to anything unless it says so. It said no. They said so. Simple as that. And don't follow me around to attack my good faith, thanks. ~ R.T.G 00:29, 1 September 2014 (UTC) You do not understand Robots.txt so you repeat your ignorance and attack other editors? If you lack the competence of the subject, cease your involvement in it and educate yourself before you go spouting blatantly false information as fact. ChrisGualtieri (talk) 03:56, 1 September 2014 (UTC) I guess he means archive.org is in line with our values more, and he thinks that robot.txt is a privacy request which should be respected. Privacy/copyright has been put forward by several people, however some others think that archiving does not interfere with privacy/anything can be done as long as it's legal, so continuing would be like an ideological battle. Sometimes an editor doesn't mean to put fact forward. Only arrogance is demonstrated in calling others inferior.Forbidden User (talk) 10:17, 1 September 2014 (UTC) That adds you to the list of people who think he doesn't know what he's talking about. Robot.txt has nothing to do with privacy or copyright. Hawkeye7 (talk) 11:46, 1 September 2014 (UTC) You have expressed my feeling quite nicely in both those senses Forbidden User, and I am of the opinion that such ideological battles cannot take the form of content for various reasons. I believe I've put all I can think of on the matter. I have stated repeatedly the fact that the form of refusal to access the content is irrelevant to the fact of refusal. Thanks. (list of people, bah!) ~ R.T.G 12:01, 1 September 2014 (UTC) @Hawkeye7: I think it is time to hat RTG's comments as an unintelligible and ill-informed distraction. This RFC has been marred by people who have absolutely no idea what Robots.txt even does and I think ArbCom would be necessary to settle it fairly. Do you agree? ChrisGualtieri (talk) 16:37, 1 September 2014 (UTC) Is hatting necessary here? People have their brains which can judge whether a certain argument is good. With the participation rate, I don't think any additional process (like ArbCom) is useful (and WMF is obviously not intervening). You yourself called on WP:BLUDGEON long ago, right? It's best to let this end the whole drama.Forbidden User (talk) 18:06, 1 September 2014 (UTC) robots.txt "...is a convention to advising cooperating web crawlers and other web robots about accessing all or part of a website which is otherwise publicly viewable..." Certainly, it matches exactly what I suggest it does. It is literally called the robot exclusion protocol. That, in simpler terms, is the protocol for preventing robots. And the .is site purposely ignores that content-owner issued exclusion document. And for ChrisGualtieri, Lorem ipsum ~ R.T.G 19:05, 1 September 2014 (UTC) Yes robots.txt is about preventing web crawlers and other robots from accessing certain pages. But archive.is is not a web crawler/robot but an on demand archiver. Granted when talking about the link adding bot that was adding archive.is links that was in effect a robot ( using archive.is to fetch the pages instead of doing so directly ). In addition to the other issues of the bot, it probably should have considered robots.txt. ( Though since the links were manually added enumerated robots.txt might not have be rehired in that case either. ) Bot the link adding bot and archive.is are potentially two different things. (see email discussions above for why I use the word potentially ). Archive.is by itself and even manually using archive.is on wikipedia would not bring robot.txt into play. PaleAqua (talk) 05:06, 2 September 2014 (UTC) I don't trust the site. It has no mission statement. It has no revelations about how it is funded. It is not notable. It is "privately funded". There is nothing to assure the worth of the site. That is primary to its inclusion on this site for a start. It was added maliciously. Keeping it here as policy would be an error through and through. Keeping it here at all clearly violates WP inclusion. That is still not a democratic situation. I move to close. ~ R.T.G 13:13, 2 September 2014 (UTC) What does notability have to do with finding an alternate means of getting cached web pages?—Ryūlóng (琉竜) 13:34, 2 September 2014 (UTC) "That is still not a democratic situation. I move to close." - What is that supposed to even mean? That your opinion is more important than my opinion? Or that you're somehow the one who calls the shots? Make yourself a bit more clearer, so I don't have to misunderstand you. --benlisquare_T•C•E 13:40, 2 September 2014 (UTC) Also of note is this edit RTG made.—Ryūlóng (琉竜) 13:45, 2 September 2014 (UTC) If you don't know what that means you shouldn't be editing this site. This is the one for English speakers. ~ R.T.G 14:19, 2 September 2014 (UTC) You can take your snarky passive-aggressiveness and shove it where it belongs. --benlisquare_T•C•E 14:25, 2 September 2014 (UTC) I'm sorry, it was said before something about my being unintelligible, and User:Ryulong is folowing me around from an unrelated discussion. I responded without reading the whole of your statement. Of course it doesn't mean that. I move to you, as in all here, to close this case. The sites notability is as dodgy as its insertion on this site. Why are we even discussing it? The only thing I saw in the comments above about content that would be lost are mathematical calculations that have been oversighted from astronomical observatories. Why is there any kind of an issue about this? I haven't seen a statement saying the importance to content. It's a non starter. It's just a circular argument. Move to close, wether disagreeing parties consider that move expected or not, it is relevant. I come half way down the page without even finding one example of content. And more of the same, and so on... ~ R.T.G 15:32, 2 September 2014 (UTC) Cool down a bit. Written content is easily misunderstood. I think RTG's latest comment is somehow improved. It might be more meaningful to look at.Forbidden User (talk) 17:57, 2 September 2014 (UTC)

Not here to promote other archive sites. Policies of other sites are not our concern.

The following discussion has been closed. Please do not modify it.

Should the nature of response, to, a privacy request be determined on the strength of our will to support the request or, the strength of the request to prevent us, ignoring it? If robots.txt requests that archivers do not archive the site, then it is on us to assume that they know what they are doing. They, the websites with the robots.txt files, have elected not to make themselves available for, this process. That equates a request not to proceed use of content, not to use the content in this way, or even access it in this way. The decision has been made by the owners and providers of the content. This site does: follow owners instructions not to use material, without evaluation of any sort, except, where that material may be direct historical reference in and of itself, and for that kind of reference, archive sites are not required. Because to be verifiably historical to reference, by our Wikipedian standards, it must be published widely, and if that's not what you thought it was just read to the end and click the link and forget about these companies who seen archive.org and bought up all of the archive web names. It is a shame if any content is lost adhering to freedom policies, but if something requires a worldly change to make it freely available, that change does not start here, and that doesn't sound very cool or anything, but it is part of the rock upon which the site is founded. You'll see. We didn't need it. Also for your interest, Archive.org is much more than snapshots of web pages. It is not a race to become the archive of everything that ever appeared on the internet. It's an officially licensed book and film library, based on open freely licensed material, and much in line with the goals of the Misplaced Pages project, so you should all go there and chip them one and read it.

No consensus to change. Archive.is is an erroneous addition. Move to close. ~ R.T.G 13:53, 31 August 2014 (UTC)

Your entire argument hinges on a completely warped and invalid notion of what Robots.txt is. Please refrain from trying to lecture people on things you do not understand. You seem oblivious to the fact that even Misplaced Pages's robots.txt is for indexing and not about preventing archiving (which we do ourselves and in every back up). Also, kindly keep your opinions about what is and isn't valuable to yourself when Archive.org's failings have been brought up repeatedly. Lastly, Archive.org is not an officially licensed book and film library - so it seems you do not understand Archive.org's text and media archives either. You also seem unaware that it was Archive.org's retroactive voiding of all archives (done without Robots.txt) when sites added it a decade later is what caused a substantial interest in Archive.is. Your uninformed opinion and poor attitude do nothing because your arguments were thoroughly disproved even before you stated them. I await your rebuttal. ChrisGualtieri (talk) 14:33, 31 August 2014 (UTC)

The Misplaced Pages article, Internet Archive, says that it is an official California library. The Citation is to the site itself. I am taking it on good faith until someone disproves it. When Misplaced Pages archives itself, that is not the same as when someone from the internet archives it. And if robots.txt has a function to refuse bots, the wider general purpose of robots.txt is irrelevant to this debate. I don't know what you mean about "archives retroactive voiding" and I doubt it will reflect on the single defining issue here. And my attitude certainly is not poor on this. It is defined by the honourable position of Wikimedia in my perception, and if you do not hold to that attitude, I do not see the purpose of debating content here at all.. ~ R.T.G 20:00, 31 August 2014 (UTC)

I see RTG is not limiting his disruption on things he does not understand to SPI. What I believe is meant by "retroactive voiding of all archives" is that Archive.org has recently decided to dumped all old archived versions of a website when the robots.txt files of the modern version has denied their web crawler access. Clearly, if you cannot understand why this change is even being discussed, you shouldn't expound a faulty opinion of it.—Ryūlóng (琉竜) 21:18, 31 August 2014 (UTC)

It has no relevance to the fact that their content on archive.is is obtained against the owners direct instruction, it's as good as a letter to that effect. The fact it is within a robots.txt gives no special privilege to anything unless it says so. It said no. They said so. Simple as that. And don't follow me around to attack my good faith, thanks. ~ R.T.G 00:29, 1 September 2014 (UTC)

You do not understand Robots.txt so you repeat your ignorance and attack other editors? If you lack the competence of the subject, cease your involvement in it and educate yourself before you go spouting blatantly false information as fact. ChrisGualtieri (talk) 03:56, 1 September 2014 (UTC)

I guess he means archive.org is in line with our values more, and he thinks that robot.txt is a privacy request which should be respected. Privacy/copyright has been put forward by several people, however some others think that archiving does not interfere with privacy/anything can be done as long as it's legal, so continuing would be like an ideological battle. Sometimes an editor doesn't mean to put fact forward. Only arrogance is demonstrated in calling others inferior.Forbidden User (talk) 10:17, 1 September 2014 (UTC)

That adds you to the list of people who think he doesn't know what he's talking about. Robot.txt has nothing to do with privacy or copyright. Hawkeye7 (talk) 11:46, 1 September 2014 (UTC)

You have expressed my feeling quite nicely in both those senses Forbidden User, and I am of the opinion that such ideological battles cannot take the form of content for various reasons. I believe I've put all I can think of on the matter. I have stated repeatedly the fact that the form of refusal to access the content is irrelevant to the fact of refusal. Thanks. (list of people, bah!) ~ R.T.G 12:01, 1 September 2014 (UTC)

@Hawkeye7: I think it is time to hat RTG's comments as an unintelligible and ill-informed distraction. This RFC has been marred by people who have absolutely no idea what Robots.txt even does and I think ArbCom would be necessary to settle it fairly. Do you agree? ChrisGualtieri (talk) 16:37, 1 September 2014 (UTC)

Is hatting necessary here? People have their brains which can judge whether a certain argument is good.

With the participation rate, I don't think any additional process (like ArbCom) is useful (and WMF is obviously not intervening). You yourself called on WP:BLUDGEON long ago, right? It's best to let this end the whole drama.Forbidden User (talk) 18:06, 1 September 2014 (UTC)

robots.txt "...is a convention to advising cooperating web crawlers and other web robots about accessing all or part of a website which is otherwise publicly viewable..." Certainly, it matches exactly what I suggest it does. It is literally called the robot exclusion protocol. That, in simpler terms, is the protocol for preventing robots. And the .is site purposely ignores that content-owner issued exclusion document. And for ChrisGualtieri, Lorem ipsum ~ R.T.G 19:05, 1 September 2014 (UTC)

Yes robots.txt is about preventing web crawlers and other robots from accessing certain pages. But archive.is is not a web crawler/robot but an on demand archiver. Granted when talking about the link adding bot that was adding archive.is links that was in effect a robot ( using archive.is to fetch the pages instead of doing so directly ). In addition to the other issues of the bot, it probably should have considered robots.txt. ( Though since the links were manually added enumerated robots.txt might not have be rehired in that case either. ) Bot the link adding bot and archive.is are potentially two different things. (see email discussions above for why I use the word potentially ). Archive.is by itself and even manually using archive.is on wikipedia would not bring robot.txt into play. PaleAqua (talk) 05:06, 2 September 2014 (UTC)

I don't trust the site. It has no mission statement. It has no revelations about how it is funded. It is not notable. It is "privately funded". There is nothing to assure the worth of the site. That is primary to its inclusion on this site for a start. It was added maliciously. Keeping it here as policy would be an error through and through. Keeping it here at all clearly violates WP inclusion. That is still not a democratic situation. I move to close. ~ R.T.G 13:13, 2 September 2014 (UTC)

What does notability have to do with finding an alternate means of getting cached web pages?—Ryūlóng (琉竜) 13:34, 2 September 2014 (UTC)

"That is still not a democratic situation. I move to close." - What is that supposed to even mean? That your opinion is more important than my opinion? Or that you're somehow the one who calls the shots? Make yourself a bit more clearer, so I don't have to misunderstand you. --benlisquare_T•C•E 13:40, 2 September 2014 (UTC)

Also of note is this edit RTG made.—Ryūlóng (琉竜) 13:45, 2 September 2014 (UTC)

If you don't know what that means you shouldn't be editing this site. This is the one for English speakers. ~ R.T.G 14:19, 2 September 2014 (UTC)

You can take your snarky passive-aggressiveness and shove it where it belongs. --benlisquare_T•C•E 14:25, 2 September 2014 (UTC)

I'm sorry, it was said before something about my being unintelligible, and User:Ryulong is folowing me around from an unrelated discussion. I responded without reading the whole of your statement. Of course it doesn't mean that. I move to you, as in all here, to close this case. The sites notability is as dodgy as its insertion on this site. Why are we even discussing it? The only thing I saw in the comments above about content that would be lost are mathematical calculations that have been oversighted from astronomical observatories. Why is there any kind of an issue about this? I haven't seen a statement saying the importance to content. It's a non starter. It's just a circular argument. Move to close, wether disagreeing parties consider that move expected or not, it is relevant. I come half way down the page without even finding one example of content. And more of the same, and so on... ~ R.T.G 15:32, 2 September 2014 (UTC)

Cool down a bit. Written content is easily misunderstood. I think RTG's latest comment is somehow improved. It might be more meaningful to look at.Forbidden User (talk) 17:57, 2 September 2014 (UTC)

By the way I'm taking a Wikibreak, good day.Forbidden User (talk) 17:57, 2 September 2014 (UTC)

@PaleAqua and Kww:: Could you two or others knowing the process help me oversight those edits of my replacing the 122. IP's signature with mine?Forbidden User (talk) 17:57, 2 September 2014 (UTC)

Talk:Legends_(TV_series)#Call_for_a_vote_on_hatnote_for_this_page ~ R.T.G 10:38, 16 September 2014 (UTC)

Three months

It is now a full three months. Consensus is not clear. Many editors want to see another alternative archive site, but there are serious challenges to this sites suitability for WP, and those challenges have been rebutted only with a willingness to overlook, and that unfortunately is not good enough. The site was added maliciously. The background of the site is a total mystery. These two items alone are enough to close discussion without satisfactory restitution. WP content must be reliably sourced. This applies to all content. That item must be satisfied or there can be no outcome from a discussion. Sorry.

Notes:

There are dozens of approved alternative archive sites. Lists of them provided by governments and educational resources.
Reasons to depend on sites are essential to verify their usability. Big sites might not answer you on the phone, but it is usually very clear who is profiting from them and what their motivations might be. This is often overlooked for the sake of local informational sites which are the only sources available for local topics and very narrow (local) range of effect. Such sites have at least a moderate amount of dependability based on the fact that if they promote themselves locally with bad information, word will get around. Those are specialist informational resources. Archive.is is not an informational resource local or otherwise and we have no clue who or what the site is about. We cannot change that. The site is a total mystery.
Mystery sites disappear. Sites like this with limited functionality and zero profit margin all go to the same place. In Scandanavia they call it Valhalla. In Indonesia they call it Surga. On WP we call it 404.
Licensing. It is not good enough on matters of dubious permissions to say we don't care or will overlook. 3rd Pillar. We cannot break those for trivialities.
There are no excuses for overlooking such a list of major malfunctions. You can't do it. Oh but, you discussed it and were brave enough to try? Bravery is not the issue here. It is about licensing and reliability. Bravery is the issue to get you from reading to editing, but it is nothing to do with determining the reliability of a mysterious resource. ~ R.T.G 12:27, 28 September 2014 (UTC)

I do not understand: you tried to phone archive.is and they did not answer, didn't you? 79.182.24.214 (talk) 17:47, 28 September 2014 (UTC)

Quote: "There are dozens of approved alternative archive sites" - and not all of them are functional for all cases. As an example, both Webcitation and archive.org fail to archive http://www.jp.playstation.com/psvita/update/ due to the roundabout way the website is coded, however archive.today manages it perfectly fine. Having variety of choice allows us to avoid these functionality problems. "But there are many alternatives!" is never a valid excuse for this very reason. --benlisquare_T•C•E 10:59, 2 October 2014 (UTC)

Functionality is not a topic of reliability. We cannot have all of everything that is on the internet. If you don't know that, you are at the wrong site. There are rules about reliability. We want trust, not trinkets. Facebook will host all of the trinkets you can find and then some. ~ R.T.G 10:26, 3 October 2014 (UTC)

If functionality is not something you seek, then I recommend purchasing a typewriter for all your article-writing needs. Don't tell me what "we" seek, as if you're a know-it-all over here. If you have an opinion, call it an opinion, because it sure as hell ain't fact. The internet is dynamic and ever-changing, there is no reason why we should knowingly limit what we are capable of in terms of archival. If you want "trust", then let me remind you that Archive.org is hardly as "trustworthy" as Archive.today is. Many old articles about the Gamergate controversy are archived on archive.today but not archive.org. Why? Because Zoe Quinn personally contacted the owner of Archive.org to have them deleted. If an archival service is volatile enough to be affected by third-party conflict of interest, then how reliable is it? --benlisquare_T•C•E 10:37, 3 October 2014 (UTC) --benlisquare_T•C•E 10:31, 3 October 2014 (UTC)

And here is a newsflash... None of that establishes a "reputation for accuracy and fact checking". Functionality is not a topic of reliability, and you response to that is nothing but an unreasoned attack. It's a response, but it is not an answer. The facts of trust as relating to archive.org, are not relevant to the facts of trust, as relating to archive.is. See, you know they don't have the reputation we need, and you know that archive.org is about archive.org, but you want those aspects to fit, so you keep chanting them... And all the while, with your need so great, have you contacted any of the archiving sites that are considered reliable? Well don't because they will tell you that sites archive.is can archive that they do not are because those sites have instructions (robots.txt) not to use scripts. It is because the information that archive.is gives us is taken without the owners permission. But who cares, right? We can get away with all that, right? ~We know we can trust them because they gave us cookies... yes? ~ R.T.G 14:12, 3 October 2014 (UTC)

WP:Drop the stick and back slowly away from the horse carcass. What we are using as reliable sources is the original, archived sites. As it happens though, and has already been pointed out, archive.is does a better job of preserving sites than many of its competitors, and is therefore often more accurate than archive.org, rendering sites correctly where other archive sites do not. This has nothing to do with robots.txt. It is all about the way that sites are archived. As has been pointed out, archive.is traverses the site more thoroughly. I was frustrated at the inadequate job that archive.org did with archiving the 2012 Olympics site, a free site that does not use robots.txt. Hawkeye7 (talk) 20:10, 3 October 2014 (UTC)

No, what you are talking about using is using the archives. This discussion has nothing whatever to do with the original sources. And no way. The horse I am guarding is the whole stable. And WP:GAVEUS, it is all that you have to say, you are just keeping on saying it in the hope that others will stop pointing out that you are wrong. ~ R.T.G 21:41, 3 October 2014 (UTC)

Three months

It is now a full three months. Consensus has turned toward supporting Archive.today. No serious objections to use of archive.today for WP have been made. Many editors want to have archive.today functionality. The only rebuttal requires a willingness to overlook the damage banning the site is doing to wikipedia and that unfortunately is not good enough. If I don't know the meaning of a big word like restitution, I shouldn't use it. WP content must be verifiable. This applies to all content. There can be no compromise on that based on overblown concerns about who to blame for old behavior or ghosts of the past. (My point being that the section above this one is incredibly biased, and two can play that game.)--{{U|Elvey}} 01:29, 2 October 2014 (UTC)

Example use of Archive.today

I just created an article on near-Earth asteroid 2014 SC324. Since the observation arc is only 2 days long today and will be longer tomorrow, I used archive.today to make a capture of the public domain JPL page that archive.org will not capture. Too bad reference #5 has to be coded as if I am referencing some obscure offline source. -- Kheider (talk) 20:30, 2 October 2014 (UTC)

Your "facts" are irrelevant! I know in my heart that archive.today is the devil incarnate! </sarcasm> --{{U|Elvey}} 21:12, 2 October 2014 (UTC)

Seriously, I tried to bring this RFC to the attention of folks like you trying to get work done by using archive.today, but I was told, effectively, to FUCK OFF and stop even REQUESTING the change when I asked that " However, this policy is under review at Misplaced Pages:Archive.is_RFC_3." be added to the warning folks get when trey try to add archive.today URLs. --{{U|Elvey}} 21:12, 2 October 2014 (UTC)

The fact that it is used is not related to the fact that it does not establish reliability. People know it is used, or they wouldn't ask for it not to be used. We could just copy and paste everything on the internet, but that's 4chan, that's Facebook, this site is Misplaced Pages, and it is about, reliable encyclopaediac information. ~ R.T.G 10:35, 3 October 2014 (UTC)

Verifiability is also important and it has never been established that archive.today alters content. Webcite appears to be struggling to find funding. Any archive site could disappear in the future and it is foolish to limit our options. Most of the ArchiveToday bashing comes from people that simply despise the user Rotlink. -- Kheider (talk) 12:27, 3 October 2014 (UTC)

Those things, reliability, notability, verifiability, are on a guilty-until-proven-innocent basis on Misplaced Pages. You should know that before you edit the site... I could open a site myself that saves webpages. I save them myself on my desktop. WP:NAIVE ~ R.T.G 13:36, 3 October 2014 (UTC)

Although there is nothing innocent since the fall you are more than welcome to make a try to open an alternative website. During these three months of discussion there were two attempts to do it (one by User:Dispenser - see message from 20:40, 17 February 2014 on this page and http://archive.grok.se/ by User:Henrik). Both have failed. 85.64.65.134 (talk) 14:10, 3 October 2014 (UTC)

If we cannot find resources suitable to the mission, then the mission is over and something else begins. There is a difference between not finding resource, and not yet being satisfied to have found the maximum resource. One is about necessary resources, and the other is about greed. ~ R.T.G 22:05, 3 October 2014 (UTC)

Webcite not accessable 7 Oct 2014 but we still want to limit our options by blocking Archive.today? -- Kheider (talk) 17:57, 7 October 2014 (UTC)

No, because having more options to choose from is evil. /sarcasm --benlisquare_T•C•E 08:22, 8 October 2014 (UTC)

Another example of the necessity of archives. Box Office Mojo, probably one of the most cited links across film articles for box office analysis and information decided to just shut down with every single link now just redirecting to IMDB. Were it not for archiving, this, would be this. Except that can still happen since WebCite has funding issues. Darkwarriorblake / SEXY ACTION TALK PAGE! 21:46, 10 October 2014 (UTC)

Those things, reliability, notability, verifiability, are on a guilty-until-proven-innocent basis on Misplaced Pages. - this is absurd. Any serious encyclopedia must have reliable information. This is not and should not be a crappy website full of hodge-podge. Many times the users add wrong information and numbers and nobody checks if they are correct for a long time (years). Many times the editors change those data and numbers. When the source is dead, how can you tell which number is correct and which one is not? How can we keep the Misplaced Pages accurate without verifiability? How can we suppose to be taken seriously without supplying accurate data? — Ark25 (talk) 20:55, 13 October 2014 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.