Misplaced Pages

talk:Link rot: Difference between revisions - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 02:10, 17 September 2012 editLexein (talk | contribs)Extended confirmed users, Rollbackers17,577 edits Archive.is: new section← Previous edit Latest revision as of 10:42, 26 December 2024 edit undoCewbot (talk | contribs)Bots7,306,228 editsm Maintain {{WPBS}}: 2 WikiProject templates. (Fix Category:WikiProject banners with redundant class parameter)Tag: Talk banner shell conversion 
(575 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{Talk header|noarchives=yes|search=no}}
{{Talkpage}}
{{WikiProject banner shell |collapsed=no |1=
{{essaysort|importance=mid}}
{{Help Project}} {{Misplaced Pages Help Project|importance=Mid}}
{{WikiProject Misplaced Pages essays|importance=Mid}}
{{archive box|search=yes|]<br>]}}
}}

{{Archive box|search=yes|]<br>]<br>]<br>]<br>]}}
== Internet Archive ==

The ] doesn't seem to have archived anything since about August 2008. What does this mean for dead links that should have been archived since then? <font face="Tahoma">] <small>(])</small></font> 14:36, 23 January 2010 (UTC)
:See ]: "Snapshots become available 6 to 18 months after they are archived." -- ] (]) 20:55, 23 January 2010 (UTC)

==removing a dead link?==
if I fina a dead link and I don't feel like fixing it is it cool to remove it, esp if I think the claim it supported was kinda retarded anyway? --] ] 18:52, 26 January 2010 (UTC)
:uh... not with reasoning like that, no. 'kinda retarded' does not qualify as an objective assessment of the merits of the link, since other editors can easily say 'it aint so retarded' - an equally valid statement without any further evidence. if you're just fixing linkrot, fix the link or flag it for others; if you want to get involved with content editing (to remove 'retarded' content) go ahead and do it explicitly as an edit; don't call it a linkrot fix. --] 20:25, 26 January 2010 (UTC)
::Tag it with a {{tl|deadlink}}.--] (]) 22:17, 26 January 2010 (UTC)

== Dead link vs. linkrot ==

What's the difference, exactly? ] (]) 20:56, 30 January 2010 (UTC)

:There isn't one. "Linkrot" is a term used to describe the phenomenon of good links going dead over time. --] (]) 04:40, 4 February 2010 (UTC)

== Linkrot and sustained notability ==

If an article is at first supported by a series of links to establish notability, and 100% of those links go bad, does that mean that in some cases the subject of the article can be considered not notable and the article be deleted? ] (]) 16:21, 18 February 2010 (UTC)
*It shouldn't happen. Notability is forever, even if all of the links go bad. That's why it's probably a good idea to use a ], so that there is plenty of documentation about the former link. Also, check out ], which would apply to dead links. --] (]) 19:16, 18 February 2010 (UTC)

:*I have seen articles get put up for deletion on the basis that all the links have gone bad, and the noms use the ] argument to support their cause, while those who support keeping cannot prove it. Those favoring deletion do not buy the ] argument in these cases. ] (]) 15:23, 23 February 2010 (UTC)

:*Editors delete content all the time based on dead links. The Orwellian memory hole lives, and it lives here in the Misplaced Pages. I think it is a huge problem. --] (]) 23:18, 4 March 2010 (UTC)

== Archiving British web pages ==

The following ] article explores some of the problems regarding archiving British web pages: These problems affect the strategy used here. ] (]) 19:23, 6 March 2010 (UTC)

:Does it though? A web archiving service acts on the laws of its resident country, not on those of the site it is archiving (as I understand the law). So archive.org and WebCite are fine, and that is their concern anyway; only if "we" (the WMF) were to set up our own archiving server in the UK would "we" be affected (as I understand it). - ]&nbsp;<sup>]? ].'']</sup> 11:22, 7 March 2010 (UTC)

::That is not how I understand it. In fact, the article itself mentions that this is a problem for organizations like the Internet Archive, which hosts the Wayback Machine. It affects us because, as part of our strategy, we specifically recommend using tools, such as the Wayback Machine, which are affected by this law. I'm not asking for a change in the article--I just wanted to make people aware that the Wayback Machine isn't a magic bullet in the effort to help stave off linkrot. ] (]) 21:28, 8 March 2010 (UTC)

== External links are not references ==

Just to explain my recent changes:

You should (almost) never remove this:

==References==
*

You should cheerfully remove this:

==External links==
*

It is not possible to justify a dead "External link" under ]. ] (]) 18:32, 18 March 2010 (UTC)

:You are correct. However, caution should certainly be used since inexperienced users often put stuff they've used a reference under "External links". --] (]) 20:30, 29 March 2010 (UTC)

: Besides this, it seems to me that an archived copy of an external link may well be a good replacement for the original (as it is for a reference), so linking to such a copy (if available) is preferable to simply removing the link. ] (]) 16:30, 3 January 2012 (UTC)

== linkrot vs. stability, e.g. News Corp vs. Fairfax in Australia ==

I've noticed that links to many articles published by News Corp in Australia are especially susceptible to linkrot, whereas links to articles in the Fairfax papers, The Age and the SMH, are quite solid. If there were enough evidence to support my statement, would WP ever have a guideline such as "Use paper X, Y, Z, if possible, instead of P, Q, as these are less susceptible to linkrot?" ] (]) 20:52, 1 April 2010 (UTC)

:We do advise against using Yahoo news stories (which typically decay within weeks), so it is certainly possible. --] (]) 02:01, 11 April 2010 (UTC)

::Of course, nobody reads the directions, so I wouldn't get my hopes up, but you're certainly welcome to include the advice. ] (]) 03:44, 11 April 2010 (UTC)

== Archiving every reference? ==

Is it suggested that we should archive every reference used in our articles? I see there's a WebCiteBOT, but I've never seen it in action, and certainly not on any article I've worked on. I just recently lost a very important reference and I'm still trying to work on finding a fix (contacting the editors, etc.). This was a great lesson to me about link rot, but now I'm wondering if I'm supposed to archive ''every'' reference I use? – <i><font color="#104FFF">Ker</font><font color="#187FFF">αun</font><font color="#18AFFF">oςc</font><font color="#18DFFF">op</font><font color="#18FFFF">ia<sup>◁</sup></font><sub><font color="#5E1FFF">]</font></sub></i> 20:48, 9 June 2010 (UTC)

:Quite simply put WebCite cannot handle the volume that Misplaced Pages provides, even the small run of 10-50 PDFs a night by Checklinks seems to be contributing to the problem. — ] 22:15, 9 June 2010 (UTC)

::That I suppose would explain the bot, but what about manual submissions to the archive? Should I just archive references as I see fit? WayBack's six-month lag seems to be a bit of a long wait considering some website pages disappear in only a few weeks. – <i><font color="#104FFF">Ker</font><font color="#187FFF">αun</font><font color="#18AFFF">oςc</font><font color="#18DFFF">op</font><font color="#18FFFF">ia<sup>◁</sup></font><sub><font color="#5E1FFF">]</font></sub></i> 22:18, 9 June 2010 (UTC)

::Is this still the case? It could affect . &nbsp; — '''<span style="background:Yellow;font-family:Helvetica Bold;color:Blue;">] ]</span>''' 04:24, 29 February 2012 (UTC)

== Impossible archiving ==

Some cited sources use various forms of presentation, including streaming audio (sometimes integrated within a ''written'' interview), streaming video, and, especially in the case of ''Billboard'''s website, flash or some similar method of loading articles. These sites can't be archived at all. Without transcripts published elsewhere, these sites seem to me to be absolutely vulnerable to link rot. – <i><font color="#104FFF">Ker</font><font color="#187FFF">αun</font><font color="#18AFFF">oςc</font><font color="#18DFFF">op</font><font color="#18FFFF">ia<sup>◁</sup></font><sub><font color="#5E1FFF">]</font></sub></i> 19:04, 12 June 2010 (UTC)

dafuq? impossible to archive a .mov file? or a .mp4 file? or a .swf file? this is usually no problem... (although most search engines CHOOSE not to do it, but its entirely optional.) ] (]) 21:13, 2 May 2012 (UTC)

==Link Rescue Bots==
Two new bots have just been approved to find archives for dead links. ], the first one, is written and opperated by ]. It has gone through all the featured articles, and has made a large dent in the good articles. However, due to some small technical difficulties, it is down for the moment. ] is written and operated by ]. It does pretty much the same thing. As the two bots finish up the ] and the ] i think we will do articles by request. Any ideas of which articles we could let the bots run on next? (Categories are good) ]</font><font color="Red" face="Optima" >]</font> <sup><font face="Times new roman" size = 2 >]</font></sup> 17:12, 15 June 2010 (UTC)
:I'd say ] and then all ] that are B-class and below, that is if the bot is able to make that distinction. -- ]] 02:25, 21 July 2010 (UTC)

==blogs.nzherald.co.nz==

URLs http://blogs.nzherald.co.nz will cease 301 redirecting to URLs on http://www.nzherald.co.nz shortly. Checking my logs I note that a few articles have references/links to articles on blogs.nzherald.co.nz ( such as ] ). These should be updated as soon as possible. The equivalent articles should still exist but will be harder to find after the redirect is gone. Could somebody please inform a bot operator. I have no idea how many links are in place. - NZH Admin <span style="font-size: smaller;" class="autosigned">—Preceding ] comment added by ] (]) 03:46, 10 August 2010 (UTC)</span><!-- Template:UnsignedIP --> <!--Autosigned by SineBot-->

== Web Link Checking Bot ==

Hi, I'm currently running a bot on my server against Misplaced Pages to check the external links, using pywikipediabot and the included weblinkchecker.py script. What this bot does is scan the contents of articles for external links, and then proceeds to check the links for 404s or timeouts, and creates a datafile of the non-working links. After about one week, the bot will then recheck the links, and report on the talk pages of the articles which links are dead, according to the data that the bot collected. In the report submitted, the bot will automagically suggest a link to archive.org, which if it was caught, should be a valid archived version of the link. The reason for my post here is to request input from the community, per the suggestion of ] in ]. I am watching both this page, and the BRFA thread, so commenting at either location is ok, and your input is greatly appreciated. Thanks, ] (]) 14:34, 17 August 2010 (UTC)
:On dewiki we decided that at minimum 4 weeks delay and 3 tests are required because many links are back online after 2-3 weeks after changing hosting service. But the script on repository has some bugs you should care about. You could test the script this page:
:*
:* http://www.stiftung-denkmal.de/dasdenkmal/stelenfeld
:* http://www.musculus.de.tf/Czernowitz/cz_strassenreferenz.html
:* http://torrihigginson.info/
:which report errors on all four links above. ]] 16:13, 17 August 2010 (UTC)
::Thanks for the input. Do you know if there is an updated version of the script that has the bugs fixed? ] (]) 16:45, 17 August 2010 (UTC)
:::No, i never used this script. I only know the reponse from dewiki where we have a template which can be used by users for marking failed dead link bot reports. ]] 17:25, 17 August 2010 (UTC)
::What bugs are meant by "the script on repository has some bugs you should care about."? &nbsp; — '''<span style="background:Yellow;font-family:Helvetica Bold;color:Blue;">] ]</span>''' 04:57, 17 January 2012 (UTC)
:How can I help? I'm interested in helping with any automated deadlink detection/mitigation. Since '']'' stopped archiving as of late 2008, checking it is necessary, but not sufficient. Automated checking of, and pre-emptive archiving with, ] is needed, IMHO (or ''other service'', especially for pages poorly captured by Webcitation - conditionals, Javascript, AJAX, etc have problems). I'm in favor of an on-demand full-rendered-web-page screengrab service, or an as-rendered-html+CSS-only service if one exists - these seem to be the only way to simultaneously guarantee pixel accuracy and actual content presence. Of course, respecting robots.txt. --] (]) 01:35, 12 September 2010 (UTC)
:: We mostly need people filling out references. Currently ] is probably the best in filling out references, but I haven't updated it with the feedback/learning mechanisms and the WebCite interface is a bit hard to use. You can also use ] to semi-automatically fix links. — ] 22:37, 12 September 2010 (UTC)
:::I know and use those tools frequently, but I would certainly participate in revising and betatesting semi-auto tools which help as well. --] (]) 23:18, 13 September 2010 (UTC)

I have ] in for such a bot, and could use some responses at ]. &nbsp; — '''<span style="background:Yellow;color:Blue;">] ]</span>''' 03:02, 23 March 2011 (UTC)
:My request for responses linked above has moved . &nbsp; — '''<span style="background:Yellow;font-family:Helvetica Bold;color:Blue;">] ]</span>''' 20:57, 21 January 2012 (UTC)

== Solution against the broken external_links: backup the Internet ==

Please find ]. ] (]) 09:53, 3 September 2010 (UTC)

== Marking a dead link within a citation template ==

How is one to mark a dead link within a citation template, e.g.:
* {{cite web|url=http://www.gujratpolice.gov.pk/user_files/File/SOP_For_Employment_of_Elite_Force.pdf|title=Gujrat Police official website, Standard Operating Procedures|accessdate=2009-03-08}}
I did a hack by adding <tt>|publisher=<nowiki>{{Dead link}}</nowiki></tt> into the template, but that may not be the preferred way to do this. __] (]) 16:33, 5 September 2010 (UTC)
:It's better not to do so, but rather follow the }} with <code><nowiki>{{dead link|date=August 2010}}</nowiki></code>.
:* {{cite web|url=http://www.gujratpolice.gov.pk/user_files/File/SOP_For_Employment_of_Elite_Force.pdf|title=Gujrat Police official website, Standard Operating Procedures|accessdate=2009-03-08}}{{dead link|September 2010}}

:Yes, it seems to look odd, but I believe it's best practice for "deadlink" to always appear as the last text on a citation or link line. Of course, make an attempt to repair with Checklinks, too... --] (]) 18:51, 5 September 2010 (UTC)

== All links eventually go bad ==

I think that in the fullness of time, on geologic time scales, all links will go bad. This is simply because those who sponsor such web pages will ultimately die off. Web servers will be lost in fires and floods. Misplaced Pages administration needs to recognize this reality. The future expansion of Virtual Servers with NO PERSISTENT STATES will only make this worse. Please see ]. There are many Misplaced Pages editors who delete content that has a dead link, and use ] to make a point. Most editors are too lazy to go to the library to verify older information, and just delete things. It is hard to maintain "presumption of good faith" when undereducated editors are denying a lot of history. Look at this example: ]. We can see that Beeblebrox, by all accounts a good wikipedian, justified a delete because the Library was too far away. Misplaced Pages should not exist at the convenience of the editors, but should exist in the service of truth. Perhaps there can be some kind of "grandfathering" clause on links. Perhaps, I would suggest, that if a link exists for a long enough period of time, that the standard of proof should shift from the creators/maintainers to those who would delete. In other words, if the link was there for a number of years, and then it rotted, then the link would be "presumed valid" instead of the present case, where is seems to be presumed a fabrication of someone's imagination. This way, the content in Misplaced Pages could age gracefully, becoming more authoritative as it got older. This feels more proper to me. This would be a good alternative to the present case where good content is deleted willy nilly by those who would deny history, simply because it is hard to verify. <small><span class="autosigned">— Preceding ] comment added by ] (] • ]) 03:30, 20 December 2010 (UTC)</span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
:You seem to have declared everyone's opinion on a single incident. The closing administrators should be experienced enough to separate valid reasons from invalid reasons. The content was not lost, it was merged. Verifiability is a principle of Misplaced Pages, and ''the reader'' cannot verify the material if the website rotted years ago. That's why we have this page. Given you posted here, is there an actual change/removal/addition you propose to this guideline? The "more authoritative as it gets older" will in my opinion not pass. —&nbsp;<small>&nbsp;]&nbsp;&nbsp;▎]</small> 10:55, 20 December 2010 (UTC)
::I don't mean to impugn everyone. What I am proposing is not a reduction in verifiability. Misplaced Pages must remain verifiable, of course. But the system we have now is that overzealous and undereducated editors will deny history, simply because the links have rotted. They are too lazy to verify content, so they delete it. They do it because "the library is 250 miles away", and they cannot just pop over there. I am making the suggestion that this is wrong and bad. Misplaced Pages ought to do something about the very long term problem of rotted links, because all links eventually will rot. ] seems to show this as an accellerating problem. As links rot through distant time scales, under the present system, the whole of wikipedia will have to be slowly rewritten. I think this is revisionist history, and it is objectionable to me. It can lead to history being manipulated by those who control search engines. Of course, you all might think I'm wrong. Whatever. I intend it only as food for thought. I am not declaring everyone's opinion on a single incident. I see a pattern here of editors denying history and deleting content, simply because they see the verification as too much work. I see it all the time. It is as if the orwellian memory hole lives. Editors will chuck all content without a valid link, even if the link was good in the past. They do this despite the wikipedia policies expressly forbidding it. --] (]) 02:51, 21 December 2010 (UTC)<small><span class="autosigned">— Preceding ] comment added by ] (] • ]) 02:40, 21 December 2010 (UTC)</span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
:::If their actions are against policy, then their edits should be reverted. If their good-faith edits are against policy or guidelines, then they should be educated. If they remove previously undisputed content because a link is bad, they should be informed not to do this. I don't see what solution you propose for the hyperbolic problems you are describing. Misplaced Pages has a strong bias towards electronic sourcing, because frankly websites are easy to access without driving 150 miles to the library. As far as actual record of history is concerned, there is much much written material elsewhere that doesn't "linkrot". —&nbsp;<small>&nbsp;]&nbsp;&nbsp;▎]</small> 10:25, 21 December 2010 (UTC)
::::So, my thoughts are meaningless drivel? To be chucked into the ether? No, the problem is much worse than you are even able to comprehend. You're unshakable defense of the status quo blinds you to even see that there is a problem, much less forge a solution. You admit there is a bias, but yet, fail to point to any solution at all. And when one is put forward as food for thought, not a serious proposal, you dismiss it as hyperbole. And then you make the astonishing claim that Misplaced Pages doesn't matter, because the "actual record of history" lies elsewhere. I guess that Misplaced Pages will overcome all of these problems someday. I was just trying to help.--] (]) 00:52, 22 July 2011 (UTC)
:::::It seems you have misinterpreted every sentence I said to the level of ]. Personally, having run a bot that tags and replaces thousands of dead links, I do not see a need to explain my stance or motivation if my replies are misinterpreted anyway. —&nbsp;<small>&nbsp;]&nbsp;&nbsp;▎]</small> 07:32, 22 July 2011 (UTC)

== Solving link rot problem ==

We are working to solve the link rot problem ]. We would like everybody to ]. Thanks - ''''']''''' (]) 14:25, 6 February 2011 (UTC)

== Conflict between guidelines ==

This guideline and ] give conflicting advice about dealing with dead links used to support article content. Please join the conversation at ]. ] (]) 22:12, 17 February 2011 (UTC)

:The lengthy conversation has closed, and I have updated the advice at ]. If anyone wants to check over this page and improve its contents, please feel free. ] (]) 19:43, 28 March 2011 (UTC)

== Proposal for new WikiProject to repair dead links ==
Just a notice for anyone who's interested. ]. -- ]] 06:39, 20 April 2011 (UTC)

== A new WebCiteBOT ==

Hi all. I'm working in a new WebCiteBOT. I have opened ]. It is ] and written in ]. I hope we can work together on this. Archiving regards. ] (]) 17:15, 21 April 2011 (UTC)

== RfC to add dead url parameter for citations ==

A relevant RfC is in progress at ]. Your comments are welcome, thanks! —&nbsp;<small>&nbsp;]&nbsp;&nbsp;▎]</small> 10:49, 21 May 2011 (UTC)

== Simple answer ==

Use more print references...

Obvious really. Misplaced Pages is a joke if it leans too heavily on the web alone.--] (]) 16:32, 10 August 2011 (UTC)

:If only more people were aware of the fact that references don't have to be online.. we should promote ] more.. -- ]] 15:58, 16 August 2011 (UTC)

::But they're so eeeeeeasy! But seriously, in practice, there's a balance to be struck. Some editors such as Cirt have created articles which are fantastically sourced, but completely offline, leaving out ''all'' convenience links. I don't know why; it may be due the research tools he uses, which, though deep, are not at all accessible to non-subscribers. ''Very'' annoying.
::Over at ] I finally twigged to '''Bare ] harms ].''' Seems I don't care so much if a link rots if it has been properly, verifiably expanded. --] (]) 17:23, 16 August 2011 (UTC)

== Extension:ArchiveLinks ==

http://www.mediawiki.org/Extension:ArchiveLinks

Is it possible to ask WMF to enable (maybe also finish) this wonderful extension? ] (]) 10:20, 10 January 2012 (UTC)

== Incompatibility with ] (even if that is linked here) ==

This page (]) ''states in its lead section that "These strategies should be implemented in accordance with Misplaced Pages:Citing sources#Preventing and repairing dead links, which describes the steps to take when a link cannot be repaired."''

But how can we do in accordance with Misplaced Pages:Citing sources#Preventing and repairing dead links if some sentence in this page's lead section (for example "Do not delete factual information solely because the URL to the source does not work any longer. WP:Verifiability does not require that all information be supported by a working link, nor does it require the source to be published online.

Except for URLs in the External links section that have not been used to support any article content, do not delete a URL solely because the URL does not work any longer. Recovery and repair options and tools are available.") and the whole "Keeping dead links" section ''are incompatible with that page?''

Does explicit instruction to "implement in accordance with Misplaced Pages:Citing sources#Preventing and repairing dead links" means that that page is predominat? --] (]) 22:25, 8 February 2012 (UTC)

== Archive.is ==

I think we should go slow on advocating http://archive.is. The field is littered with defunct archive sites - just look at this article history. Archive.is looks good, very good in fact, and its performance and coverage of essentially all used sources is very encouraging. But IMHO Misplaced Pages can't afford to depend on a brand new site which so far, discloses no public information about its funding, affiliation, or future. I have communicated with the owner, and I am confident the owner is acting in good faith, but it's a solo effort. I'd like to see if the site is here in a year. In the meantime, I would like to advocate using WebCite ''in parallel'' with Archive.is, meaning at least archiving at WebCitation, if not citing in ref. I hope this is received as a sensible precaution, in the best interest of Misplaced Pages's future source verifiability. --] (]) 02:10, 17 September 2012 (UTC)

Latest revision as of 10:42, 26 December 2024

This is the talk page for discussing improvements to the Link rot page.
This project page does not require a rating on Misplaced Pages's content assessment scale.
It is of interest to the following WikiProjects:
WikiProject iconMisplaced Pages Help Mid‑importance
WikiProject iconThis page is within the scope of the Misplaced Pages Help Project, a collaborative effort to improve Misplaced Pages's help documentation for readers and contributors. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks. To browse help related resources see the Help Menu or Help Directory. Or ask for help on your talk page and a volunteer will visit you there.Misplaced Pages HelpWikipedia:Help ProjectTemplate:Misplaced Pages Help ProjectHelp
MidThis page has been rated as Mid-importance on the project's importance scale.
WikiProject iconMisplaced Pages essays Top‑impact
WikiProject iconThis page is within the scope of WikiProject Misplaced Pages essays, a collaborative effort to organize and monitor the impact of Misplaced Pages essays. If you would like to participate, please visit the project page, where you can join the discussion. For a listing of essays see the essay directory.Misplaced Pages essaysWikipedia:WikiProject Misplaced Pages essaysTemplate:WikiProject Misplaced Pages essaysWikiProject Misplaced Pages essays
TopThis page has been rated as Top-impact on the project's impact scale.
Note icon
The above rating was automatically assessed using data on pageviews, watchers, and incoming links.

Archives

Archive 1 (2005-2007)
Archive 2 (2008-2009)
Archive 3 (2010-2014)
Archive 4 (2015-2021)
Archive 5 (2022-)


Categories: