Misplaced Pages

Research data archiving: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 13:54, 24 March 2007 editK (talk | contribs)Extended confirmed users, Pending changes reviewers22,767 edits Removing poorly cited and highly misleading statement that when x happens (inadequate or nonexistent data, it is "pseudoscience".← Previous edit Revision as of 13:54, 24 March 2007 edit undoWilliam M. Connolley (talk | contribs)Autopatrolled, Extended confirmed users, Pending changes reviewers, Rollbackers66,015 editsm rv to SWNext edit →
Line 1: Line 1:
{{Original research}} {{Original research}}


'''Scientific data archiving''' refers to the long-term storage of scientific data and methods. Scientific journals and funding agencies generally have policies requiring scientists to store in a public archive any of their data and methods necessary to reproduce their studies. This is considered the best practice because it insures other scientists can audit the data, replicate the research and build on their findings. The journals often rely on the good intentions of scientists to supply any supplementary data that may be required. The need for data archiving and due diligence is greatly increased when the research deals with health issues or public policy formation. <ref>"The Case for Due Diligence When Empirical Research is Used in Policy Formation" by Bruce McCullough and Ross McKitrick. </ref> <ref> "Data Sharing and Replication" a website by Gary King </ref> '''Scientific data archiving''' refers to the long-term storage of scientific data and methods. Some scientific journals and funding agencies have policies that require scientists to archive their data and methods so other scientists can audit the data, replicate the research and build on their findings. These policies are generally not enforced. The need for data archiving and due diligence is greatly increased when the research deals with health issues or public policy formation. <ref>"The Case for Due Diligence When Empirical Research is Used in Policy Formation" by Bruce McCullough and Ross McKitrick. </ref> <ref> "Data Sharing and Replication" a website by Gary King </ref>

In order to prevent data loss or corruption, some journals - such as Nature - require data not be held by the researcher alone. In these cases, data must be archived either at a government data center, an accredited independent ] or the publisher of the journal. <ref>"Availability of Data and Materials: The Policy of Nature Magazine</ref>


==Policy of NSF in Grant General Conditions== ==Policy of NSF in Grant General Conditions==
Line 28: Line 30:
Withholding of data has gotten to be so commonplace in academic genetics that researchers at Massachusetts General Hospital published a journal article on the subject. The study found that “Because they were denied access to data, 28% of geneticists reported that they had been unable to confirm published research.” <ref>"Data withholding in academic genetics: evidence from a national survey" by EG Campbell et al. </ref> Withholding of data has gotten to be so commonplace in academic genetics that researchers at Massachusetts General Hospital published a journal article on the subject. The study found that “Because they were denied access to data, 28% of geneticists reported that they had been unable to confirm published research.” <ref>"Data withholding in academic genetics: evidence from a national survey" by EG Campbell et al. </ref>


===Climate change research===
In 2003, ] and ] decided to audit the published findings of ] et al from an article published in 1998. Dr. Mann refused access to data and his source code.<ref>"Mann on Source Code" by Stephen McIntyre</ref> After a long process - in which the National Science Foundation had supported Mann's effort to withhold the code - the code was finally turned over.<ref> "Title to MBH98 Source Code" by Stephen McIntyre </ref> Dr. Mann published a Corrigendum in which he admitted some errors but denied others. <ref>"corrigendum-Global-scale temperature patterns and climate forcing over the past six centuries" by Michael Mann, et al </ref> The criticisms of McIntyre and McKitrick were reviewed by the Wegman Panel <ref>"The Wegman Report" </ref> and the National Academy of Sciences. <ref>"Surface Temperature Reconstructions for the last 2,000 years" by National Academy of Science </ref> McIntyre and McKitrick claim their findings have been largely confirmed by these reviews. <ref>"A Scorecard on MM03" by McIntyre and McKitrick </ref> Mann claims that the errors found made no difference to his conclusions.<ref>See Corrigendum of "Global-scale temperature patterns and climate forcing over the past six centuries" by Mann et al </ref> Without access to the author’s data, methods and source code, a full audit could not have been made.


In 2006, Martin Juckes et al submitted an article to Climate of the Past which was then made available for comment on the Internet. The article claimed the source code used by McIntyre and McKitrick was not archived. McIntyre responded that the accusation was false and may be academic misconduct, with an implicit threat of legal action against Juckes and coauthors. <ref>Potential Academic Misconduct by the Euro Team" by Stephen McIntyre </ref> False claims regarding data archiving are usually easy to establish. Juckes blamed the inaccurate statement on a misunderstanding. <ref>Martin's Big Day by Stephen McIntyre </ref>


==Data archives== ==Data archives==

Revision as of 13:54, 24 March 2007

This article possibly contains original research. Please improve it by verifying the claims made and adding inline citations. Statements consisting only of original research should be removed. (Learn how and when to remove this message)

Scientific data archiving refers to the long-term storage of scientific data and methods. Some scientific journals and funding agencies have policies that require scientists to archive their data and methods so other scientists can audit the data, replicate the research and build on their findings. These policies are generally not enforced. The need for data archiving and due diligence is greatly increased when the research deals with health issues or public policy formation.

In order to prevent data loss or corruption, some journals - such as Nature - require data not be held by the researcher alone. In these cases, data must be archived either at a government data center, an accredited independent data library or the publisher of the journal.

Policy of NSF in Grant General Conditions

36. Sharing of Findings, Data, and Other Research Products

a. NSF expects significant findings from research and education activities it supports to be promptly submitted for publication, with authorship that accurately reflects the contributions of those involved. It expects investigators to share with other researchers, at no more than incremental cost and within a reasonable time, the data, samples, physical collections and other supporting materials created or gathered in the course of the work. It also encourages awardees to share software and inventions or otherwise act to make the innovations they embody widely useful and usable.

b. Adjustments and, where essential, exceptions may be allowed to safeguard the rights of individuals and subjects, the validity of results, or the integrity of collections or to accommodate legitimate interests of investigators.

Policies by journals

Nature: An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Therefore, a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols available to readers promptly on request. Any restrictions on the availability of materials or information must be disclosed at the time of submission of the manuscript, and the methods section of the manuscript itself should include details of how materials and information may be obtained, including any restrictions that may apply. One preferred form of disclosure is a link from the methods section to a copy of the relevant Material Transfer Agreement (MTA) form, which is hosted as Supplementary Information on the journal's web site. Authors may charge a reasonable fee to cover the costs of producing and distributing materials. If materials are to be distributed by a for-profit company, this should be stated in the paper.
Any supporting data sets for which there is no public repository must be made available to referees at submission and any interested reader on and after the publication date from the authors directly, the author providing a URL to be used in the paper on publication.
Such material must be hosted on an accredited independent site (URL and accession numbers to be provided by the author), or sent to the Nature journal at submission, either uploaded via the journal's online submission service, or if the files are too large or in an unsuitable format for this purpose, on CD/DVD (five copies). Such material cannot solely be hosted on an author's personal or institutional web site.
After publication, readers who encounter a persistent refusal by the authors to comply with these guidelines should contact the chief editor of the Nature journal concerned, with "materials complaint" and publication reference of the article as part of the subject line. In cases where editors are unable to resolve a complaint, the journal reserves the right to refer the correspondence to the author's funding institution and/or to publish a statement of formal correction, linked to the publication, that readers have been unable to obtain necessary materials or reagents to replicate the findings.
Science: Materials sharing After publication, all reasonable requests for materials must be fulfilled. A charge for time and materials involved in the transfer may be made. Science must be informed of any restrictions on sharing of materials applying to materials used in the reported research. Any such restrictions should be indicated in the cover letter at the time of submission, and each individual author will be asked to reaffirm this on the Conditions of Acceptance forms that he or she executes at the time the final version of the manuscript is submitted. The nature of the restrictions should be noted in the paper. Unreasonable restrictions may preclude publication.

Controversies involving data archiving

Heart research

Dr. Singh published research regarding heart attack victims. His research was questioned. The medical journal investigated for 12 years before deciding the research was probably fraudulent. If Dr. Singh had archived his data and methods prior to publication, the issue may have been resolved more quickly.

Academic genetics

Withholding of data has gotten to be so commonplace in academic genetics that researchers at Massachusetts General Hospital published a journal article on the subject. The study found that “Because they were denied access to data, 28% of geneticists reported that they had been unable to confirm published research.”


Data archives

References

  1. "The Case for Due Diligence When Empirical Research is Used in Policy Formation" by Bruce McCullough and Ross McKitrick.
  2. "Data Sharing and Replication" a website by Gary King
  3. "Availability of Data and Materials: The Policy of Nature Magazine
  4. "National Science Foundation: Grant General Conditions (GC-1)" published April 1, 2001 (page 17)
  5. "Availability of Data and Materials: The Policy of Nature Magazine
  6. "General Policies of Science Magazine"
  7. "Medical Journal Editor Finds Truth Hard to Track Down" published by Alliance for Human Research Protection"
  8. "Data withholding in academic genetics: evidence from a national survey" by EG Campbell et al.

Literature

  • Gauch Jr Hugh G (2002) Scientific Method in Practice, Cambridge University Press
  • Popper, KR (1959) "The Logic of Scientific Discovery" (English translation, 1959)
  • Wilson F (2000) The Logic and Methodology of Science and Pseudoscience, Canadian Scholars Press

External Links

  • Statistical checklist required by Nature
  • The US National Committee for CODATA
  • Studies examine withholding of scientific data among researchers, trainees
  • The Role of Data and Program Code Archives in the Future of Economic Research
  • Data sharing and replication – Gary King website
  • Some thoughts or disclosure and due diligence in climate science
  • The Case for Due Diligence When Empirical Research is Used in Policy Formation by McCullough and McKitrick
  • Thoughts on Refereed Journal Publication by Chuck Doswell
  • “How to encourage the right behaviour” An opinion piece published March, 2002.
  • “The Selfish Gene: Data Sharing and Withholding in Academic Genetics” by Eric Campbell and David Blumenthal published May 31, 2002.
Categories: