Importance of Data Provenance in Scientific Research

What is the purpose of Data Provenance

Question images
[Scientific Questions]. (n.d.). Retrieved March, 2019, from https://pixabay.com/

Science requires transparency and verifiability, and scientists must always ask:

  • Is this data trustworthy?
  • Is this data authentic?
  • From where did the data originate?
  • Is this data accurate?
  • How was this data generated and processed?
  • How has the data been changed over time?
  • What other data was used to calibrate, validate, and process these data?[2]

Duplicate Results
[Duplicated Results]. (n.d.). Retrieved March, 2019, from https://pixabay.com/
Tracking provenance for research data is vital to science, providing answers to questions researchers pose when sharing and exchanging data:

  • Where did the data come from?
  • Who modified it?
  • Can this data be trusted?
  • Can this data be reproduced?
  • Is this copy of data the same as the copy I deposited?
  • In what way is it the same?
  • How do I resolve discrepancies or anomalies?[2]

Why is Data Provenance Important?

Data provenance plays an important role in science research. The validity, authenticity and integrity of most such experiments hinges on ability to reproduce the results consistently.

Good science requires more than results. Data provenance is necessary to supports and validates the following:

  • Reproducibility is necessary to ensure that the results are not an accidental outcome, but the result of genuine, carefully-performed experimentation and analysis
  • Verifiability is necessary to assure that the results really did derive from the data
  • Credibility as part of the published data, to making determinations about whether information is trusted and and how to give credit to originators when reusing information
  • Authentication is necessary to believe that the raw data used in the scientific work is itself valid [1]
  • Reference
    [1] Data provenance. (n.d.). Retrieved February, 2019, from https://www.ands.org.au/working-with-data/publishing-and-reusing-data/data-provenance
    [2] Hills, D. J., Downs, R. R., Duerr, R., Goldstein, J. C., Parsons, M. A., & Ramapriyan, H. K. (2015, December). The Importance of Data Set Provenance for Science. Retrieved February, 2019, from https://eos.org/opinions/the-importance-of-data-set-provenance-for-science.