Date of Award


Degree Name

Master of Science


Computer Science


Natalia Villanueva-Rosales


In any scientific experiment, researchers are required to access, compute, and analyze data to produce useful information to the scientific community. In order to instill trust on such scientific research products, the product users need to understand the procedure applied and the assumptions incorporated. The reuse and replication of reliable scientific data need methods that help the users to understand data origin and the derivation process, i.e. provenance. Although several standards for representing provenance such as PROV model (a W3C recommendation) have been recommended, they have not been widely utilized by scientific communities due to difficulty in aligning such recommended standards to their needs. However, use of this standard has not improved, as suggested by provenance usage studies in the literature.

In this research we propose controlled vocabularies for describing provenance data using three provenance design patterns. These provenance design patterns were used in three domains, i.e., Smart Cities, Water Modeling, and Biodiversity Modeling. We evaluate the proposed vocabulary with users of the interdisciplinary, international USDA-funded Water modeling project. The results show that in general, provenance is important to understand and trust a final product. This work provides a building block to create and evaluate complex provenance design patterns that can be embedded in systems that manipulate data and executes scientific workflows.




Received from ProQuest

File Size

72 pages

File Format


Rights Holder

Smriti Rajkarnikar Tamrakar