Date of Award


Degree Name

Doctor of Philosophy


Computer Science


Natalia Villanueva Rosales


Many ground-breaking scientific experiments require the execution of multiple complex scientific computations. Thus, scientific workflows (i.e., a sequence of scientific computations) have received significant attention, more specifically, the automated composition of scientific workflows. Scientific workflows that repurpose data may have unique scientific assumptions that need to be considered when composing a workflow. Workflow composition tools have enabled a wider range of stakeholders (e.g., policymakers, the general public, and researchers) to create and execute workflows; however, domain expertise is still required for these tasks. The overarching goal of this work is to further improve the automatic composition of scientific workflows by validating if the scientific assumptions taken during the creation of a dataset are aligned with the scientific assumptions required to use these datasets for a specific scientific computation. This work aims to answer the following research questions: How can metadata and provenance be used to describe scientific assumptions of data consumed by scientific computations for the improvement of automated scientific workflow composition and repurposing of data? and to what extent can current Artificial Intelligence (AI) planning techniques with a heuristic function be used to formulate a scientific workflow that considers scientific assumptions in a hydrology domain? Our initial work focused on exploring automatic workflow composition with components that require and produce multiple scientific variables for an abstract case study (i.e., domain-free) using graph traversal. In addition, a second case study was conducted for a real-world hydrology scenario, which provided us with insights into how scientific assumptions could be described to enable model-to-model integration. In both cases, abstract and real-world scenarios, we use domain-independent vocabulary to represent a workflow for interoperability between different workflow management systems. We extended existing and widely used ontologies and vocabularies for describing scientific assumptions that are used in the automated composition of workflows. In addition, we propose a heuristic function for optimizing the algorithm. Our work aims to support scientific decision-making by enabling a wider range of stakeholders (e.g., policymakers, the general public) to automatically generate scientific workflows leveraging additional domain knowledge captured in metadata that can be executed in frameworks compatible with the standard workflow language used in this work.




Received from ProQuest

File Size

88 p.

File Format


Rights Holder

Raul Alejandro Vargas Acosta