Date of Award

2023-05-01

Degree Name

Doctor of Philosophy

Department

Computer Science

Advisor(s)

Monika Akbar

Abstract

Early detection is crucial to mitigate the impact of emerging threats. This work proposes four innovative frameworks that build machine learning and deterministic epidemiological models using multiple domain-specific datasets to detect the onset of emerging threats in two domains: infectious diseases and cybersecurity. Our models are designed to detect infectious disease outbreaks, model their spread, detect malware activity, and analyze the relationship between software/hardware weaknesses and attack techniques.

First, we present a novel framework to detect multiple infectious disease outbreaks by integrating standardized disease-specific domain knowledge and public search trend data. Our framework showed high performance in identifying infectious disease outbreaks ---diseases that are among the leading causes of illness and death in the United States--- using people's search data. In addition to detecting outbreaks, studying their spread within a region is equally important. Therefore, we present the SEIRD+m model, which integrates human mobility data into the classical deterministic SEIR epidemiological model to provide a more accurate approach to modeling epidemics. We demonstrated its efficacy using COVID-19 as a case study, showing that restricting mobility only in COVID-19 hotspots can effectively reduce predicted infections and deaths among at-risk populations, including those based on race, income, and age.

Both infectious diseases and computer malware require timely and accurate detection to minimize their impact. Therefore, we extended our disease outbreak detection framework to detect malware activity over a geographic region. We use natural language processing (NLP) approaches to connect disparate cybersecurity datasets, enabling the development of a machine learning model for detecting malware activities based on people's search trends in a specific location. Our model has proven effective in identifying malware activity in four real-world attack case studies. Aside from detecting malware activities, it is necessary to investigate the properties of software vulnerabilities and how these properties are used to compromise systems, in order to prevent cyberattacks and mitigate their impacts. Thus, we propose a framework that leverages NLP techniques to find connections between attack techniques and software vulnerabilities. The effectiveness of our framework is demonstrated through three case studies, highlighting its potential in identifying potential security/software vulnerability exploitation of multiple software weaknesses.

The approaches presented in this work provide evidence that the integration of domain-specific datasets and user-generated dynamic data can enable the development of highly effective computational models for detecting emerging threats. By leveraging these models, decision-makers can rapidly identify and respond to potential threats, leading to a more efficient allocation of resources. Our work opens up exciting opportunities for further research in this area.

Language

Provenance

Recieved from ProQuest

Copyright Date

2023-05-01

File Size

File Format

application/pdf

Rights Holder

Ismael Villanueva Miranda

Recommended Citation

Villanueva Miranda, Ismael, "Modeling And Predicting Emerging Threats Using Disparate Data" (2023). Open Access Theses & Dissertations. 3868.
https://scholarworks.utep.edu/open_etd/3868

Download

Included in

Computer Sciences Commons

COinS

Open Access Theses & Dissertations

Modeling And Predicting Emerging Threats Using Disparate Data

Date of Award

Degree Name

Department

Advisor(s)

Abstract

Language

Provenance

Copyright Date

File Size

File Format

Rights Holder

Recommended Citation

Included in

Search

Links

Browse

Author Corner

Open Access Theses & Dissertations

Modeling And Predicting Emerging Threats Using Disparate Data

Author

Date of Award

Degree Name

Department

Advisor(s)

Abstract

Language

Provenance

Copyright Date

File Size

File Format

Rights Holder

Recommended Citation

Included in

Share

Search

Links

Browse

Author Corner