Date of Award
2025-08-01
Degree Name
Doctor of Philosophy
Department
Mathematical Sciences
Advisor(s)
Abhijit Mandal
Second Advisor
Amy Wagler
Abstract
Social network analysis (SNA) research is often rife with data collection pitfalls, frequently leading to incomplete and missing data. With the growing use of SNA-based research, researchers must address the challenge of missing data and synthetic data generation in these settings. Missing data occurs due to longitudinal non-response or lack of response to sensitive or difficult-to-answer questions. Synthetic data generation in SNA settings addresses the lack of representation that is often present in large-scale SNA studies. This dissertation investigates synthetic data generation methods to address these challenges and develops a novel algorithm that leverages information from multi-modal data, e.g., databases combining graphical data with participant-level survey data. The synthetic data generation methods incorporate latent variable and stochastic modeling approaches, as well as large language models, approaches well-suited to SNA settings. The proposed algorithm is assessed using a variety of synthetic data generation approaches to determine the quality and diversity of the synthetic data. This assessment employs a rigorous set of metrics that are fine-tuned to SNA multi-modal data settings. The results demonstrate that the LLM and stochastic modeling approach outperformed the two latent feature models examined. This outcome potentially stems from the variable mapping in the latent feature models.
Language
en
Provenance
Received from ProQuest
Copyright Date
2025-08
File Size
117 p.
File Format
application/pdf
Rights Holder
Hortencia Josefina Hernandez
Recommended Citation
Hernandez, Hortencia Josefina, "A Multi-Modal Method For Synthetic Data Generation In Social Network Analysis" (2025). Open Access Theses & Dissertations. 4386.
https://scholarworks.utep.edu/open_etd/4386