Date of Award

2023-12-01

Degree Name

Doctor of Philosophy

Department

Computer Science

Advisor(s)

M. Shahriar Hossain

Abstract

Recent years have seen an exponential increase in unstructured data, primarily in the form of text, images, and videos. Extracting useful features and trends from large-scale unstructured datasets -- such as news outlets, scientific papers, and videos like security cameras or body cam recordings -- is faced with substantial challenges of volume, scalability, complexity, and semantic understanding. In analyzing trends, comprehending the temporal context is vital for uncovering patterns and narratives that are not apparent from a single video frame or text document. Despite its importance, many existing data mining and machine learning approaches overlook extracting evolutionary contextual features in datasets. The oversight leads to missed opportunities in harnessing insights for improved decision-making and predictive analysis. In this dissertation, I seek to address the gap between regular and temporal/dynamic representations of video and text data. The regular representations do not capture temporal contexts of features; on the other hand, the dynamic representation captures contextual changes over time, improving the quality of the features for downstream prediction applications.

My dissertation focuses on neural network embeddings as distributed representations of unit features of the data (e.g., entities or words for text, and objects in videos.) I investigate how the embeddings can be generated to capture the contextual changes. I propose temporal embedding models that (1) are capable of detecting both short and long-term shifts in the semantics of entities within text data -- a capability often missing in existing temporal word embedding models, and (2) capitalize on the spatial distance of objects and their appearance over video frames in video data to model contextual trends. My experiments demonstrate that the proposed models provide high-quality temporal embeddings for both text and video data, enriching predictive capabilities for downstream applications. The findings demonstrate the underutilized potential of temporal embedding models within natural language understanding and computer vision.

Language

Provenance

Recieved from ProQuest

Copyright Date

2023-12

File Size

154 p.

File Format

application/pdf

Rights Holder

Ahnaf Farhan

Recommended Citation

Farhan, Ahnaf, "Context-Aware Temporal Embeddings For Text And Video Data" (2023). Open Access Theses & Dissertations. 3970.
https://scholarworks.utep.edu/open_etd/3970

Download

Included in

Computer Sciences Commons

COinS

Open Access Theses & Dissertations

Context-Aware Temporal Embeddings For Text And Video Data

Date of Award

Degree Name

Department

Advisor(s)

Abstract

Language

Provenance

Copyright Date

File Size

File Format

Rights Holder

Recommended Citation

Included in

Search

Links

Browse

Author Corner

Open Access Theses & Dissertations

Context-Aware Temporal Embeddings For Text And Video Data

Author

Date of Award

Degree Name

Department

Advisor(s)

Abstract

Language

Provenance

Copyright Date

File Size

File Format

Rights Holder

Recommended Citation

Included in

Share

Search

Links

Browse

Author Corner