Date of Award

2011-01-01

Degree Name

Master of Science

Department

Computational Science

Advisor(s)

Miguel Argaez

Abstract

In pattern recognition and machine learning, a classification problem refers to finding an algorithm for assigning a given input data into one of several categories. Many natural signals are sparse or compressible in the sense that they have short representations when expressed in a suitable basis. Motivated by the recent successful development of algorithms for sparse signal recovery, we apply the selective nature of sparse representation to perform classification. In order to find such sparse linear representation, we implement an l1-minimization algorithm. This methodology overcomes the lack of robustness with respect to outliers. In contrast to other classification algorithms such as Support Vector Machines (SVM), no model selection dependence is involved. The minimization algorithm is a convex relaxation-like algorithm that has been proven to efficiently recover sparse signals. To study its performance, the proposed method is applied to six tumor gene expression datasets with a large number of features but few samples. Our numerical results compare favorably with various SVM methods. We also test the effectiveness of our classification algorithm in the Fisher's Iris dataset where a large number of samples but a small number of features are available. Since the process and techniques for acquiring and analyzing data advance every day at high rates, we need to manage and analyze large amounts of data for several different scientific problems. Future work aims to study the performance of our classification method when dimensionality reduction techniques are applied, including feature selection and feature extraction strategies.

Language

en

Provenance

Received from ProQuest

File Size

50 pages

File Format

application/pdf

Rights Holder

Reinaldo Sanchez Arias

Share

COinS