Term of Award
Master of Science, Computer Science (M.S.C.S.)
Document Type and Release Option
Thesis (open access)
Copyright Statement / License for Reuse
This work is licensed under a Creative Commons Attribution 4.0 License.
Department of Computer Science
Committee Member 1
Committee Member 2
Committee Member 3
Regular classification of data includes a training set and test set. For example for Naïve Bayes, Artificial Neural Networks, and Support Vector Machines, each classifier employs the whole training set to train itself. This thesis will explore the possibility of using a condensed form of the training set in order to get a comparable classification accuracy. The technique explored in this thesis will use a clustering algorithm to explore with data records can be labeled as exemplar, or a quality of multiple records. For example, is it possible to compress say 50 records into one single record? Can a single record represent all 50 records and train a classifier similarly? This thesis aims to explore the idea of what can label a data record as exemplar, what are the concepts that extract the qualities of a dataset, and how to check the information gain of one set of compressed data over another set of compressed data. This thesis will explore using Affinity Propagation, categorical data, exploring entropy within cluster sets, and testing the compressed data using Cosine Similarity as a classifier.
Klecker, Christopher R., "Building A Classification Model Using Affinity Propagation" (2019). Electronic Theses and Dissertations. 1917.
Research Data and Supplementary Material
Available for download on Tuesday, April 07, 2020