Term of Award

Spring 2018

Degree Name

Master of Science in Applied Engineering (M.S.A.E.)

Document Type and Release Option

Thesis (restricted to Georgia Southern)

Copyright Statement / License for Reuse

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


Department of Computer Sciences

Committee Chair

Hayden Wimmer

Committee Member 1

Lei Chen

Committee Member 2

JingJing Yin


The field of Data Science is becoming increasingly popular as companies and institutions see the need to gain additional insights and information from data to make better decisions to improve the quality of service delivery to customers. This thesis document contains three aspects of Data Science projects aimed at improving on tools and techniques used in analyzing and evaluating data. The first research project involves the use of twitter, a social media platform, to predict movie ratings using Sentiment Analysis. Twitter is selected because it generates a large amount data that can be easily extracted in text format. The R statistical programming language is used for collecting the tweet data and applying sentiment analysis. Following the sentiment analysis of tweets, in the second research project, we apply machine learning techniques in evaluating the performance of different algorithms on a health data set. We construct a model that gives high accuracy on predicting the target category of the health data set. In the third research project, we use text classification and digital forensics as tools for extracting and categorizing text messages from smartphone devices. We differentiate between text messages that are threatening from those that were non-threatening. This research is important in reducing the time required by digital forensics analysts when inspecting, evaluating, and interpreting large amounts of text messages extracted from mobile devices.

Research Data and Supplementary Material