Data Science Based Research on Sentiment Analysis, Machine Learning and Text Classification of Social Media and Digital Forensic Data
Term of Award
Master of Science in Applied Engineering (M.S.A.E.)
Document Type and Release Option
Thesis (restricted to Georgia Southern)
Copyright Statement / License for Reuse
This work is licensed under a Creative Commons Attribution 4.0 License.
Department of Computer Sciences
Committee Member 1
Committee Member 2
The field of Data Science is becoming increasingly popular as companies and institutions see the need to gain additional insights and information from data to make better decisions to improve the quality of service delivery to customers. This thesis document contains three aspects of Data Science projects aimed at improving on tools and techniques used in analyzing and evaluating data. The first research project involves the use of twitter, a social media platform, to predict movie ratings using Sentiment Analysis. Twitter is selected because it generates a large amount data that can be easily extracted in text format. The R statistical programming language is used for collecting the tweet data and applying sentiment analysis. Following the sentiment analysis of tweets, in the second research project, we apply machine learning techniques in evaluating the performance of different algorithms on a health data set. We construct a model that gives high accuracy on predicting the target category of the health data set. In the third research project, we use text classification and digital forensics as tools for extracting and categorizing text messages from smartphone devices. We differentiate between text messages that are threatening from those that were non-threatening. This research is important in reducing the time required by digital forensics analysts when inspecting, evaluating, and interpreting large amounts of text messages extracted from mobile devices.
Nwankwo, Christian Sunday, "Data Science Based Research on Sentiment Analysis, Machine Learning and Text Classification of Social Media and Digital Forensic Data" (2018). Electronic Theses and Dissertations. 1737.
Research Data and Supplementary Material