Term of Award

Summer 2019

Degree Name

Master of Science in Computer Science (M.S.)

Document Type and Release Option

Thesis (open access)

Copyright Statement / License for Reuse

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Department

Department of Computer Sciences

Committee Chair

Mehdi Allahyari

Committee Member 1

Andrew Allen

Committee Member 2

Pradipta De

Abstract

This paper proposes to tackle Question Answering on a specific domain by developing a multi-tier system using three different types of data storage for storing answers. For testing our system on University domain we have used extracted data from Georgia Southern University website. For the task of faster retrieval we have divided our answer data sources into three distinct types and utilized Dialogflow's Natural Language Understanding engine for route selection. We compared different word and sentence embedding techniques for making a semantic question search engine and BERT sentence embedding gave us the best result and for extracting answer from a large collection of documents we also achieved the highest accuracy using the BERT-base model. Besides trying with the BERT-base model we also achieved competitive accuracy by using BERT embedding on paragraph splitted documents. We have also been able to accelerate the answer retrieval time by a huge percentage using pre-stored embedding.

Research Data and Supplementary Material

No

Share

COinS