Term of Award

Spring 2017

Degree Name

Master of Science in Computer Science (M.S.)

Document Type and Release Option

Thesis (restricted to Georgia Southern)

Copyright Statement / License for Reuse

Digital Commons@Georgia Southern License

Department

Department of Computer Sciences

Committee Chair

Vladan Jovanovic

Committee Member 1

Lixin Li

Committee Member 2

Wen-Ran Zhang

Abstract

A data warehouse integrates data from various and heterogeneous data sources and creates a consolidated view of the data that is optimized for reporting and analysis. Today, business and technology are constantly evolving, which directly affects the data sources. New data sources can emerge while some can become unavailable. The DW or the data mart that is based on these data sources needs to reflect these changes. Various solutions to adapt a data warehouse after the changes in the data sources and the business requirements have been proposed in the literature (Subotic, Poscic, Poscic, & Jovanovic, 2014). However, research in the problem of DW evolution has focused mainly on managing changes in the dimensional model, other aspects related to the ETL, and maintaining the history of changes has not been addressed. As a solution to the problem, we propose a Meta Data vault model that includes a data vault based data warehouse and a master data management. A major area of focus in this research is to keep both history of changes and a “single version of the truth”, through an MDM, integrated with the DW. We also present the load patterns used to load data into the data warehouse and materialized views to deliver data to end-users. To test our solution, we have used big data sets from the biomedical field, for each modification of the data source schema, we outline the changes that need to be made to the EDW, the data marts and the ETL.

Research Data and Supplementary Material

No

Share

COinS