Term of Award
Fall 2015
Degree Name
Master of Science in Computer Science (M.S.)
Document Type and Release Option
Thesis (open access)
Copyright Statement / License for Reuse
This work is licensed under a Creative Commons Attribution 4.0 License.
Department
Department of Computer Sciences
Committee Chair
Vladan Jovanovic
Committee Member 1
Wen-Ran Zhang
Committee Member 2
James Harris
Abstract
Data warehouse (DW) projects are undertakings that require integration of disparate sources of data, a well-defined mapping of the source data to the reconciled data, and effective Extract, Transform, and Load (ETL) processes. Owing to the complexity of data warehouse projects, great emphasis must be placed on an agile-based approach with properly developed and executed test plans throughout the various stages of designing, developing, and implementing the data warehouse to mitigate against budget overruns, missed deadlines, low customer satisfaction, and outright project failures. Yet, there are often attempts to test the data warehouse exactly like traditional back-end databases and legacy applications, or to downplay the role of quality assurance (QA) and testing, which only serve to fuel the frustration and mistrust of data warehouse and business intelligence (BI) systems. In spite of this, there are a number of steps that can be taken to ensure DW/BI solutions are successful, highly trusted, and stable. In particular, adopting a Data Vault (DV)-based Enterprise Data Warehouse (EDW) can simplify and enhance various aspects of testing, and curtail delays common in non-DV based DW projects. A major area of focus in this research is raw DV loads from source systems, keeping transformations to a minimum in the ETL process which loads the DV from the source. Certain load errors, classified as permissible errors and enforced by business rules, are kept in the Data Vault until correct values are supplied. Major transformation activities are pushed further downstream to the next ETL process which loads and refreshes the Data Mart (DM) from the Data Vault.
Recommended Citation
Williams, Connard N., "Testing Data Vault-Based Data Warehouse" (2015). Electronic Theses and Dissertations. 1340.
https://digitalcommons.georgiasouthern.edu/etd/1340
Included in
Computer and Systems Architecture Commons, Databases and Information Systems Commons, Data Storage Systems Commons, Software Engineering Commons