Data Warehousing in the Cloud : Analysis of an Implementation Project
Eklund, Marcus (2018)
Eklund, Marcus
Åbo Akademi
2018
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2020062645848
https://urn.fi/URN:NBN:fi-fe2020062645848
Tiivistelmä
Data are generated at an ever-increasing rate. Ideally, all these data should support decision making. One way of facilitating this, is to utilize a data warehouse. Data warehousing is an integral part of business intelligence. The main purpose of a data warehouse is to provide a consistent and non-volatile copy of transaction data from operational sources, in a format suitable for querying and analysis. Implementing a data warehouse through manual labor is time-consuming and difficult. Changes to the data warehouse are slow to implement, and a data warehouse requires constant maintenance. Data warehouse automation is an emerging topic, designed to accelerate and automate development, and ensure quality and consistency through enforced standardization. Through data warehouse automation, changes become non-issues, and expanding and further developing the data warehouse becomes significantly easier.
The author of this thesis works at a company that builds solutions for reporting and analytics, based on data warehousing. The data warehouses are implemented using a DWA-platform developed in-house. The primary objective of this thesis is to research how the established way of working at the company would change, if the on-premises environment were replaced by a cloud-based environment.
The thesis introduces the concept of data warehouse automation and presents some commercial DWA-tools. The theoretical part focuses on the definition and purpose of data warehousing and the components of a data warehouse based on data vault modeling. In addition, cloud computing is discussed, with a focus on data warehousing in the cloud.
To answer the research questions, an artefact was constructed, based on scientific literature and previous project experience. The artefact included tools and services that would replace the ones that are normally utilized by the development team, when working on-premises. The aim was to see whether an artefact could be created that would suit the business case and the established way of working. The artefact was found to be a solid foundation for the design of a real-life implementation. The artefact was implemented in a pilot project for a customer. The project and its results were analysed through case study research.
After finishing the pilot project, it was concluded that the company’s way of working is well suited for a cloud-based environment. The offering of the chosen Platform as a service (PaaS) provider included replacements for the tools and services utilized in a standard on-premises business intelligence project. Since the pilot project, several other reporting solutions have been built, utilizing only the cloud components of a PaaS-environment. Cloud-based solutions are now a standard option when designing a solution for customers.
The author of this thesis works at a company that builds solutions for reporting and analytics, based on data warehousing. The data warehouses are implemented using a DWA-platform developed in-house. The primary objective of this thesis is to research how the established way of working at the company would change, if the on-premises environment were replaced by a cloud-based environment.
The thesis introduces the concept of data warehouse automation and presents some commercial DWA-tools. The theoretical part focuses on the definition and purpose of data warehousing and the components of a data warehouse based on data vault modeling. In addition, cloud computing is discussed, with a focus on data warehousing in the cloud.
To answer the research questions, an artefact was constructed, based on scientific literature and previous project experience. The artefact included tools and services that would replace the ones that are normally utilized by the development team, when working on-premises. The aim was to see whether an artefact could be created that would suit the business case and the established way of working. The artefact was found to be a solid foundation for the design of a real-life implementation. The artefact was implemented in a pilot project for a customer. The project and its results were analysed through case study research.
After finishing the pilot project, it was concluded that the company’s way of working is well suited for a cloud-based environment. The offering of the chosen Platform as a service (PaaS) provider included replacements for the tools and services utilized in a standard on-premises business intelligence project. Since the pilot project, several other reporting solutions have been built, utilizing only the cloud components of a PaaS-environment. Cloud-based solutions are now a standard option when designing a solution for customers.