Data Warehouse characteristics
Data Warehouse is a subject oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision-making process.
Let’s look into these characteristics little deeper in order to get a better clarity.
Subject oriented :
Data warehouses designed to help people analyze the data. For example, if you wanted to learn about the company’s sales data, we can build a data warehouse that concentrates on sales.
In an insurance data warehouse model, there can be multiple data warehouses in which each one of them concentrates on various subject areas like policy, claims, party etc.
- The policy data warehouse contains records relating to the submission, coverage, risk, liabilities/exposures, deductibles, pricing etc.
- The claims data warehouse contains records relating to payments and recoveries of claims
- The Party data warehouse contains records relating to internal and external organizations and individuals that interact with policies and claims
Generally speaking, data warehouse system combines data from many different source systems. There is a possibility that data residing on these source systems have some redundancies and inconsistencies which need to be resolved before moving data to the data warehouse. This is called data integration.
For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product.
In the below example, a party is being referred as insured, broker and claimant in different source systems.
Online transactional processing(OLTP) system (source systems) data usually covers a short period of time, i.e. most transactions involve the latest data. Historical data will move to archive due to performance requirements.
However, in data warehouses data is available for the longest period of time for trend analysis and forecasting. The time period can be five or more years.
Every key structure in the data warehouse contains an element of time(like creation date), explicitly or implicitly. Or in other words, data is stored as a series of snapshots.
Non-volatile means data in the data warehouse is not overwritten or updated. This means that data warehouses can be viewed as only read only databases. It only allows operations like initial loading of data and access of data.
This also satisfies users need for short query response time.