Data Vault – is it a star replacement?
Data Vault, is it a replacement for traditional data warehouses and star schemas? This is a common question that not only often gets asked to me, but also a question that get gets asked by those interested or learning about Data Vault.
Data Vault is a storage mechanism and is optimised as such rather than a querying for analysis mechanism.
What does this mean? Well, in its simplistic unified decomposition approach, it consists of a large number of tables, and, with a large number of tables potentially comes many joins which in the Data Vault world is true.
So, if Data Vault is for storing data, what methodology is used for querying or analysis? Well this is where the dimensional/star model, OLAP cubes or tabular models fits in, aka Data Marts or Information Marts for which they are optimised for querying.
So, then where does that fit? If we look at it from a layered architecture point of view, the Data Vault (being the storage layer) will sit in front of the analysis layer (dimensional model, etc.). The Data Vault becomes the source for the analysis layer which in turn then feeds the presentation layer which could be a reporting platform (Power BI, SSRS, etc.), API, flat file, etc. .
The Data Vault layer now becomes the 1 of the potentially 2 parts that make up an enterprise data warehouse. The other part is the staging area which could be a persisted history staging area fed from the source systems, flat files or other data sources.