It’s clear that storing the history of reference data changes meets many requirements for data analysis and reporting, and is practically essential for auditing. The question is how to store this history of reference data? One solution is to use the concept of slowly changing dimensions (SCD) used in data warehouses. This is used to track changes in attributes that describe analysis dimensions. This is particularly important for data analysis and reporting. For example, it allows us to check past prices of products, company employment, or customer contact information. SCD can also be successfully used to record changes in reference data.
There are various ways to implement SCD, each with its own advantages and disadvantages.
SCD1 is the simplest situation, where a new attribute value replaces the old one and the history of changes is lost. The only way to access historical data in this case is to have a copy of the reference data from the period of interest.
SCD2 involves creating a new record with a new value and a start date when an attribute changes. The old record is kept with an end date. A separate record is created for each change, storing the current data for that period. Queries to the database must take into account the selection of current data for a specific time.
SCD3 involves storing the history of changes in the same record but in a separate attribute. The advantage of this approach is less redundancy of records, but it complicates the way of reading the history of changes.
SCD4 involves storing the history of changes in a separate table. Like SCD3, this allows for better resource management, but the way of reading the history of changes is more complicated.
The choice of the appropriate type of SCD depends on specific needs and requirements for reference data. Final design decisions are based on the estimated number of changes made to reference data and the intensity of their use in reporting and analysis.