cancel
Showing results for 
Search instead for 
Did you mean: 

SCD Type 2 Target based CDC using Data Services

former_member283957
Participant
0 Kudos

Good Morning ,

Kimball writes:

Type 2: Add New Row

Slowly changing dimension type 2 changes add a new row in the dimension with the updated attribute values. This requires generalizing the primary key of the dimension beyond the natural or durable key because there will potentially be multiple rows describing each member. When a new row is created for a dimension member, a new primary surrogate key is assigned and used as a foreign key in all fact tables from the moment of the update until a subsequent change creates a new dimension key and updated dimension row. A minimum of three additional columns should be added to the dimension row with type 2 changes: 1) row effective date or date/time stamp; 2) row expiration date or date/time stamp; and 3) current row indicator.

After Initial load is completed, we identify the changed or new record in the source system and update the target system with the changed record. Identifying and loading only changed data is called Change Data Capturing. This type of load is used to maintain the history data in the target system (SCD Type 2).

There are two CDC methods - Source based CDC and Target based CDC.

Source-based CDC evaluates the source table to determine, what record has changed and extracts only changed records to load into target tables.

Target-based CDC extracts all the data from the source, compares the source and target rows using table comparison and then loads only the changed rows into the target system.

My question is:

A Dimension "Usuarios"  SCD Type 2 is going to be loaded after Initial load using Target-based  CDC .

Must  I added this three additional columns 1) row effective date or date/time stamp; , 2) row expiration date or date/time stamp; 3) current row indicator. to the Dimension "Usuarios" ?

Or this three additional columns must be added to the Dimension "Usuarios" only when the CDC method is  Source based CDC  is it right ?

Thanks in advance.

Best Regards.

Antonio

Accepted Solutions (1)

Accepted Solutions (1)

akhileshkiran
Contributor
0 Kudos

Hi ,

If you are using target based CDC. You need to produce a new Surrogate Key column to preserve the historical data in the target table.

For the Source Based CDC we can maintain a Control Table with a time stamp. by using that control table we can load the data based on the time stamp from source to target.

Regards,

Akhilesh Kiran.

former_member283957
Participant
0 Kudos

Hi Akhilesh ,

Thank you very much for your answer but let me ask you the question again.

My question is:

A Dimension "Usuarios"  SCD Type 2 is going to be loaded after Initial load using Target-based  CDC .

1) Must  I added this three additional columns 1) row effective date or date/time stamp; , 2) row expiration date or date/time stamp; 3) current row indicator. to the Dimension "Usuarios" ?

Akhilesh , so the asnswer to this question 1)  is : No I Do not need to add this three aditional columns because I use table comparison (Target-based  CDC)

Or this three additional columns must be added to the Dimension "Usuarios" only when the CDC method is  Source based CDC  is it right ?

Thanks in advance.

Best Regards.

Antonio

akhileshkiran
Contributor
0 Kudos

Hi ,

Yes for target based CDC no need to add the additional three columns. If you have a Surrogate column present in the Target.

Regards,

Akhilesh Kiran.

former_member283957
Participant
0 Kudos

Hi Akhilesh Kiran,

Thank you very much for your answer.

I have attached the star model.

Dimension "Categoria" DM_Categoria  is  SCD Type 2 and it is going to be loaded after Initial load using Target-based  CDC . DM_Categoria has a SK , ID_Categoria is a Surrogate Key ,

Dimension "Usuario"  DM_Usuario is  SCD Type 2 and It is going to be loaded after Initial load using Source-based  CDC , therefore it has three additional columns 1) row effective date or date/time stamp; , 2) row expiration date or date/time stamp; 3) current row indicator.

Please could you check the Model , Is everything OK ?

Thanks in advance.

Best Regards

Antonio

akhileshkiran
Contributor
0 Kudos

Hi ,

Everything in the star-schema looks fine. Can you please brief me about ID_Tiempo.


From Fact HE_Incidencia you are pointing three relationships with Dimension DM_Tiempo.

In fact table you are using all three columns mapping to ID_Tiempo in the Dimension DM_Tiempo.

Regards,

Akhilesh Kiran.

Answers (0)