cancel
Showing results for 
Search instead for 
Did you mean: 

What is the difference between SCD type 2 & 3? How to implement SCD type 3 in BODS?

Former Member
0 Kudos

Hi All,

I'm new to BODS, can any one please help in implementing SCD type 3 in BODS

Accepted Solutions (1)

Accepted Solutions (1)

former_member200473
Contributor
0 Kudos

Hi Sabarish,

Difference : SCD2 is unlimited history and SCD3 is limited history.

Explanation:

1- Using SCD2 you can save unlimited history with the help of the Surrogate Key . In this structure of the table will never be effected(constant) only the no of the rows will be effected(will be increased) and to prevent the duplication of data ,primary key will be used .

2-While in SCD3 you can save the limited history. There is no use of Surrogate Key.Since here it depends upon you how much old history/data you want to save (for example: till second last , third last and so on....). As much old data want to save  as much no. of columns will be required to add to the table.

Implementation: Can you please define how much old data you want to save ,so that i may try to help you

Regards,

Shiva Sahu

Former Member
0 Kudos

Hi Shiva,

Thanks for the reply..

My requirement is to load with second last data, to data staging area(SQL server) ,Here i'm trying to implement this using Business objects Data Services.

Thanks & Regards,

Sabarish.

former_member200473
Contributor
0 Kudos

Hi Sabarish,

Is it mandatory to do it by SCD3 only ?

You can try by SCD2 as well also.

Here i am trying to explain it to you.

Step 1 - First implement SCD2 and get your all data into staging area.

Step 2 - Once you get whole data, you can use gen_row_num_by_group() function to generate the row number for each different group.

Step 3 - Now apply filter for two highest no. for each group.

Step 4 - Check for your expected result .

Reason: The reason why am i saying it to implement with SCD2 is because if later your requirement (let say third last or 100th last... etc) gets change you can easily change the filter condition and get the expected data while in SCD3 you will be required to change the structure(adding or removing of columns) of the table as many times your requirement gets change.

But if you have really a wish to implement through SCD3 , yes.. it can be developed.

Please let me know ,if you need any clarification.

Regards,

Shiva Sahu

Answers (1)

Answers (1)

Former Member
0 Kudos

Hi Sabarish,

SCD- Slowly Changing Dimensions

SCDs are dimensions that have data that changes over time. The following methods of handling SCDs are available:

  • Type 1 : No history preservation

  v Natural consequence of normalization 

  • Type 2: Unlimited history preservation and new rows
  • v New rows generated for significant changes
  • v Requires use of a unique key
  • v New fields are generated to store history data
  • v Requires an Effective_Date field               .
  • Type 3: Limited history preservation

  v Two states of data are preserved: current and old 

Slowly Changing Dimension Type 1(SCD Type1)

For SCD Type 1 change, you find and update the appropriate attributes on a specific dimensional record. The new information simply overwrites the original information. In other words, no history is kept.

Example

  • Customer Key
  • Name
  • State
  • 1001
  • Christina
  • Illinois

             After Christina moved from Illinois to California, the new information replaces the new record, and we have the following table: 

Customer Key

Name

State

1001

Christina

California

Advantages:

This is the easiest way to handle the Slowly Changing Dimension problem, since there is no need to keep track of the old information

Disadvantages:

   All history is lost. By applying this methodology, it is not possible to trace back in history. For example, in this case, the company would not be able to know that Christina lived in Illinois before.

Slowly Changing Dimension Type 2(SCD Type2)

With a Type 2 change, we don’t make structural changes in the table. Instead we add a record. A new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key.

In our example, recall we originally have the following table:

Customer Key

Name

State

1001

Christina

Illinois

After Christina moved from Illinois to California, we add the new information as a new row into the table.

Customer Key

Name

State

1001

Christina

Illinois

1005

Christina

California

Advantages:

This allows us to accurately keep all historical information.

Disadvantages:

This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern.

This necessarily complicates the ETL process.

Slowly Changing Dimension Type 3(SCD Type3)

With a Type 3 change, we change the dimension structure so that it renames the existing attribute and add two attributes, one to record the new value and one to record the date of change.

       In our example, recall we originally have the following table:

Customer Key

Name

State

1001

Christina

Illinois

  • After Christina moved from Illinois to California, the original information gets updated, and we have the following table (assuming the effective date of change is January 15, 2003):
  • Customer Key
  • Name
  • Original State
  • Current State
  • Effective Date
  • 1001
  • Christina
  • Illinois
  • California
  • 15-JAN-2003

Advantages:

This does not increase the size of the table, since new information is updated.

This allows us to keep some part of history.

Disadvantages:

Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if Christina later moves to Texas on December 15, 2003, the California information will be lost.

Hope this is helpful to you.

Thanks,

Richa