cancel
Showing results for 
Search instead for 
Did you mean: 

Dimension modelling for dummies?

Former Member
0 Kudos

Hi all,

as we need to build up a demo up from scratch showing a complete dataflow in dataservices from source up to cubes/marts, I'm now in need to do a dimension-modelling in dataservices.

So is there any way to build up a consistent dimension and fact-tables for dummies? (there is not so much knowledge yet on my side concerning this topic)

Your help is really appreciated.

Best regards,

bivision

Accepted Solutions (0)

Answers (3)

Answers (3)

werner_daehn
Active Contributor
0 Kudos

If the question happened to be more related to the data modeling than how to load, some more reading

https://wiki.sdn.sap.com/wiki/display/BOBJ/StarSchemaDesign

Former Member
0 Kudos

Thanks so far for the quick answers. I will try out this stuff ;-).

But what I'm still unsure about is:

--> how many dimensions are recommended to build in dataservices? how should the dimension in dataservices look like? (e.g. is isrecommended to create many line-item-dimensions as in SAP BW ? how many characteristics should be in one dimension for max?)

I'd be happy if you could provide me some best-practice-tipps in dimension modelling with DataServices!

Best regards,

bivision

werner_daehn
Active Contributor
0 Kudos

Relational databases prefer a fact table with as little width as possible. So that means the least number of foreign keys as possible. Reason is, the fact table is large, e.g. 10 million rows, so adding even a single additional 1 byte column means 10 megabyte more to read.

Relational databases prefer the dimension tables used to contain as little rows as possible as then a full table scan and a join is quick. So if one dimension is e.g. a customer dimension with million of rows and you need just the regional information most of the time, by splitting the two into a customer dimension with all customer specific attributes and 1 million rows which are rarely used plus a geography dimension with a few rows only but used often is better.

As you see there is a contradiction, a narrow fact with small dimensions? You can't have both with the same information being stored. Hence you have to balance that.

The link I provided talks about the common approach, tricks, formulas how to estimate the benefit,....

Former Member
0 Kudos

Thanks so far...maybe it seems so hard for me as I'm used to SAP BW where all this stuff goes automatically

I tried out to work with "key generation" but I do not succeed in creating one unique key for each tuple of characteristics..how can i manage this? DataServices always does create a new key-column for the same set of characteristics...

How do I have to specifiy the columns / keys in my dimension tables?

Apart from that: is it recommended to provide a separate key-table with the latest key-value or to derive this from the target (dimension) table?

werner_daehn
Active Contributor
0 Kudos

That's the major difference between BW and hand-designed Data Warehouses. BW forces you to follow one modeling standard. That is good in case you want it to be that way or you don't care. It is not so good in all other cases, e.g. performance.

Given your question it seems you want to have a classification dimension, so each combination of attributes gets one unique key. The dataflow for that is

Source -> Query -> Table_Comaprison -> Key_Generation -> Target

In the Query you select the columns you need.

Table Comparison has all these columns as input primary key listed and the input_contains_duplicates is checked

Key_Generation is set to the unique key column

Having an intermediate SID table like BW does has pros and cons. The pros are the greater flexibility, the con is performance. Each join is expensive, with SID tables you always have twice as many joins as without. And there are other options to accomplish the same often, e.g. ["Slow Changing Dimensions Type 2"|https://wiki.sdn.sap.com/wiki/display/BOBJ/SlowChangingDimensionType2].

I am not too thrilled about the situation you are facing. The data model of a DWH is the most fundamental part, so you should know what you are doing. So I would try to get some help, somebody who knows the very details of a star schema design from experience.

Former Member
0 Kudos

Hi,

You can find an example of building a star schema in the Data Services Tutorial, this can be found below:

https://websmp106.sap-ag.de/~form/sapnet?_SHORTKEY=01100035870000713358&_SCENARIO=011000358700000002...

Richard

Former Member
0 Kudos

Hi

You're in luck, the data integrator transforms map nicely onto Kimball's dimensional theory.The very short answer is

For dimensions, you'll use table comparison, history preservation and key generation.

For facts, you'll use lookup_ext (to find surrogate keys)

Maybe more specific questions will follow later?

Good luck.

Michael