Hello Community,
What is the best way to identify which "combination of characteristics" is guilty for causing a large record count in the dimeension table ?
We are working to redesign an infoCube that had many unbalanced dimensions. So far we've done quite well: all newly designed dimensions are well balanced *except for one*
Unfortunately, we don't have the luxury of just splitting it up further, since we're already using the maximum of 13 customer dimesions.
So now I must decide how to best shuffle the characteristics, without disrupting the balance of the other dimensions.
The cube was designed long before I joined the team, and I don't have a relationship diagram (nor, indeed, an intimate knowledge of the data). Is there a way to reverse engineer to find out which are the problem characteristics ?
For example, I've already selected the record count from each SID table. But I'm not certain what to do next.
So my question is : what is the best way to identify which "combination of characteristics" is guilty for causing a large record count in the dimeension table ?
Thanks so much for your help !
Keith
Keith,
here's the documentation for the "select distinct" statement: http://help.sap.com/saphelp_47x200/helpdata/de/40/1311fd2fa511d3a98100a0c9449261/frameset.htmhttp://help.sap.com/saphelp_47x200/helpdata/de/40/1311fd2fa511d3a98100a0c9449261/frameset.htm">http://help.sap.com/saphelp_47x200/helpdata/de/40/1311fd2fa511d3a98100a0c9449261/frameset.htm>
I did this in MS Access, so I don't have a sample...
Christian
Hi!
Here's what we did in order to find "good" dimensions:
1) Copy some sample data (we took it from an ODS object that feeds the cube) into a database table (MS Access should work okay)
2) Create a 'control table' that defines possible dimensions
3) Create a macro that reads the control table and generates SQL statements like "SELECT COUNT DISTINCT MATERIAL PLANT COMP_CODE <fields you'd like to have in the same dimension> ... FROM Sample_Table"
That way you can find out how many entries the corresponding dimension table would have in BW. The bigger the sample the better the guess of course.
This method allowed us to keep all dimensions of an SD cube at less than 1% of fact table (except for line items...)
Hope this helps,
Christian
Keith,
the "select count distinct" simply gives you the number of distinct combinations for certain characteristics.
Assume the dimension contains char1, char2, ..., charN.
Now you can run the "select count distinct" for subsets of these characteristics. Let's say there are a huge number of combinations for char1, char3, char5. Then these characteristics should not be in the same dimension...
I think it doesn't matter whether you use the source ODS or the SIDs in the dimension table.
Hope this helps,
Christian
Add a comment