on 12-02-2005 8:18 AM
Hi, all!
When I run RSRT, i met with selecting diffent aggregates for different split parts. As i understand, number of split parts depends on number of RKF in query. But I have 3 RKF and 2 CKF in query and only 2 split parts.
Can you specify, when and what for are split parts used?
I've seen queries with a dozen RKFs but only one or two splits. If a single query can satisfy needs of multiple RFKs then it does not need to split. I've never seen a CFK drive a split.
This split can actually cause a query to perform more poorly. In my case I had one aggregate. The query split and one ran on the aggregate and the other on the infocube. I had more database activity and time then if it would have ran only on the infocube. So I used RSRT to study the needs of the split that was still running on the infocube. Then I changed my aggregate to meet those needs. Pizzaman help me understand this in this post:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Generally a split part correponds to RKF, but it really depends on the nature of the restrictions. If the BW can generate SQL code such that two of the RKFs could be populated from the same set of restrictions, tehn you would have only one split part.
Perhaps the easiet way to understand this, is to examine the actual SQL that has been generated.
In my experience, it can become a real challenge act trying to figure if a BW query benefits from this capability of having different portions of the query generated against multiple aggregates (base cube and aggregates).
Sometimes having too many aggregates can hurt performance when you have split parts resulting in 4 differnt aggregates being used for the BW query. What happens then is 4 SQL queries must run (possibly 8 if the aggregates are not fully compressed) and the results of the 4 (or 😎 queries merged. Also each of the 4 queries must perfrom the required reads of dimension tables and master data tables. If some of these dimension or master data tables are large and require a full table scan, you are now doing that 4 times instead of just once if you only had the base cube or a single aggregate that could satisfy all the split parts.
You can test different scenarios by turing off different aggregates to see what the total impact is.
e.g. (won't worry about compressed cube in this example)
Base InfoCube - 10 million rows
Now you have 4 aggregates (we'll assume no parent/child aggregates)
Aggr 100001 - 1.2 million rows
Aggr 100002 - 800,000 rows
Aggr 100003 - 1 million rows
Aggr 100004 - 500,000 rows
A user query is split and 4 SQL queries are generated, one against each aggregate. If all the split parts could be satisfied from Aggr 100001, you could be better off getting rid of the other aggregates completely.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
88 | |
23 | |
11 | |
9 | |
8 | |
5 | |
5 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.