cancel
Showing results for 
Search instead for 
Did you mean: 

Aggregates: what is split part?

Former Member
0 Kudos

Hi, all!

When I run RSRT, i met with selecting diffent aggregates for different split parts. As i understand, number of split parts depends on number of RKF in query. But I have 3 RKF and 2 CKF in query and only 2 split parts.

Can you specify, when and what for are split parts used?

Accepted Solutions (0)

Answers (2)

Answers (2)

chris_robertson
Participant
0 Kudos

I've seen queries with a dozen RKFs but only one or two splits. If a single query can satisfy needs of multiple RFKs then it does not need to split. I've never seen a CFK drive a split.

This split can actually cause a query to perform more poorly. In my case I had one aggregate. The query split and one ran on the aggregate and the other on the infocube. I had more database activity and time then if it would have ran only on the infocube. So I used RSRT to study the needs of the split that was still running on the infocube. Then I changed my aggregate to meet those needs. Pizzaman help me understand this in this post:

Former Member
0 Kudos

Generally a split part correponds to RKF, but it really depends on the nature of the restrictions. If the BW can generate SQL code such that two of the RKFs could be populated from the same set of restrictions, tehn you would have only one split part.

Perhaps the easiet way to understand this, is to examine the actual SQL that has been generated.

In my experience, it can become a real challenge act trying to figure if a BW query benefits from this capability of having different portions of the query generated against multiple aggregates (base cube and aggregates).

Sometimes having too many aggregates can hurt performance when you have split parts resulting in 4 differnt aggregates being used for the BW query. What happens then is 4 SQL queries must run (possibly 8 if the aggregates are not fully compressed) and the results of the 4 (or 😎 queries merged. Also each of the 4 queries must perfrom the required reads of dimension tables and master data tables. If some of these dimension or master data tables are large and require a full table scan, you are now doing that 4 times instead of just once if you only had the base cube or a single aggregate that could satisfy all the split parts.

You can test different scenarios by turing off different aggregates to see what the total impact is.

e.g. (won't worry about compressed cube in this example)

Base InfoCube - 10 million rows

Now you have 4 aggregates (we'll assume no parent/child aggregates)

Aggr 100001 - 1.2 million rows

Aggr 100002 - 800,000 rows

Aggr 100003 - 1 million rows

Aggr 100004 - 500,000 rows

A user query is split and 4 SQL queries are generated, one against each aggregate. If all the split parts could be satisfied from Aggr 100001, you could be better off getting rid of the other aggregates completely.