Skip to Content

Suggestion required on repartitioning cube

Hi

Suggestion required on repartitioning on cube on PRD. Because I never use repartitioning option before.

Currently we have 2250000000 records in a cube and monthly 2 millions records added.

1) Please suggest how many partitions to be set for future perspective?

2) What are the prerequisites for using repartitioning on PDR?

3) Any chances of data lost?

4) How much time it takes to repartitioning?

Any else which you think I should know before repartitioning.

Thanks

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

3 Answers

  • Aug 31, 2015 at 10:04 PM

    Hi,

    I am a bit confused by looking at the no. of records. You mentioned 2 million a month, i.e. 24 million a year. If I divide the total no. records (2,250 million = 2.25 billion) by 24 million yearly, then you should have a history of almost 94 years. Can you please confirm these figures?

    Anyway, taking the total no. of records as the baseline, your InfoCube becomes unmanageable from a data volume point of view. I strongly recommend to also use Logical Partitioning (i.e. splitting up one InfoProvider into multiple InfoProviders). Please consider using the SPO (Semantically Partitioned Object functionality). Please refer to my blog series Pattern-based Partitioning using the SPO BAdI - Part 1: Introduction for more information.

    Data volume recommendations are dependent on the underlying database. I heart in the past recommendations like 100 million for MS SQL Server and 300 million for Oracle and DB2. I expect that higher data volumes will not lead to problems but you should avoid a single InfoProvider with a billion records. It is also not scalable, it can and will only increase.

    Logical Partitioning based on year (calendar year or fiscal year) is often a practical way to make the data storage of your InfoProviders scalable.

    Logical Partitioning can be combined with database partitioning (a.k.a. physical partitioning). This is dependent on the underlying database. It will split up the fact table of the InfoCube into multiple partitions. Often calendar month or fiscal month are used for this purpose.

    Re. repartitioning, I suggest to do it the manual way, considering the data volume and my recommendation to introduce Logical Partitioning. I.e. moving to an SPO-based InfoCube (with database partitioning) and reloading the various PartProviders (e.g. yearly time slices).

    Best regards,

    Sander

    Add comment
    10|10000 characters needed characters exceeded

  • avatar image
    Former Member
    Aug 31, 2015 at 09:02 AM

    Hello Imran,

    With an InfoCube containing 2250000000 records, I recommend to do a Logical Partitioning first.

    Define InfoCubes e.g. for one year, fiscal period or company code and join them via a MultiProvider (advantages: parallel access to underlying basis InfoCubes, load balancing, resource utilization, query pruning).

    Next to the Logical Partitioning you can set the Physical Partitioning.

    In order to keep the query performance good, try to avoid InfoCubes with more than 100 mln. records. Furthermore I recommend not to have more than 35 partitions per InfoCube.

    Repartitioning can be useful if you have already loaded data to your InfoCube, and:

    • You did not partition the InfoCube when you created it.
    • You loaded more data into your InfoCube than you had planned when you partitioned it.
    • You did not choose a long enough period of time for partitioning.
    • Some partitions contain no data or little data due to data archiving over a period of time.

    However SAP recommends a complete back up of the database before you execute this function. This ensures that if an error occurs (for example, during a DB catalogue operation), you can restore the system to its previous status.

    For this reason I have never used the Repartitioning feature, I always dropped the InfoCube and (re)defined the InfoCube (Physical) Partitioning.

    Details on how to set the Physical Partitioning can be found here.

    Hope it helps.

    Rgds,

    René

    Add comment
    10|10000 characters needed characters exceeded

  • Sep 01, 2015 at 06:05 AM

    Hi

    Thanks, Sander & René for reply.

    All records stored in existing cube are line item wise and added approx. 2 million/month.

    Query response time is more then 1 hour and Oracle using as DB on BW 7.0.

    1) If cube contain 8 years data then which partition option is suitable:

         i) 000.2008 TO 012.2015 AND 8 years x 12 months + 2 = 98 partitions.

         ii)  000.2008 TO 012.2027 AND 20 years x 12 months + 2 = 242 partitions.

    2) How to avoid to store more then 100/300 million records per cube? If we use multiple cubes for storing data then we should create new cube manually every time after cube fill 100/300 million records?

    3) How to create and update aggregates for each partitions?

    4) How can we find which partitions have no data or few data?

    Add comment
    10|10000 characters needed characters exceeded

    • Hi René

      Thanks again for your valuable reply. As per my understanding I need to do following steps on my InfoCube:

      let suppose If 3 years data contains 300mln records and cube contained 1200 mln records.

      1) Create 4 cubes with 35 partitions and Multiprovider on it on DEV and transport to PRD.

      2) Transfer 300 mln records per cube from original cube and add a cube every year manually.

      3) Delete all data except current year data from original cube.

      4) Make query on Multiprovider.

      5) Move previous year data every year to newly created cube manually.

      Correct me if I am wrong.

      Waiting for your reply.