cancel
Showing results for 
Search instead for 
Did you mean: 

Suggestion required on repartitioning cube

former_member198905
Participant
0 Kudos

Hi

Suggestion required on repartitioning on cube on PRD. Because I never use repartitioning option before.

Currently we have 2250000000 records in a cube and monthly 2 millions records added.

1) Please suggest how many partitions to be set for future perspective?

2) What are the prerequisites for using repartitioning on PDR?

3) Any chances of data lost?

4) How much time it takes to repartitioning?

Any else which you think I should know before repartitioning.

Thanks

Accepted Solutions (0)

Answers (3)

Answers (3)

former_member198905
Participant
0 Kudos

Hi

Thanks, Sander & René for reply.

All records stored in existing cube are line item wise and added approx. 2 million/month.

Query response time is more then 1 hour and Oracle using as DB on BW 7.0.

1) If cube contain 8 years data then which partition option is suitable:

     i) 000.2008 TO 012.2015 AND 8 years x 12 months + 2 = 98 partitions.

     ii)  000.2008 TO 012.2027 AND 20 years x 12 months + 2 = 242 partitions.

2) How to avoid to store more then 100/300 million records per cube? If we use multiple cubes for storing data then we should create new cube manually every time after cube fill 100/300 million records?

3) How to create and update aggregates for each partitions?

4) How can we find which partitions have no data or few data?

0 Kudos

Hi Imran,

My recommendation is NOT to put 8 years of data into one InfoCube. You have to separate your hot data, warm and cold data.

General rule of thumb:

  • hot data = current and previous year
  • warm data = between 1 and 3 years old
  • cold data = older than 3 years

As mentioned earlier by Sander as well, it's better to first split up your data by calendar/fiscal year or fiscal period into separate InfoCubes (Logical Partitioning). After that you can apply Physical Partitioning by defining InfoCube partitions per InfoCube.

Before you do this you have to know your data. In your current InfoCube (with 8 years of data) check the number of records for each calendar/fiscal year.

Let's say that you want no more than 100 mln. records per InfoCube. With a growth rate of approx. 2 mln./month, this would mean that you can combine several calendar/fiscal years into one InfoCube.

To answer your questions:

  1. With your current data volume, don't keep 8 years of data in one InfoCube. The maximum number of records in and InfoCube on Oracle or DB2 is 300 mln. I recommend not to use more than 35 partitions per InfoCube.
  2. Yes, each time the InfoCube reaches its maximum size, you need to create a new InfoCube. This is usually part of the year end activities.
  3. Aggregates are separate database objects defined within an InfoCube. There's no difference between filling aggregates of InfoCubes with or without partitions. You don't need to worry about it.
  4. Your system runs on an Oracle database so there's no transaction where you can check the partitions. In this case I always use the SQL command editor in SAP transaction DB02 (go to Performance -> Additional Functions-> SQL Command Editor) and execute the following SQL command:

          SELECT * FROM dba_tab_partitions WHERE table_name LIKE '/BI%/E%';

          It will provide per InfoCube a nice overview of the partitions and their number of records.

Perhaps needless to say, but the requests in an InfoCube need to be compressed first, before they are written to the partitions.

Only the E-fact table of an InfoCube is partitioned (technically the F-fact table is partitioned on request ID). The F-fact table is optimized for writing (this is the table that the system is writing to when loading the InfoCube), the E-fact table is optimized for reading (needed for optimal query performance). So it's best to compress the InfoCube after it has been loaded and don't leave to many requests uncompressed.

Hope it's clear.

Rgds,

René

former_member198905
Participant
0 Kudos

Hi René

Thanks again for your valuable reply. As per my understanding I need to do following steps on my InfoCube:

let suppose If 3 years data contains 300mln records and cube contained 1200 mln records.

1) Create 4 cubes with 35 partitions and Multiprovider on it on DEV and transport to PRD.

2) Transfer 300 mln records per cube from original cube and add a cube every year manually.

3) Delete all data except current year data from original cube.

4) Make query on Multiprovider.

5) Move previous year data every year to newly created cube manually.

Correct me if I am wrong.

Waiting for your reply.

sander_vanwilligen
Active Contributor
0 Kudos

Hi,

I am a bit confused by looking at the no. of records. You mentioned 2 million a month, i.e. 24 million a year. If I divide the total no. records (2,250 million = 2.25 billion) by 24 million yearly, then you should have a history of almost 94 years. Can you please confirm these figures?

Anyway, taking the total no. of records as the baseline, your InfoCube becomes unmanageable from a data volume point of view. I strongly recommend to also use Logical Partitioning (i.e. splitting up one InfoProvider into multiple InfoProviders). Please consider using the SPO (Semantically Partitioned Object functionality). Please refer to my blog series for more information.

Data volume recommendations are dependent on the underlying database. I heart in the past recommendations like 100 million for MS SQL Server and 300 million for Oracle and DB2. I expect that higher data volumes will not lead to problems but you should avoid a single InfoProvider with a billion records. It is also not scalable, it can and will only increase.

Logical Partitioning based on year (calendar year or fiscal year) is often a practical way to make the data storage of your InfoProviders scalable.

Logical Partitioning can be combined with database partitioning (a.k.a. physical partitioning). This is dependent on the underlying database. It will split up the fact table of the InfoCube into multiple partitions. Often calendar month or fiscal month are used for this purpose.

Re. repartitioning, I suggest to do it the manual way, considering the data volume and my recommendation to introduce Logical Partitioning. I.e. moving to an SPO-based InfoCube (with database partitioning) and reloading the various PartProviders (e.g. yearly time slices).

Best regards,

Sander

0 Kudos

Hello Imran,

With an InfoCube containing 2250000000 records, I recommend to do a Logical Partitioning first.

Define InfoCubes e.g. for one year, fiscal period or company code and join them via a MultiProvider (advantages: parallel access to underlying basis InfoCubes, load balancing, resource utilization, query pruning).

Next to the Logical Partitioning you can set the Physical Partitioning.

In order to keep the query performance good, try to avoid InfoCubes with more than 100 mln. records. Furthermore I recommend not to have more than 35 partitions per InfoCube.

Repartitioning can be useful if you have already loaded data to your InfoCube, and:

  • You did not partition the InfoCube when you created it.
  • You loaded more data into your InfoCube than you had planned when you partitioned it.
  • You did not choose a long enough period of time for partitioning.
  • Some partitions contain no data or little data due to data archiving over a period of time.

However SAP recommends a complete back up of the database before you execute this function. This ensures that if an error occurs (for example, during a DB catalogue operation), you can restore the system to its previous status.

For this reason I have never used the Repartitioning feature, I always dropped the InfoCube and (re)defined the InfoCube (Physical) Partitioning.

Details on how to set the Physical Partitioning can be found here.

Hope it helps.

Rgds,

René