cancel
Showing results for 
Search instead for 
Did you mean: 

partitioning on ERP for table with 2billion entries

Former Member
0 Kudos

Hello experts,

We have an issue where a process runs.  The process is simple and broken into two parts.  First part of the process, there is a selection that happens from table JEST.  The 2nd part, some logic is applied and updates happen in parallel to different tables.   The longest time of this whole process is the select statement.  Its as following:

SELECT

  *

FROM

  "JEST"

WHERE

  "MANDT"=:A0 AND "OBJNR"=:A1

This select seems efficient but ofcourse, we want it to happen even faster.  We sort the table from time to time,  we also have some archiving in place to address the size.  Still, considering this table has 2billion rows, we are looking at possibility of partitioning(possibly range partitioning on OBJNR). The table has 5 fields of which OBJNR is a key field.  We are thinking, partitioning would help even further.

does this seem like the right approach?

Also, we also looked into possibility of turning on parallel processing on this table but it was not recommended by SAP as that affects DML operations.  We are wondering if partitioning the table would let oracle process queries in parallel on this table by default?  Can anyone confirm?

Accepted Solutions (1)

Accepted Solutions (1)

stefan_koehler
Active Contributor
0 Kudos

Hi Ne,

i don't know the table definition of JEST out of my mind, but as far as i can remember the columns MANDT and OBJNR are part of the primary key (or even the full primary key), right?

> We are wondering if partitioning the table would let oracle process queries in parallel on this table by default?

No. If you are considering partitioning because of performance issues with that single SELECT, then partitioning won't help. I assume that it uses an unique index (with an unique or range scan) with rowid lookup and does not run a full table or index fast full scan.

As far as i understand you correct, the issue is not the performance of that single SELECT, right? It is more likely the amount of time that is spent by the several executions of that one? Does one SELECT return just one row? If yes, then it is likely caused by "SQL*Net message from client" waits which are considered as an idle event from database perspective, but not from application / SQL response time perspective. I have written a blog post about this topic here () some time ago. In such cases you need to rewrite the application code to "bulk" collects / fetches.

Regards

Stefan

Former Member
0 Kudos

Hello Stefan,

Thank you for your reply.

The explain plan looks as following:

SELECT STATEMENT ( Estimated Costs = 2 , Estimated #Rows = 5 )

       2 TABLE ACCESS BY INDEX ROWID JEST

         ( Estim. Costs = 1 , Estim. #Rows = 5 )

         Estim. CPU-Costs = 8,976 Estim. IO-Costs = 1

           1 INDEX RANGE SCAN JEST~0

             ( Estim. Costs = 1 , Estim. #Rows = 5 )

             Search Columns: 2

             Estim. CPU-Costs = 5,897 Estim. IO-Costs = 1

             Access Predicates

It is going through an index range scan. 

You are right, we are focused on the time spent by this select's several executions(more like few hundred million).  This is also standard SAP code so not sure how we can influence it to do "bulk" collects/fetches. 

We are trying to see if there anything at DB level we can do to select faster.  We were discouraged by SAP to use parallel processing in OLTP environment(we run ECC on HPUX and Oracle 11g).  Its not the recommended approach.

We are trying to attack all variables we have control over though we are reaching limits.  I will check out your blog for sure. 

stefan_koehler
Active Contributor
0 Kudos

Hi Ne,

> The explain plan looks as following

Ok perfect - as i assumed (and no filter predicates, just access predicates). The only open point is how many rows are returned by that single SELECT in average (and how the SQL response time is distributed). The estimated rows are not necessarily the reality.

> You are right, we are focused on the time spent by this select's several executions(more like few hundred million)

.. and this is exactly one of the cases where you can not do anything on database level, if most of the time is spent by "SQL*Net message from client" wait events. This is a very common case, if several very fast executions (or in your case few hundred million) of the same SQL are done.

> We were discouraged by SAP to use parallel processing in OLTP environment(we run ECC on HPUX and Oracle 11g).

As previously mentioned there is nothing for parallel query (or partitioning) in that SQL context. I have implemented parallel query in SAP OLTP environments a few times, but only if it makes sense.

> This is also standard SAP code so not sure how we can influence it to do "bulk" collects/fetches.

Sometimes there are already code enhancements (e.g. FOR ALL ENTRIES) or you can already adjust specific SAP DBSL parameters to execute that particular SQL less times, but this depends on the exact OPEN SQL code and can not be generalized. At first you should identify where most of the time is spent (in context of the whole OPI / OCI stack) and then implement the proper solutions.

Sometimes you also need to convince SAP to provide a code enhancement, but it is pretty hard (or better said impossible) without valid and well prepared data.

You can easily get the relevant and isolated data for that particular SQL with help of the new low-level kernel diagnostics & tracing infrastructure (since Oracle 11g). More details and examples about that framework can be found here Oracle Diagnostic Events in 11g by Miladin Modrakovic or here ORADEBUG DOC by Tanel Poder.

Regards

Stefan

Former Member
0 Kudos

Hello Stefan,

thank you again for replying.

While I go through the threads/blog you have suggested(and i must say they are pretty technical getting down to the internals of oracle[deep considering from my basis perspective]) I did want run the following analogy by you to see if in theory it makes sense in approaching:

Think of the table as a one giant parking lot for cars.  There are 2 billion designated spots(index) for cars(rows) in this lot.  I will ask the parking/security officer(oracle) to go get me the license plate numbers from a given list(few hundred millions) of designated spots.  Now, theoretically, one way to speed up the whole operation is to split the lot in to smaller lots(partitions) and have several officers(parallel) go out and get the info needed.  Theoretically sound but technically I am not sure.

stefan_koehler
Active Contributor
0 Kudos

Hi Ne,

i get your analogy, but technically Oracle parallel execution (e.g. parallel query) is based on granules. You can get the concept from the official Oracle documentation (Granules of Parallelism).

In your case you would need to parallelize the INDEX RANGE SCAN (and TABLE ACCESS BY ROWID) and for that you need partition granules. Your query that should benefit from it uses a WHERE clause like "MANDT=:A0 AND OBJNR=:A1", which would end up in the same partition (= same granule) - so nothing to parallelize here. I quote from the official Oracle documentation in that scenario.


Partition granules are the basic unit of parallel index range scans, joins between two equipartitioned tables where the query optimizer has chosen to use partition-wise joins, and parallel operations that modify multiple partitions of a partitioned object.

Because partition granules are statically determined by the structure of the table or index when a table or index is created, partition granules do not give you the flexibility in executing an operation in parallel that block granules do. The maximum allowable DOP is the number of partitions.

Block granules are easier to parallelize, but i really doubt that an INDEX FAST FULL SCAN or a FULL TABLE SCAN (even on partition level) is faster than an index range scan lookup on that unique (primary key) index by just visiting a few blocks. You need to consider the amount of work that needs to be done for co-ordination and building up such granules as well.

However you already think about implementing/testing a solution without knowing the real time consuming part at all. Based on your description of the application behavior (few hundred million single executions) i am pretty sure that you should verify and consider the "SQL*Net message from client" problem.

> While I go through the threads/blog you have suggested(and i must say they are pretty technical getting down to the internals of oracle[deep considering from my basis perspective])

I know - i am sorry, but (Oracle) performance tuning is a pretty technical topic and you need to understand the concepts behind for being able to troubleshoot, identify and fix the real problem in a systematic and economic way. Oracle is a beast and gives you a lot of possibilities to trace and identify the issue (or make it even worse based on wrong assumptions), which is pretty technical in consequence.


Regards

Stefan

Answers (0)