Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
lbreddemann
Active Contributor

When faced with error messages like ORA-1578 Oracle database users just got the task to handle a database block corruption. I've already written about this topic before and today I'd like to share some tips and tricks to get this task done quicker.

Hint #1 - Be prepared

I cannot restate this more often - but the in and out of DB administration is to actually know what you're doing before you do it.

Have a plan, think it through and test it on a non-production environment.

Once you are sure it works - try to break it (ask any untrained co-worker, usually this helps much...).
Once your procedures are "fool-proof": employ it on the QA-system and try them out there.
Finally implement your procedures on the production environment and check if they work there too.
As soon as you get a new system with some new functions - try out if your procedures still hold.

Hint #2 - RMAN verification

The very first step in getting corruptions out of the system is the complete database verification.
As all available standard tools to check a Oracle database (analyze, dbverify/rman, full export) check for slightly different errors it's highly advisable to perform all of them. Anyhow, except usually dbverify/rman the checks take quite some time and can even lead to problems in the production system.

Therefore I suggest to start with the least disturbing check: verification via RMAN.

This feature is available now for some years and had been integrated into the brtools last year.
The big advantage of using RMAN over using DBV is that RMAN uses the same I/O access (DirectIO, AsyncIO etc.) as the Oracle Server processes do and that the output comes in a much more convinient way - via a Oracle view.

The verification via RMAN can either be started via the menus of BRTOOLS (I highly recommend that!) or via commandline. The command check the tablespace EXAMPLE with RMAN via brbackup it could look like this:

brbackup -u / -t online -w only_rmv -m example

Let's compare the output of DBV and RMAN for a datafile that has corruptions in it:

DBV via BRTOOLS:

DBVERIFY: Release 10.2.0.2.0 - Production on Mon Apr 14 15:39:24 2008
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = C:\\ORACLE\\TDB\\EXAMPLE.DATA1
DBV-00201: Block, DBA 16777741, marked corrupt for invalid redo application
DBV-00201: Block, DBA 16777742, marked corrupt for invalid redo application
DBV-00201: Block, DBA 16777743, marked corrupt for invalid redo application
 
[... omitting 50 lines ...]
 
DBV-00201: Block, DBA 16777996, marked corrupt for invalid redo application
DBV-00201: Block, DBA 16777997, marked corrupt for invalid redo application
DBV-00201: Block, DBA 16777998, marked corrupt for invalid redo application
DBV-00201: Block, DBA 16777999, marked corrupt for invalid redo application
DBV-00201: Block, DBA 16778000, marked corrupt for invalid redo application
DBV-00201: Block, DBA 16778001, marked corrupt for invalid redo application
 
DBVERIFY - Verification complete
Total Pages Examined         : 6400
Total Pages Processed (Data) : 415
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 49
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 264
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 5672
Total Pages Marked Corrupt   : 123
Total Pages Influx           : 0
Highest block SCN            : 20622347 (0.20622347)

RMAN via BRTOOLS:

BR0568I Verifying C:\\ORACLE\\TDB\\EXAMPLE.DATA1 using RMAN...
BR0398E RMAN detected corrupted blocks in C:\\ORACLE\\TDB\\EXAMPLE.DATA1
BR0548I Please check Oracle alert log C:\\ORACLE\\TDB\\saptrace\\background\\alert_TDB.log for further information about this error

Let's ignore the Alert.log for now but rather check the database view V$COPY_CORRUPTION:

SQL> set linesize 130
SQL> select * from v$backup_corruption;
     RECID      STAMP  SET_STAMP  SET_COUNT     PIECE#      FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# MAR CORRUPTIO
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ------------------ --- ---------
         1  652022089  652022088         42          1          4        525          7           20596264 NO  LOGICAL
         2  652027964  652027963         53          1          4        525          7           20619321 NO  LOGICAL
         3  652030881  652030880         54          1          4        525          4           20619321 NO  LOGICAL
         4  652030881  652030880         54          1          4        549          4           20620431 NO  LOGICAL
         5  652030881  652030880         54          1          4        589          3           20622009 NO  LOGICAL
         6  652030881  652030880         54          1          4        597          3           20622013 NO  LOGICAL
         7  652030881  652030880         54          1          4        605          3           20622017 NO  LOGICAL
         8  652030881  652030880         54          1          4        613          3           20622021 NO  LOGICAL
         9  652030881  652030880         54          1          4        621          3           20622025 NO  LOGICAL
        10  652030881  652030880         54          1          4        637         12           20622285 NO  LOGICAL
        11  652030881  652030880         54          1          4        650          8           20622289 NO  LOGICAL
        12  652030881  652030880         54          1          4        669         12           20622293 NO  LOGICAL
        13  652030881  652030880         54          1          4        682          8           20622297 NO  LOGICAL
        14  652030881  652030880         54          1          4        701         12           20622309 NO  LOGICAL
        15  652030881  652030880         54          1          4        714          8           20622311 NO  LOGICAL
        16  652030881  652030880         54          1          4        733         12           20622317 NO  LOGICAL
        17  652030881  652030880         54          1          4        746          8           20622321 NO  LOGICAL
        18  652030881  652030880         54          1          4        765         12           20622331 NO  LOGICAL
        19  652030881  652030880         54          1          4        778          8           20622333 NO  LOGICAL

So, instead of having one error message line per corrupt block with a reference to a dba (database block address, used oracle internal) or rdba (relative database block address) we now have one single line for all corrupt blocks that are ‘neighbours' to each other.

We also see what kind of corruption was found - in this case ‘LOGICAL' which means, that the data has been marked corrupt for the sake of data consistency. In this example it was due to a recovery of NOLOGGING blocks.
Furthermore, we see at which SCN the block was marked corrupt, which may help to find out up to which point-in-time changes could have happened to that block:

SQL> select scn_to_timestamp (corruption_change#) from v$copy_corruption where recid=1;

SCN_TO_TIMESTAMP(CORRUPTION_CHANGE#)
---------------------------------------------------------------------------
14-APR-08 12.52.05.000000000 PM

Hint #3 - Materialize DBA_EXTENTS

Once we know all corrupted blocks it is necessary to figure out, which objects are stored in them.

Usually this is done with a query against the DBA_EXTENTS view. This is fine as long as you've only a small number of corrupt blocks.
But as soon as you're faced with several hundred blocks running the queries one by one, this can be very time wasting.

The time is usually spent a) in setting up the statements from the DBV output and b) waiting for the query to finish as the DBA_EXTENTS view is known to be not the fastest one on earth.

We will try to speed up both a) and b).

Point b) is a bit easier so let's start with that. The DBA_EXTENTS view is notoriously slow for different reasons.
Since we don't need the most current version of the view for each select we're goin to run, we can materialize the table and index it:

SQL> create table TMP_EXTENTS tablespace SAPDATA as
2     (select owner, segment_name, partition_name, segment_type, file_id, block_id, blocks
3      from dba_extents where rownum<1);
Table created.
Elapsed: 00:00:00.39
SQL> insert /*+append*/ into tmp_extents
2      (select owner, segment_name, partition_name, segment_type, file_id, block_id, blocks
3       from dba_extents);
4695 rows created.
SQL> create index i_tmp_ext on tmp_extents ( file_id, block_id, blocks) tablespace sapdata;
Index created.
Elapsed: 00:00:00.34
SQL> analyze table tmp_extents compute statistics;
Table analyzed.
Elapsed: 00:00:00.53

As you see this takes some time, but it's still quicker than running hundreds of queries against DBA_EXTENTS. Let's check that...

Original Query (from note #365481):
SQL> select owner, segment_name, partition_name, segment_type, block_id, blocks
2  from dba_extents
3  where (525 between
4         block_id and (block_id + blocks - 1))
5  and file_id = 4
6  and rownum < 2;
no rows selected
Elapsed: 00:00:09.99
Query against the TMP_EXTENTS:
SQL> select owner, segment_name, partition_name, segment_type, block_id, blocks
2  from tmp_extents
3  where (525 between
4         block_id and (block_id + blocks - 1))
5  and file_id = 4
6  and rownum < 2;
no rows selected
Elapsed: 00:00:00.02

In this case no segment was found and the block is not a tablespace header block, therefore the block in question is freespace. Freespace corruptions is usually nothing to worry about, since the blocks are reformatted the next time they're used.

ATTENTION: Be aware that the TMP_EXTENTS won't get updated ... so make sure you rebuild it before using it. And afterwards: don't forget to drop it again!

To make point a) quicker see the next hint ...

Hint #4 - Use SQL to get a list of corrupt objects

After we have materialized the DBA_EXTENTS and used RMAN to fill the V$BACKUP_CORRUPTION it would not be too clever to start running the statement (s.a.) for each block one for one.
If you use DBV to verify your database the following won't work!

Instead we can have all the blocks looked up at once.
Just join both V$BACKUP_CORRUPTION and TMP_EXTENTS like this:

SQL> set linesize 130
SQL> set pagesize 100
SQL> col owner for a8
SQL> col segment_name for a20
SQL> col partition_name for a20
SQL> select cc.recid, te.owner, te.segment_name, te.partition_name, te.segment_type, te.block_id, te.blocks
2  from tmp_extents te,
3       v$backup_corruption cc
4  where (cc.block# between te.block_id and (te.block_id + te.blocks - 1) )
5* and cc.file# = te.file_id
     RECID OWNER    SEGMENT_NAME         PARTITION_NAME       SEGMENT_TYPE         BLOCK_ID     BLOCKS
---------- -------- -------------------- -------------------- ------------------ ---------- ----------
         3 SAPR3    F_DATA_BM_TIME       P6                   INDEX PARTITION           521          8
         2 SAPR3    F_DATA_BM_TIME       P6                   INDEX PARTITION           521          8
         1 SAPR3    F_DATA_BM_TIME       P6                   INDEX PARTITION           521          8
         4 SAPR3    F_DATA_BM_CW         P3                   INDEX PARTITION           545          8
         5 SAPR3    F_DATA_BM_PACKET     P2                   INDEX PARTITION           585          8
         6 SAPR3    F_DATA_BM_PACKET     P3                   INDEX PARTITION           593          8
         7 SAPR3    F_DATA_BM_PACKET     P4                   INDEX PARTITION           601          8
         8 SAPR3    F_DATA_BM_PACKET     P5                   INDEX PARTITION           609          8
         9 SAPR3    F_DATA_BM_PACKET     P6                   INDEX PARTITION           617          8
        10 SAPR3    F_DATA_INDEX1        P2                   INDEX PARTITION           633          8
        11 SAPR3    F_DATA_INDEX1        P2                   INDEX PARTITION           649          8
        12 SAPR3    F_DATA_INDEX1        P3                   INDEX PARTITION           665          8
        13 SAPR3    F_DATA_INDEX1        P3                   INDEX PARTITION           681          8
        14 SAPR3    F_DATA_INDEX1        P4                   INDEX PARTITION           697          8
        15 SAPR3    F_DATA_INDEX1        P4                   INDEX PARTITION           713          8
        16 SAPR3    F_DATA_INDEX1        P5                   INDEX PARTITION           729          8
        17 SAPR3    F_DATA_INDEX1        P5                   INDEX PARTITION           745          8
        18 SAPR3    F_DATA_INDEX1        P6                   INDEX PARTITION           761          8
        19 SAPR3    F_DATA_INDEX1        P6                   INDEX PARTITION           777          8
19 rows selected.

If we just want to see one line per segment - no problem:

SQL> select distinct te.owner, te.segment_name, te.partition_name, te.segment_type
2  from tmp_extents te,
3       v$backup_corruption cc
4  where (cc.block# between te.block_id and (te.block_id + te.blocks - 1) )
5  and cc.file# = te.file_id
6  order by te.owner, te.segment_name, te.partition_name, te.segment_type;
OWNER    SEGMENT_NAME         PARTITION_NAME       SEGMENT_TYPE
-------- -------------------- -------------------- ------------------
SAPR3    F_DATA_BM_CW         P3                   INDEX PARTITION
SAPR3    F_DATA_BM_PACKET     P2                   INDEX PARTITION
SAPR3    F_DATA_BM_PACKET     P3                   INDEX PARTITION
SAPR3    F_DATA_BM_PACKET     P4                   INDEX PARTITION
SAPR3    F_DATA_BM_PACKET     P5                   INDEX PARTITION
SAPR3    F_DATA_BM_PACKET     P6                   INDEX PARTITION
SAPR3    F_DATA_BM_TIME       P6                   INDEX PARTITION
SAPR3    F_DATA_INDEX1        P2                   INDEX PARTITION
SAPR3    F_DATA_INDEX1        P3                   INDEX PARTITION
SAPR3    F_DATA_INDEX1        P4                   INDEX PARTITION
SAPR3    F_DATA_INDEX1        P5                   INDEX PARTITION
SAPR3    F_DATA_INDEX1        P6                   INDEX PARTITION
12 rows selected.

That's a lot more convenient - isn't it?

Hint #5 - Rebuild NOLOGGING Indexes half-automatically

Many BW users encounter corruption errors like DBV-200 or DBV-201 after they recover index-tablespaces. The reason for that is the fact that the NOLOGGING feature is usually used by BI to speed up things a bit. The downside is of course that the indexes need to be recreated if they had been build after the last data backup was taken.

There are several ways to accomplish that, but the most simple one is: use the BRTools to restore and recover your database.´
In the current versions the BRTools are aware of the NOLOGGING feature and what it implicates. So BRRECOVER tries to figure out which BW-indexes need to be rebuild and starts just that.

Another way is to create the rebuild statements for those Indexes that have been created after the last data backup and that are either NOLOGGING and/or have local partitions that are flagged NOLOGGING.
That could be done with a statement similar to this:

!! ATTENTION: STATEMENT CORRECTED !! (17.06.2008)

select distinct 'alter index "'|| o.owner || '"."' ||
                o.object_name || '" rebuild ' ||
                decode(i.partition_name, NULL, '', ' PARTITION "'||
                i.partition_name ||'"' )
                ||' online nologging;' as rebuild_list
from (select owner, object_name, subobject_name
      from dba_objects
      where
        (( object_type = 'INDEX PARTITION'
            and  subobject_name is not null)
         or (object_type ='INDEX'
            and  subobject_name is null))
      and  created > to_date('2008-04-14-11:58:00',
                                'YYYY-MM-DD-HH24:MI:SS')
      ) o,
     (select id.owner, id.index_name, ip.partition_name
      from     dba_indexes  id,
               dba_ind_partitions ip
      where   'NO' in (ip.logging, id.logging)
      and id.owner = ip.index_owner (+)
      and id.index_name  = ip.index_name (+)) i
     where  o.owner          = i.owner
       and  o.object_name    = i.index_name
       and  (  o.subobject_name = i.partition_name
            or o.subobject_name is null);

REBUILD_LIST
----------------------------------------------------------------------------------
alter index "SAPR3"."F_DATA_BM_TIME" rebuild PARTITION "P5" online nologging;
alter index "SAPR3"."F_DATA_BM_CW" rebuild PARTITION "P5" online nologging;
alter index "SAPR3"."F_DATA_BM_CW" rebuild PARTITION "P6" online nologging;
alter index "SAPR3"."F_DATA_BM_PACKET" rebuild PARTITION "P4" online nologging;
alter index "SAPR3"."F_DATA_INDEX1" rebuild PARTITION "P2" online nologging;
alter index "SAPR3"."F_DATA_INDEX1" rebuild PARTITION "P3" online nologging;
alter index "SAPR3"."F_DATA_BM_PACKET" rebuild PARTITION "P1" online nologging;
alter index "SAPR3"."F_DATA_INDEX1" rebuild PARTITION "P6" online nologging;
alter index "SAPR3"."F_DATA_INDEX1" rebuild PARTITION "P1" online nologging;
alter index "SAPR3"."F_DATA_BM_CW" rebuild PARTITION "P2" online nologging;
alter index "SAPR3"."F_DATA_BM_CW" rebuild PARTITION "P1" online nologging;
alter index "SAPR3"."F_DATA_BM_PACKET" rebuild PARTITION "P6" online nologging;
alter index "SAPR3"."F_DATA_BM_TIME" rebuild PARTITION "P6" online nologging;
alter index "SAPR3"."F_DATA_BM_PACKET" rebuild PARTITION "P2" online nologging;
alter index "SAPR3"."F_DATA_BM_PACKET" rebuild PARTITION "P3" online nologging;
alter index "SAPR3"."F_DATA_BM_TIME" rebuild PARTITION "P2" online nologging;
alter index "SAPR3"."F_DATA_BM_TIME" rebuild PARTITION "P4" online nologging;
alter index "SAPR3"."F_DATA_BM_CW" rebuild PARTITION "P4" online nologging;
alter index "SAPR3"."F_DATA_BM_CW" rebuild PARTITION "P3" online nologging;
alter index "SAPR3"."F_DATA_INDEX1" rebuild PARTITION "P4" online nologging;
alter index "SAPR3"."F_DATA_INDEX1" rebuild PARTITION "P5" online nologging;
alter index "SAPR3"."F_DATA_BM_TIME" rebuild PARTITION "P3" online nologging;
alter index "SAPR3"."F_DATA_BM_TIME" rebuild PARTITION "P1" online nologging;
alter index "SAPR3"."F_DATA_BM_PACKET" rebuild PARTITION "P5" online nologging;
24 rows selected.

Note: '2008-04-14-11:58:00' is the timestamp of the last data backup taken.
You can just copy and paste the commands or have them written to a spool file and start this as a script.

Hint #6 - get prepared (see hint #1)

I've shown you a few things that worked out for me when solving customer messages.
So go ahead and try out, if they work for you as well.
Try to create some corruptions on your test database e.g. restore the backup of the BW instance from two weeks ago. Try to recover it, take timings (set timing on works fine in sqlplus!), try to make it work.

DISCLAIMER:

The things I showed you here are for educational purposes only. Should you encounter corruptions in your production databases please do open a support message.´
The supported procedures to handle corruptions are described in these notes:

#540463 - FAQ: Consistency Checks + Block Corruptions
#365481 - Block corruptions
#023345 - Consistency check of ORACLE database

Once you want to to something different (like the techniques described above), please do check with SAP support before implementing it.

Best regards and happy corruption handling,

Lars

2 Comments