Pages

Sunday, August 12, 2018

ORA 1578 - Block Corruption causing multiple Error Messages

============================
General
============================
Huge trace files re generated from archiver process.
When checking alert.log, there are these messages:
ORA-01578; ORACLE data block corrupted


Other then that, no other error messages are coming.


How to fix this issue?

============================
Errors
============================
Errors in alert.log

Thu Aug 09 10:29:26 2018
DDE: Problem Key 'ORA 1578' was completely flood controlled (0x6)
Further messages for this problem key will be suppressed for up to 10 minutes
Thu Aug 09 10:31:07 2018
Thread 1 advanced to log sequence 200953 (LGWR switch)
  Current log# 1 seq# 200953 mem# 0: /oracle_db/db1/db_igt/ora_redo_01_a.rdo
Thu Aug 09 10:33:48 2018
Thread 1 advanced to log sequence 200954 (LGWR switch)
  Current log# 2 seq# 200954 mem# 0: /oracle_db/db1/db_igt/ora_redo_02_a.rdo
Thu Aug 09 10:36:29 2018
Thread 1 advanced to log sequence 200955 (LGWR switch)
  Current log# 3 seq# 200955 mem# 0: /oracle_db/db1/db_igt/ora_redo_03_a.rdo
Thu Aug 09 10:39:10 2018
Thread 1 advanced to log sequence 200956 (LGWR switch)
  Current log# 1 seq# 200956 mem# 0: /oracle_db/db1/db_igt/ora_redo_01_a.rdo
Thu Aug 09 10:39:26 2018
DDE: Problem Key 'ORA 1578' was completely flood controlled (0x6)
Further messages for this problem key will be suppressed for up to 10 minutes
Thu Aug 09 10:41:53 2018
Thread 1 advanced to log sequence 200957 (LGWR switch)
  Current log# 2 seq# 200957 mem# 0: /oracle_db/db1/db_igt/ora_redo_02_a.rdo
Thu Aug 09 10:44:33 2018
Thread 1 advanced to log sequence 200958 (LGWR switch)
  Current log# 3 seq# 200958 mem# 0: /oracle_db/db1/db_igt/ora_redo_03_a.rdo
Thu Aug 09 10:47:10 2018
Thread 1 advanced to log sequence 200959 (LGWR switch)
  Current log# 1 seq# 200959 mem# 0: /oracle_db/db1/db_igt/ora_redo_01_a.rdo
Thu Aug 09 10:49:27 2018
DDE: Problem Key 'ORA 1578' was completely flood controlled (0x6)
Further messages for this problem key will be suppressed for up to 10 minutes


============================
General
============================

Definitions.
PHYSICAL corruption -  a Block could not be written
Logical corruption -  a Block could be written, but cannot be updated
Soft Corrupt - A Logical Corrupt block may be marked as Logical corruption

A. Soft Corrupt Blocks - Definition and information (Doc ID 1496934.1)

PURPOSE
Oracle may mark a block as Soft Corrupt.  
This document explains the definition of Soft Corrupt Block and the situations and behaviour when a block is marked as Soft Corrupt. 

DETAILS

What is a Soft Corrupt Block?
A Software Corrupt Block is:

1) Former LOGICAL corrupt block marked as formally corrupt or


2) A block that was attempted to be recovered in the buffer cache and could not be recovered. 

   The next message is indication of marking a block as soft corrupt in memory because the automatic block recover could not recover the block:

When a block is marked as Soft Corrupt?

When db_block_checking is enabled, Oracle soft corrupts an already corrupt block when modifying it (the block before image is already corrupt meaning that the block may be already corrupt on DISK).  
This only applies for LOGICAL corruptions as with PHYSICAL corruptions the block is detected as corrupt during read and does not arrive to the checking code where a modification takes place.
When automatic block recover could not recover the block after a process failed during block modification.  "Block recovery logically corrupted file .. block .." message appears in the alert.log
dbms_repair.FIX_CORRUPT_BLOCKS can also be used to mark a LOGICAL corrupt block as Soft Corrupt.
Media recovery can also soft corrupt a block if the block is already PHYSICAL corrupt
Subsequent block reads then produce ORA-1578 instead of internal errors.

Why is a block marked as Soft Corrupt?

A block is marked as soft corrupt to prevent other internal errors and to prevent further corruptions in the database.  
When a block is logically corrupt it may cause ORA-600 / ORA-7445 errors and may produce further corruptions for other blocks in the Oracle buffer cache memory when the corrupt block is modified.  Marking the block as soft corrupt prevents DML SQL statements to modify the block.  When a block is marked as soft corrupt, SQL statements fail with error ORA-1578 when the block is read.
A block is also marked as Soft Corrupt so it can be skipped when using event 10231 or when using procedure dbms_repair.SKIP_CORRUPT_BLOCKS.
When using TDE (Transparent Data Encryption - Tablespace encryption), a block can be marked as soft corrupt if an incorrect or invalid wallet is used.  Note that for this case the block is only soft corrupt in Memory as long as the invalid wallet is used (dbverify does not produce an error as the block is not soft corrupt on disk).


What is the Oracle behavior for a Soft Corrupt Block?

When a block is marked as soft corrupt, SQL statements fail with error ORA-1578 when reading the block unless dbms_repair.SKIP_CORRUPT_BLOCKS or event 10231/10233 are used.
RMAN backups do not fail with Soft Corrupt blocks.  RMAN ignores 'soft corrupt' blocks during backup without setting the MAXCORRUPT clause.  Soft Corrupt blocks do not count in the MAXCORRUPT clause.
Media Recovery (rollforward) ignores Soft Corrupt blocks.  In general media RECOVERY ignores Physical corrupt blocks and Soft Corrupt corrupt blocks.  When the block is logically corrupt Media RECOVERY may fail with unexpected errors. When the block is PHYSICAL corrupt, media recovery marks the block as soft corrupt and recovery continues with no errors (Corrupt block is skipped).
RMAN validate does not report a soft corrupt block in the trace file but in v$database_block_corruption. Example:
SQL> select * from v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO

---------- ---------- ---------- ------------------ ---------
         1      59744          1                  0 CORRUPT
DBVerify reports the soft corrupt block with DBV-200 error:

DBV-00200: Block, dba <rdba>, already marked corrupted


============================
Options to identify the Corruption
============================
Option A - using IPS
Using IPS it is possible to use a CLI commands, that would identify the corrupted blocks, general a zip file with all related information, and then use oracle tool to read the generated zip.

>adrci

ADRCI: Release 11.1.0.7.0 - Production on Sun Aug 12 06:57:59 2018

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

ADR base = "/software/oracle"

adrci>  SHOW PROBLEM

ADR Home = /software/oracle/diag/rdbms/igt/igt:
*************************************************************************
PROBLEM_ID           PROBLEM_KEY                                                 LAST_INCIDENT        LASTINC_TIME
-------------------- ----------------------------------------------------------- -------------------- ----------------------------------------
3                    ORA 1578                                                    915474               2018-08-01 12:27:46.962289 +00:00
2                    ORA 600 [ORA-00600: internal error code, arguments: [3020], 715347               2018-06-17 14:55:00.738747 +00:00
1                    ORA 600 [3020]                                              715379               2018-06-17 14:54:59.632741 +00:00
3 rows fetched



adrci>IPS CREATE PACKAGE PROBLEM 3
Created package 1 based on problem id 3, correlation level typical

adrci> IPS GENERATE PACKAGE 3 IN /software/oracle/oracle/scripts

At this point, the zip file is created.
In addition, it is possible to see older incidents.

adrci> show incident

ADR Home = /software/oracle/diag/rdbms/igt/igt:
*************************************************************************
INCIDENT_ID          PROBLEM_KEY                                                 CREATE_TIME
-------------------- ----------------------------------------------------------- ----------------------------------------
913422               ORA 1578                                                    2018-08-01 11:07:53.855202 +00:00
913414               ORA 1578                                                    2018-08-01 11:08:15.318826 +00:00
913382               ORA 1578                                                    2018-08-01 11:08:33.824501 +00:00
913366               ORA 1578                                                    2018-08-01 11:08:31.296714 +00:00
913334               ORA 1578                                                    2018-08-01 11:08:23.704473 +00:00
910674               ORA 1578                                                    2018-06-26 05:51:44.996466 +00:00
910634               ORA 1578                                                    2018-06-26 05:47:56.707714 +00:00
910626               ORA 1578                                                    2018-06-26 05:50:22.140367 +00:00
910603               ORA 1578                                                    2018-06-26 05:51:41.576405 +00:00
910602               ORA 1578                                                    2018-06-26 05:50:37.070947 +00:00
908392               ORA 1578                                                    2018-06-26 03:44:54.058518 +00:00
908377               ORA 1578                                                    2018-06-26 03:45:20.038519 +00:00
908376               ORA 1578                                                    2018-06-26 03:43:30.007570 +00:00
905705               ORA 1578                                                    2018-06-25 16:41:19.797776 +00:00
905698               ORA 1578                                                    2018-06-25 16:40:51.885913 +00:00
905697               ORA 1578                                                    2018-06-25 16:40:10.247187 +00:00
905673               ORA 1578                                                    2018-06-25 16:41:51.501605 +00:00
905657               ORA 1578                                                    2018-06-25 16:40:23.967756 +00:00
903068               ORA 1578                                                    2018-06-25 07:29:38.796384 +00:00
903044               ORA 1578                                                    2018-06-25 07:29:15.884086 +00:00
903036               ORA 1578                                                    2018-06-25 07:29:57.106602 +00:00
903028               ORA 1578                                                    2018-06-25 07:28:12.459483 +00:00
903020               ORA 1578                                                    2018-06-25 07:28:31.065843 +00:00
900614               ORA 1578                                                    2018-06-25 05:01:40.162611 +00:00
900606               ORA 1578                                                    2018-06-25 05:03:02.017225 +00:00
900598               ORA 1578                                                    2018-06-25 05:00:27.467744 +00:00
900590               ORA 1578                                                    2018-06-25 04:57:42.799013 +00:00
900542               ORA 1578                                                    2018-06-25 05:03:11.090605 +00:00
898223               ORA 1578                                                    2018-06-25 03:49:49.148526 +00:00
898215               ORA 1578                                                    2018-06-25 03:51:04.810199 +00:00
898184               ORA 1578                                                    2018-06-25 03:51:17.896126 +00:00
898183               ORA 1578                                                    2018-06-25 03:49:47.582633 +00:00
898167               ORA 1578                                                    2018-06-25 03:53:04.846327 +00:00
895739               ORA 1578                                                    2018-06-24 14:34:45.511852 +00:00
895731               ORA 1578                                                    2018-06-24 14:34:45.511682 +00:00
895683               ORA 1578                                                    2018-06-24 14:34:48.223735 +00:00
895659               ORA 1578                                                    2018-06-24 14:34:45.604276 +00:00
895643               ORA 1578                                                    2018-06-24 14:34:47.860703 +00:00
893209               ORA 1578                                                    2018-06-24 08:22:31.437230 +00:00
893201               ORA 1578                                                    2018-06-24 08:23:09.218288 +00:00
893193               ORA 1578                                                    2018-06-24 08:22:12.367569 +00:00
893137               ORA 1578                                                    2018-06-24 08:23:05.790164 +00:00
893121               ORA 1578                                                    2018-06-24 08:23:04.021084 +00:00
890783               ORA 1578                                                    2018-06-24 07:49:24.643469 +00:00
890743               ORA 1578                                                    2018-06-24 07:49:29.911364 +00:00
890720               ORA 1578                                                    2018-06-24 07:49:00.292962 +00:00
890719               ORA 1578                                                    2018-06-24 07:48:58.370655 +00:00
890711               ORA 1578                                                    2018-06-24 07:48:50.150749 +00:00
888363               ORA 1578                                                    2018-06-24 07:23:33.911818 +00:00
888362               ORA 1578                                                    2018-06-24 07:23:26.754356 +00:00
50 rows fetched

I did not use the IPS tool eventually.
Because Option B was simpler.



============================
Options to identify the Corruption
============================
Using SQL

SELECT FILE#, 
       BLOCK#, 
       BLOCKS, 
       TO_CHAR(CORRUPTION_CHANGE#) CORRUPTION_CHANGE#, 
       CORRUPTION_TYPE 
  FROM V$DATABASE_BLOCK_CORRUPTION;

FILE# BLOCK# BLOCKS CORRUPTION_CHANGE#        CORRUPTION_TYPE
----- ------ ------ ------------------------- --------------------
    5 110573      1 15056799588275            CORRUPT


SET PAGESIZE 2000
SET LINESIZE 280 

SELECT e.owner, 
       e.segment_type, 
       e.segment_name, 
       e.partition_name, 
       c.file#,
       greatest(e.block_id, c.block#) corr_start_block#,
       least(e.block_id+e.blocks-1, c.block#+c.blocks-1) corr_end_block#,
       least(e.block_id+e.blocks-1, c.block#+c.blocks-1),
- greatest(e.block_id, c.block#) + 1 blocks_corrupted
       corruption_type description
  FROM DBA_EXTENTS e, 
       V$DATABASE_BLOCK_CORRUPTION c
 WHERE e.file_id = c.file#
   AND e.block_id <= c.block# + c.blocks - 1
   AND e.block_id + e.blocks - 1 >= c.block#
UNION
SELECT s.owner, 
       s.segment_type, 
       s.segment_name, 
       s.partition_name, 
       c.file#,
       header_block corr_start_block#,
       header_block corr_end_block#,
       1 blocks_corrupted,
       corruption_type||' Segment Header' description
  FROM DBA_SEGMENTS S, 
       V$DATABASE_BLOCK_CORRUPTION C
 WHERE s.header_file = c.file#
   AND s.header_block between c.block# and c.block# + c.blocks - 1
UNION
SELECT null owner, 
       null segment_type, 
       null segment_name, 
       null partition_name, 
       c.file#,
       greatest(f.block_id, c.block#) corr_start_block#,
       least(f.block_id+f.blocks-1, c.block#+c.blocks-1) corr_end_block#,
       least(f.block_id+f.blocks-1, c.block#+c.blocks-1)
 - greatest(f.block_id, c.block#) + 1 blocks_corrupted,
       'Free Block' description
  FROM DBA_FREE_SPACE F, 
       V$DATABASE_BLOCK_CORRUPTION C
 WHERE f.file_id = c.file#
   AND f.block_id <= c.block# + c.blocks - 1
   AND f.block_id + f.blocks - 1 >= c.block#
ORDER BY file#, corr_start_block#;

This query pointed out, that the corruption was in an index of an application table

============================
Options to fix Corruption
============================
Option A. Rebuild Index Online
In Case the index is corrupted, best option would be to use ALTER INDEX my_index REBUILD ONLINE.
When using REBULD INDEX - Oracle would use data in the Index to the Rebuild.
But when using ALTER INDEX REBUILD ONLINE Oracle would use data in the Table to the Rebuild.



Option B. IPS - Incident Packaging Service
Oracle has introduced several new Diagnosability features in 11g. 
One of them is Incident Packaging Service (IPS). 
IPS uses rules to correlate all relevant dumps and traces from ADR for a given problem and allows you to package them to ship to Oracle Support.

No comments:

Post a Comment