Pages

Monday, December 24, 2018

Large Incident files with error: kksfbc-new-child-thresh-exceeded

===========================
General
===========================
Large incident files with error kksfbc-new-child-thresh-exceeded are generated under incident/ folder

=========================
Oracle Documentation

=========================

This is a is known bug:
ORA-600 [kksfbc-new-child-thresh-exceeded] Error In 11.1 (Doc ID 1200693.1)

CAUSE
This is most likely caused by unpublished Bug 7626014 fixed in 11.2 and PSU 11.1.0.7.5. 
The error occurs against a heavily used cursor which has multiple children with high concurrency.  
This problem can also cause unnecessary additional child cursors to be created.

SOLUTION
To fix this issue apply one of the following solutions:

Either Upgrade to 11.2.0.1, or

Install Patch 7626014 on top of 11.1.0.7.

Install the 11.1.0.7 PSU #5 or higher like 11.1.0.7 PSU #6 which contain the fix for this defect.

On Windows, install our 11.1.0.7.BP 22 or above - preferably the most recent one.

If this fix does not exist on top of the latest available patchset for your platform, please log a new Service Request and provide evidences that you are encountering this issue and request for an interim patch for your platform.

Note:

This fix alters the method used if a session cannot find an existing child cursor that can be shared. 

With this fix, builds of a new child cursor are serialized.


===========================
Example
===========================
*** 2018-12-23 17:44:55.692
*** SESSION ID:(440.41588) 2018-12-23 17:44:55.692
*** CLIENT ID:() 2018-12-23 17:44:55.692
*** SERVICE NAME:(SYS$USERS) 2018-12-23 17:44:55.692
*** MODULE NAME:(JDBC Thin Client) 2018-12-23 17:44:55.692
*** ACTION NAME:() 2018-12-23 17:44:55.692
Dump continued from file: /software/oracle/diag/rdbms/igt/igt/trace/igt_ora_17188.trc
ORA-00600: internal error code, arguments: [kksfbc-new-child-thresh-exceeded], [], [], [], [], [], [], [], [], [], [], []

========= Dump for incident 212913 (ORA 600 [kksfbc-new-child-thresh-exceeded]) ========

*** 2018-12-23 17:44:55.692
----- Current SQL Statement for this session (sql_id=6ug9rkhbfzgsa) -----
MERGE into SGA_W_PSMS_SUBSCRIBER o using (select :1  MSISDN from dual) o1 on (o.MSISDN = o1.MSISDN) when matched then update set o.IMSI= :2 ,o.CURRENT_COUNTRY= :3 ,o.CURRENT_NETWORK_ID = :4  , o.CURRENT_VLR = :5  ,o.CURRENT_MSC = :6  ,o.TS_FIRST_REGISTRATION = :7  ,o.TS_E_LAST_LU = :8  ,o.IS_ROAMING = :9 , o.CURRENT_SGSN = :10   , o.PREVIOUS_VLR=:11 , o.PREVIOUS_SGSN=:12 , o.TS_E_LAST_CANCEL=:13 ,o.TS_VISIT_START=:14 , o.TS_VISIT_END=:15 , o.TS_LAST_MODIFIED=:16 ,o.MAP_CAMEL_SERVICE_KEY=:17 , o.SCENARIO_HISTORY =:18 , o.RE_ACTIVATION_HISTORY =:19 , o.CAMPAIGN_HISTORY =:20  , o.PREVIOUS_VISIT_END_TIME=:21 ,o.CURRENT_SCENARIO_START_TIME=:22 ,o.SCENARIO_ID=:23 ,o.SPARX_BW_LIST_CHECK_RESULT=:24 ,o.VISIT_ID=:25 , o.LOCATION_GROUP_HISTORY=:26 , o.MVNO_HISTORY=:27 , o.IS_SUBSCRIBER=:28 , o.IS_LTE_USER =:29 ,  o.CURRENT_LTE_MCC_MNC=:30 , o.LTE_CANCELLATION_TYPE=:31 , o.LTE_SUBSCRIBER_STATUS=:32 , o.TS_E_LAST_LU_LTE=:33 , o.TS_E_LAST_CANCEL_LTE=:34 , o.CURRENT_MME=:35 , o.CURRENT_TECHNOLOigt_ora_17188_i212913.trc 

incdir_212889

Dump continued from file: /software/oracle/diag/rdbms/igt/igt/trace/igt_ora_17164.trc
ORA-00600: internal error code, arguments: [kksfbc-new-child-thresh-exceeded], [], [], [], [], [], [], [], [], [], [], []

========= Dump for incident 212889 (ORA 600 [kksfbc-new-child-thresh-exceeded]) ========

*** 2018-12-23 15:54:12.188
----- Current SQL Statement for this session (sql_id=6ug9rkhbfzgsa) -----
MERGE into SGA_W_PSMS_SUBSCRIBER o using (select :1  MSISDN from dual) o1 on (o.MSISDN = o1.MSISDN) when matched then update set o.IMSI= :2 ,o.CURRENT_COUNTRY= :3 ,o.CURRENT_NETWORK_ID = :4  , o.CURRENT_VLR = :5  ,o.CURRENT_MSC = :6  ,o.TS_FIRST_REGISTRATION = :7  ,o.TS_E_LAST_LU = :8  ,o.IS_ROAMING = :9 , o.CURRENT_SGSN = :10   , o.PREVIOUS_VLR=:11 , o.PREVIOUS_SGSN=:12 , o.TS_E_LAST_CANCEL=:13 ,o.TS_VISIT_START=:14 , o.TS_VISIT_END=:15 , o.TS_LAST_MODIFIED=:16 ,o.MAP_CAMEL_SERVICE_KEY=:17 , o.SCENARIO_HISTORY =:18 , o.RE_ACTIVATION_HISTORY =:19 , o.CAMPAIGN_HISTORY =:20  , o.PREVIOUS_VISIT_END_TIME=:21 ,o.CURRENT_SCENARIO_START_TIME=:22 ,o.SCENARIO_ID=:23 ,o.SPARX_BW_LIST_CHECK_RESULT=:24 ,o.VISIT_ID=:25 , o.LOCATION_GROUP_HISTORY=:26 , o.MVNO_HISTORY=:27 , o.IS_SUBSCRIBER=:28 , o.IS_LTE_USER =:29 ,  o.CURRENT_LTE_MCC_MNC=:30 , o.LTE_CANCELLATION_TYPE=:31 , o.LTE_SUBSCRIBER_STATUS=:32 , o.TS_E_LAST_LU_LTE=:33 , o.TS_E_LAST_CANCEL_LTE=:34 , o.CURRENT_MME=:35 , o.CURRENT_TECHNOLOigt_ora_17164_i212889.trc 

===========================
Solution
===========================
As a workaround, I have tried to Flush the Shared pool for the sql_id entries in question

Per some tech-notes, it is possible to flush specific sql_id.

But when running this code, I could not see that the child cursors was flushed.


As they still are present in table V$SQL_BIND_CAPTURE.

As a workaround, I have Flushed Shared Pool, and this has resolved the issue.

SELECT sql_id, count(*) AS child_records
  FROM V$SQL_BIND_CAPTURE
 GROUP BY sql_id
HAVING  count(*) > 400
ORDER BY count(*) DESC;

SQL_ID        CHILD_RECORDS
------------- -------------
6ug9rkhbfzgsa         52584
akh1m5g2waych         13254
7dh0ddg65qpbv          3528
b8ya3z8kq3c6r          2086
adbf1h2dx6m81          1800
7ng34ruy5awxq           880
7y6yta0cajvr5           735
5yv0nj5hwfz5d           725
fwb25vcr3c2wk           688
dwypdxsjg4juq           624
2q93zsrvbdw48           478
6aq34nj2zb2n7           477

SELECT ADDRESS, HASH_VALUE from V$SQLAREA 
WHERE SQL_Id IN ('6ug9rkhbfzgsa','akh1m5g2waych','7dh0ddg65qpbv','b8ya3z8kq3c6r','adbf1h2dx6m81');

ADDRESS          HASH_VALUE
---------------- ----------
0000000257DEEAE0 2614316289
0000000253EC17B0 3428537723
0000000253AE12A8 3318053264
0000000253D3EB30  627159255
0000000257DAF2C0  384810762

BEGIN
  DBMS_SHARED_POOL.PURGE ('0000000257DEEAE0,2614316289','C');
  DBMS_SHARED_POOL.PURGE ('0000000253EC17B0,3428537723','C');
  DBMS_SHARED_POOL.PURGE ('0000000253AE12A8,3318053264','C');
  DBMS_SHARED_POOL.PURGE ('0000000253D3EB30,627159255','C');
  DBMS_SHARED_POOL.PURGE ('0000000257DAF2C0,384810762','C');
END;
/  

SELECT sql_id, count(*) AS child_records
  FROM V$SQL_BIND_CAPTURE
 GROUP BY sql_id
HAVING  count(*) > 400
ORDER BY count(*) DESC;

SQL_ID        CHILD_RECORDS
------------- -------------
6ug9rkhbfzgsa         52584
akh1m5g2waych         13254
7dh0ddg65qpbv          3528
b8ya3z8kq3c6r          2086
adbf1h2dx6m81          1800
7ng34ruy5awxq           880
7y6yta0cajvr5           735
5yv0nj5hwfz5d           725
fwb25vcr3c2wk           688
dwypdxsjg4juq           624
2q93zsrvbdw48           478
6aq34nj2zb2n7           477

Since this did not wirk, I have Flushed the Shared Pool.
Flush Shared Pool has resolved the issue.

ALTER SYSTEM FLUSH SHARED_POOL;




Sunday, December 9, 2018

Oracle SCN - System Change Number

==============================
Oracle SCN - System Change Number
==============================
SCN is an internal time stamp used by Oracle Database. 

Oracle Database uses SCNs to mark the SCN before which all changes are known to be on disk so that recovery avoids applying unnecessary redo. 

The database also uses SCNs to mark the point at which no redo exists for a set of data so that recovery can stop.

SCNs occur in a monotonically increasing sequence. 

Oracle Database can use an SCN like a clock because an observed SCN indicates a logical point in time and repeated observations return equal or greater values. 
If one event has a lower SCN than another event, then it occurred at an earlier time with respect to the database.
Several events may share the same SCN, which means that they occurred at the same time with respect to the database.

Every transaction has an SCN. 

For example, if a transaction updates a row, then the database records the SCN at which this update occurred. 

Other modifications in this transaction have the same SCN. 

When a transaction commits, the database records an SCN for this commit.

Oracle Database increments SCNs in the system global area (SGA). 

When a transaction modifies data, the database writes a new SCN to the undo data segment assigned to the transaction. 

The log writer process then writes the commit record of the transaction immediately to the online redo log. 

The commit record has the unique SCN of the transaction. 

Oracle Database also uses SCNs as part of its instance recovery and media recovery mechanisms.

==============================
At commit time
==============================
A system change number (SCN) is generated for the COMMIT.

The internal transaction table for the associated undo tablespace records that the transaction has committed. 
The corresponding unique SCN of the transaction is assigned and recorded in the transaction table. 

The log writer (LGWR) process writes remaining redo log entries in the redo log buffers to the online redo log and writes the transaction SCN to the online redo log. 

This atomic event constitutes the commit of the transaction.

==============================
How to find current SCN
==============================
Option 1.
SELECT CURRENT_SCN FROM V$DATABASE;
294565574

Option 2.

SELECT TIMESTAMP_TO_SCN(SYSDATE) FROM DUAL;
294565581

Option 3.
SELECT DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER AS SCN FROM DUAL;
ORA-00904: : invalid identifier


oracle@my_server:~>% sqlplus / as sysdba
SQL> GRANT EXECUTE ON DBMS_FLASHBACK to RI_CENTRAL;
Grant succeeded.

sqlplus RI_CENTRAL/RI_CENTRAL@ora_inst
SELECT DBMS_FLASHBACK.get_system_change_number AS SCN FROM DUAL;

       SCN
----------
 294565682

==============================
SCN to Time
==============================
SQL> SELECT SCN_TO_TIMESTAMP(294565682) AS SCN_TIME FROM DUAL;

SCN_TIME
---------------------------------
09-DEC-18 04.34.28.000000000 PM

SQL> SELECT TO_CHAR(SCN_TO_TIMESTAMP(294565682),'YYYYMMDD hh24:mi:ss') AS SCN_TIME FROM DUAL;

SCN_TIME
------------------------------

20181209 16:34:28

==============================
Script to get current SCN

==============================


CURRENT_SCN=`echo -e "set head off term off echo off feed off ver off space 1 linesize 1000\ncol  DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER() format 999999999999999999999999999999999999999\nselect DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER() from dual;" | sqlplus -S $MY_USER/$MY_PASS@$MY_CONNECTSTR`


echo CURRENT_SCN is: $CURRENT_SCN  

Sunday, November 11, 2018

Move Datafile from Location to a new Location

==================================
General
==================================
Example of moving Datafile from Location A to Location B

==================================
Example
==================================
Steps
Check Database Archive Mode
SQL> SELECT log_mode FROM V$DATABASE;

LOG_MODE
---------------------------
ARCHIVELOG

==================================
Database is in ARCHIVELOG mode
==================================
- Take the datafile offline.
ALTER DATABASE DATAFILE '/old/datafile/location/datafile_01.dbf' OFFLINE;

SQL>Database Altered

- Copy or move the datafile to its new location.
dd if=/old/location/datafile_01.dbf of=new/location/datafile_01.dbf bs='4096
or
cp -p /old/location/datafile_01.dbf /new/location/datafile_01.dbf

- Point Database to the new datafile location.
ALTER DATABASE RENAME FILE '/old/location/datafile_01.dbf' TO 'new/location/datafile_01.dbf';

SQL>Database altered.

- Recover new datafile.

RECOVER DATAFILE '/new/location/datafile_01.dbf';

SQL>Media recovery complete.

- Bring Datafile Online.

ALTER DATABASE DATAFILE 'new/location/datafile_01.dbf' ONLINE;

SQL>Database altered.

Delete datafile from old location
rm -f /old/location/datafile_01.dbf

==================================
Database is in NOARCHIVELOG mode
==================================
- Shutdown the database
SHUTDOWN IMMEDIATE;

- Copy or move the datafile to its new location.
dd if=/old/location/datafile_01.dbf of=new/location/datafile_01.dbf bs=4096
or
cp -p /old/location/datafile_01.dbf /new/location/datafile_01.dbf


- STARTUP MOUNT
STARTUP MOUNT;

- Rename the file:
ALTER DATABASE RENAME FILE '/old/location/datafile_01.dbf' TO '/new/location/datafile_01.dbf';

-Open The Database
ALTER DATABASE OPEN;

Thursday, October 25, 2018

Large Trace Files with Error: ORA-00600: internal error code, arguments: [ktspfupdst-1]. Rebuild Indexes.

===============================
General
===============================
Large trace file are generated from time to time.

===============================
Evidences
===============================
less /software/oracle/diag/rdbms/orainst/orainst/trace/alert_orainst.log 

Wed Oct 24 02:31:15 2018
Errors in file /software/oracle/diag/rdbms/orainst/orainst/trace/orainstorainst_ora_28448.trc  (incident=79868):
ORA-00600: internal error code, arguments: [ktspfupdst-1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /software/oracle/diag/rdbms/orainst/orainst/incident/incdir_79868/orainst_ora_28448_i79868.trc
Wed Oct 24 02:31:17 2018
Trace dumping is performing id=[cdmp_20181024023117]

less /software/oracle/diag/rdbms/orainst/orainst/incident/incdir_79868/orainst_ora_28448_i79868.trc

Dump file /software/oracle/diag/rdbms/orainst/orainst/incident/incdir_79868/orainst_ora_28448_i79868.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /software/oracle/111
System name:    Linux
Node name:      col-com-1-aps-4
Release:        2.6.18-308.24.1.el5
Version:        #1 SMP Wed Nov 21 11:42:14 EST 2012
Machine:        x86_64
Instance name: orainst
Redo thread mounted by this instance: 1
Oracle process number: 83
Unix process pid: 28448, image: oracle@col-com-1-aps-4


*** 2018-10-24 02:31:15.971
*** SESSION ID:(488.25914) 2018-10-24 02:31:15.971
*** CLIENT ID:() 2018-10-24 02:31:15.971
*** SERVICE NAME:(SYS$USERS) 2018-10-24 02:31:15.971
*** MODULE NAME:(JDBC Thin Client) 2018-10-24 02:31:15.971
*** ACTION NAME:() 2018-10-24 02:31:15.971

Dump continued from file: /software/oracle/diag/rdbms/orainst/orainst/trace/orainst_ora_28448.trc
ORA-00600: internal error code, arguments: [ktspfupdst-1], [], [], [], [], [], [], [], [], [], [], []

========= Dump for incident 79868 (ORA 600 [ktspfupdst-1]) ========

*** 2018-10-24 02:31:15.971
----- Current SQL Statement for this session (sql_id=akh1m5g2waych) -----
MERGE into SFI_CUSTOMER_PROFILE o using (select :1  KEY1 from dual) o1 on (o.KEY1 = o1.KEY1) when matched then update  SET IMSI=:2 , TS_LAST_MODIFIED=sysdate, KEY2=:3 , KEY3=:4 , MSISDN=:5 , ATTR41=:6 , ATTR42=:7 , ATTR43=:8 , ATTR44=:9 , ATTR45=:10 ,  ATTR46=:11 , ATTR47=:12 , ATTR48=:13 , ATTR49=:14 , ATTR50=:15 ,  ATTR51=:16 , ATTR52=:17 , ATTR53=:18 , ATTR54=:19 , ATTR55=:20 ,  ATTR56=:21 , ATTR57=:22 , ATTR58=:23 , ATTR59=:24 , ATTR60=:25 ,  ATTR61=:26 , ATTR62=:27 , ATTR63=:28 , ATTR64=:29 , ATTR65=:30 ,  ATTR66=:31 , ATTR67=:32 , ATTR68=:33 , ATTR69=:34 , ATTR70=:35 ,  ATTR71=:36 , ATTR72=:37 , ATTR73=:38 , ATTR74=:39 , ATTR75=:40 ,  ATTR76=:41 , ATTR77=:42 , ATTR78=:43 , ATTR79=:44 , ATTR80=:45 ,  TS_LAST_TRY_FETCH=:46 , EI_STATUS_CODE_ID=:47   when not matched then  Insert (o.KEY1, o.IMSI, o.TS_LAST_MODIFIED,o.KEY2,o.KEY3, o.MSISDN, o.ATTR41,o.ATTR42,o.ATTR43,o.ATTR44,o.ATTR45,  o.ATTR46,o.ATTR47,o.ATTR48,o.ATTR49,o.ATTR50, o.ATTR51,o.ATTR52,o.ATTR53,o.ATTR54,o.ATTR55, o.ATTR56,o.ATTR57,o.ATTR58,o.ATTR59,o.ATTR60, o.ATTR61,o.ATTR62,o.ATTR63,o.ATTR64,o.ATTR65, o.ATTR66,o.ATTR67,o.ATTR68,o.ATTR69,o.ATTR70, o.ATTR71,o.ATTR72,o.ATTR73,o.ATTR74,o.ATTR75,  o.ATTR76,o.ATTR77,o.ATTR78,o.ATTR79,o.ATTR80, o.TS_LAST_TRY_FETCH,o.EI_STATUS_CODE_ID) Values (:48 ,:49 ,sysdate,:50 ,:51 ,:52 ,:53 ,:54 ,:55 ,:56 ,:57 ,:58 ,:59 ,:60 ,:61 ,:62 ,:63 ,:64 ,:65 ,:66 ,:67 ,:68 ,:69 ,:70 ,:71 ,:72 ,:73 ,:74 ,:75 ,:76 ,:77 ,:78 ,:79 ,:80 ,:81 ,:82 ,:83 ,:84 ,:85 ,:86 ,:87 ,:88 ,:89 ,:90 ,:91 ,:92 ,:93 ,:94 )


===============================
Solution
===============================
Searching for ORA-00600: internal error code, arguments: [ktspfupdst-1], [], [], [], [], [], [], [], [], [], [], [] 

Reading the Matalink description, is seems that the  issue is related to DML operations on a table with corrupt Index.

Indeed, in this case, trace files were generated during MERGE statement.

Per Matalink, two possible solutions

1st Solution
A. Rebuilding all indexes on the updated table, and it partitions.

B. Running DBMS_SPACE_ADMIN.assm_segment_verify();

DBMS_SPACE_ADMIN.ASSM_SEGMENT_VERIFY example
BEGIN
  DBMS_SPACE_ADMIN.ASSM_SEGMENT_VERIFY ('PERFSTAT','STATS$EVENT_HISTOGRAM', 'TABLE', NULL, 10);
  END;
/
PL/SQL procedure successfully completed.

PERFSTAT - owner
STATS$EVENT_HISTOGRAM - object name
TABLE - object type
NULL - partition name
10 - stands for SEGMENT_VERIFY_DEEP - Deep verification involves consistency checks between datablocks and space metadata blocks
9 - stands for SEGMENT_VERIFY_BASIC - Basic verification involves consistency checks of space metadata.



2nd Solution
A. ANALYZE TABLE USER_B.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE INTO INVALID_ROWS;

B. In case of error: rebuilding all indexes on the updated table, and it partitions. then validate structure with same command.

C. ANALYZE TABLE USER_B.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE INTO INVALID_ROWS;

===============================
1st Solution
===============================

For Non Partitioned Tables:

SELECT 'ALTER INDEX '||table_owner||'.'||index_name||' REBUILD ONLINE ;' 
  FROM DBA_INDEXES 
 WHERE index_name IN 
      (SELECT index_name 
         FROM USER_INDEXES 
        WHERE table_name = 'SFI_CUSTOMER_PROFILE')
   AND index_owner = 'MY_USER';


For Partitions:

SELECT 'ALTER INDEX '||index_owner||'.'||index_name||' REBUILD PARTITION '||partition_name||';' 
  FROM DBA_IND_PARTITIONS 
 WHERE index_name IN 
       (SELECT index_name 
          FROM USER_PART_INDEXES 
         WHERE  table_name = 'SFI_CUSTOMER_PROFILE')
   AND index_owner = 'MY_USER';

===============================
2nd Solution by Example
===============================

sqlplus system/XXX@orainst

Several users own table SFI_CUSTOMER_PROFILE
For all except one, the command is successful.

SELECT 'ANALYZE TABLE '||owner||'.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE ONLINE;'
FROM  DBA_SEGMENTS WHERE segment_name = 'SFI_CUSTOMER_PROFILE' AND blocks > 10;


SQL> ANALYZE TABLE USER_A.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE ONLINE;
Table analyzed.

SQL> ANALYZE TABLE USER_B.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE ONLINE;
ANALYZE TABLE USER_B.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE ONLINE
*
ERROR at line 1:
ORA-14508: specified VALIDATE INTO table not found

Need to use syntax:
ANALYZE TABLE USER_B.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE INTO INVALID_ROWS;

To create the table INVALID_ROWS:

@/software/oracle/111/rdbms/admin/utlvalid.sql


oracle@user:~/scripts>% less /software/oracle/111/rdbms/admin/utlvalid.sql

rem 
Rem  Copyright (c) 1990, 1995, 1996, 1998 by Oracle Corporation
Rem  NAME
REM  UTLVALID.SQL
Rem  FUNCTION
Rem    Creates the default table for storing the output of the
Rem    analyze validate command on a partitioned table
Rem  NOTES
Rem  MODIFIED
Rem     syeung     06/17/98 - add subpartition_name                            
Rem     mmonajje   05/21/96 - Replace timestamp col name with analyze_timestamp
Rem     sbasu      05/07/96 - Remove echo setting
Rem     ssamu      01/09/96 - new file utlvalid.sql
Rem

create table INVALID_ROWS (
  owner_name         varchar2(30),
  table_name         varchar2(30),
  partition_name     varchar2(30),
  subpartition_name  varchar2(30),
  head_rowid         rowid,
  analyze_timestamp  date
);


SQL> ANALYZE TABLE USER_B.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE INTO INVALID_ROWS;
ANALYZE TABLE COL_CLARO_SPXQQ.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE INTO INVALID_ROWS
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file


To locate the trace file
SQL> ALTER SESSION SET TRACEFILE_IDENTIFIER = 'ALEC.DBA';

SQL> ANALYZE TABLE USER_B.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE INTO INVALID_ROWS;

 Trace file:
 /software/oracle/diag/rdbms/orainst/orainst/trace/orainst_ora_21706_ALEC.DBA
 *** 2019-05-12 08:10:20.401
*** SESSION ID:(528.12704) 2019-05-12 08:10:20.401
*** CLIENT ID:() 2019-05-12 08:10:20.401
*** SERVICE NAME:(orainst) 2019-05-12 08:10:20.401
*** MODULE NAME:(SQL*Plus) 2019-05-12 08:10:20.401
*** ACTION NAME:() 2019-05-12 08:10:20.401

Table/Index row count mismatch1?RAaDG~ODU

table 449980 : index 449979, 0
Partitioned Index root = tsn: 5 rdba: 0x0140843a


The root cause of failure is mismatch in rows between table and primary key index.
This is not a valid situation.


SQL> SELECT index_name FROM DBA_PART_INDEXES WHERE table_name = 'SFI_CUSTOMER_PROFILE' AND owner = 'USER_B';

INDEX_NAME
------------------------------
SFI_CUSTOMER_PROFILE_PK



SQL> SELECT 'ALTER INDEX USER_B.SFI_CUSTOMER_PROFILE_PK REBUILD PARTITION '||partition_name||' ONLINE; '
FROM DBA_IND_PARTITIONS 
WHERE index_name = 'SFI_CUSTOMER_PROFILE_PK' 
AND index_owner = 'USER_B';


ALTER INDEX USER_B.SFI_CUSTOMER_PROFILE_PK REBUILD PARTITION SYS_P69 ONLINE;
ALTER INDEX USER_B.SFI_CUSTOMER_PROFILE_PK REBUILD PARTITION SYS_P70 ONLINE;
ALTER INDEX USER_B.SFI_CUSTOMER_PROFILE_PK REBUILD PARTITION SYS_P71 ONLINE;
ALTER INDEX USER_B.SFI_CUSTOMER_PROFILE_PK REBUILD PARTITION SYS_P72 ONLINE;
ALTER INDEX USER_B.SFI_CUSTOMER_PROFILE_PK REBUILD PARTITION SYS_P73 ONLINE;
ALTER INDEX USER_B.SFI_CUSTOMER_PROFILE_PK REBUILD PARTITION SYS_P74 ONLINE;
ALTER INDEX USER_B.SFI_CUSTOMER_PROFILE_PK REBUILD PARTITION SYS_P75 ONLINE;
ALTER INDEX USER_B.SFI_CUSTOMER_PROFILE_PK REBUILD PARTITION SYS_P76 ONLINE;


SQL> ALTER SESSION SET TRACEFILE_IDENTIFIER = 'ALEC.DBA_2';

SQL> ANALYZE TABLE COL_CLARO_SPXQQ.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE INTO INVALID_ROWS;
ANALYZE TABLE COL_CLARO_SPXQQ.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE INTO INVALID_ROWS
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

SQL> SELECT COUNT(*) FROM INVALID_ROWS;

  COUNT(*)
----------
         0
 
 
Trace file: /software/oracle/diag/rdbms/orainst/orainst/trace/orainst_ora_21706_ALEC.DBA_2
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /software/oracle/111
System name:    Linux
Node name:      col-com-1-aps-3
Release:        2.6.18-308.24.1.el5
Version:        #1 SMP Wed Nov 21 11:42:14 EST 2012
Machine:        x86_64
Instance name: orainst
Redo thread mounted by this instance: 1
Oracle process number: 38
Unix process pid: 21706, image: oracle@col-com-1-aps-3


*** 2019-05-12 08:20:50.739
*** SESSION ID:(528.12704) 2019-05-12 08:20:50.739
*** CLIENT ID:() 2019-05-12 08:20:50.739
*** SERVICE NAME:(orainst) 2019-05-12 08:20:50.739
*** MODULE NAME:(SQL*Plus) 2019-05-12 08:20:50.739
*** ACTION NAME:() 2019-05-12 08:20:50.739


*** TRACE CONTINUED FROM FILE /software/oracle/diag/rdbms/orainst/orainst/trace/orainst_ora_21706_ALEC.KAPLAN ***

Table/Index row count mismatch1?1rEN5~2FU

table 449990 : index 449989, 0
Partitioned Index root = tsn: 5 rdba: 0x0140c5b2
orainst_ora_21706_ALEC.KAPLAN_2 (END)  


Same error as before...
The rebuild of index did not solve error.
This is the index in question:

ALTER TABLE SFI_CUSTOMER_PROFILE
ADD CONSTRAINT SFI_CUSTOMER_PROFILE_PK PRIMARY KEY (key1) 
USING INDEX LOCAL;

Lets TRUNCATE the table, and re-populate is from a backup table.
  
CREATE TABLE SFI_CUSTOMER_PROFILE_BAK AS SELECT * FROM SFI_CUSTOMER_PROFILE WHERE 1=2;
ALTER TABLE SFI_CUSTOMER_PROFILE_BAK NOLOGGING;
INSERT /*+ APPEND */ INTO SFI_CUSTOMER_PROFILE_BAK SELECT * FROM SFI_CUSTOMER_PROFILE;
3593549 rows created.
TRUNCATE TABLE SFI_CUSTOMER_PROFILE;
INSERT /*+ APPEND */ INTO SFI_CUSTOMER_PROFILE SELECT * FROM SFI_CUSTOMER_PROFILE_BAK;

COMMIT;

   
SQL> ANALYZE TABLE USER_B.SFI_CUSTOMER_PROFILE VALIDATE STRUCTURE CASCADE INTO INVALID_ROWS;

Table analyzed.

Now table data and table PK are in sync.

Need to drop the INVALID_ROWS table:
   
SQL> DROP TABLE INVALID_ROWS;
Table dropped.   

Wednesday, October 24, 2018

Large Trace Files with Error: ORA-08102: index key not found, obj# 9999, file 99 block 99999

========================================
General
========================================
Large trc files are being generated.

========================================
Evidences
========================================


alert.log
========================================
Wed Oct 24 05:48:16 2018
Trace dumping is performing id=[cdmp_20181024054816]
Wed Oct 24 05:49:24 2018
Errors in file /software/oracle/diag/rdbms/igt/igt/trace/igt_j000_12354.trc:
ORA-12012: error on auto execute of job 141
ORA-08102: index key not found, obj# 282, file 1, block 1969 (2)
Errors in file /software/oracle/diag/rdbms/igt/igt/trace/igt_j000_12354.trc:
ORA-12012: error on auto execute of job 141
ORA-08102: index key not found, obj# 282, file 1, block 1969 (2)
Wed Oct 24 05:49:25 2018
Trace dumping is performing id=[cdmp_20181024054925]
Wed Oct 24 05:50:33 2018
Errors in file /software/oracle/diag/rdbms/igt/igt/trace/igt_j000_12354.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 282, file 1, block 1969 (2)
ORA-12012: error on auto execute of job 141
ORA-08102: index key not found, obj# 282, file 1, block 1969 (2)
Wed Oct 24 05:51:32 2018


/software/oracle/diag/rdbms/igt/igt/trace/igt_j000_12354.trc
========================================
Trace file /software/oracle/diag/rdbms/igt/igt/trace/igt_j000_12354.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /software/oracle/111
System name:    Linux
Node name:      col-com-1-aps-4
Release:        2.6.18-308.24.1.el5
Version:        #1 SMP Wed Nov 21 11:42:14 EST 2012
Machine:        x86_64
Instance name: igt
Redo thread mounted by this instance: 1
Oracle process number: 69
Unix process pid: 12354, image: oracle@col-com-1-aps-4 (J000)


*** 2018-10-24 05:50:30.185
*** SESSION ID:(462.12791) 2018-10-24 05:50:30.185
*** CLIENT ID:() 2018-10-24 05:50:30.185
*** SERVICE NAME:(SYS$USERS) 2018-10-24 05:50:30.185
*** MODULE NAME:() 2018-10-24 05:50:30.185
*** ACTION NAME:() 2018-10-24 05:50:30.185


*** TRACE FILE RECREATED AFTER BEING REMOVED ***


REDO RECORD - Thread:1 RBA: 0x0193ac.0001b6c7.0084 LEN: 0x15638 VLD: 0x09
SCN: 0x0db7.14fe2926 SUBSCN:  1 10/24/2018 05:42:57
CHANGE #1 TYP:2 CLS: 1 AFN:4 DBA:0x01017022 OBJ:153442 SCN:0x0db7.14fe27f4 SEQ:  1 OP:11.4 ENC:0
KTB Redo 
op: 0x01  ver: 0x01  
compat bit: 4 (post-11) padding: 1
op: F  xid:  0x000d.00f.0002be51    uba: 0x00c02523.1ef7.25
KDO Op code: LKR row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x01017022  hdba: 0x010073e2
itli: 2  ispac: 0  maxfr: 4858
tabn: 0 slot: 7 to: 2
CHANGE #2 TYP:0 CLS:41 AFN:3 DBA:0x00c00140 OBJ:4294967295 SCN:0x0db7.14fe277e SEQ:  2 OP:5.2 ENC:0
ktudh redo: slt: 0x000f sqn: 0x0002be51 flg: 0x0012 siz: 136 fbi: 0
            uba: 0x00c02523.1ef7.25    pxid:  0x0000.000.00000000
CHANGE #3 TYP:2 CLS: 1 AFN:4 DBA:0x0104f5c5 OBJ:153442 SCN:0x0db7.14a89d05 SEQ:  1 OP:11.5 ENC:0
KTB Redo 
op: 0x11  ver: 0x01  
compat bit: 4 (post-11) padding: 1
op: F  xid:  0x000d.00f.0002be51    uba: 0x00c02524.1ef7.01
Block cleanout record, scn:  0x0db7.14fe291b ver: 0x01 opt: 0x02, entries follow...
  itli: 6  flg: 2  scn: 0x0db7.14a89d05
KDO Op code: URP row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x0104f5c5  hdba: 0x010073e2
itli: 1  ispac: 0  maxfr: 4858
tabn: 0 slot: 0(0x0) flag: 0x0c lock: 1 ckix: 0
ncol: 42 nnew: 41 size: 0
col  0: [15]  37 33 32 31 30 31 31 37 30 39 34 32 35 30 34
col  2: [ 2]  c1 02
col  3: [ 2]  c1 02
col  4: [ 1]  80
col 41: [ 1]  80
CHANGE #83 TYP:0 CLS:26 AFN:3 DBA:0x00c1461c OBJ:4294967295 SCN:0x0db7.1511b877 SEQ:  7 OP:5.1 ENC:0
ktudb redo: siz: 84 spc: 4636 flg: 0x0022 seq: 0xfee7 rec: 0x08
            xid:  0x0005.004.0004e55f  
ktubu redo: slt: 4 rci: 7 opc: 11.1 objn: 43448 objd: 153442 tsn: 4
Undo type:  Regular undo       Undo type:  Last buffer split:  No 
Tablespace Undo:  No 
             0x00000000
KDO undo record:
KTB Redo 
op: 0x04  ver: 0x01  
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x000e.01b.0002b70d uba: 0x00c0df2a.1f14.08
                      flg: C---    lkc:  0     scn: 0x0db7.140c0dc8
KDO Op code: LKR row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x03014003  hdba: 0x010073e2
itli: 2  ispac: 0  maxfr: 4858
tabn: 0 slot: 8 to: 0
END OF DUMP REDO

*** 2018-10-24 05:50:33.287
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 282, file 1, block 1969 (2)
ORA-12012: error on auto execute of job 141
ORA-08102: index key not found, obj# 282, file 1, block 1969 (2)

========================================
Resolution
========================================

SQL> SELECT OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_TYPE 
       FROM DBA_OBJECTS 
      WHERE OBJECT_ID = '282';

OWNER OBJECT_NAME SUBOBJECT_NAME OBJECT_TYPE
------- ----------- -------------- --------------
SYS     I_JOB_NEXT                 INDEX


sqlplus / as sysdba
ALTER INDEX SYS.I_JOB_NEXT REBUILD ONLINE;