Pages

Sunday, August 20, 2017

Many trace files with ORA-00600 and error message Action: 'kpuActionDefault'

===========================
General
===========================
There are many *.trc files and incident/incident_<pid> files generated.
When checking alert.log, there are many error messages with error ORA-00600:

===========================
Evidences
===========================
alert.log
Sun Aug 20 07:14:18 2017
ORA-00600: internal error code, arguments: [kkpamRGet], [PTI 8.1], [0], [], [], [], [], [], [], [], [], []
Incident details in: /software/oracle/diag/rdbms/igt/igt/incident/incdir_246165/igt_ora_2263_i246165.trc
Sun Aug 20 07:14:18 2017
Errors in file /software/oracle/diag/rdbms/igt/igt/trace/igt_ora_5575.trc  (incident=246525):
ORA-00600: internal error code, arguments: [kkpamDGSPam2], [0], [0], [], [], [], [], [], [], [], [], []
Incident details in: /software/oracle/diag/rdbms/igt/igt/incident/incdir_246525/igt_ora_5575_i246525.trc

trace files
/software/oracle/diag/rdbms/igt/igt/trace/igt_ora_31903.trc

ORACLE_HOME = /software/oracle/111
System name:    Linux
Node name:      deu-vod-2-dbu-1
Release:        2.6.18-371.3.1.el5
Version:        #1 SMP Mon Nov 11 03:23:58 EST 2013
Machine:        x86_64
Instance name: igt
Redo thread mounted by this instance: 1
Oracle process number: 455
Unix process pid: 31903, image: oracle@deu-vod-2-dbu-1


*** 2017-08-20 12:59:18.420
*** SESSION ID:(2544.29033) 2017-08-20 12:59:18.420
*** CLIENT ID:() 2017-08-20 12:59:18.420
*** SERVICE NAME:(SYS$USERS) 2017-08-20 12:59:18.420
*** MODULE NAME:(JDBC Thin Client) 2017-08-20 12:59:18.420
*** ACTION NAME:() 2017-08-20 12:59:18.420
 
DDE: Problem Key 'ORA 600 [kkpamRGet]' was flood controlled (0x4) (incident: 246112)
ORA-00600: internal error code, arguments: [kkpamRGet], [PTI 8.1], [0], [], [], [], [], [], [], [], [], []


incident file /software/oracle/diag/rdbms/igt/igt/incident/incdir_243949/igt_ora_13764_i243949.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /software/oracle/111
System name:    Linux
Node name:      deu-vod-2-dbu-1
Release:        2.6.18-371.3.1.el5
Version:        #1 SMP Mon Nov 11 03:23:58 EST 2013
Machine:        x86_64
Instance name: igt
Redo thread mounted by this instance: 1
Oracle process number: 185
Unix process pid: 13764, image: oracle@deu-vod-2-dbu-1


*** 2017-08-04 08:59:25.201
*** SESSION ID:(3259.3287) 2017-08-04 08:59:25.201
*** CLIENT ID:() 2017-08-04 08:59:25.201
*** SERVICE NAME:(SYS$USERS) 2017-08-04 08:59:25.201
*** MODULE NAME:(JDBC Thin Client) 2017-08-04 08:59:25.201
*** ACTION NAME:() 2017-08-04 08:59:25.201
 
Dump continued from file: /software/oracle/diag/rdbms/igt/igt/trace/igt_ora_13764.trc
ORA-00600: internal error code, arguments: [kdtgsp-inc], [], [], [], [], [], [], [], [], [], [], []

========= Dump for incident 243949 (ORA 600 [kdtgsp-inc]) ========

*** 2017-08-04 08:59:25.201
----- Current SQL Statement for this session (sql_id=ancznzs51sycm) -----
INSERT INTO GA_W_COUNTERS_HISTORY (STATIC_ID , DYN1 , DYN2 , DYN3 , DYN4 , DYN5, DYN_S1, COUNTER_SUM , COUNTER_DELTA, TS_LAST_MODIFIED, NODE_ID, DAY) values (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 ,:10 ,:11 ,:12 )

Begin OCI Call Context Dump
End OCI Call Context Dump


===========================
Oracle Metalink
===========================
Per Oracle document:
Bug 9399991 - Assorted Internal Errors and Dumps (mostly under kkpa*/kcb*) from SQL against partitioned tables (Doc ID 9399991.8):

Versions confirmed as being affected:
11.1.0.7
11.2.0.1
  
Description

Various dumps and / or ORA-600 errors can occur if a partitioned table has lots of dependent references in the shared pool.
eg: lots of different literal SQLs reference the same partitioned table.

The problem is that an internal field can overflow when the number of dependencies exceeds 65536 which can then allow the MVOB of the partitioned table tohave data freed leaving stale references in the SGA.

Rediscovery notes:
 If the problem is present the a library cache dump will show one or more MVOB entries which have a KGHP flag but no KEP flag. 

 A SQL like the following may help show the MVOB entries in the library cache: 
  select KGLHDADR, KGLNAHSH, KGLNAOWN, KGLFNOBJ, KGLHDNSP, KGLOBTYP, 
         Decode(bitand(KGLHDFLG,4),4,'YES','NO') "KPR",
         case when kglhdamk < 0 then (65536 + kglhdamk)
              else kglhdamk end "KDP"
   from x$kglob 
   where KGLOBTYP=70;

 In this select:
  "KPR" is the flag we are interested in. YES=flag set, NO=not set
  "KDP" is the dependency count. 
  You should see this increase if new SQL statements are parsed that reference the partitioned object. 
  If you see KDP>0 and KPR=NO then there is a problem in the SGA and it may be worth flushing.

 If you see KDP increasing above about 50000 for a specific entry then it may also be worth flushing the shared pool as the problem  could occur if the dependency count increases much further.

 Solution

 Per above: "If you see KDP>0 and KPR=NO then there is a problem in the SGA and it may be worth flushing."


===========================
In the specific case
===========================
Failing SQL was:
INSERT INTO GA_W_COUNTERS_HISTORY (STATIC_ID , DYN1 , DYN2 , DYN3 , DYN4 , DYN5, DYN_S1, COUNTER_SUM , COUNTER_DELTA, TS_LAST_MODIFIED, NODE_ID, DAY) 
values (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 ,:10 ,:11 ,:12 )

GA_W_COUNTERS_HISTORY table is indeed partitioned.

SELECT TABLE_OWNER, TABLE_NAME, PARTITION_NAME 
  FROM DBA_TAB_PARTITIONS 
 WHERE table_name = 'GA_W_COUNTERS_HISTORY' 
   AND TABLE_OWNER = 'MY_USER'

TABLE_OWNER        TABLE_NAME                     PARTITION_NAME
------------------ ------------------------------ --------------
MY_USER            GA_W_COUNTERS_HISTORY          P_1
MY_USER            GA_W_COUNTERS_HISTORY          P_2
MY_USER            GA_W_COUNTERS_HISTORY          P_3
MY_USER            GA_W_COUNTERS_HISTORY          P_4
MY_USER            GA_W_COUNTERS_HISTORY          P_5
MY_USER            GA_W_COUNTERS_HISTORY          P_6
MY_USER            GA_W_COUNTERS_HISTORY          P_7
MY_USER            GA_W_COUNTERS_HISTORY          P_8
MY_USER            GA_W_COUNTERS_HISTORY          P_9
MY_USER            GA_W_COUNTERS_HISTORY          P_10
MY_USER            GA_W_COUNTERS_HISTORY          P_11
MY_USER            GA_W_COUNTERS_HISTORY          P_12
MY_USER            GA_W_COUNTERS_HISTORY          P_13
MY_USER            GA_W_COUNTERS_HISTORY          P_14
MY_USER            GA_W_COUNTERS_HISTORY          P_15
MY_USER            GA_W_COUNTERS_HISTORY          P_16
MY_USER            GA_W_COUNTERS_HISTORY          P_17
MY_USER            GA_W_COUNTERS_HISTORY          P_18
MY_USER            GA_W_COUNTERS_HISTORY          P_19
MY_USER            GA_W_COUNTERS_HISTORY          P_20
MY_USER            GA_W_COUNTERS_HISTORY          P_21
MY_USER            GA_W_COUNTERS_HISTORY          P_22
MY_USER            GA_W_COUNTERS_HISTORY          P_23
MY_USER            GA_W_COUNTERS_HISTORY          P_24
MY_USER            GA_W_COUNTERS_HISTORY          P_25
MY_USER            GA_W_COUNTERS_HISTORY          P_26
MY_USER            GA_W_COUNTERS_HISTORY          P_27
MY_USER            GA_W_COUNTERS_HISTORY          P_28
MY_USER            GA_W_COUNTERS_HISTORY          P_29
MY_USER            GA_W_COUNTERS_HISTORY          P_30
MY_USER            GA_W_COUNTERS_HISTORY          P_31

When running the SQL, from the above technote:
 select KGLHDADR, KGLNAHSH, KGLNAOWN, KGLFNOBJ, KGLHDNSP, KGLOBTYP, 
         Decode(bitand(KGLHDFLG,4),4,'YES','NO') "KPR",
         case when kglhdamk < 0 then (65536 + kglhdamk)
              else kglhdamk end "KDP"
   from x$kglob 
   where KGLOBTYP=70;

KGLHDADR         KGLNAHSH   KGLNAOWN  KGLFNOBJ          KGLHDNSP KGLOBTYP KPR   KDP
---------------- ---------- --------- ----------------- -------- -------- ----- ---
000000011D531030 1057352192 MY_USER_C GA_W_COUNTERS_ORY 45       70       NO    2
0000000156411590 1057352192 MY_USER_C GA_W_COUNTERS_ORY 45       70       YES   0
000000011980EBC0 4229035152 MY_USER_F GA_W_COUNTERS_ORY 45       70       YES   2
000000015699DCD0 2014443530 MY_USER_J GA_W_COUNTERS_ORY 45       70       YES   1
...
...
260 rows selected.

"If you see KDP>0 and KPR=NO then there is a problem in the SGA and it may be worth flushing."

So indeed, there is a corruption in memory in shared pool.
After flushing shared pool, the issue was resolved

ALTER SYSTEM FLUSH SHARED_POOL;

No comments:

Post a Comment