Oracle DBA by Example. Code samples, Scripts, Reference.: Many trace files with ORA-00600 and error message Action: 'kpuActionDefault'

===========================
General
===========================
There are many *.trc files and incident/incident_<pid> files generated.
When checking alert.log, there are many error messages with error ORA-00600:

===========================
Evidences
===========================

alert.log

Sun Aug 20 07:14:18 2017
ORA-00600: internal error code, arguments: [kkpamRGet], [PTI 8.1], [0], [], [], [], [], [], [], [], [], []
Incident details in: /software/oracle/diag/rdbms/igt/igt/incident/incdir_246165/igt_ora_2263_i246165.trc
Sun Aug 20 07:14:18 2017
Errors in file /software/oracle/diag/rdbms/igt/igt/trace/igt_ora_5575.trc (incident=246525):
ORA-00600: internal error code, arguments: [kkpamDGSPam2], [0], [0], [], [], [], [], [], [], [], [], []
Incident details in: /software/oracle/diag/rdbms/igt/igt/incident/incdir_246525/igt_ora_5575_i246525.trc

trace files

/software/oracle/diag/rdbms/igt/igt/trace/igt_ora_31903.trc

ORACLE_HOME = /software/oracle/111

System name: Linux

Node name: deu-vod-2-dbu-1

Release: 2.6.18-371.3.1.el5

Version: #1 SMP Mon Nov 11 03:23:58 EST 2013

Machine: x86_64

Instance name: igt

Redo thread mounted by this instance: 1

Oracle process number: 455

Unix process pid: 31903, image: oracle@deu-vod-2-dbu-1

*** 2017-08-20 12:59:18.420

*** SESSION ID:(2544.29033) 2017-08-20 12:59:18.420

*** CLIENT ID:() 2017-08-20 12:59:18.420

*** SERVICE NAME:(SYS$USERS) 2017-08-20 12:59:18.420

*** MODULE NAME:(JDBC Thin Client) 2017-08-20 12:59:18.420

*** ACTION NAME:() 2017-08-20 12:59:18.420

DDE: Problem Key 'ORA 600 [kkpamRGet]' was flood controlled (0x4) (incident: 246112)

ORA-00600: internal error code, arguments: [kkpamRGet], [PTI 8.1], [0], [], [], [], [], [], [], [], [], []

incident file /software/oracle/diag/rdbms/igt/igt/incident/incdir_243949/igt_ora_13764_i243949.trc

Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production

With the Partitioning, Oracle Label Security, OLAP, Data Mining,

Oracle Database Vault and Real Application Testing options

ORACLE_HOME = /software/oracle/111

System name: Linux

Node name: deu-vod-2-dbu-1

Release: 2.6.18-371.3.1.el5

Version: #1 SMP Mon Nov 11 03:23:58 EST 2013

Machine: x86_64

Instance name: igt

Redo thread mounted by this instance: 1

Oracle process number: 185

Unix process pid: 13764, image: oracle@deu-vod-2-dbu-1

*** 2017-08-04 08:59:25.201

*** SESSION ID:(3259.3287) 2017-08-04 08:59:25.201

*** CLIENT ID:() 2017-08-04 08:59:25.201

*** SERVICE NAME:(SYS$USERS) 2017-08-04 08:59:25.201

*** MODULE NAME:(JDBC Thin Client) 2017-08-04 08:59:25.201

*** ACTION NAME:() 2017-08-04 08:59:25.201

Dump continued from file: /software/oracle/diag/rdbms/igt/igt/trace/igt_ora_13764.trc

ORA-00600: internal error code, arguments: [kdtgsp-inc], [], [], [], [], [], [], [], [], [], [], []

========= Dump for incident 243949 (ORA 600 [kdtgsp-inc]) ========

*** 2017-08-04 08:59:25.201

----- Current SQL Statement for this session (sql_id=ancznzs51sycm) -----

INSERT INTO GA_W_COUNTERS_HISTORY (STATIC_ID , DYN1 , DYN2 , DYN3 , DYN4 , DYN5, DYN_S1, COUNTER_SUM , COUNTER_DELTA, TS_LAST_MODIFIED, NODE_ID, DAY) values (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 ,:10 ,:11 ,:12 )

Begin OCI Call Context Dump

End OCI Call Context Dump

===========================

Oracle Metalink

===========================

Per Oracle document:

Bug 9399991 - Assorted Internal Errors and Dumps (mostly under kkpa*/kcb*) from SQL against partitioned tables (Doc ID 9399991.8):

Versions confirmed as being affected:
11.1.0.7
11.2.0.1

Description

Various dumps and / or ORA-600 errors can occur if a partitioned table has lots of dependent references in the shared pool.
eg: lots of different literal SQLs reference the same partitioned table.

The problem is that an internal field can overflow when the number of dependencies exceeds 65536 which can then allow the MVOB of the partitioned table tohave data freed leaving stale references in the SGA.

Rediscovery notes:
If the problem is present the a library cache dump will show one or more MVOB entries which have a KGHP flag but no KEP flag.

A SQL like the following may help show the MVOB entries in the library cache:
select KGLHDADR, KGLNAHSH, KGLNAOWN, KGLFNOBJ, KGLHDNSP, KGLOBTYP,
Decode(bitand(KGLHDFLG,4),4,'YES','NO') "KPR",
case when kglhdamk < 0 then (65536 + kglhdamk)
else kglhdamk end "KDP"
from x$kglob
where KGLOBTYP=70;

In this select:
"KPR" is the flag we are interested in. YES=flag set, NO=not set
"KDP" is the dependency count.
You should see this increase if new SQL statements are parsed that reference the partitioned object.
If you see KDP>0 and KPR=NO then there is a problem in the SGA and it may be worth flushing.

If you see KDP increasing above about 50000 for a specific entry then it may also be worth flushing the shared pool as the problem could occur if the dependency count increases much further.

Solution

Per above: "If you see KDP>0 and KPR=NO then there is a problem in the SGA and it may be worth flushing."

===========================

In the specific case

===========================

Failing SQL was:

INSERT INTO GA_W_COUNTERS_HISTORY (STATIC_ID , DYN1 , DYN2 , DYN3 , DYN4 , DYN5, DYN_S1, COUNTER_SUM , COUNTER_DELTA, TS_LAST_MODIFIED, NODE_ID, DAY)

values (:1 ,:2 ,:3 ,:4 ,:5 ,:6 ,:7 ,:8 ,:9 ,:10 ,:11 ,:12 )

GA_W_COUNTERS_HISTORY table is indeed partitioned.

SELECT TABLE_OWNER, TABLE_NAME, PARTITION_NAME

FROM DBA_TAB_PARTITIONS

WHERE table_name = 'GA_W_COUNTERS_HISTORY'

AND TABLE_OWNER = 'MY_USER'

TABLE_OWNER TABLE_NAME PARTITION_NAME

------------------ ------------------------------ --------------

MY_USER GA_W_COUNTERS_HISTORY P_1

MY_USER GA_W_COUNTERS_HISTORY P_2

MY_USER GA_W_COUNTERS_HISTORY P_3

MY_USER GA_W_COUNTERS_HISTORY P_4

MY_USER GA_W_COUNTERS_HISTORY P_5

MY_USER GA_W_COUNTERS_HISTORY P_6

MY_USER GA_W_COUNTERS_HISTORY P_7

MY_USER GA_W_COUNTERS_HISTORY P_8

MY_USER GA_W_COUNTERS_HISTORY P_9

MY_USER GA_W_COUNTERS_HISTORY P_10

MY_USER GA_W_COUNTERS_HISTORY P_11

MY_USER GA_W_COUNTERS_HISTORY P_12

MY_USER GA_W_COUNTERS_HISTORY P_13

MY_USER GA_W_COUNTERS_HISTORY P_14

MY_USER GA_W_COUNTERS_HISTORY P_15

MY_USER GA_W_COUNTERS_HISTORY P_16

MY_USER GA_W_COUNTERS_HISTORY P_17

MY_USER GA_W_COUNTERS_HISTORY P_18

MY_USER GA_W_COUNTERS_HISTORY P_19

MY_USER GA_W_COUNTERS_HISTORY P_20

MY_USER GA_W_COUNTERS_HISTORY P_21

MY_USER GA_W_COUNTERS_HISTORY P_22

MY_USER GA_W_COUNTERS_HISTORY P_23

MY_USER GA_W_COUNTERS_HISTORY P_24

MY_USER GA_W_COUNTERS_HISTORY P_25

MY_USER GA_W_COUNTERS_HISTORY P_26

MY_USER GA_W_COUNTERS_HISTORY P_27

MY_USER GA_W_COUNTERS_HISTORY P_28

MY_USER GA_W_COUNTERS_HISTORY P_29

MY_USER GA_W_COUNTERS_HISTORY P_30

MY_USER GA_W_COUNTERS_HISTORY P_31

When running the SQL, from the above technote:

select KGLHDADR, KGLNAHSH, KGLNAOWN, KGLFNOBJ, KGLHDNSP, KGLOBTYP,

Decode(bitand(KGLHDFLG,4),4,'YES','NO') "KPR",

case when kglhdamk < 0 then (65536 + kglhdamk)

else kglhdamk end "KDP"

from x$kglob

where KGLOBTYP=70;

KGLHDADR KGLNAHSH KGLNAOWN KGLFNOBJ KGLHDNSP KGLOBTYP KPR KDP

---------------- ---------- --------- ----------------- -------- -------- ----- ---

000000011D531030 1057352192 MY_USER_C GA_W_COUNTERS_ORY 45 70 NO 2

0000000156411590 1057352192 MY_USER_C GA_W_COUNTERS_ORY 45 70 YES 0

000000011980EBC0 4229035152 MY_USER_F GA_W_COUNTERS_ORY 45 70 YES 2

000000015699DCD0 2014443530 MY_USER_J GA_W_COUNTERS_ORY 45 70 YES 1

...

260 rows selected.

"If you see KDP>0 and KPR=NO then there is a problem in the SGA and it may be worth flushing."

So indeed, there is a corruption in memory in shared pool.

After flushing shared pool, the issue was resolved

ALTER SYSTEM FLUSH SHARED_POOL;

Oracle DBA by Example. Code samples, Scripts, Reference.

Pages

Sunday, August 20, 2017

Many trace files with ORA-00600 and error message Action: 'kpuActionDefault'

No comments:

Post a Comment