Pages

Monday, December 8, 2014

Oracle Auto Tasks: Auto Space Advisor, Sql Tuning Advisor, Auto Optimizer Stats Collection. Huge trace file is generated for process ORA$AT_OS_OPT_SY

============================================
Issue
============================================
Huge trace file are generated wth error message kcbzib: dump suspect buffer
The process that generate these filesmay vary:
ORA$AT_OS_OPT_SY.
expdp

============================================
Oracle workaround
============================================
Per Oracle technote Bug 10110863 - Trace file with "kcbzib: dump suspect buffer" without any error (Doc ID 10110863.8)

Description
Trace file with "kcbzib: dump suspect buffer" dump is produced
along with large buffer dump, without any user visible errors.

Rediscovery Notes:
1. Trace files with "kcbzib: dump suspect buffer" dump,
   along with large buffer dump, without any user visible errors.
2. ORA-8103, ORA-10632 & ORA-1410 raised by check functions internally
   and trapped before returning to user.
   This can be verified to catch error with events like following
 event="8103 errorstack(1)"
 event="10632 errorstack(1)"

Workaround
None but just deleting trace file with this dump is sufficient.

============================================
script to delete trace file with this dump
============================================

#!/bin/bash

#0 6 * * * /software/oracle/oracle/scripts/delete_specific_trace_files.sh

TRACE_DIR=/software/oracle/diag/rdbms/igt/igt/trace

find ${TRACE_DIR} -type f -name "*tr*" | xargs grep -l "kcbzib: dump suspect buffer" | xargs rm -f




============================================
General ORA$AT_OS_OPT_SY info
============================================
Starting from Oracle 11g Diagnostic Repositories are turned on by default. 


Doc ID 756734.1 

Auto Optimizer Stats Collection - Automatic Optimizer Statistics Collection which gathers stale or missing statistics for all schema objects.  

Auto Space Advisor- Automatic Segment Advisor Identifies segments that could be reorganized to save space  


Sql Tuning Advisor - Automatic SQL Tuning Advisor  Identifies and attempts to tune high load SQL 


- ORA$AT_OS_OPT_SY_n Stands for Auto Optimizer Stats Collection.
- ORA$AT_SA_SPC_SY_n Stands for Auto Space Advisor.
- ORA$AT_SQ_SQL_SW_n Stands for Sql Tuning Advisor.

These tasks executions are logged into DBA_AUTOTASK_JOB_HISTORY

============================================
Controlling the the maintenance tasks
============================================
The execution of these processes is controlled by these three parameters in sqlnet.ora file under $ORACLE_HOME/network/admin directory.

DIAG_ADR_ENABLED
DIAG_SIGHANDLER_ENABLED
DIAG_DDE_ENABLED

By default these are enabled.
To turn them process off, add to sqlnet.ora: 


DIAG_ADR_ENABLED=OFF 
DIAG_SIGHANDLER_ENABLED=FALSE 
DIAG_DDE_ENABLED=FALSE

============================================
The trace file
============================================
These trace file are generated by an Oracle job, as seem from the trace file name (j001) running by default at 01:00 AM.

less igt_j001_4570.trc

Trace file /software/oracle/diag/rdbms/orainst/orainst/trace/orainst_j001_4570.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /software/oracle/111
System name:    Linux
Node name:      server-2
Release:        2.6.18-194.el5
Version:        #1 SMP Tue Mar 16 21:52:39 EDT 2010
Machine:        x86_64
Instance name: orainst
Redo thread mounted by this instance: 1
Oracle process number: 42
Unix process pid: 4570, image: oracle@server-2 (J001)


*** 2014-12-07 01:00:09.866
*** SESSION ID:(266.12084) 2014-12-07 01:00:09.866
*** CLIENT ID:() 2014-12-07 01:00:09.866
*** SERVICE NAME:(SYS$USERS) 2014-12-07 01:00:09.866
*** MODULE NAME:(DBMS_SCHEDULER) 2014-12-07 01:00:09.866
*** ACTION NAME:(ORA$AT_OS_OPT_SY_20299) 2014-12-07 01:00:09.866


----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------

*** 2014-12-07 01:00:10.721
skdstdst()+28        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFF9CCCAD30 ? 7FFF9CCCADF8 ?
                                                   7FFF9CCEAD50 ? 000000000 ?
ksedst1()+95         call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFF9CCCAD30 ? 7FFF9CCCADF8 ?
                                                   7FFF9CCEAD50 ? 000000000 ?
ksedst()+35          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFF9CCCAD30 ? 7FFF9CCCADF8 ?
                                                   7FFF9CCEAD50 ? 000000000 ?

...
...   
sou2o()+90           call     opidrv()             000000032 ? 000000004 ?
                                                   7FFF9CD02168 ? 000000002 ?
                                                   0DD9BCA88 ? 000000000 ?
opimai_real()+275    call     sou2o()              7FFF9CD02140 ? 000000032 ?
                                                   000000004 ? 7FFF9CD02168 ?
                                                   0DD9BCA88 ? 000000000 ?
ssthrdmain()+177     call     opimai_real()        000000000 ? 7FFF9CD022E0 ?
                                                   000000004 ? 7FFF9CD02168 ?
                                                   0DD9BCA88 ? 000000000 ?
main()+215           call     ssthrdmain()         000000003 ? 7FFF9CD022E0 ?
                                                   000000004 ? 000000000 ?
                                                   0DD9BCA88 ? 000000000 ?
__libc_start_main()  call     main()               000000003 ? 7FFF9CD02448 ?
+244                                               000000004 ? 000000000 ?
                                                   0DD9BCA88 ? 000000000 ?
Corrupt stack.  Infinite loop of frame pointers found.


--------------------- Binary Stack Dump ---------------------


========== FRAME [1] (skdstdst()+28 -> kgdsdst()) ==========
defined by frame pointers 0x7fff9cceb2a0  and 0x7fff9cceb290
CALL TYPE: call   ERROR SIGNALED: no   COMPONENT: (null)
RDI 0000000000000000 RSI 0000000000000000 RDX 00007FFF9CCCAD30
RCX 00007FFF9CCCADF8 R8 00007FFF9CCEAD50 R9 0000000000000000
RAX 0000000000000001 RBX 00007FFF9CCEB8C8 RBP 00007FFF9CCEB290
R10 0000000000000000 R11 0000000000000000 R12 0000000092D36014
R13 00007FFF9CCEB978 R14 0000000000000000 R15 0000000000000001
RSP 00007FFF9CCEB290 RIP 00000000017F9590
Dump of memory from 0x7fff9cceb290 to 0x7fff9cceb2a0
7FFF9CCEB290 9CCEB340 00007FFF 0098C609 00000000  [@...............]

...
...

Checking the history runs of auto optimizer stats collection, they all show success.

SET linesize 120
SET pagesize 999
COL JOB_NAME FOR A30
COL JOB_STATUS FOR A15
COL JOB_START_TIME FOR A40

SELECT JOB_NAME, JOB_STATUS, JOB_START_TIME 
  FROM DBA_AUTOTASK_JOB_HISTORY 
 WHERE CLIENT_NAME='auto optimizer stats collection'
ORDER BY JOB_START_TIME DESC;

JOB_NAME                       JOB_STATUS JOB_START_TIME
------------------------------ ---------- -------------------------------------
ORA$AT_OS_OPT_SY_20302         SUCCEEDED  07-DEC-14 10.00.02.160840 PM ETC/GMT
ORA$AT_OS_OPT_SY_20299         SUCCEEDED  06-DEC-14 10.00.02.295884 PM ETC/GMT
ORA$AT_OS_OPT_SY_20296         SUCCEEDED  05-DEC-14 10.00.02.199972 PM ETC/GMT
ORA$AT_OS_OPT_SY_20278         SUCCEEDED  04-DEC-14 10.00.02.183327 PM ETC/GMT
ORA$AT_OS_OPT_SY_20275         SUCCEEDED  03-DEC-14 10.00.02.204706 PM ETC/GMT
ORA$AT_OS_OPT_SY_20272         SUCCEEDED  02-DEC-14 10.00.02.176247 PM ETC/GMT
ORA$AT_OS_OPT_SY_20269         SUCCEEDED  01-DEC-14 10.00.02.148828 PM ETC/GMT
ORA$AT_OS_OPT_SY_20266         SUCCEEDED  30-NOV-14 10.00.02.193282 PM ETC/GMT
ORA$AT_OS_OPT_SY_20263         SUCCEEDED  29-NOV-14 10.00.02.152566 PM ETC/GMT
ORA$AT_OS_OPT_SY_20260         SUCCEEDED  28-NOV-14 10.00.02.153708 PM ETC/GMT
ORA$AT_OS_OPT_SY_20257         SUCCEEDED  27-NOV-14 10.00.02.174340 PM ETC/GMT
ORA$AT_OS_OPT_SY_20254         SUCCEEDED  26-NOV-14 10.00.02.189363 PM ETC/GMT
ORA$AT_OS_OPT_SY_20251         SUCCEEDED  25-NOV-14 10.00.02.161676 PM ETC/GMT
ORA$AT_OS_OPT_SY_20248         SUCCEEDED  24-NOV-14 10.00.01.634654 PM ETC/GMT
ORA$AT_OS_OPT_SY_20245         SUCCEEDED  23-NOV-14 10.00.01.669284 PM ETC/GMT
ORA$AT_OS_OPT_SY_20242         SUCCEEDED  22-NOV-14 10.00.01.391327 PM ETC/GMT
ORA$AT_OS_OPT_SY_20239         SUCCEEDED  21-NOV-14 10.00.00.480373 PM ETC/GMT
ORA$AT_OS_OPT_SY_20236         SUCCEEDED  20-NOV-14 10.00.00.887736 PM ETC/GMT


Searching the Metalink, there are various errors related to the DBMS_SCHEDULER Automated Stats Gathering in Oracle 11.1.0.x.
Looks like Oracle bug in Oracle 11.1.0.7. Fixed in 12.1.x


Bug 10110863 - Customer see trace file with "kcbzib: dump suspect buffer" dump, along with large buffer dump, without any user visible errors.


SQL> SELECT DISTINCT client_name from DBA_AUTOTASK_JOB_HISTORY;

CLIENT_NAME
----------------------------------------------------------------
auto optimizer stats collection
auto space advisor
sql tuning advisor

SQL> desc DBA_AUTOTASK_JOB_HISTORY
 Name                      Null?    Type
 ------------------------- -------- --------------------------------------------
 CLIENT_NAME                        VARCHAR2(64)
 WINDOW_NAME                        VARCHAR2(65)
 WINDOW_START_TIME                  TIMESTAMP(6) WITH TIME ZONE
 WINDOW_DURATION                    INTERVAL DAY(9) TO SECOND(6)
 JOB_NAME                           VARCHAR2(260)
 JOB_STATUS                         VARCHAR2(30)
 JOB_START_TIME                     TIMESTAMP(6) WITH TIME ZONE
 JOB_DURATION                       INTERVAL DAY(3) TO SECOND(0)
 JOB_ERROR                          NUMBER
 JOB_INFO                           VARCHAR2(4000)


in addition there are big files under /software/oracle/diag/rdbms/orainst/orainst/incident
for example:

Dump file /software/oracle/diag/rdbms/igt/igt/incident/incdir_186882/igt_j002_7479_i186882.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /software/oracle/111
System name:    Linux
Node name:      esp-vod-1-dbu-2
Release:        2.6.18-308.24.1.el5
Version:        #1 SMP Wed Nov 21 11:42:14 EST 2012
Machine:        x86_64
Instance name: igt
Redo thread mounted by this instance: 1
Oracle process number: 1151
Unix process pid: 7479, image: oracle@esp-vod-1-dbu-2 (J002)


*** 2015-04-11 22:00:39.866
*** SESSION ID:(428.24596) 2015-04-11 22:00:39.866
*** CLIENT ID:() 2015-04-11 22:00:39.866
*** SERVICE NAME:(SYS$USERS) 2015-04-11 22:00:39.866
*** MODULE NAME:(DBMS_SCHEDULER) 2015-04-11 22:00:39.866
*** ACTION NAME:(ORA$AT_SQ_SQL_SW_3736) 2015-04-11 22:00:39.866

Dump continued from file: /software/oracle/diag/rdbms/igt/igt/trace/igt_j002_7479.trc
ORA-00700: soft internal error, arguments: [kesqsMakeSql-invstat:cpuTime], [], [], [], [], [], [], []

========= Dump for incident 186882 (ORA 700 [kesqsMakeSql-invstat:cpuTime]) ========

*** 2015-04-11 22:00:39.867
----- Current SQL Statement for this session (sql_id=6jbrg916bjmqc) -----
DECLARE job BINARY_INTEGER := :job;  next_date TIMESTAMP WITH TIME ZONE := :mydate;  broken BOOLEAN := FALSE;  job_name VARCHAR2(30) := :job_name;  job_subname VARCHAR2(30) := :job_subname;  job_owner VARCHAR2(30) := :job_owner;  job_start TIMESTAMP WITH TIME ZONE := :job_start;  job_scheduled_start TIMESTAMP WITH TIME ZONE := :job_scheduled_start;  window_start TIMESTAMP WITH TIME ZONE := :window_start;  window_end TIMESTAMP WITH TIME ZONE := :window_end;  BEGIN  DECLARE
         ename VARCHAR2(30);
       BEGIN
         ename := dbms_sqltune.execute_tuning_task(
                    'SYS_AUTO_SQL_TUNING_TASK');
       END;  :mydate := next_date; IF broken THEN :b := 1; ELSE :b := 0; END IF; END; 

----- PL/SQL Stack -----


Example from perfstat.txt report.
Top SQL are coming from DBMS_SPACE.auto_space_advisor_job_proc and from DBMS_STATS.gather_database_stats_job_proc.

SQL ordered by Elapsed time for DB: IGT  Instance: igt  Snaps: 27499 -27500
-> Total DB Time (s):           3,360
-> Captured SQL accounts for   56.7% of Total DB Time
-> SQL reported below exceeded  1.0% of Total DB Time

  Elapsed                Elap per            CPU                        Old
  Time (s)   Executions  Exec (s)  %Total   Time (s)  Physical Reads Hash Value
---------- ------------ ---------- ------ ---------- --------------- ----------
    458.90            1     458.90   13.7      15.46         264,448 4023405807
Module: DBMS_SCHEDULER
call dbms_space.auto_space_advisor_job_proc (  )

    388.58            1     388.58   11.6      39.50          91,840  125601442
Module: DBMS_SCHEDULER
call dbms_stats.gather_database_stats_job_proc (  )


DBMS_SPACE.auto_space_advisor_job_proc is executed from auto_task 'auto space advisor'.
BEGIN
  DBMS_AUTO_TASK_ADMIN.enable(
    client_name => 'auto space advisor',
    operation   => NULL,
    window_name => NULL);
END;

/

DBMS_STATS.gather_database_stats_job_proc is executed from auto_task 'auto optimizer stats collection'.
BEGIN
  DBMS_AUTO_TASK_ADMIN.enable(
    client_name => 'auto optimizer stats collection', 
    operation => NULL, 
    window_name => NULL);
END;

/


As a workaround, it is possible to disable the Auto Task Admin jobs
To disable tasks:

SET lines 180 pages 1000
COL client_name FOR A40
COL attributes FOR A60

SELECT client_name, status,attributes,service_name 
FROM DBA_AUTOTASK_CLIENT;


BEGIN

  DBMS_AUTO_TASK_ADMIN.disable(
    client_name => 'auto space advisor',
    operation   => NULL,
    window_name => NULL);
END;
/

BEGIN

  DBMS_AUTO_TASK_ADMIN.disable(
    client_name => 'sql tuning advisor',
    operation   => NULL,
    window_name => NULL);
END;
/

BEGIN
  DBMS_AUTO_TASK_ADMIN.disable(
    client_name => 'auto optimizer stats collection',
    operation   => NULL,
    window_name => NULL);
END;
/


COL ATTRIBUTES FOR A60
SELECT client_name, status,attributes,service_name 

FROM DBA_AUTOTASK_CLIENT;

CLIENT_NAME                     STATUS   ATTRIBUTES                                  
------------------------------- -------- --------------------------------------------
auto optimizer stats collection DISABLED ON BY DEFAULT, VOLATILE, SAFE TO KILL
auto space advisor              DISABLED ON BY DEFAULT, VOLATILE, SAFE TO KILL
sql tuning advisor              DISABLED ONCE PER WINDOW, ON BY DEFAULT, VOLATILE, 
                                         SAFE TO KILL                                 



To enable Auto Task Admin jobs:
BEGIN
  DBMS_AUTO_TASK_ADMIN.enable(
    client_name => 'auto space advisor',
    operation   => NULL,
    window_name => NULL);
END;
/

BEGIN
  DBMS_AUTO_TASK_ADMIN.enable(
    client_name => 'sql tuning advisor',
    operation   => NULL,
    window_name => NULL);
END;
/

BEGIN
  DBMS_AUTO_TASK_ADMIN.enable(
    client_name => 'auto optimizer stats collection',
    operation   => NULL,
    window_name => NULL);
END;
/






To see the recommendations:

SELECT DBMS_SPACE.ASA_RECOMMENDATIONS FROM DUAL;


ASA_RECOMMENDATIONS(TABLESPACE_NAME, SEGMENT_OWNER, SEGMENT_NAME, SEGMENT_TYPE,
--------------------------------------------------------------------------------
224744, 17825792, 0, 0, 0, 0, 
'Compress object V500_OPER_PURGE.SGA_W_PSMS_SUBSCRIBER,
 estimated savings is 17825792 bytes.', 
'alter table "V500_OPER_PURGE"."SGA_W_PSMS_SUBSCRIBER" compress for oltp',
'alter table "V500_OPER_PURGE"."SGA_W_PSMS_SUBSCRIBER"  move', 
NULL, 25062, 46))

The full syntax:

SELECT DBMS_SPACE.ASA_RECOMMENDATIONS('TRUE','FALSE','FALSE') 
FROM DUAL;

1st parameter - see all_runs recommendations
2nd parameter - see only manual runs recommendations
3rd parameter - see findings without recomendation

No comments:

Post a Comment