Pages

Sunday, March 25, 2018

Golden Gate and Oracle Archive files. What to check if Archive process is stuck.

=======================
General
=======================
For some reason, archive files stopped to be deleted by Oracle RMAN process.

This server is also running Oracle Golden Gate (GG)

Since Oracle GG depend upon archive files to replicate data between sites, it seems that one of the Golden Gate processes is stuck, and "locking" the archive files, preventing RMAN process from deleting them

How to check relation between Golden gate capture processes and Oracle capture processes?


=======================
Oracle main Tables
=======================

DBA_APPLY
Apply Process dequques logical changes to records, and applies them.
Apply processes are listed in DBA_APPLY.

DBA_CAPTURE
Integrated Extract would be registered in database, i.e. in DBA_CAPTURE

DBA_QUEUES
Describes queues in database.

DBA_REGISTERED_ARCHIVED_LOG
Displays information about all registered archived logfiles in the database.

=======================
In General
=======================

When removing an Integrated Extract, it has to first be un-registered from the database.
The it can be dropped.


If the process above was not un-registered from the database then it has to be manually cleared using the DBMS_CAPTURE_ADM.DROP_CAPTURE procedure.


=======================
Oracle side
=======================
DBMS_APPLY_ADM Package.
Use this package to manage apply processes.

SELECT APPLY_NAME, QUEUE_NAME, QUEUE_OWNER, STATUS FROM DBA_APPLY;


APPLY_NAME       QUEUE_NAME      QUEUE_OWNER          STATUS
---------------- --------------- -------------------- --------------------
OGG$EXT_P_01     OGG$Q_EXT_P_01  OGG
OGG$EXT_S_01     OGG$Q_EXT_S_01  OGG
OGG$EXT_I_01     OGG$Q_EXT_I_01  OGG




EXEC DBMS_APPLY_ADM.STOP_APPLY('OGG$EXT_P_01');
EXEC DBMS_APPLY_ADM.STOP_APPLY('OGG$EXT_S_01');
EXEC DBMS_APPLY_ADM.STOP_APPLY('OGG$EXT_I_01');

EXEC DBMS_APPLY_ADM.DROP_APPLY('OGG$EXT_P_01',TRUE);
EXEC DBMS_APPLY_ADM.DROP_APPLY('OGG$EXT_S_01',TRUE);
EXEC DBMS_APPLY_ADM.DROP_APPLY('OGG$EXT_I_01',TRUE);

SQL> SELECT apply_name, queue_name, queue_owner, status FROM DBA_APPLY;


no rows selected

DBMS_CAPTURE_ADM Package.
Use this package to manage capture processes.


SELECT capture_name, capture_user, capture_type, status from DBA_CAPTURE;
CAPTURE_NAME      CAPTURE_USER     CAPTURE_TYPE         STATUS
----------------- ---------------- -------------------- --------------------
OGG$CAP_EXT_I_02  OGG              LOCAL                ABORTED
OGG$CAP_EXT_S_02  OGG              LOCAL                ABORTED
OGG$CAP_EXT_P_02  OGG              LOCAL                ABORTED

EXEC DBMS_CAPTURE_ADM.STOP_CAPTURE('OGG$CAP_EXT_I_02');
EXEC DBMS_CAPTURE_ADM.STOP_CAPTURE('OGG$CAP_EXT_S_02');
EXEC DBMS_CAPTURE_ADM.STOP_CAPTURE('OGG$CAP_EXT_P_02');

SELECT capture_name, capture_user, capture_type, status from DBA_CAPTURE;

CAPTURE_NAME      CAPTURE_USER     CAPTURE_TYPE         STATUS
----------------- ---------------- -------------------- --------------------
OGG$CAP_EXT_I_02  OGG              LOCAL                DISABLED
OGG$CAP_EXT_S_02  OGG              LOCAL                DISABLED
OGG$CAP_EXT_P_02  OGG              LOCAL                DISABLED


EXEC DBMS_CAPTURE_ADM.DROP_CAPTURE('OGG$CAP_EXT_I_02');
EXEC DBMS_CAPTURE_ADM.DROP_CAPTURE('OGG$CAP_EXT_S_02');

EXEC DBMS_CAPTURE_ADM.DROP_CAPTURE('OGG$CAP_EXT_P_02');


SELECT * FROM DBA_CAPTURE;



no rows selected


DBA_REGISTERED_ARCHIVED_LOG
SELECT DISTINCT CONSUMER_NAME FROM DBA_REGISTERED_ARCHIVED_LOG;

=======================

Golden Gate side

=======================

Even when processes are listed in DBA_CAPTURE table, it might be that there are not available via ggsci

ggsci
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt


no data is available

GGSCI (esp-tel-1-dbu-2) 1>  dblogin userid ogg  password password
Successfully logged into database.

GGSCI (esp-tel-1-dbu-2) 2> DELETE extract EXT_P_01

ERROR: EXTRACT EXT_P_03 does not exist.


GGSCI (esp-tel-1-dbu-2) 3> DELETE extract EXT_S_01

ERROR: EXTRACT EXT_S_03 does not exist.


GGSCI (esp-tel-1-dbu-2) 4> DELETE extract EXT_I_01


ERROR: EXTRACT EXT_I_03 does not exist.

=======================
Solution
=======================
Once the capture process was terminated using DBMS_CAPTURE_ADM.DROP_CAPTURE() procedure, the RMAN process was able to delete the "stuck" archive files.

=======================
Additional information - Tables
=======================
DBA_CAPTURE
 - Hold info about GG capture processes

DBA_REGISTERED_ARCHIVED_LOG 
- Hold information about archive logs registered for capture process


COL CAPTURE_NAME FOR A20
COL CAPTURE_TYPE FOR A20
COL STATUS FOR A10
COL REQ_SCN FOR A16
COL OLDEST_SCN FOR A16
SET linesize 120

SELECT CAPTURE_NAME, CAPTURE_TYPE, STATUS,
to_char(REQUIRED_CHECKPOINT_SCN,'999999999999999')as REQ_SCN ,to_char(OLDEST_SCN,'999999999999999')as OLDEST_SCN 

FROM DBA_CAPTURE
ORDER BY CAPTURE_NAME;


SELECT SUBSTR(consumer_name,LENGTH(consumer_name)-1,2) as consumer_group, 
       consumer_name, 
       TO_CHAR(MIN(modified_time),'YYYYMMDD hh24:mi:ss') AS min_modified_time,
       MIN(next_scn) AS MIN_next_scn, 
       count(*) as arch_logs
 FROM DBA_REGISTERED_ARCHIVED_LOG 
GROUP BY consumer_name  
ORDER BY SUBSTR(consumer_name,LENGTH(consumer_name)-1,2) , consumer_name;

                     Capture
                     Process
CONSUMER_GROUP       Name             MIN_MODIFIED_TIME  MIN_NEXT_SCN   ARCH_LOGS
-------------------- ---------------- ------------------ ------------ -----------
02                   OGG$CAP_EXT_I_02 20170728 12:46:25    9331491518        5963
02                   OGG$CAP_EXT_P_02 20170728 12:46:25    9331491518        5963
02                   OGG$CAP_EXT_S_02 20170728 12:46:25    9331491518        5963
03                   OGG$CAP_EXT_I_03 20170728 12:46:25    9331491518        1019
03                   OGG$CAP_EXT_P_03 20170728 12:46:25    9331491518        1019
03                   OGG$CAP_EXT_S_03 20170728 12:46:25    9331491518        1019
05                   OGG$CAP_EXT_I_05 20170728 12:46:25    9331491518        5963
05                   OGG$CAP_EXT_P_05 20170728 12:46:25    9331491518        5963
05                   OGG$CAP_EXT_S_05 20170728 12:46:25    9331491518        5963
06                   OGG$CAP_EXT_I_06 20170728 12:46:25    9331491518        5963
06                   OGG$CAP_EXT_P_06 20170728 12:46:25    9331491518        5963
06                   OGG$CAP_EXT_S_06 20170728 12:46:25    9331491518        5963
07                   OGG$CAP_EXT_I_07 20170728 12:46:25    9331491518        5963
07                   OGG$CAP_EXT_P_07 20170728 12:46:25    9331491518        5963
07                   OGG$CAP_EXT_S_07 20170728 12:46:25    9331491518        5963
09                   OGG$CAP_EXT_I_09 20170728 12:46:25    9331491518        5963
09                   OGG$CAP_EXT_P_09 20170728 12:46:25    9331491518        5963
09                   OGG$CAP_EXT_S_09 20170728 12:46:25    9331491518        5963
10                   OGG$CAP_EXT_I_10 20170728 12:46:25    9331491518        5963
10                   OGG$CAP_EXT_P_10 20170728 12:46:25    9331491518        5963
10                   OGG$CAP_EXT_S_10 20170728 12:46:25    9331491518        5963
11                   OGG$CAP_EXT_I_11 20170728 12:46:25    9331491518        5963
11                   OGG$CAP_EXT_P_11 20170728 12:46:25    9331491518        5963
11                   OGG$CAP_EXT_S_11 20170728 12:46:25    9331491518        5963
12                   OGG$CAP_EXT_I_12 20170728 12:46:25    9331491518        5963
12                   OGG$CAP_EXT_P_12 20170728 12:46:25    9331491518        5963
12                   OGG$CAP_EXT_S_12 20170728 12:46:25    9331491518        5963
13                   OGG$CAP_EXT_I_13 20170728 12:46:25    9331491518        5963
13                   OGG$CAP_EXT_P_13 20170728 12:46:25    9331491518        5963
13                   OGG$CAP_EXT_S_13 20170728 12:46:25    9331491518        5963
14                   OGG$CAP_EXT_I_14 20170802 10:33:18    9406839331        5148
14                   OGG$CAP_EXT_P_14 20170802 10:38:32    9407231427        5141
14                   OGG$CAP_EXT_S_14 20170802 10:37:23    9407006792        5145
15                   OGG$CAP_EXT_I_15 20170728 12:46:25    9331491518        5963
15                   OGG$CAP_EXT_P_15 20170728 12:46:25    9331491518        5963
15                   OGG$CAP_EXT_S_15 20170728 12:46:25    9331491518        5963

36 rows selected.



SELECT  SUBSTR(CAPTURE_NAME,LENGTH(CAPTURE_NAME)-1,2) as "CAPTURE GROUP" ,
        CAPTURE_NAME , 
        REQUIRED_CHECKPOINT_SCN
   FROM DBA_CAPTURE 
ORDER BY SUBSTR(CAPTURE_NAME,LENGTH(CAPTURE_NAME)-1,2), CAPTURE_NAME;

CAPTURE
GROUP    CAPTURE_NAME                     REQUIRED_CHECKPOINT_SCN
-------- ------------------------------ -------------------------
02       OGG$CAP_EXT_I_02                             10123069599
02       OGG$CAP_EXT_P_02                             10123069599
02       OGG$CAP_EXT_S_02                             10123069599
03       OGG$CAP_EXT_I_03                              9741450444
03       OGG$CAP_EXT_P_03                              9741450577
03       OGG$CAP_EXT_S_03                              9741450577
05       OGG$CAP_EXT_I_05                             10123069599
05       OGG$CAP_EXT_P_05                             10123069599
05       OGG$CAP_EXT_S_05                             10123069599
06       OGG$CAP_EXT_I_06                             10123069599
06       OGG$CAP_EXT_P_06                             10123069599
06       OGG$CAP_EXT_S_06                             10123069599
07       OGG$CAP_EXT_I_07                             10123069599
07       OGG$CAP_EXT_P_07                             10123069599
07       OGG$CAP_EXT_S_07                             10123069599
09       OGG$CAP_EXT_I_09                             10123069599
09       OGG$CAP_EXT_P_09                             10123069599
09       OGG$CAP_EXT_S_09                             10123069599
10       OGG$CAP_EXT_I_10                             10123069599
10       OGG$CAP_EXT_P_10                             10123069599
10       OGG$CAP_EXT_S_10                             10123069599
11       OGG$CAP_EXT_I_11                             10123069599
11       OGG$CAP_EXT_P_11                             10123069599
11       OGG$CAP_EXT_S_11                             10123069599
12       OGG$CAP_EXT_I_12                             10123069599
12       OGG$CAP_EXT_P_12                             10123069599
12       OGG$CAP_EXT_S_12                             10123069599
13       OGG$CAP_EXT_I_13                             10123069599
13       OGG$CAP_EXT_P_13                             10123069599
13       OGG$CAP_EXT_S_13                             10123069599
14       OGG$CAP_EXT_I_14                             10123069599
14       OGG$CAP_EXT_P_14                             10123069599
14       OGG$CAP_EXT_S_14                             10123069599
15       OGG$CAP_EXT_I_15                             10123069599
15       OGG$CAP_EXT_P_15                             10123069599
15       OGG$CAP_EXT_S_15                             10123069599

36 rows selected.

Sunday, March 4, 2018

ORA-08103: object no longer exists. Investigation and resolution.

=======================================
General
=======================================
Large Oracle trace file (~600 Mb) are generated.
What could be the cause?

=======================================
Evidences
=======================================
In alert.log there is only this non informative error message:
Stopping background process CJQ0

Checking the trace files timestamp, they all were generated at apprximately same time each day, at ~23:51.

In all trace files, there is same error message: "dump suspect buffer"
But the trace file has also session detalis.

-rw-r----- 1 oracle dba 650,767,781 Feb  3 23:52
/software/oracle/diag/rdbms/igt/igt/trace/igt_j000_16517.trc

Trace file /software/oracle/diag/rdbms/igt/igt/trace/igt_j000_16517.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /software/oracle/111
System name:    Linux
Node name:      arg-cti-2-aps-1
Release:        2.6.18-274.el5
Version:        #1 SMP Fri Jul 8 17:36:59 EDT 2011
Machine:        x86_64
Instance name: igt
Redo thread mounted by this instance: 1
Oracle process number: 175
Unix process pid: 16517, image: oracle@arg-cti-2-aps-1 (J000)


*** 2018-02-03 23:50:24.551
*** SESSION ID:(164.47150) 2018-02-03 23:50:24.551
*** CLIENT ID:() 2018-02-03 23:50:24.551
*** SERVICE NAME:(SYS$USERS) 2018-02-03 23:50:24.551
*** MODULE NAME:() 2018-02-03 23:50:24.551
*** ACTION NAME:() 2018-02-03 23:50:24.551

kcbzib: dump suspect buffer
buffer tsn: 4 rdba: 0x01007af2 (4/31474)
scn: 0x0dac.ef6f7cc4 seq: 0x01 flg: 0x04 tail: 0x7cc42301
frmt: 0x02 chkval: 0xe4b3 type: 0x23=PAGETABLE SEGMENT HEADER
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x0000000082AA8000 to 0x0000000082AAA000

=======================================
Investigation
=======================================
Checking DBA_HIST tables by session_id.

SELECT  DBA_HIST_ACTIVE_SESS_HISTORY.* 
FROM DBA_HIST_ACTIVE_SESS_HISTORY 
WHERE session_id = '164' AND session_serial# = 47150;

user_id = 49
sql_id  cnsp07td49muk

SELECT * FROM SBA_HIST_SQLTEXT 
 WHERE sql_id = 'cnsp07td49muk';

SELECT * FROM DBA_USERS WHERE user_id = 49;
--MY_USER

SELECT * FROM DBA_JOBS WHERE schema_user = 'MY_USER'

Indeed there is a job, which is scheduled to run at 23:50 Daily.

Checking that job code, there is the SQL which sql_id is referenced by 'cnsp07td49muk';

Checking the application log table for that job, there this error message:

Error during procedure: ORA-08103: object no longer exists

=======================================
Resolution
=======================================
Following the error message ORA-08103: object no longer exists, additional jobs which where running at same time were investigated.

Apparently, there was another job, performing a purge of old data from same table, by ALTER TABLE XXX TRUNCATE PARTITION YYY;

This was the cause for error ORA-08103: object no longer exists error.

Issue was resolved, by rescheduling the failing job, so it would not run at he same time with the purge data job.