Pages

Sunday, April 22, 2018

Golden Gate Lag at Chkpt is high. What does it mean?

==========================
General
==========================
The issue: One of the Oracle GG (Golden Gate) EXTRACT  processes was in ABENDED status. 
After fixing the underlying problem , the value for Lag at Chkpt is high.
What does it mean?


>%cd /software/ogg/1212
/software/ogg/1212/>% ./ggsci
info all

GGSCI (esp-tel-1-dbu-2) 26> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING  
EXTRACT     RUNNING     EXT_I_01    00:00:06      00:00:06    
EXTRACT     RUNNING     EXT_I_02    00:00:05      00:00:02    
EXTRACT     RUNNING     EXT_I_04    00:00:06      00:00:04    
EXTRACT     RUNNING     EXT_I_05    00:00:06      00:00:02    
EXTRACT     RUNNING     EXT_I_06    00:00:07      00:00:10
EXTRACT     RUNNING     EXT_I_07    83:06:52      00:00:09
EXTRACT     RUNNING     EXT_S_02    00:00:05      00:00:07    
EXTRACT     RUNNING     EXT_S_04    00:00:05      00:00:02    
EXTRACT     RUNNING     EXT_S_05    00:00:05      00:00:03    
EXTRACT     RUNNING     EXT_S_06    00:00:05      00:00:03    
EXTRACT     RUNNING     EXT_S_07    96:46:27      00:00:01    


==========================
Explanation
==========================
Time Since Checkpoint and Lag at Checkpoint

Time Since Checkpoint
Each process has it's own checkpoint file. 
Whenever a process see's a commit in the transaction, a checkpoint is made in the checkpoint file. 

You can see the increase in the Time Since Checkpoint, when extract was stopped or when extract or replicat is processing a long running transaction

If you stop a Extract process at 10:00AM and restart it at 10:30 AM, upon restarting, the Extract process reads it's checkpoint file to know the last successful processed transaction. And it starts processing all the transactions after that checkpoint.

You can see the increase in the Time Since Checkpoint, when extract or replicat is processing a long running transaction. 
By using the SEND EXTRACT STATUS or SEND REPLICAT STATUS command you can know the exact status of the processes. 
The INFO command always fetches the information from the checkpoint file.
The SEND command communicates directly with the process and gets the current/latest status of it.


Lag at Checkpoint
For Extract, lag is the difference, in seconds, between the time that a record was processed by Extract (based on the system clock) and the timestamp of that record in the data source.

For Replicat, lag is the difference, in seconds, between the time that the last record was processed by Replicat (based on the system clock) and the timestamp of the record in the trail.

 ==========================
Resolution
==========================
Nothing to do, just wait.
The Extract was in Abended state for several hours, and it got a backlog of transactions to reprocess.
After ~10 minutes, the Lag at Checkpoint was reduced from ~90 minutes to ~70 minutes


GGSCI (my_server) 757> SEND REPLICAT REP_I_01 STATUS

Sending STATUS request to REPLICAT REP_I_01 ...
  Current status: Processing data
  Sequence #: 4,614
  RBA: 428,201,835
  275,562 records in current transaction.


GGSCI (my_server) 758> SEND REPLICAT REP_I_01 STATUS

Sending STATUS request to REPLICAT REP_I_01 ...
  Current status: Processing data
  Sequence #: 4,614
  RBA: 429,066,155
  277,022 records in current transaction.


The Transaction is not yet completed.

GGSCI (esp-tel-1-dbu-2) 26> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                    
EXTRACT     RUNNING     EXT_I_01    00:00:06      00:00:08    
EXTRACT     RUNNING     EXT_I_02    00:00:05      00:00:03    
EXTRACT     RUNNING     EXT_I_04    00:00:06      00:00:08    
EXTRACT     RUNNING     EXT_I_05    00:00:06      00:00:08    
EXTRACT     RUNNING     EXT_I_06    00:00:07      00:00:07    

EXTRACT     RUNNING     EXT_I_07    64:58:26      00:00:01 
EXTRACT     RUNNING     EXT_S_01    00:00:05      00:00:03    
EXTRACT     RUNNING     EXT_S_02    00:00:05      00:00:03    
EXTRACT     RUNNING     EXT_S_04    00:00:06      00:00:08    
EXTRACT     RUNNING     EXT_S_05    00:00:05      00:00:02    
EXTRACT     RUNNING     EXT_S_06    00:00:06      00:00:08    
EXTRACT     RUNNING     EXT_S_07    78:51:46      00:00:01    


After additional ~10 minutes, the Lag at Checkpoint was reduced further.
EXTRACT     RUNNING     EXT_I_01    00:00:06      00:00:07    
EXTRACT     RUNNING     EXT_I_02    00:00:06      00:00:07    
EXTRACT     RUNNING     EXT_I_04    00:00:06      00:00:07    
EXTRACT     RUNNING     EXT_I_05    00:00:06      00:00:07    
EXTRACT     RUNNING     EXT_I_06    00:00:05      00:00:03    
EXTRACT     RUNNING     EXT_I_07    54:01:19      00:00:08   
EXTRACT     RUNNING     EXT_S_01    00:00:06      00:00:07    
EXTRACT     RUNNING     EXT_S_02    00:00:06      00:00:06    
EXTRACT     RUNNING     EXT_S_04    00:00:06      00:00:07    
EXTRACT     RUNNING     EXT_S_05    00:00:06      00:00:07    
EXTRACT     RUNNING     EXT_S_06    00:00:06      00:00:07    
EXTRACT     RUNNING     EXT_S_07    66:03:33      00:00:07  

No comments:

Post a Comment