Pages

Thursday, April 11, 2024

pcs cluster switch from cluster mode to local mount mode

Stop the cluster mode

root@rock8-19c-2:~>% pcs property set maintenance-mode=true
root@rock8-19c-2:~>% pcs status
Cluster name: igt055_cluster
Status of pacemakerd: 'Pacemaker is running' (last updated 2024-04-11 11:03:14Z)
Cluster Summary:
  * Stack: corosync
  * Current DC: rock8-19c-1 (version 2.1.5-8.1.el8_8-a3f44794f94) - partition with quorum
  * Last updated: Thu Apr 11 11:03:14 2024
  * Last change:  Thu Apr 11 11:03:09 2024 by root via cibadmin on rock8-19c-2
  * 2 nodes configured
  * 6 resource instances configured

              *** Resource management is DISABLED ***
  The cluster will not attempt to start, stop or recover services

Node List:
  * Online: [ rock8-19c-1 rock8-19c-2 ]

Full List of Resources:
  * Resource Group: ora_igt_rg (unmanaged):
    * db1_igt_fs        (ocf::heartbeat:Filesystem):     Started rock8-19c-2 (unmanaged)
    * online_igt_fs     (ocf::heartbeat:Filesystem):     Started rock8-19c-2 (unmanaged)
    * exp_igt_fs        (ocf::heartbeat:Filesystem):     Started rock8-19c-2 (unmanaged)
    * db2_igt_fs        (ocf::heartbeat:Filesystem):     Started rock8-19c-2 (unmanaged)
    * ora_igt_vip       (ocf::heartbeat:IPaddr2):        Started rock8-19c-2 (unmanaged)
    * ora_igt_ap        (lsb:dbora_ctl):         Started rock8-19c-2 (unmanaged)

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled



Can change oracle archive mode now
For example:
oracle@rock8-19c-2:~>% sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Thu Apr 11 11:15:45 2024
Version 19.22.0.0.0

Copyright (c) 1982, 2023, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.22.0.0.0

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area 8589932704 bytes
Fixed Size                  8960160 bytes
Variable Size            4328521728 bytes
Database Buffers         4244635648 bytes
Redo Buffers                7815168 bytes
Database mounted.
SQL> alter database archivelog;

Database altered.

SQL> alter database open;

Database altered.

SQL> ARCHIVE LOG LIST;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /oracle_db/db1/db_igt/arch
Oldest online log sequence     496
Next log sequence to archive   498
Current log sequence           498
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit


Now start cluster mode

root@rock8-19c-2:~>% pcs property set maintenance-mode=false
root@rock8-19c-2:~>% pcs resource show
Warning: This command is deprecated and will be removed. Please use 'pcs resource status' instead.
  * Resource Group: ora_igt_rg:
    * db1_igt_fs        (ocf::heartbeat:Filesystem):     Started rock8-19c-2
    * online_igt_fs     (ocf::heartbeat:Filesystem):     Started rock8-19c-2
    * exp_igt_fs        (ocf::heartbeat:Filesystem):     Started rock8-19c-2
    * db2_igt_fs        (ocf::heartbeat:Filesystem):     Started rock8-19c-2
    * ora_igt_vip       (ocf::heartbeat:IPaddr2):        Started rock8-19c-2
    * ora_igt_ap        (lsb:dbora_ctl):         Started rock8-19c-2
root@rock8-19c-2:~>% pcs resource disable ora_igt_rg
root@rock8-19c-2:~>% pcs resource enable ora_igt_rg
root@rock8-19c-2:~>% pcs status
Cluster name: igt055_cluster
Status of pacemakerd: 'Pacemaker is running' (last updated 2024-04-11 12:34:54Z)
Cluster Summary:
  * Stack: corosync
  * Current DC: rock8-19c-1 (version 2.1.5-8.1.el8_8-a3f44794f94) - partition with quorum
  * Last updated: Thu Apr 11 12:34:54 2024
  * Last change:  Thu Apr 11 11:35:59 2024 by root via cibadmin on rock8-19c-2
  * 2 nodes configured
  * 6 resource instances configured

Node List:
  * Online: [ rock8-19c-1 rock8-19c-2 ]

Full List of Resources:
  * Resource Group: ora_igt_rg:
    * db1_igt_fs        (ocf::heartbeat:Filesystem):     Started rock8-19c-2
    * online_igt_fs     (ocf::heartbeat:Filesystem):     Started rock8-19c-2
    * exp_igt_fs        (ocf::heartbeat:Filesystem):     Started rock8-19c-2
    * db2_igt_fs        (ocf::heartbeat:Filesystem):     Started rock8-19c-2
    * ora_igt_vip       (ocf::heartbeat:IPaddr2):        Started rock8-19c-2
    * ora_igt_ap        (lsb:dbora_ctl):         Started rock8-19c-2

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled





ORA-12547: TNS:lost contact

ERROR:
ORA-12547: TNS:lost contact

=============================================
Correct permissions should be:
ls -l ${ORACLE_HOME}bin/oracle
-rwsr-s--x 1 oracle dba 409028888 Aug 27  2019 bin/oracle

But now:
ls -l ${ORACLE_HOME}bin/oracle
-rwxr-x--x 1 oracle dba 457057360 Mar 28 08:07 /software/oracle/19c/bin/oracle
=============================================

How to fix:
Per Oracle technote "Troubleshooting ORA-12547 TNS: Lost Contact (Doc ID 555565.1)"
Solution is to make sure file system for database home has setuid/suid set, database binary($RDBMS_HOME/bin/oracle) has correct ownership and permission
su - oracle
cd $ORACLE_HOME/bin/
chmod 6751 oracle

Wednesday, April 10, 2024

drbd sync SOW with drbdadm and reference

A. Activity Overview 
1. drdb syncrinization

B. Prerequisite and Pre-checks:

Steps:
1. On servers BD902G and BD901G
pcs status

Server which is running oracle service ora_igt_rg  is Active. (Primary)
The other server is Passive (Secondary)

2. drbd General Note:
Data is written on Active server by oracle service, and synchronized by drbd to Standby server.
drbd has its own CLI, drbdadm

Command are run per LV Group (db2, db1, Ora_Exp, Ora_Online). Each Volume sync can take few minutes to complete.
drbdadm status during synchronization, will show progress, in percent.
drbdadm commands should be run as root.
drbdadm commads are run either on Active (Primary), or on Passive (Secondary) node, per context, see below.

3. Check status before starting synchronization
Run drbdadm status on Active server

Expected result:
connection:StandAlone – meaning there is no sync with the Secondary node

root>% drbdadm status
Ora_Exp role:Primary
  disk:UpToDate
  DBD902G connection:StandAlone
Ora_Online role:Primary
  disk:UpToDate
  DBD902G connection:StandAlone
db1 role:Primary
  disk:UpToDate
  DBD902G connection:StandAlone
db2 role:Primary
  disk:UpToDate
  DBD902G connection:StandAlone

4. Synchronization commands.
Yellow: commands on secondary site
Green: commands on primary site

drbdadm status
drbdadm status

drbdadm secondary Ora_Exp
drbdadm disconnect Ora_Exp
drbdadm -- --discard-my-data connect Ora_Exp
// per another documentation, use this: 
//drbdadm connect --discard-my-data 
Ora_Exp
drbdadm connect Ora_Exp
drbdadm status
 
drbdadm secondary Ora_Online
drbdadm disconnect Ora_Online
drbdadm -- --discard-my-data connect Ora_Online
drbdadm connect Ora_Online
drbdadm status

drbdadm secondary db1
drbdadm disconnect db1
drbdadm -- --discard-my-data connect db1
drbdadm connect db1
drbdadm status

drbdadm secondary db2
drbdadm disconnect db2
drbdadm -- --discard-my-data connect db2
drbdadm connect db2
drbdadm status
 
drbdadm status
drbdadm status

1.After synchronization check status
Now, Primary node is aware of secondary node
Secondary node is aware on Primary node.
In this example:
DBD901G - is Primary
DBD902G - is Secondary


root@DBD901G:~>% drbdadm  status
Ora_Exp role:Primary
  disk:UpToDate
  SRV902G role:Secondary
    peer-disk:UpToDate

Ora_Online role:Primary
  disk:UpToDate
  SRV902G role:Secondary
    peer-disk:UpToDate

db1 role:Primary
  disk:UpToDate
  SRV902G role:Secondary
    peer-disk:UpToDate

db2 role:Primary
  disk:UpToDate
  SRV902G role:Secondary
    peer-disk:UpToDate


root@DBD902G:~>% drbdadm  status
Ora_Exp role:Secondary
  disk:UpToDate
  DBD901G role:Primary
    peer-disk:UpToDate

Ora_Online role:Secondary
  disk:UpToDate
  DBD901G role:Primary
    peer-disk:UpToDate

db1 role:Secondary
  disk:UpToDate
  DBD901G role:Primary
    peer-disk:UpToDate

db2 role:Secondary
  disk:UpToDate
  DBD901G role:Primary
    peer-disk:UpToDate

completed!!




Additional commands
drbdadm verify <resource>
drbdadm verify all
- used for monitoring

drbdadm primary <resource>
drbdadm secondary <resource>
- set resource as Primary or Secondary

drbdsetup status home
- get status for current node

drbdsetup status home --verbose --statistics
or
drbdsetup status --verbose --statistics
- get detailed status for current node

drbdsetup events2 --now r0
- Get current stats

drbdsetup events2 r0
- Get constant stats, updating each second

drbdsetup events2 --statistics --now r0
- Get current stats, with details

drbdadm cstate
drbdadm cstate <resource>:<peer>
- Get the node state
- <resource>:<peer> - The default is the peer’s hostname as given in the configuration file.

A resource may have one of the following connection states:

StandAlone
No network configuration available. 
The resource has not yet been connected, or has been administratively disconnected (using drbdadm disconnect), or has dropped its connection due to failed authentication or split brain.

Disconnecting
Temporary state during disconnection. The next state is StandAlone.

Unconnected
Temporary state, prior to a connection attempt. Possible next states: Connecting.

Timeout
Temporary state following a timeout in the communication with the peer. Next state: Unconnected.

BrokenPipe
Temporary state after the connection to the peer was lost. Next state: Unconnected.

NetworkFailure
Temporary state after the connection to the partner was lost. Next state: Unconnected.

ProtocolError
Temporary state after the connection to the partner was lost. Next state: Unconnected.

TearDown
Temporary state. The peer is closing the connection. Next state: Unconnected.

Connecting
This node is waiting until the peer node becomes visible on the network.

Connected
A DRBD connection has been established, data mirroring is now active. This is the normal state.

drbdadm role
Resource Roles
You may see one of the following resource roles:

Primary
The resource is currently in the primary role, and may be read from and written to. 
This role only occurs on one of the two nodes, unless dual-primary mode is enabled.

Secondary
The resource is currently in the secondary role. 
It normally receives updates from its peer (unless running in disconnected mode), but may neither be read from nor written to. 
This role may occur on one or both nodes.

Unknown
The resource’s role is currently unknown. 
The local resource role never has this status. 
It is only displayed for the peer’s resource role, and only in disconnected mode


drbdadm dstate <resource>
Get the Disk State


The disk state may be one of the following:

Diskless
No local block device has been assigned to the DRBD driver. This may mean that the resource has never attached to its backing device, that it has been manually detached using drbdadm detach, or that it automatically detached after a lower-level I/O error.

Attaching
Transient state while reading metadata.

Detaching
Transient state while detaching and waiting for ongoing I/O operations to complete.

Failed
Transient state following an I/O failure report by the local block device. Next state: Diskless.

Negotiating
Transient state when an Attach is carried out on an already-Connected DRBD device.

Inconsistent
The data is inconsistent. This status occurs immediately upon creation of a new resource, on both nodes (before the initial full sync). Also, this status is found in one node (the synchronization target) during synchronization.

Outdated
Resource data is consistent, but outdated.

DUnknown
This state is used for the peer disk if no network connection is available.

Consistent
Consistent data of a node without connection. When the connection is established, it is decided whether the data is UpToDate or Outdated.

UpToDate
Consistent, up-to-date state of the data.
This is the normal state.