===========
General
===========
In short:
Stop Cluster
Stop Cluster
unmount Oracle shared storage
Do DBA work
remount Oracle shared storage
remount Oracle shared storage
Start Cluster.
How to know if this is a Veritas or Pacemaker?
getaclu
VCS - > Veritas
PMK -> pacemaker
Starhome Technote
http://10.135.10.64/portal/projects/howto/mount-cl-fs-locally.html
=====================
Pacemaker
=====================
=============
Stop the cluster
==============
Stop the cluster
#> pcs cluster stop --all
Stop the cluster
#> pcs cluster stop --all
#> pcs status cluster
Error: cluster is not currently running on this node
Tag and enable
#> for i in $(vgscan |grep Ora | cut -d '"' -f 2) ;do vgchange --addtag sometag $i ; vgchange -ay --config 'activation{volume_list=["@sometag"]}' $i ; done
Volume group "OraVg3" successfully changed
1 logical volume(s) in volume group "OraVg3" now active
Volume group "OraVg2" successfully changed
2 logical volume(s) in volume group "OraVg2" now active
Volume group "OraVg1" successfully changed
1 logical volume(s) in volume group "OraVg1" now active
#> mount -t xfs -f -b size=4096 -m crc=0 /dev/OraVg1/db1 /oracle_db/db1
#> mount -t xfs -f -b size=4096 -m crc=0 /dev/OraVg2/Ora_Exp /backup/ora_exp
#> mount -t xfs -f -b size=4096 -m crc=0 /dev/OraVg2/Ora_Online /backup/ora_online
#> mount -t xfs -f -b size=4096 -m crc=0 /dev/OraVg3/db2 /oracle_db/db2
#> mount -t xfs -f -b size=4096 -m crc=0 /dev/OraVg2/Ora_Exp /backup/ora_exp
#> mount -t xfs -f -b size=4096 -m crc=0 /dev/OraVg2/Ora_Online /backup/ora_online
#> mount -t xfs -f -b size=4096 -m crc=0 /dev/OraVg3/db2 /oracle_db/db2
or
mount /dev/OraVg1/db1 /oracle_db/db1
mount /dev/OraVg2/Ora_Exp /backup/ora_exp
mount /dev/OraVg2/Ora_Online /backup/ora_online
mount /dev/OraVg3/db2 /oracle_db/db2
==============
Perform Database Restore
==============
Perform Database Restore
==============
==============
Start the cluster
==============
Unmount it all locally:
#> umount -f /mnt/oratmp /oracle_db/db1 /backup/ora_exp /backup/ora_online /oracle_db/db2
umount /oracle_db/db1
umount /oracle_db/db1
umount /oracle_db/db2
umount /backup/ora_exp
umount /backup/ora_online
Delete tag and deactivate
#> for i in $(vgscan |grep Ora | cut -d '"' -f 2) ;do vgchange -an $i ; vgchange --deltag sometag $i ; done
Restart the cluster
#> pcs cluster start --all
=====================
pcs commands
=====================
pcs cluster stop --all
pcs status cluster
=====================
VCS
=====================
Take screenshot of current status
df -hP | grep ora
/dev/vx/dsk/OraDg1/db1 79G 26G 54G 33% /oracle_db/db1
/dev/vx/dsk/OraDg2/Ora_Online 159G 16G 143G 10% /backup/ora_online
/dev/vx/dsk/OraDg2/Ora_Exp 100G 7.0G 93G 8% /backup/ora_exp
/dev/vx/dsk/OraDg3/db2 199G 662M 197G 1% /oracle_db/db2
Stop Cluster
hastop -all
or
hastop -all -force
Perioticly check cluster status until it is unavailable - like the following message:
hastatus -summary
VCS ERROR V-16-1-10600 Cannot connect to VCS engine
VCS WARNING V-16-1-11046 Local system not available
Force stop oracle service, in case of PARTIAL status which cannot be resolved.
sudo hagrp -flush -force <service_group> -sys <host_group>
sudo hagrp -flush -force <service_group> -sys <host_group>
sudo hagrp -flush -force ora_igt_sg -sys apu-rm
or freeze just the oracle group
hagrp -freez ora_igt_sg
Mount locally the oracle mount points
vxdg import OraDg1
vxdg import OraDg2
vxdg import OraDg3
vxvol -g OraDg1 startall
vxvol -g OraDg2 startall
vxvol -g OraDg3 startall
mount -t vxfs /dev/vx/dsk/OraDg1/db1 /oracle_db/db1
mount -t vxfs /dev/vx/dsk/OraDg2/Ora_Online /backup/ora_online
mount -t vxfs /dev/vx/dsk/OraDg2/Ora_Exp /backup/ora_exp
mount -t vxfs /dev/vx/dsk/OraDg3/db2 /oracle_db/db2
df -hP | grep ora
Do the Oracle stuff
Dismount locally the oracle mount points
umount /oracle_db/db1
umount /backup/ora_online
umount /backup/ora_exp
umount /oracle_db/db2
vxvol -g OraDg1 stopall
vxvol -g OraDg2 stopall
vxvol -g OraDg3 stopall
vxdg deport OraDg1
vxdg deport OraDg2
vxdg deport OraDg3
Start the service
Run the following command on ALL cluster nodes:
node a
hastart
node b
hastart
===========
Additional Commands
===========
move service to another node
as root
pcs status
pcs resource cleanup ora_igt_rg
pcs resource disable ora_igt_rg
pcs resource enable ora_igt_rg
pcs resouce show ora_igt_rg
To create a service
as root Check on oracle service definition
/etc/sysconfig/env.oracledb
for oracle service - check this file:
/etc/systemd/system/dbora.service
This defines the oracle service, ORACLE_HOME, ORACLE_SID
/etc/sysconfig/evn.oracledb
run this:
systemctl status dbora
systemctl status dbora.service
systemctl start dbora.service
systemctl deamon
pcs resource create dbora_igt_ap systemd:dbora op stop interval=0 timeout=120s on-fail="block" monitor interval=30s timeout=600s start interval=0 timeout=120s --group ora_igt_rg
pacemaker cluster commands info
pcs status - this will show cluster status
pcs resource show - this will show current node status
pcs resource show dbora_igt_ap - this will show oracle service info
Resource: dbora_igt_ap (class=systemd type=dbora)
Operations: monitor interval=30s timeout=600s (dbora_igt_ap-monitor-interval-30s)
start interval=0 timeout=120s (dbora_igt_ap-start-interval-0)
stop interval=0 on-fail=block timeout=120s (dbora_igt_ap-stop-interval-0)
pacemaker cluster commands oracle
pcs resource move ora_igt_rg PIPNVHED901G
pcs resource cleanup ora_igt_rg
pcs resource disable ora_igt_rg
pcs resource enable ora_igt_rg
pacemaker cluster commands restart oracle service
pcs status
pcs resource cleanup ora_igt_rg
pcs resource disable ora_igt_rg
pcs resource enable ora_igt_rg
pcs resource restart ora_igt_rg PIPNVHED901G
pcs status
pacemaker cluster commands other
--pcs resource disable oracle
--pcs resource enable oracle/19/dbhome_1/rdbms/log/startup
Check corosync in short
root>% vi /etc/corosync/corosync.conf
root>% pcs cluster sync
server901G: Succeeded
server902G: Succeeded
root>% pcs cluster reload corosync
Corosync reloaded
root>% corosync-cmapctl | grep totem.token
runtime.config.totem.token (u32) = 5000
runtime.config.totem.token_retransmit (u32) = 1190
runtime.config.totem.token_retransmits_before_loss_const (u32) = 4
totem.token (u32) = 5000
Change corosync timeout
1. Edit /etc/corosync/corosync.conf on one of the cluster nodes
Add the required line if does not exist or update the value if the line does exist.
For 5 seconds, set the value to 5000 msec.
totem {
version: 2
secauth: off
cluster_name: rhel7-cluster
transport: udpu
rrp_mode: passive
token: 15000 <--- If this line is missing, add it, otherwise update the value.-->
}
2. Propagate the updated corosync.conf to the rest of the nodes as follows:
pcs cluster sync
3. Reload corosync.
This command can be run from one node to reload corosync on all nodes and does not require a downtime.
This command can be run from one node to reload corosync on all nodes and does not require a downtime.
pcs cluster reload corosync
4. Confirm changes
corosync-cmapctl | grep totem.token
For example:
corosync-cmapctl | grep totem.token
runtime.config.totem.token (u32) = 5000
runtime.config.totem.token_retransmit (u32) = 1190
runtime.config.totem.token_retransmits_before_loss_const (u32) = 4
No comments:
Post a Comment