How to: Resolve DRBD split-brain recovery manually
After split brain has been detected, one node will always have the resource in a StandAlone connection state. The other might either also be in the StandAlone state (if both nodes detected the split brain simultaneously), or in WFConnection (if the peer tore down the connection before the other node had a chance to detect split brain).
At this point, unless you configured DRBD to automatically recover from split brain, you must manually intervene by selecting one node whose modifications will be discarded (this node is referred to as the split brain victim).
This intervention is made with the following commands:
Below is the implementation of above technote
On secondary site
Regulal Flow
on secondary site
drbdadm secondary Ora_Exp
drbdadm disconnect Ora_Exp
drbdadm -- --discard-my-data connect Ora_Exp
on primary site
drbdadm connect Ora_Exp
drbdadm status
If above does not work
on secondary site
root@server-1b:~>% drbdadm secondary Ora_Exp
root@server-1b:~>% drbdadm disconnect Ora_Exp
root@server-1b:~>% drbdadm -- --discard-my-data connect Ora_Exp
root@server-1b:~>% drbdadm invalidate Ora_Exp
On primary site
root@server-1a:~>% drbdadm status
root@server-1a:~>% drbdadm status
root@server-1a:~>% drbdadm connect Ora_Exp
See progress and log:
/sys/kernel/debug/drbd/resources/<resource_name>/connections/<server_name>/0/proc_drdb/
See progress
root@server-1a:~>% drbdadm status
Ora_Exp role:Primary
disk:UpToDate
server-1b role:Secondary
peer-disk:UpToDate
Ora_Exp role:Primary
disk:UpToDate
server-1b role:Secondary
peer-disk:UpToDate
Ora_Online role:Primary
disk:UpToDate
server-1b role:Secondary
peer-disk:UpToDate
disk:UpToDate
server-1b role:Secondary
peer-disk:UpToDate
db1 role:Primary
disk:UpToDate
server-1b role:Secondary congested:yes ap-in-flight:96 rs-in-flight:14336
replication:SyncSource peer-disk:Inconsistent done:81.03
disk:UpToDate
server-1b role:Secondary congested:yes ap-in-flight:96 rs-in-flight:14336
replication:SyncSource peer-disk:Inconsistent done:81.03
db2 role:Primary
disk:UpToDate
server-1b role:Secondary congested:yes ap-in-flight:32 rs-in-flight:14336
replication:SyncSource peer-disk:Inconsistent done:86.60
In this example, Ora_Exp and Ora_Online were already synced.
disk:UpToDate
server-1b role:Secondary congested:yes ap-in-flight:32 rs-in-flight:14336
replication:SyncSource peer-disk:Inconsistent done:86.60
In this example, Ora_Exp and Ora_Online were already synced.
db1 and db2 are in process of sync.
The numbers 81.03 and 86.60 are percent of the synced disk.
once the percent is 100% - the 2 sites are in sync
on site A
Ora_Exp role:Primary
disk:UpToDate
server-1b role:Secondary
peer-disk:UpToDate
on site B
Ora_Exp role:Secondary
disk:UpToDate
server-1a role:Primary
peer-disk:UpToDate
Example:
commands on secondary site
commands on primary site
drbdadm secondary Ora_Exp
drbdadm disconnect Ora_Exp
drbdadm -- --discard-my-data connect Ora_Exp
drbdadm connect Ora_Exp
drbdadm status
drbdadm secondary Ora_Online
drbdadm disconnect Ora_Online
drbdadm -- --discard-my-data connect Ora_Online
drbdadm connect Ora_Online
drbdadm status
drbdadm secondary db1
drbdadm disconnect db1
drbdadm -- --discard-my-data connect db1
drbdadm connect db1
drbdadm status
drbdadm secondary db2
drbdadm disconnect db2
drbdadm -- --discard-my-data connect db2
drbdadm connect db2
drbdadm status
drbdadm status
drbdadm status
drbdadm status
How to: Make a node Primary
On to be Primary site:
root@server-1a:~>% drbdadm status
ogg role:Secondary
disk:UpToDate
server-1b connection:Connecting
root@server-1a:~>% drbdadm primary ogg
root@server-1a:~>% drbdadm disconnect ogg
root@server-1a:~>% drbdadm connect ogg
root@server-1a:~>% drbdadm status
ogg role:Primary
disk:UpToDate
server-1b connection:Connecting
drbdadm primary
Promote the resource´s device into primary role.
You need to do this before any access to the device, such as creating or mounting a file system.
You need to do this before any access to the device, such as creating or mounting a file system.
drbdadm secondary
Brings the device back into secondary role.
Reference
https://manpages.ubuntu.com/manpages/xenial/en/man8/drbdadm.8.html
=================
Correct status
=================
root@SRV901G:~>% drbdadm status
Ora_Exp role:Primary
disk:UpToDate
SRV902G role:Secondary
peer-disk:UpToDate
Ora_Online role:Primary
disk:UpToDate
SRV902G role:Secondary
peer-disk:UpToDate
db1 role:Primary
disk:UpToDate
SRV902G role:Secondary
peer-disk:UpToDate
db2 role:Primary
disk:UpToDate
SRV902G role:Secondary
peer-disk:UpToDate
root@SRV902G:~>% drbdadm status
Ora_Exp role:Secondary
disk:UpToDate
SRV901G role:Primary
peer-disk:UpToDate
Ora_Online role:Secondary
disk:UpToDate
SRV901G role:Primary
peer-disk:UpToDate
db1 role:Secondary
disk:UpToDate
SRV901G role:Primary
peer-disk:UpToDate
db2 role:Secondary
disk:UpToDate
SRV901G role:Primary
peer-disk:UpToDate
=================
Not Correct Status
=================
root@SRVDBD901G:~>% drbdadm status
Ora_Exp role:Primary
disk:UpToDate
SRVDBD902G connection:StandAlone
Ora_Online role:Primary
disk:UpToDate
SRVDBD902G connection:StandAlone
db1 role:Primary
disk:UpToDate
SRVDBD902G connection:StandAlone
db2 role:Primary
disk:UpToDate
SRVDBD902G connection:StandAlone
root@SRVDBD902G:~>% drbdadm status
Ora_Exp role:Secondary
disk:UpToDate
SRVDBD901G connection:StandAlone
Ora_Online role:Secondary
disk:UpToDate
SRVDBD901G connection:StandAlone
db1 role:Secondary
disk:UpToDate
SRVDBD901G connection:StandAlone
db2 role:Secondary
disk:UpToDate
SRVDBD901G connection:StandAlone
No comments:
Post a Comment