RAC : RECOVER VOTING DISK 1

In this post, I will demonstrate how to recover voting disk in case we lose the only copy of voting disk.Voting disk will be automatically recovered using latest available backup of OCR.

Current scenario:
The only copy of the voting disk is present in test diskgroup on disk ASMDIsk010

We will corrupt ASMDIsk011 so that we lose the only copy of the voting disk.

We will restore voting disk to another diskgroup using the OCR.

Let’s start …

– Currently, we have 1 voting disk. Let us corrupt it and check if clusterware still continues

– FIND OUT LOCATION OF VOTEDISK

[grid@host01 cssd]$ crsctl query css votedisk

## STATE File Universal Id File Name Disk group

– —– —————– ——— ———

1. ONLINE 00ce3c95c6534f44bfffa645a3430bc3 (ORCL:ASMDISK010) [TEST]

– FIND OUT THE NO. OF DISKS IN test DG (CONTAINING VOTEDISK)

ASMCMD> lsdsk -G test

Path

ORCL:ASMDISK010

– Let us corrupt ASMDISK010

— bs = blocksize = 4096

— count = # of blocks overwritten = 1000000 (~1M)

– total no. of bytes corrupted = 4096 * 1000000

(~4096M = size of one partition)

#dd if=/dev/zero of=/dev/oracleasm/disks/ASMDISK010 bs=4096 count=1000000

Here, I was expecting clusterware to stop as the only voting disk was not available but surprisingly clusterware kept running. I even waited for quite some time but to no avail. I would be glad if someone can give more input on this.

Finally, I stopped clusterware and tried to restart it. It was not able to restart.

– Reboot all the nodes and note that cluster ware does not start as voting disk is not accessible.

#crsctl stat res -t

– Now since voting disk can’t be restored back to test diskgroup as disk in test has been corrupted,

we will create another diskgroup votedg where we will restore voting disk.

RECOVER VOTING DISK

– To move voting disk to votedg diskgroup, ASM instance should be up and for ASM

instance to be up, CRS should be up. Hence we will

– stop crs on all the nodes

– start crs in exclusive mode on one of the nodes (host01)

– start asm instance on host01 using pfile (since spfile of ASM instance is on ASM)

– create a new diskgroup votedg

– move voting disk to votedg diskgroup

– stop crs on host01(was running in exclusive mode)

– restart crs on host01

– start crs on rest of the nodes

– start cluster on all the nodes

– IMPLEMENTATION –

- stop crs on all the nodes(if it does not stop, kill ohasd process and retry)

root@hostn# crsctl stop crs -f

– start crs in exclusive mode on one of the nodes (host01)

root@host01# crsctl start crs -excl

– start asm instance on host01 using pfile

grid@host01$ echo INSTANCE_TYPE=ASM >> /u01/app/oracle/init+ASM1.ora 
             chown grid:oinstall /u01/app/oracle/init+ASM1.ora 
 SQL>startup pfile='/u01/app/oracle/init+ASM1.ora';

- create a new diskgroup votedg

– move voting disk to data diskgroup – voting disk is automaticaly recovered using latest available backup of OCR.

root@host01#crsctl replace votedisk +votedg

– stop crs on host01(was running in exclusive mode)

root@host01#crsctl stop crs

– restart crs on host01

root@host01#crsctl start crs

– start crs on rest of the nodes (if it does not start, kill ohasd process and retry)

root@host02#crsctl start crs 
root@host03#crsctl start crs

– start cluster on all the nodes and check that it is running

root@host01#crsctl start cluster -all 
            crsctl stat res -t

I hope this post was useful.

RAC

Wednesday, January 13, 2016

RECOVER VOTING DISK 1

No comments:

Post a Comment