Oracle ASM in Azure


(Warning - please note there is an extra couple of steps if using premium ssd devices - please make sure to follow this if you are using those disks - which you probably want to be for i/o performance)

Now we're doing more stuff in the cloud I've had to dust off my old sysadmin hat and start doing more of the stuff I'd forgotten about from when i was a sysadmin (10+ years ago). This week I've been setting up ASM instances in Azure so i had to do all the work around making the disk devices available to ASM.

On premises I'd just pick up the phone and say create devices for me - now i actually have to work out how to do it.

So here is my quick guide to setting up ASM devices in Azure (most of this will be the same for other clouds or on premises). In this basic example I'm just going to add 2 disks of 1TB each, one to use for DATA disk group and one to use for FRA disk group.

So assuming you already have a basic server up and running now you need to add 2 disks to it - to do this go to the azure portal, click on the disks icon and then add two disks making sure to click the save button or nothing will actually happen ( i made this mistake so that's why i specifically mention it.....)

The azure screen will then look like this (showing 3 additional disks, 1 was for /oracle filesystem which i'd already dealt with the previous day and the other two will be used for the ASM example below)



Now the disks are saved they are provisioned to the server and you should be able to see them by default without having to do anything else (at least it worked this way on redhat 7.2 for me). However you may find on other or older os's you may need to scan the scsi bus for the operating system to discover them. The command to do this is rescan-scsi-bus.sh - this is within the sg3_utils package (at least for redhat) if that package is not already installed.

This command will basically just probe for any new/changed devices that are attached to the system.

To check what has been discovered you can look in a couple of places - you can either say

[root]# dmesg |grep SCSI
[    0.211099] SCSI subsystem initialized
[    1.020215] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
[    2.892767] sd 5:0:1:0: [sdb] Attached SCSI disk
[    2.893235] sd 2:0:0:0: [sda] Attached SCSI disk
[  363.573682] sd 4:0:0:0: [sdc] Attached SCSI disk
[87200.690290] sd 4:0:0:1: [sdd] Attached SCSI disk
[87201.360334] sd 4:0:0:2: [sde] Attached SCSI disk

or

[root]# grep SCSI /var/log/messages
Feb 21 09:17:43 localhost kernel: SCSI subsystem initialized
Feb 21 09:17:43 localhost kernel: Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
Feb 21 09:17:43 localhost kernel: sd 5:0:1:0: [sdb] Attached SCSI disk
Feb 21 09:17:43 localhost kernel: sd 2:0:0:0: [sda] Attached SCSI disk
Feb 21 09:17:49 localhost smartd[521]: Monitoring 0 ATA and 0 SCSI devices
Feb 21 09:23:41 localhost kernel: sd 4:0:0:0: [sdc] Attached SCSI disk
Feb 22 09:30:58 localhost kernel: sd 4:0:0:1: [sdd] Attached SCSI disk
Feb 22 09:30:59 localhost kernel: sd 4:0:0:2: [sde] Attached SCSI disk

The messages file is perhaps clearer as it shows the timestamp the 2 disks were discovered and gives you more confidence that you have the right ones.

So in the case above add and see were the discovered disks and you'll see /dev/sdd and /dev/sde have been created.

Another useful command at this point is lsblk - this gives some more detail about the disks and their current use - so for my server if i run that i see

[root]# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
fd0      2:0    1    4K  0 disk
sda      8:0    0 29.3G  0 disk
├─sda1   8:1    0  500M  0 part /boot
└─sda2   8:2    0 28.8G  0 part /
sdb      8:16   0   14G  0 disk
└─sdb1   8:17   0   14G  0 part /mnt/resource
sdc      8:32   0  100G  0 disk
└─sdc1   8:33   0  100G  0 part /oracle
sdd      8:48   0 1023G  0 disk
sde      8:64   0 1023G  0 disk
sr0     11:0    1  1.1M  0 rom

We can see from this that sda contains / and /boot, sdb has /mnt/resource, sdc has /oracle (which was another disk i added previously) and sdd/sde currently are not partitioned and have nothing on them (at least from a filesystem point of view).

Another point thats worth mentioning here is that redhat has only discovered 1 path to the disk - there is no multipathing on this server - as an example of something where we have multipathing this is what the output would look like there

[root@multipath ~]# lsblk
NAME                        MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
sda                           8:0    0  100G  0 disk
└─mpatha (dm-0)             253:0    0  100G  0 mpath
  ├─mpathap1 (dm-2)         253:2    0  512M  0 part  /boot
  ├─mpathap2 (dm-4)         253:4    0 34.1G  0 part
  │ └─vg00-swapvol (dm-9)   253:9    0    4G  0 lvm   [SWAP]
  └─mpathap3 (dm-6)         253:6    0 65.4G  0 part
    ├─vg00-rootvol (dm-8)   253:8    0    5G  0 lvm   /
    ├─vg00-swapvol (dm-9)   253:9    0    4G  0 lvm   [SWAP]
    ├─vg00-varvol (dm-10)   253:10   0    4G  0 lvm   /var
    ├─vg00-crashvol (dm-11) 253:11   0 29.3G  0 lvm   /var/crash
    ├─vg00-auditvol (dm-12) 253:12   0  256M  0 lvm   /var/log/audit
    ├─vg00-tmpvol (dm-13)   253:13   0    2G  0 lvm   /tmp
    ├─vg00-homevol (dm-14)  253:14   0    1G  0 lvm   /home
    └─vg00-lvoracle (dm-15) 253:15   0   20G  0 lvm   /oracle
sdd                           8:48   0  100G  0 disk
└─mpatha (dm-0)             253:0    0  100G  0 mpath
  ├─mpathap1 (dm-2)         253:2    0  512M  0 part  /boot
  ├─mpathap2 (dm-4)         253:4    0 34.1G  0 part
  │ └─vg00-swapvol (dm-9)   253:9    0    4G  0 lvm   [SWAP]
  └─mpathap3 (dm-6)         253:6    0 65.4G  0 part
    ├─vg00-rootvol (dm-8)   253:8    0    5G  0 lvm   /
    ├─vg00-swapvol (dm-9)   253:9    0    4G  0 lvm   [SWAP]
    ├─vg00-varvol (dm-10)   253:10   0    4G  0 lvm   /var
    ├─vg00-crashvol (dm-11) 253:11   0 29.3G  0 lvm   /var/crash
    ├─vg00-auditvol (dm-12) 253:12   0  256M  0 lvm   /var/log/audit
    ├─vg00-tmpvol (dm-13)   253:13   0    2G  0 lvm   /tmp
    ├─vg00-homevol (dm-14)  253:14   0    1G  0 lvm   /home
    └─vg00-lvoracle (dm-15) 253:15   0   20G  0 lvm   /oracle

You can see that the exact same thing is accessible via sda and sdd - this is where multipath comes in - it gives you a single device name that can access both paths - giving failover and improved performance (in the output above that name is mpatha) For this azure blog example I'm ignoring multipath as its not there on my Azure server (at least the small one i chose) but be aware if you do have multipath that ASM should be doing the access via the multipath device not and not the individual disk names that i will show below.

OK now we know the disk names we now have the option to fdisk them to partition up the disks - this isn't strictly necessary as we want to use the whole disk anyway but it seems to be good practice as it at least formats the disk header and makes linux (and actually people) aware that this disk is being used for something.

To do the fdisk i just run

fdisk /dev/sdd

Then go through the menu options choosing n for new and accepting all the defaults - when it stops prompting you for more answers it's ready to write the results back to the disk which you do by typing w.

After you;ve done that for both disks you'll see the output in lsblk is slightly different

[root@]# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
fd0      2:0    1    4K  0 disk
sda      8:0    0 29.3G  0 disk
├─sda1   8:1    0  500M  0 part /boot
└─sda2   8:2    0 28.8G  0 part /
sdb      8:16   0   14G  0 disk
└─sdb1   8:17   0   14G  0 part /mnt/resource
sdc      8:32   0  100G  0 disk
└─sdc1   8:33   0  100G  0 part /oracle
sdd      8:48   0 1023G  0 disk
└─sdd1   8:49   0 1023G  0 part
sde      8:64   0 1023G  0 disk
└─sde1   8:65   0 1023G  0 part
sr0     11:0    1  1.1M  0 rom

We have 2 new lines reflecting the fact the disk is partitioned, if you look in /dev you'll also see new devices called /dev/sdd1 and /dev/sde1 (or whatever your device names were). If you create more than one partition these will show up as /dev/sdd2, /dev/sdd3 etc - you get the idea.

OK great we're now good to go right?

Well you could do but you have 2 main issues:

1) Permissions won't allow oracle to use the devices
2) The device names can change - for example if there is an issue talking to /dev/sdb this device name might be used for something else or the device order could get randomly shuffled - this is no good - we need persistent names here.

There are 2 main solutions to this

1) ASMLib
2) udev

Now ASMLib was meant to be oracle's solution to this but i never really saw the point of this extra layer of complexity (and something else to go wrong), in fact at one point in the past it was pulled but then came back. It's still there as an option if you want to use it but i struggle to see the benefit of it really when you can just do it with the built in tool udev.

So ignoring ASMLib here is how you do it with udev

Now the key bit of information required by udev to make all this work and to guarantee things end up with the same name all the time is the WWN name of the disk - this is essentially like the MAC address of a network card - it is a globally unique value that can only be assigned to one thing. This means we can build a set of rules based on this fact to give us a consistent device name.

So how do we find this WWN?

Well couple of ways - either this which shows the WWN name for each disk at the end of the line

[root@]# lsscsi -i
[1:0:0:0]    cd/dvd  Msft     Virtual CD/ROM   1.0   /dev/sr0   -
[2:0:0:0]    disk    Msft     Virtual Disk     1.0   /dev/sda   3600224805f644b4df8f61359aa823ccd
[4:0:0:0]    disk    Msft     Virtual Disk     1.0   /dev/sdc   36002248061b57d183e2b7e3923d5bd5c
[4:0:0:1]    disk    Msft     Virtual Disk     1.0   /dev/sdd   360022480a89ae89fa85a9e69fd66934f
[4:0:0:2]    disk    Msft     Virtual Disk     1.0   /dev/sde   3600224809d6a88fc75be22809c742190
[5:0:1:0]    disk    Msft     Virtual Disk     1.0   /dev/sdb   36002248064f6b661ed183884d69ac5db

Or for a specific disk this command (and this is actually what udev itself makes use of) - this command varies between versions so check you have the right one

[root@]#  /usr/lib/udev/scsi_id -g -u -d /dev/sde1
3600224809d6a88fc75be22809c742190

OK - so now we have the WWN names how do we make use of udev to give us some disk names ?

We need to add an additional 'rules' file which is run on startup that will discover the disks and create some stuff for us based on the configuration we create. So lets create that file - i name it starting with 99 so its the last rule that gets processed.

vi /etc/udev/rules.d/99-oracle-asmdevices.rules

Inside this file i add the following 2 lines of configuration

KERNEL==”sd?1″, SUBSYSTEM==”block”, PROGRAM==”/usr/lib/udev/scsi_id -g -u -d /dev/$parent”, RESULT==”360022480a89ae89fa85a9e69fd66934f″, SYMLINK+=”asm-data1″, OWNER=”oracle”, GROUP=”dba”, MODE=”0660″

KERNEL==”sd?1″, SUBSYSTEM==”block”, PROGRAM==”/usr/lib/udev/scsi_id -g -u -d /dev/$parent”, RESULT==”3600224809d6a88fc75be22809c742190″, SYMLINK+=”asm-fra1″, OWNER=”oracle”, GROUP=”dba”, MODE=”0660″

So lets examine that a little more - whats it actually doing? If we look at the first line its doing the following:

1) Look at the devices the kernel has discovered under /dev/sd?1 (note the rule is only looking for partition 1 and nothing else)
2) using the command scsi_id ...... probe it and get the WWN name
3) if the WWN matches the value i have then do the following:
  • create a symbolic link name called asm-data1 (under /dev)
  • set the owner to oracle
  • set the group to dba
  • set the permissions to 0660
To check you configured the rules file correctly you can run


udevadm test /block/sde/sde1

Note use there of the /block prefix rather than /dev - this will give lots of nice output telling you what its doing. If you just want to force it to reread all the rules and set everything up then run

udevadm control --reload-rules

After that is complete you'll see the following devices created under /dev

So we have these symlinks

[root@dev]# ls -l oracle*
lrwxrwxrwx. 1 root root 4 Feb 22 13:20 oracleasm-data1 -> sdd1
lrwxrwxrwx. 1 root root 4 Feb 22 13:18 oracleasm-fra1 -> sde1

which point at these devices - which we can see now have the right permissions.

[root@dev]# ls -l sdd1 sde1
brw-rw----. 1 oracle dba 8, 49 Feb 22 13:20 sdd1
brw-rw----. 1 oracle dba 8, 65 Feb 22 13:20 sde1

So looking good - now we just have to set up ASM - i'll do this using the asmca in silent mode with the following command string pointing at the symlink names (I'e got the GI installed via a clone command as detailed here 

asmca -silent -configureASM -sysAsmPassword password here -asmsnmpPassword password here -diskString '/dev/oracleasm*' -diskGroupName DATA -disk '/dev/oracleasm-data1' -redundancy EXTERNAL -diskGroupName FRA -disk '/dev/oracleasm-fra1' -redundancy EXTERNAL

So in this case i'm explicitly mentioning the path for DATA to the first symlink and FRA to the second symlink - the disk discovery path for new devices is anything under /dev/oracleasm*

And there we have it - how to add asm devices with consistent name to a redhat machine in Azure and then get ASM to use them.




2 comments:

  1. Hi Rich,

    is iSCSI the only way you can present devices to ASM? I am asking with regard to the performance.

    Cheers,
    Balazs

    ReplyDelete
  2. Hi Balazs,
    It supports most things i think - I've just no experience of them. One of the main ASM guys at oracle has blog and he specifically mentions other methods.

    http://asmsupportguy.blogspot.co.uk/2010/04/about-asm-disk-groups-disks-and-files.html

    Cheers,
    Rich

    ReplyDelete