Monday, 5 December 2011

Active-Active MS SQL Server Failover Cluster using ESX 4.0 - part 1

Microsoft SQL Server does not support scale out deployments, which means that scaling out has to be built-in into the design of the application. The simplest way of doing this is by splitting data across two or more databases and running an active-active cluster. This is a series of posts that will show you how to do deploy such a system.  I will then use the active-active cluster to deploy two separate organizations of Microsoft Dynamics CRM 2011.

VMware has a pretty good document detailing how to create the necessary Guests (VMs) to run a successful Microsoft failover cluster here. In my opinion, there is no point in running a failover cluster from the same host, you could argue that there is no point in looking for database performance in ESX but that's a different point. At any rate, I am using a set up that allows the guests to be on different hosts and these are the pre-requisites for the hosts:
  • Two physical network adapters dedicated to the MSCS cluster and to the public and private networks.
  • One physical network adapter dedicated to the service console (ESX hosts) or the VMkernel (ESXi hosts).
  • Fibre Channel (FC) SAN. Shared storage must be on an FC SAN.
  • RDM in physical compatibility (pass-through) or virtual compatibility (non-pass-through) mode. VMware recommends physical compatibility mode. The cluster cannot use virtual disks for shared storage. Failover clustering with Windows Server 2008 is not supported with virtual compatibility mode (nonpass-through) RDMs
If you have all that, then let's move on, otherwise you could have a look at VMWare's document and follow the single host section.

I created three volumes (LUNs) in our SAN array, two for Data (one for each resource group, more on this later) and another one for the Quorum disk. The quorum disk can be small, in production (Win2k3) we run with 0.5 GB, but here I gave it 1 GB and 20 GB for the Data disks.

I tried to save time and simply cloned an already existing Guest with Windows 2008 R2 installed but it turns out that this is not a very good idea, have a look at this post. In essence, the network cards get the same Guid and this creates problems. I got around this by adding a couple of extra NICs to the second Guest, but there must be a more elegant way of doing this. I'll have to investigate.

Anyway, at this point you should have: three LUNs in your SAN array that are accessible to your Hosts, two Guests with at least two NICs and a hard drive for the OS.

Before you start, you need to ensure that the LUNs you created in the SAN array are available to the Hosts, you can do this in vSphere or through the console. (Not sure how to do it in the console, so I’ll have to investigate that as well).

In vSphere, from Home -> Inventory -> Datastores, Right Click on the Datacentre containing your Hosts -> Select Rescan for Datastores…-> Click OK.
1.1 vSphere Datastores View.
You can check that the scan has successfully found the new LUNs by trying to add a new store. Right Click on you datacentre -> Add Datastore -> Select your Host -> Select Disk/LUN -> You should now see your LUNs, see figure 1.2.
1.2 Example LUN.
Do Not Add the LUN here, just click cancel, this is a simple test to check that the LUNs are visible to the Hosts.

Note that from the console, you can just check the output of: fdisk -l

You are finally ready to start. In vSphere from Home -> Inventory -> VMs and Templates, Right Click on Node 1 (win2k8a in my example) and select Edit Settings:
1.3 Node1 (win2k8a) original settings.
  1.  Click Add.
  2. Select Hard Disk.
  3. Select Raw Mappings Device.
  4. Select the first LUN you want to use.
  5. Select Store with Virtual Machine.
  6. Select Physical Compatibility.
  7. Set Virtual Device Node to SCSI (1:0).
  8. Click Finish.
In step 6 I’ve selected Physical Compatibility, because I’m not fussed about taking snapshots as I’m running this as a test, this might not be your case, so you might want to set it to Virtual Compatibility, your call.

In Step 7, it is important that you select a different controller from the controller where your OS disk is. Normally, setting to SCSI (1:0) should be ok. If done correctly a new controller would have been added by ESX.

Repeat steps 1 to 8 for the other two LUNs. Note that you’ll need to use different virtual device nodes in step 7, e.g. 1:1 & 1:2.

Ensure that you set the SCSI controller 1 SCSI Bus Sharing to Physical, as per figure 1.4.
1.4 Second Controller SCSI Bus Sharing settings.
Let’s now move to the second node. Right Click Node 2 (win2k8b in my example) and select Edit Settings:
  1. Click Add.
  2. Select Hard Disk.
  3. Select Use an existing virtual disk.
  4. Browse to the datastore holding node 1 and in the directory for node 1, select the first LUN disk.
  5. Set Virtual Device Node to SCSCI (1:0).
  6. Click Finish.
In step 4, you should add the disks in the same order, so that they map onto each other. Not sure if this is needed, but it makes, kind of, sense.

In Step 5, it is important that you select a different controller from the controller where your OS disk is. Normally, setting to SCSI (1:0) should be ok. If done correctly a new controller would have been added by ESX.

Ensure that you set the SCSI controller 1  SCSI Bus Sharing to Physical, see figure 1.4.

This pretty much concludes the ESX setup and part one of this series. In the next post, I'll cover how to install failover clustering.

No comments:

Post a Comment