Sunday, 6 November 2011

HA windows service with Windows Cluster Service

We support a MS Dynamics CRM application that uses a few SSIS jobs for tidying up data, however they were awfully slow, so we decided to forgo them. In fairness, it is only SSIS jobs that call a web service that were too slow and I'm sure that given enough time, we could make them run faster, but we decided to go for a different route and that was to use a windows service running on the backend, but our backend is clustered, which raises the problem of how to make sure that this service runs in High Availability mode.
It turns out that it is surprisingly easy to do with windows cluster service. There is a resource type called generic service and this can be used to start the service on various nodes as needed.

Assuming that you have your windows service installed in all the nodes in your cluster, this is what you need to do to have a clustered windows service.
  1. Start the cluster console (start|run|cluadmin)
  2. Connect to cluster if needed (it normally connects to the cluster running on the box by default)
  3. Right Click on Groups. Select New| Group. Follow wizard, make sure that all Nodes are available, as shown in the screenshot below and click Finish.
  4. Right Click on the group you have just created. Select New|Resource.
  5. Make sure that you select Generic Service in Resource Type and follow the wizard.  
  6. When prompted for a service name, make sure that this is the service name and not the service's display name. If your service has starts parameters this is where you can add them.  
  7. You can now bring the service online, by bringing the whole group or just the service.
That is all, you should now have an HA windows service. Note that you don't actually have to create a new group, but I think it's just good practice to do so.
Also note that all the windows cluster service will do is to bring the service up after a failover, if your windows service runs a long running process and it stops halfway because the active node goes down, the service will be brought up on the other node, which may or may not cause issues.

No comments:

Post a Comment