Witness for Nutanix Metro Availability

With Nutanix you have a powerful data protection solution: Metro Availability. You can just enable it within PRISM and administer it easily. Whilst I present this to more and more prospects there is always one thing to clarify: What happens if no one can manually fail-over a site?

In most cases people ask this in conjunction to a so called “witness”-function. Someone, something to decide if a site fail-over has to be initiated. Imagine it´s 2:00am, the Metro-A site goes down because of a fire and no one is awake to promote Metro-B as active site. Bad thing, right?

With this in mind I created a solution based on the powerful REST API and a SNMP trap. I think that you have to have an indicator that is responsible for the fail-over decision. So if the USV sends one last trap this could be an initiator to promote Metro-B as active site.

With vCenter Orchestrator/vRealize Orchestrator you have to register the API (in my case both sites, cause maybe Metro-B which is active too, can fail):

REST API

With the new REST hosts you can now add the Metro Availability functions with every site:

API calls

 

As you can see I also added a “disableMetro” function, cause for the tests you it´s easier to disable Metro-A than shutting it down.

Now I registered my MacBook as an SNMP device and started the trap host on the vCO/vRO. Afterwards I cloned the “Wait for a trap on a SNMP device” workflow in my Nutanix folder and created 2 new workflow from the REST operations:

Workflows

 

The “NTNX_SNMP_Witness” workflow is the main workflow and includes the other ones:

Schema

 

In my case the disable function has to be called because Metro-A is still alive. ATTENTION: You can use this for automated failover scenarios as well!

When I start the workflow it waits until the trap is received an disables Metro-A and promotes Metro-B:

Wait

 

I send the SNMP trap:

SNMP trap

 

and the workflow starts to disable Metro-A protection domain and promotes Metro-B as active. Note that in my case I used a wrong name cause the test environment is a shared one and it´s not allowed to change anything without registration and reservation 🙂

Workflow run

This is just a quick example how to add a witness function. I´m sure you can do this with different kinds of orchestration tools.

 

Leave a Reply