Nutanix Cluster – Enabling Maintenance mode on ESXi Host

Lets have an overview about Nutanix Virtual Computing Platform prior directly jumping in to the steps on enabling maintenance mode.

Hyper-Converged Infra (HCI)

HCI is a software based architecture which tightly integrates the compute, storage, network and Virtualization. Here the vital part is that the local storage of physical servers which are part of cluster are converged and provided as a pool of shared storage resource to utilize the Virtualization features.

Nutanix Virtual Computing Platform

Nutanix Virtual Computing Platform is a Hyper-Converged Infra. Here, Nutanix  uses its own developed NDFS (Nutanix Distributed File System) for converging storage resources.

In general, the physical hosts which are part of Nutanix Cluster are installed with Standard Hypervisor (In this case assume ESXi) and they have their own hardware resources such as Processor, Memory, Storage and Network.

Here Nutanix places its Controller-VM (CVM) in each host \ node of cluster. It is the one which is responsible for forming the unified shared storage resource and serving the IOPS from hypervisor. So, CVM is the key one which enables storage level convergence.

Logically we can say

  • Compute level clustering is happening with help of vSphere HA & DRS of hypervisor.
  • Storage level convergence \ clustering is happening with help of Nutanix CVM.

So, while taking a host \ node for activity. Two level of maintenance have to be placed.

  • Hypervisor level maintenance
  • CVM level maintenance.

Consider we are having Nutanix Cluster with 5-ESXi hosts and resiliency factor is set to withstand single node failure. So, we can safely take one node for maintenance activity.

Steps to enable maintenance mode:

It is good to collect the NCC log and have it verified with Nutanix to ensure that there is no existing critical issues in cluster.

Verify the “Data Resiliency status” of Nutanix Cluster in PRISM portal, it should be Normal prior starting the activity.

As a first step, we have to migrate all the User VMs which are residing in the target host (except CVM) to other hosts available in cluster.

Connect the CVM of target ESXi host via SSH and execute the below mentioned command to find its UUID.

ncli host ls | grep -C7 [IP-Adress of CVM]

Place the CVM in maintenance mode using its UUID which we have fetched in previous step.

ncli host edit id=[UUID] enable-maintenance-mode=”true”

Verify that the CVM has been placed in maintenance mode using following command. In this stage, CVM level Maintenance mode is enabled and confirmed.

cluster status | grep CVM

Now do the shutdown of CVM using below command.

cvm_shutdown -h now

Now it is safe to enable maintenance mode at hypervisor level. All the user VMs were migrated to other nodes and CVM also brought down gracefully as per previous steps.

Place the target ESXi host in maintenance mode and take it for your maintenance activity.

Steps to exit from maintenance mode:

Once completed with the maintenance activity, now we have to add the nodes back to cluster.

Exit the ESXi host from Maintenance mode and Power-ON the CVM.

Connect a neighbor CVM available in Cluster via SSH.

Check the status of CVM which we have Powered-ON. In this stage it should be reported as it is in maintenance mode.

ncli host ls | grep -C7 [IP-Adress of CVM]

Exit the CVM from maintenance mode using its UUID which we have fetched in previous step.

ncli host edit id=[UUID] enable-maintenance-mode=”false”

Verify that the CVM has been removed from maintenance mode using following command.

cluster status | grep CVM

CVM came out of maintenance mode now.

Ensure that the “Data resiliency and Meta-data sync status” came normal in PRISM portal. It may take few minutes to reflect.

Note: In the given commands, parameters in the brackets [ ] should be replaced with appropriate value.

For example –

ncli host ls | grep -C7 [IP-Address of CVM]   –>   ncli host ls | grep -C7 169.254.20.1

Intended Audience – Administrators of Nutanix Virtual Computing Platform with vSphere ESXi.

Thanks for reading the post and do share your views 🙂


Never Stop Learning !

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s