ESXi 5.x host becomes unresponsive during vMotion How to solve errors when a VMware vCenter vMotion migration or a configuration change fails

Suddenly somehow we got a virtual machine which couldn’t be powered on, or ESXi 5.x host becomes unresponsive after attempting to migrate a virtual machine from VMware vCenter Server or configuration change fails.

Symptoms

  • VMware ESXi 5.x host becomes unresponsive after attempting to migrate a virtual machine from VMware vCenter Server;
  • Making a configuration change to the ESXi host renders the host unresponsive;
  • Migration fails at 13%;
  • Some of the virtual machines in the inventory become invalid;
  • vpxa fails to start;
  • You are unable to power on a virtual machine.

Resolution

  • Connect to the ESXi host using SSH.
  • Check if SNMP is creating too many .trp files in the /var/spool/snmp directory on the ESXi host by running the command:
    ls /var/spool/snmp | wc -l

Note: If the output indicates that the value is 2000 or more, this may be causing the full inodes.

vmware result_ls root disk full, esxi, host, vmotion, snmp, trap, maintenance
vmware result_ls

To be sure check che disk root usage running this command

vdf -h
vmware disk root_usage
vmware disk root_usage

If the available space is less than 3-4Mb (or usage ‘USE’ over 90%), it could be a problem.

  • Delete the .trp files in the /var/spool/snmp/ directory by running the commands:

# cd /var/spool/snmp
# for i in $(ls | grep trp); do rm -f $i;done

CLEAN_TRP_SNMP results
CLEAN_TRP_SNMP results root disk full problems snmp

Related Articles: VMware KB | Wh33ly’s Blog

How to Find a Lost, Missing, Hidden or Removed Network Card In a scenario where you have physically removed hardware from a machine you can no longer see it in device manager.

In a scenario where you have physically removed hardware from a machine you can no longer see it in device manager.  This does not mean that it is gone.  Evidence of that is, if for example you had a network card that had a Static IP address set and you remove the card and add a new one then try to set the IP address to the same as the old NIC you will get an error message. The error might look something like “The IP address 192.168.30.100 you have entered for this network adapter is already assigned to another adapter (Microsoft Virtual machine Bus Network Adapter #3) which is no longer present in the computer.  If the same address is assigned to both adapters and they both become active, only one of them will use this address.  This may result in incorrect system configuration”.  In Windows Server 2008 R2 and Windows 7 it actually gives you an opportunity to “remove the static IP configuration for the absent adapter”. If you say Yes, this will eliminate the IP conflict problem but does not solve the problem of the adapter still being present in the machine.  In older versions of the OS, it was even worse because every time you go into network properties it gives you an error message.  Another way this comes up is if you move a virtual machine from one host to another.  Like in the case of moving from Virtual Server 2005 R2 to Hyper-V or perhaps you are moving from one Hyper-V machine to another but you did not do an export, you just moved the VHD’s and created a new machine.

The IP address XXX.XXX.XXX.XXX you have entered for this network adapter is already assigned to another
The IP address XXX.XXX.XXX.XXX you have entered for this network adapter is already assigned to another

Getting rid of these old devices is actually very simple. Well, it is simple if you know how

Before you proceed, I recommend that you confirm that you have a good backup. I have never had a problem with this but hey, it is your server not mine.

Description

  • You need to run a command prompt so you can set an environment variable prior to opening the Device Manager This will bring up a command windowClick Start – Type the following command and then press ENTER

    cmd
  • Step 2: We have the command window open.  We now need to set the variable (that is the “set” line and then with the variable set, we need to run Device Manager.
    The file name for the Device Manager snap-in is devmgmt.msc.  The first line will not appear to do anything but it is setting the environment for next step.  The second command will actually open the Device Manager but it will be in a “special” mode which allows you to show devices that no longer exists.Type the following commands pressing ENTER after each line

    set devmgr_show_nonpresent_devices=1

    devmgmt.msc

  • Step 3: Now all we have to do is show hidden devices and you will be able to access the devices that are not present in the machine.  This will also turn a checkbox on in front of the Show Hidden Devices menu option.In this Special Device Manager Window; on the menu, click View then Show Hidden Devices
  • Step 4: Now you can just go find the adapter or device that is missing and delete it!  Expand the network adapter (or whatever category of device) and look for the device that needs to be removed.  The error message that you got should tell you the “name” of the device so you just have to go find that named device.  You may also notice while you are there that the icon for the “non-present” or missing device is slightly subdued so that will make it easier to find it if you have many devices in a category.See screen shot belowExpand the network adapter (or whatever category of device) and look for the device that needs to be removed.
    Right-Click the Device and select Uninstall
devmgr_show_nonpresent_devices
devmgr_show_nonpresent_devices

Related Article: VMware KB  | Microsoft Blog Technet