Tips, tricks, and missing steps for the VMWare upgrade
Having done several small / medium business upgrades from vCenter and ESXi 5.5 to 6.0 and 6.5, I wanted to share some best practices and lessons learned with the IT community.
How long does it take to upgrade vCenter?
About 2-4 hours during the actual upgrade process. Figure one hour to deploy the appliance (no risk, original vCenter is unchanged), then another hour and a half to replace your existing appliance (your original vCenter is shut down). VMWare is very good about this process and will perform many system checks before shutting down your original vCenter. If something goes wrong at this stage, remember you can shut down (power off) the new vCenter and then start your old vCenter and it should recover.
Don’t forget the preliminary research, system checks, and preparation is another 4 hours for a small or medium business (this includes research for vSphere upgrades).
1-2 hours of the preparation is getting the images downloaded, either to your burned DVD or directly from the VMWare website.
What is the risk of downtime while upgrading vCenter?
Good news: For small and medium businesses, there is almost no chance of customer impact while upgrading vCenter. Just don’t plan to do other management tasks at the same time. Remember that the hosts will keep running the VMs, the storage, and the virtual networking unchanged even if they can’t reach vCenter. DRS, HA and vMotion will not be available, so you won’t have normal redundancy.
As long as you complete your upgrade in a reasonable amount of time (one evening), you should be fine. What are the odds a host crashes in that 2-4 hour period? Generally extremely low (less than 0.1%) unless you have more than 5 hosts.
If you have 100+ hosts, then your odds of a server randomly crashing during the upgrade gets closer to 5%. But in that case, you should have a team of experts on the job, so you shouldn’t need much help from this lowly blogger <grin>.
Can I upgrade vCenter and vSphere remotely?
You can perform this entire process (vCenter and ESX host upgrades) 100% remotely. Make sure you have a plan for what to do if a host doesn’t come back after rebooting.
Research before migrating or upgrading vCenter
Make sure you have deep-dived into the existing vCenter appliance and each host before starting.
Make sure to get the following configurations:
- Existing vCenter IP , verify root username and password (see next topic)
- Network basics (internal domain name, subnet mask, gateway, two DNS servers)
- Time provider (NTP). Make sure this matches the NTP source your hosts are using! This is a good time to standardize it across your environment.
- A ‘free’ IP address on the same subnet as your existing vCenter server. Will be used for a few minutes during the upgrade.
- Configure your DNS servers to have a fully-qualified name for vCenter (the old vCenter IP and your new vCenter’s name). For example: vcenter01.company.com . Make sure you can resolve it before you start the upgrade. The vCenter install (particularly new installs) will bomb out if you don’t have good DNS.
Problem: Can’t log onto the old 5.5 vCenter appliance
Symptom: You are pretty sure you know the root password, but it is not working on the web GUI. ( https://vCenterIP:5480 ) It says wrong password.
Symptom: The 5.5 vCenter appliance was built more than a year ago.
Fix: Connect to vCenter using SSH
(I recommend using the Putty SSH application which is available on the Internet). When you enter ‘root’ and the correct password (default is ‘vmware’ but you’ve probably changed it), you will be told that the password is expired. Change the password using the SSH dialogue and now you can log on to the web GUI.
Cause: When you build a vCenter appliance (5.5, 6.0, or 6.5 or 6.7), by default, the system expires the root account after one year. I recommend un-checking this immediately whenever you build vCenter. If you want to change the root password periodically, you can still do it, but this way your department won’t get locked out.
How to upgrade vCenter from 5.5 to 6.5 or 6.7
The step by step guides available online are good.
This one at tech-coffee is straightforward and easy to use. I personally recommend deploying 6.5 vCenter appliances (rather than deploying vCenter onto a Windows server).
vSphere ESXi upgrades from 5.5 to 6.5 or 6.7
Before you upgrade your ESX hosts, take a quick read through the rest of the topics below. There are lots of good tips and things to avoid.
What is the risk of downtime for ESXi upgrades?
The first question is: Are all your hosts running the same model of hardware?
- If YES… then your risk is almost nil.
- If NO… your hosts are running different hardware models (such as a Dell R710 and a HP DL360 gen8 and a HP DL360 gen9), then your risk of downtime or customer impact is pretty high. Read through the topic below, “CPU Generation Compatibility Levels” before you continue.
Next question: Do you have enough RAM and disk space to run all your VMs on (HOSTS – 1) ?
Figure this out ahead of time. Many companies have a few high-resource VMs such as database servers, which take up all or most of the resources on a host. Plan how you will migrate the VMs around so that all the VMs will fit onto your other servers.
If you properly migrate your VMs off each host, putting it into maintenance mode, before upgrade, you should be good. I still recommend doing this portion after-hours so that you minimize impact from vMotion and have the maximum amount of redundancy during the workday in case one of your hosts randomly fails.
Starting the ESXi host upgrades
Here is a good guide for upgrading ESX 5.5 to 6.x using command line. I’ve used it successfully. You WILL need Internet access from the host. If you have a decent business ISP, it should take less than 20 minutes to download the update.
Problem: CPU Generation Compatibility Levels
This is a big concern because if you don’t avoid it in advance, you will have to shut down some VMs in order to complete the ESXi hosts upgrade.
If you have more than one model of server, deep dive on this BEFORE you start upgrading ANY hosts.
- You have older hosts mixed in with newer servers.
- After upgrading a host to vSphere 6.0 or 6.5, you cannot vMotion your VMs to it. “The virtual machine requires hardware features that are unsupported or disabled on the target host. General incompatibilities. If possible, use a cluster with Enhanced vMotion Compatibility (EVC) enabed, see KB article 1003212.“
vCenter 5.5 seems to handle CPU compatibility without any configuration steps. Later versions need to be configured for this before you turn up hosts.
In vCenter 6.0 and 6.5 and 6.7 , you need to set up a Cluster object and put your hosts into it in order for Enhanced vMotion Compatibility to work across CPU generations. But you can’t do this while your host has any VMs running. If you upgrade then move your VMs onto the host without setting up the cluster object, you will have issues vMotioning the VMs to other hosts later.
Fix: Set up cluster in vCenter, configure CPU compatibility levels to the lowest common denominator, and add your newest host to it early.
- For each of your hosts, look up their CPU generation against this chart on VMWare KB: https://kb.vmware.com/s/article/1003212
- After upgrading vCenter, but before upgrading hosts, create a cluster object (right-click the datacenter, New Cluster)
- After naming the cluster, right-click it and select Settings.
- Select the “VMware EVC” tab in settings. Enable it, and select the lowest (oldest) CPU generation for all of your hosts.
- Starting with your newest hosts (the highest generation), move them into this cluster after upgrade but before migrating VMs back onto them.
For example, I will upgrade ESX05 first because it has the highest CPU generation.
- (I have already upgraded vCenter to 6.5)
- Create a cluster object and configure CPU compatibility at the LOWEST common level
- vMotion all VMs off ESX05 to older hosts.
- Put ESX05 into maintenance mode in vCenter
- Install ESXi 6.5 on this host
- Re-add ESX05 to vCenter if needed.
- Move ESX05 into the compatibility cluster while it is in maintenance mode with no VMs.
- Configure and thoroughly test ESX05 (see topic below).
- Take ESX05 out of maintenance mode
- vMotion some VMs onto ESX05, in preparation for upgrading ESX04 (the next lower generation server) next.
If you do this right, you should be able to avoid customer impacting downtime.
Problem: Did you miss the CPU compatibility fix and already vMotioned some VMs into an upgraded, newer host?
Sorry. You are stuck with a situation where you can’t get your last host(s) into the compatibility cluster because you have live VMs on it and you can’t vMotion the VMs to any other host. This is not an acceptable stopping point because you lose cluster benefits like DRS and also can’t vMotion for regular maintenance in the future. So bite the bullet and fix it now, rather than wait until you are forced to do it in the future.
- If you get stuck in this compatibility hell, then the easiest solution is to shut down all the VMs on that host then move the host into the cluster. This normally results in a downtime of about 10-15 minutes, depending on how fast your VMs boot up.
- Here is some more information from a fellow blogger: https://tinkertry.com/easier-way-to-vmotion-to-incompatible-cpu-host
Problem: Did you upgrade the host with vCenter on it, and now you can’t migrate without shutting down vCenter?
How to fix vCenter vMotion CPU compatibility issue:
- Here are VMware’s instructions for what to do if your vCenter appliance is one of the VMs on that incompatible host. The instructions work. Gotta love chicken and egg puzzles.
Problem: Conflicting VIB Error (HP Servers especially)
Wrote a post about how to solve this issue last year. You can find it at this link here.
Testing your hosts
You can avoid 99% of the risk by testing your upgraded host thoroughly before you put production VMs onto it.
I recommend the following tests (at minimum)
- Make sure you’ve got your upgraded host configured, into vCenter, and into it’s destination cluster.
- Using a test VM (build one up if you don’t have one ready), test connecting to each virtual switch and pinging to-and-from other servers across the network.
- Storage vMotion the test VM to each major storage system you have.
- vMotion (processing) the test VM between your upgraded host and other hosts. Test both directions.
- Snapshot and delete snapshot on the test VM.