VMWare 3.5 and VirtualCenter 2.5 – A Disaster Release?
I have been a fan of VMWare since when I was working with ESX 2.5.X back in the day. The move from ESX 2.5.X to 3.0.0 (which I skipped) was pretty smooth and ESX 3.0.1 and 3.0.2 have been working well – with some minor glitches. But overall the organization I am working for has been very pleased with the outcome and has helped me to push virtualization deep into production. But then ESX 3.5 and VirtualCenter 2.5 came along and since then I am not so sure anymore about how confident I should feel about these products.
I have been running ESX 3.5 in a test environment and VCC 2.5 in production for a while now. The list of new features looks impressive, but when looked at the pieces a little more closely the paint is not as shiny as it looked from 10,000 ft. high. Since ESX 3.5 requires VCC 2.5 we upgraded to Virtualcenter 2.5 first – with some surprises. The new agent did not behave on our existing production host servers and some ran out of disk space (which was not apparent right away). A VMWare support engineer had to be involved and the findings were "interesting". They had seen this issue before where the installer for the agent fills up the disk space for the ESX host and renders it partially unusable. However, VMWare did not know what was causing the issue.
The next thing I noticed was the different behavior in the HA feature. A cluster that had enough resources available to cover the loss of one host suddenly was no longer able to accommodate the loss of one host. The cluster – being peacefully before – now showed a lot of red and yellow warning flags. A lot of swearing, tuning, checking, and extra work was needed to make adjustments to get back to where we were before. The annoyances continued with the discovery that the 64Bit support for the VCC client was dropped. A lot of smaller pieces popped up here and there that really make me wonder what the thought process behind those changes is. Here is another example. The VCC client now includes a patching solution for ESX. The update manager needs to go out on the Internet. Not a big deal you might think, unless you are using a proxy to go out that requires authentication. Try to change the password or user account for the proxy access and you discover that this product is barely beta release material.
Performance of the VCC just sucks. Did VMWare follows Microsoft’s path by bloating newer product releases? Why am I getting so many error messages and then upon retry things just magically work? Where is the check-box for "Evacuate powered off and suspended virtual machines" checked by default? It should be the exception to move those machines – not the rule.
ESX 3.5 as a standalone is probably great, but in a clustered environment a risk to your job security in my opinion. I am not considering a rollout at this point at all. Storage vMotion is a great feature, but why in he!! is it only to be used via the RemoteCLI? What were the VMWare folks thinking to release it like this? What about reduced hardware and driver support? Hello!!
This rant is certainly not a complete write-up of all the issues. But I think it is a clear indicator that these 2 products are far from production ready. Let me throw one last thing out there. VCB. VMWare’s backup solution is a bunch of crap. It looks great on the feature list for management, but the admin who was to implement it and use it is just screwed by this alpha-release. I was looked at like as I would be crazy when ordering a 3rd party backup solution because "we have the build-in VCB – why do you need another backup product?". Thanks, VMWare. Stop releasing experimental features and products and go back to a more solid "production ready" approach. Nobody wants to risk his job when a problem can affect hundreds of VMs suddenly being down or inaccessible or at risk because of unfinished code under the hood.
Additional "Must-Read" Articles: