In this post, we will perform step by step installation of vCloud Foundation 4.0. It has been couple of weeks since this version has released. I have been working on VCF & VVD since couple of years and deployed it multiple times, hence wanted to write a blog on it.
Before we start with VCF 4.0, Please check the network configuration in my VyOS Virtual Router blog here.
VMware Cloud Foundation is a private as well as public cloud solution. It is a unified platform which will give you entire SDDC stack. VCF 4.0 includes vSphere 7.0, VSAN 7.0, NSX-T 3.0, VRA 8.1 as well as SDDC manager to manage your virtual infrastructure domains. One more big change in VCF 4.0 is, Kubernetes Cluster deployment through SDDC manager after successful deployment of management domain.
Bills of material (Image copied from VMware site)
Check out VMware’s official site for all new features & release notes here…
With that, let’s get started…
vCloud Foundation deployment requires multiple networks to be in place before we start the deployment. We will discuss about the network requirements for successful deployment.
Network Requirements: Following management domain networks to be in place on physical switch (TOR). Jumbo frames (MTU 9000) are recommended on all VLANs or minimum of 1600 MTU. Check out the ports requirements on VMware site https://ports.vmware.com/home/VMware-Cloud-Foundation
Follow my previous blog for network configuration here.
Physical Hardware: Minimum 4 physical server with preinstalled VMware ESXi 7.0 hypervisor for VSAN cluster.
AD & DNS Requirements: Active Directory (Domain Controller) to be in place. In our case, DC is connected to 1631 VLAN on VyOS. Following DNS records to be in place before we start with the installation.
Pre-installed ESXi Configuration:
All ESXi must have ‘VM network’ and ‘Management network’ VLAN id 1631 configured.
NTP server address should be in place on all ESXi.
SSH & NTP service to be enabled and policy set to ‘Start & Stop with the host’
All additional disks to be present on an ESXi for VSAN configuration.
Let’s begin with the nested ESXi configuration for our lab.
Create 4 new VM’s on physical ESXi. These will be our nested ESXi where our VCF env will get install. All ESXi should have identical configuration. I have following configuration in my lab.
CPU hot plug: Enabled
Hardware Virtualization: Enabled
Memory: 50 GB
HDD1: ESXi OS installation
HDD2: VSAN Cache Tier
HDD3: VSAN Capacity Tier
HDD4: VSAN Capacity Tier
2 Network Adapter: Connected to ‘Trunk’ port group.
And the ESXi ISO attached to CD drive.
After completing ESXi installation. Configure them with correct IP address and make sure that ‘Test management network’ shows OK for all ESXi’s.
We now need to mark additional HDD of ESXi to SSD. You can either connect to DC and putty to ESXi or open ESXi console and run these commands.
esxcli storage nmp satp rule add -s VMW_SATP_LOCAL -d mpx.vmhba1:C0:T1:L0 -o enable_ssd
esxcli storage nmp satp rule add -s VMW_SATP_LOCAL -d mpx.vmhba1:C0:T2:L0 -o enable_ssd
esxcli storage nmp satp rule add -s VMW_SATP_LOCAL -d mpx.vmhba1:C0:T3:L0 -o enable_ssd
esxcli storage core claiming reclaim -d mpx.vmhba1:C0:T1:L0
esxcli storage core claiming reclaim -d mpx.vmhba1:C0:T2:L0
esxcli storage core claiming reclaim -d mpx.vmhba1:C0:T3:L0
Once done, run ‘esxcli storage core device list’ command and verify if you see SSD instead of HDD.
This completes our ESXi configuration.
Cloud Builder is an appliance provided by VMware to build VCF env on target ESXi’s. It is one time use VM and can be powered off after the successful deployment of VCF management domain. After deployment, we will use SDDC manager for managing additional VI domains. I will be deploying this appliance in VLAN 1631, so that it gets access to DC and all our ESXi servers. Download the CB appliance from VMware downloads.
Deployment is straight forward like any other ova deployment. Make sure to you choose right password while deploying the ova. The admin & root password must be a minimum of 8 characters and include at least one uppercase, one lowercase, one digit, and one special character. If this does not meet, then the deployment will fail which results in re-deploying ova.
Till now, we have completed configuration of Domain controller, VyoS router, nested ESXi & Cloud Builder ova deployment. Following VM’s have been created on my physical ESXi host.
Log into Cloud Builder using configured fqdn and click next on this screen.
Check if all prereqs are in place and click Next.
Download the ‘Deployment Parameter Workbook’ on this page.
Deployment Parameter Workbook:
It is an Excel sheet which needs to be filled accurately without breaking its format. Be careful while filling this workbook, as it provides all input parameters for our VCF deployment. Let’s have a look at the sheet.
Prerequisite Checklist: Cross check your environment as per prereqs.
Management Workloads: All license information needs to go in here.
Users and Groups: You need specify all passwords here. Check out the NSX-T passwords, as the validation fails if it does not match the password policy.
Hosts and Networks: Edit network information as per the environment and update ESXi information accordingly.
Deploy Parameters: Fill out all the information as per your environment. If you miss something, the cell turns red which causes failure in validation.
After you complete this sheet, it needs to be uploaded in cloud builder on this page.
Next is, Validation of the workbook and preinstalled ESXi.
Resolve any errors / warnings that shows up here.
Status should show ‘Success’ for all validation items. Click Next and click on Deploy SDDC.
All SDDC components gets installed on nested ESXi and you see this message.
SDDC Deployment Complete.
Check the SDDC Manager and vCenter.
It was defiantly not that easy for me first time. This was my 3rd deployment which got successful in 1st run. The last successful run took around 4 hours to complete. I have written this blog after resolving the errors that I got, so that you don’t waste time in troubleshooting. If you miss any steps in this post, you will surely end up in errors.
Here are some suggestions.
Keep checking vcf-bringup.log in cloud builder for any errors in deployment. The location of the file is ‘/opt/vmware/bringup/logs/’ in cloud builder. This file will give you live update of the deployment and any errors which caused the deployment to fail. Use ‘tail -f vcf-bringup.log’ to get the latest update on deployment. PFB.
Another error ‘The manifest is present but user flag causing to skip it.’ caused my deployment to fail.
To resolve this, I changed the deployment model of NSX-T to ‘Small’ from ‘Medium’. Looked like it was compute resource issue.
Also, keep checking NTP sync on the cloud builder. Mine did not sync with NTP for some reason and I had to manually sync it.
Steps to manually sync NTP…
systemctl stop ntpd.service
Wait for a min and again run this
systemctl start ntpd.service
systemctl restart ntpd.service
Verify the offset again. It must be closer to 0.
NSX-T Deployment error.
The NSX-T OVF wasn’t getting installed. I could see generic error in vCenter. Reboot of entire environment fixed the issue for me.
Also, use this command ‘systemctl restart vcf-bringup’ to pause the deployment when required.
For example, my NSX-T manger was taking time to get deployed, and due to an interval on cloud builder, it used to cancel the deployment assuming some failure. So, I paused the deployment after nsx-t ova job got triggered from CB and hit ‘Retry’ after nsx got deployed successfully in vCenter. It picked it up from that point and moved on.
That’s it for this post. I will come up with some more posts on VCF 4.0. Next is to deploy additional workload domain and application networks for it.
Feel free to share my blog on social media. 😊
Subscribe for my latest blogs…