Deploy VCF 4.0.1 with BGP loopback addresses
For a customer project I needed to deploy VCF 4.0.1 with BGP loopback addresses on a Cisco Fabric (spine leaf setup). As you may already know VCF requires BGP configurations to configure BGP peerings in NSX-T. The VCF design we have is a little different from the blueprint (VVD 6.0.1). Instead of having the BGP peerings configured on the spine, we are configuring it on the leafs and instead of having 2 Edges, we are configuring 4 Edges as shown below:
Because of the limitations we have with the Cisco Fabric setup, we are using loopback addresses as the BGP neighbors for the Edges. Reachability to the loopback from the firewall will be configured using a static route on the firewall, pointing to the Anycast Gateway IP on the VTEPs. But how do we deploy that with standard workflows from Cloudbuilder? In this article I will show you how to do so.
Table of Contents
Getting started
These are the networks that will be used for the two uplink networks:
In the vcf-ems-deployment-parameter.xlsx workbook, I have entered some dummy value in the BGP neighbor IP address column. Make sure that the dummy value is based on the same subnet as the uplink networks. If not, it will not pass the first JSON validation tests.
You can now start the deployment of the SDDC environment with the cloudbuilder. The cloudbuilder should eventually fail at step “Failed to validate BGP Neighbor Peering Status for Edge node xx” because of the wrong BGP configuration from the workbook.
Manually fix BGP Neighbor configuration in NSX-T
At this stage, we need to login on the NSX-T manager to configure the BGP Neighbor settings to have a valid BGP Neighbor.
Login to the NSX-T manager and click on Networking –> Tier-0 Gateways and edit the Tier-0 gateway. Edit the two configured BGP Neighbors in the BGP section as shown below:
Change the BGP Neighbor IP Address to the loop back address, Max Hop limit and refresh the status of the BGP Neighbor. If everything has been configured correctly on the physical switch (leafs) and in the BGP Neighbor (AS number, IP address, Passwords and Max Hop Limit), you should see the status reporting back as Success.
You might think that the retry job in the cloudbuilder will succeed, but unfortunately it won’t.
Change the BGP configuration value in Cloudbuilder
We need to trick the Cloudbuilder to pass the “verifying the NSX-T Data Center BGP Peering” task. Cloudbuilder is using the BGP configuration from the vcf-ems-deployment-parameter.xlsx workbook to perform the validation. So all retry jobs will still fail, unless we trick the Cloudbuilder with the loopback addresses.
Login with SSH into the Cloudbuilder appliance. The excel workbook file will be converted to a JSON file during the deployment. In my case, there was a file called vcf-public-ems.json that contains all the configurations from my workbook.
The vcf-public-ems.json can be found in the following location:
cd /opt/vmware/sddc-support/cloud_admin_tools/Resources/vcf-public-ems/
Update the vcf-public-ems.json file with the correct BGP Neighbor addres (loopback addres). Search for the bgpNeighbours section and edit it like shown below:
"bgpNeighbours": [ { "neighbourIp": "10.16.0.1", ## this was configured with 10.27.11.1 "autonomousSystem": 65000, "password": "VMware1!" }, { "neighbourIp": "10.16.0.2", ## this was configured with 10.27.12.1 "autonomousSystem": 65000, "password": "VMware1!" } ] }
Retrieve the Cloudbuilder execution task ID
We now need to retry the job with the update vcf-public-ems.json file. To do so, we need to find the Cloudbuilder execution task ID. Open the /opt/vmware/bringup/logs/bringup-debug.log and locate the execution ID.
The execution task ID should be located in one of the latest log entries:
tail -f /opt/vmware/bringup/logs/vcf-bringup-debug.log
2020-09-22T18:14:38.945+0000 [bringup,98f70efb99536227,3168] DEBUG [c.v.e.s.o.c.c.ContractParamBuilder,pool-2-thread-9] Contract task Verify BGP Peering for the Edge Cluster input: {"nsxtManager":{"address":"nsx-t-1a.vkernelblog.com","username":"admin","password":"*****"},"tier0RouterName":"vkb-m01-ec01-t0-gw01","nsxtEdges":[{"address":"10.16.11.69","username":"admin","password":"*****"},{"address":"10.16.11.70","username":"admin","password":"*****"}],"tier0LocaleServicesList":["default"],"edgeAddressToBgpNeighborsMap":{"10.16.11.70":[{"neighborAddress":"10.27.11.1","remoteASN":65000,"password":"*****","sourceAddresses":["10.27.11.20","10.27.11.10"]},{"neighborAddress":"10.27.12.1","remoteASN":65000,"password":"*****","sourceAddresses":["10.27.12.20","10.27.12.10"]}],"10.16.11.69":[{"neighborAddress":"10.27.11.1","remoteASN":65000,"password":"*****","sourceAddresses":["10.27.11.20","10.27.11.10"]},{"neighborAddress":"10.27.12.1","remoteASN":65000,"password":"*****","sourceAddresses":["10.27.12.20","10.27.12.10"]}]}} 2020-09-22T18:14:38.946+0000 [bringup,98f70efb99536227,3168] DEBUG [c.v.e.s.o.c.ProcessingTaskSubscriber,pool-2-thread-9] Collected the following errors for task with name VerifyBgpPeeringNsxApiAction and ID 7f000001-74b5-18d5-8174-b6799bd6015b: [ExecutionError [errorCode=null, errorResponse=LocalizableErrorResponse(messageBundle=com.vmware.vcf.common.fsm.plugins.nsxt.messages)]] 2020-09-22T18:14:38.994+0000 [bringup,98f70efb99536227,2b53] WARN [c.v.e.s.o.c.ProcessingOrchestratorImpl,pool-2-thread-9] Processing State completed with failure 2020-09-22T18:14:39.326+0000 [bringup,98f70efb99536227,7a76] INFO [c.v.e.s.o.core.OrchestratorImpl,pool-2-thread-9] End of Orchestration with FAILURE for Execution ID 83fbbe85-4e44-45dc-85fe-bcec92d1388b 2020-09-22T18:14:39.326+0000 [bringup,98f70efb99536227,7a76] DEBUG [c.v.e.s.o.d.i.OrchestratorDataImpl,pool-2-thread-9] Saving executionContext 83fbbe85-4e44-45dc-85fe-bcec92d1388b to DB 2020-09-22T18:14:42.050+0000 [bringup,98f70efb99536227,ae06] INFO [c.v.e.s.o.c.s.OrchestratorSubscriber,pool-2-thread-13] Ignoring unknown OrchestratorMessage {"executionId":"83fbbe85-4e44-45dc-85fe-bcec92d1388b"} 2020-09-22T18:14:42.059+0000 [bringup,98f70efb99536227,53af] INFO [c.v.e.s.t.s.e.util.TaskUtilImpl,pool-2-thread-9] Skipping updating task corresponding to execution with ID 83fbbe85-4e44-45dc-85fe-bcec92d1388b as it does not exist.
We now have found the execution task ID “83fbbe85-4e44-45dc-85fe-bcec92d1388b”.
Apply workaround
Let’s create the command that need to be executed from CLI through a SSH session on the Cloudbuilder appliance.
curl -k -u admin:'VMware1!' -X PATCH https://localhost/v1/sddcs/EXECUTIONTASKID -H "Content-Type: application/json" -d "@/opt/vmware/sddc-support/cloud_admin_tools/Resources/vcf-public-ems/vcf-public-ems.json"
curl -k -u admin:'VMware1!' -X PATCH https://localhost/v1/sddcs/83fbbe85-4e44-45dc-85fe-bcec92d1388b -H "Content-Type: application/json" -d "@/opt/vmware/sddc-support/cloud_admin_tools/Resources/vcf-public-ems/vcf-public-ems.json"
Let’s execute the command and let’s see what happens!
Final Notes
This is not an official workaround released by VMware, but a workaround that was found by myself. If you have any questions related to this article, please do not hesitate to contact me by twitter, linkedin or through the contact form.