NSX-T 3.0 – Welcome to VirtualRove.COM

NSX-T: Replace faulty NSX Edge Transport Node VM

July 27, 2023July 27, 2023Roshan ChavanLeave a comment

I recently came across a situation where the NSX-T Edge vm in an existing cluster was having issues while loading its parameter. Routing was working fine and there was no outage as such. However, when a customer was trying to select an edge vm and edit it in NSX UI, it was showing an error. Support from VMware said that the edge in question is faulty and needs to be replaced. Again, routing was working perfectly fine.

Let’s get started to replace the faulty edge in the production environment.

Note: If the NSX Edge node to be replaced is not running, the new NSX Edge node can have the same management IP address and TEP IP address. If the NSX Edge node to be replaced is running, the new NSX Edge node must have a different management IP address and TEP IP address.

In my lab env, we will replace a running edge. Here is my existing NSX-T env…

Single NSX-T appliance,

All hosts TN have been configured,

Single edge vm (edge 131) attached to edge cluster,

One test workload overlay network. Segment Web-001 (192.168.10.0/24)

A Tier-0 gateway,

Note that the interfaces are attached to existing edge vm.

BGP config,

Lastly, my VyOS router showing all NSX BGP routes,

Start continuous ping to NSX test overlay network,

Alright, that is my existing env for this demo.

We need one more thing before we start the new edge deployment. The new edge vm parameters should match with the existing edge parameters to be able to replace it. And the existing edge showing an error when we try to open its parameters in NSX UI. The workaround here is to make an API call to existing edge vm and get the configuration.

Please follow the below link to know more about API call.

NSX-T: Edge Transport Node API call

I have copied the output to following txt file,

EdgeApi.txt

Edge_API Download

Let’s get started to configure the new edge to replace it with existing edge. Here is the link to the blogpost to deploy a standalone edge transport node.

NSX-T: Standalone Edge VM Transport Node deployment

New edge vm (edge132) is deployed and visible in NSX-T UI,

Note that the newly deployed edge (edge132) does not have TEP IP and Edge cluster associated with it. As I mentioned earlier, The new edge vm parameters should match with the existing edge parameters to be able to replace it.

Use the information collected in API call for faulty edge vm and configure the new edge vm the way you see it in the API call. Here is my new edge vm configuration looks like,

Make sure that the networks matches with the existing non working edge networks.

You should see TEP ip’s once you configure the new edge.

Click on each edge node and verify the information. All parameters should match.

Edge131

Edge132

We are all set to replace the faulty edge now.

Select the faulty edge (edge131) and click on actions,

Select “Enter NSX Maintenance Mode”

You should see Configuration State as “NSX Maintenance Mode” in the UI.

And you will lose connectivity to your NSX workload.

No BGP route on the TOR

Next, click on “Edge Clusters”, Select the edge cluster and “Action”.

Choose “Replace Edge Cluster Member”

Select appropriate edge vm’s in the wizard and Save,

As soon as the faulty edge have been replaced, you should get the connectivity to workload.

BGP route is back on the TOR.

Interface configuration on the Tier-0 shows new edge node.

Node status for faulty edge shows down,

Let’s get into the newly added edge vm and run “get logical-router” cmd,

All service routers and distributed routers have been moved to new edge.

Get into the SR and check routes to make sure that it shows all connected routes too,

We are good to delete the old edge vm.

Lets go back to edge transport node and select the faulty edge and “DELETE”

“Delete in progress”

And its gone.

It should disappear from vCenter too,

Well, that was fun.

That’s all I had to share from my recent experience. There might be several other reasons to replace / delete existing edge vm’s. This process should apply to all those use cases. Thank you for visiting. See you in the next post soon.

Are you looking out for a lab to practice VMware products…? If yes, then click here to know more about our Lab-as-a-Service (LaaS).

Leave your email address in the box below to receive notification on my new blogs.

NSX-T 3.1 – Backup & Restore_Production DR Experience – Part1

January 25, 2022Roshan ChavanLeave a comment

Hello Techies, This post will focus on NSX-T Disaster Recovery of the production env that I recently did for one of the customer. Post talks about my own experience and the procedure may differ as per your NSX-T design.

Here is the official VMware documentation which was referred while doing the activity.

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.1/administration/GUID-A0B3667C-FB7D-413F-816D-019BFAD81AC5.html

Additionally, following document is MUST to go through before you plan your DR.

https://communities.vmware.com/t5/VMware-NSX-Documents/NSX-T-Multisite/ta-p/2771370

To put the screenshots in this post, I have recreated the env in my lab. All captures in this post are from the lab that I created for testing purpose.

To set the right expectations, This DR was performed to backup and restore the Management Plane of NSX-T and not the Data Plane. Let me explain the existing env to understand the reason for doing Management Plane recovery only.

NSX-T Multisite Env
Both sites are active and configured with respective BGP routing to local Top of the Rack (TOR) switches.
Primary Site hosts the NSX-T Manager cluster
Backup of the NSX-T manager configured on SFTP server which sits at DR site.
Both sites have a vCenter, Edge VM’s and ESXi nodes.
Inter-Site link has jumbo frames enabled.
Both Sites hosts active workload. Also, Load Balancer, VPN as well as micro-segmentation is in place.
3^rd Party solution is already configured to Migrate / Restart the VM’s on the DR site in case of disaster.

Since both sites are independent and have sperate EDGE VM’s and routing in place, only Management Plane needs to be restored. The 3^rd party backup solution will restore the VM’s on the DR site in case of disaster.

Important Note: Data Plane (i.e. host transport nodes, edge transport nodes…) does not get affected even if you loose the NSX-T manager cluster for any reason. Routing and Connectivity to all workload VM’s works perfectly fine. In short, During the loss of Management Plane, Data Plane is still running as far you do not add any new workload. Also, keep in mind that the vMotion of any VM will end up in loosing the connectivity of that VM if it’s connected to NSX-T Overlay Network. So, it would be a good idea to disable DRS until you bring back the NSX-T manager cluster on the DR site.

The other disadvantage is you cannot make any configuration changes in NSX-T since the UI itself is not available.

Here are some additional bullet points…

You must restore to new appliances running the same version of NSX-T Data Center as the appliances that were backed up.
If you are using an NSX Manager or Global Manager IP address to restore, you must use the same IP address as in the backup.
If you are using an NSX Manager or Global Manager FQDN to restore, you must use the same FQDN as in the backup. Note that only lowercase FQDN is supported for backup and restore.

In most of the cases, FQDN is configured in the env which involves additional steps while restoring the backup. We will discuss more about it in detail. Let’s focus on configuring the backup.

Check my following post for configuring the backup for NSX-T env.

NSX-T Backup Configuration on VMware Photos OS

To begin this post, let’s have a look at the existing env architecture…

List of servers in the env with IP’s.

Here is the screen capture from the env…

Site A vCenter – Dubai

Site B vCenter – Singapore

As I said earlier, we are going to perform Management Plane recovery and not Data Plane, hence I did not configure edge, tier-0 etc on the Site-B. However, customer env had another edge cluster for site B and so the Tier-0. (as shown in the above diagram)

Stable NSX-T manager cluster, VIP assigned to 172.16.31.78

Dubai vCenter host transport nodes

Singapore vCenter host transport nodes

Just a single Edge Transport node deployed at primary site.

BGP Neighbors Configuration…

Note the source addresses. We should see them on TOR as neighbors.

Let’s have a look at the TOR…

Established 172.27.11.2 & 172.27.12.2 neighbors.

BGP routes on the TOR.

Let’s create a new segment and to see if the new routes appears on the TOR.

We should see 10.2.98.X BGP route on the TOR.

Perfect. We have everything in place to perform the DR test and check the connectivity once we bring the NSX-T manager cluster UP in the DR site.

That’s it for this post. We will discuss further process in the next part of this blog series.

Are you looking out for a lab to practice VMware products…? If yes, then click here to know more about our Lab-as-a-Service (LaaS).

Leave your email address in box below to receive notification on my new blogs.

NSX-T 3.0 – Reverse Migration of VMkernel to Port Group

May 4, 2021October 6, 2021Roshan ChavanLeave a comment

In this post, we will talk about the reverse migration of VMKernel adaptor from NSX-T to back to vCenter port group. If you did not get chance to look at my previous article on migration of vmk from vCenter to NSX-T, here is the link.

NSX-T 3.0 – VMkernel Migration to an N-VDS Switch

There can be multiple reasons for removing VMKernel ports from NSX-T. Here are some…
The third party application which had vmk in vCenter PG does not behave as expected,
The application itself (which uses vmk) is no longer needed,
You want to uninstall NSX-T from one of the host for any reason, you first have to move vmk’s from it or appropriate “Network Mappings for Uninstall” has to be in place before you move on.

Note: Uninstalling NSX-T Data Center from an ESXi host is disruptive if the physical interfaces or VMkernel interfaces are connected to N-VDS.

Here is one more important scenario mentioned at VMware docs, (copied from VMware site)

Transport node configuration on a node cannot be overriden if underlying segments or VMs are connected to that transport node. For example, consider a two ESXi host cluster, where host-1 is configured as transport-node-1, but host-2 is unprepared. Segments and VMs are connected to transport-node-1. After preparing host-1 as a transport node (associated to transport-zone-1), if you apply a transport node profile to that cluster (associated to transport-zone-2), then NSX-T does not override the transport node configuration with the transport node profile configuration. To successfully override configuration on host-1, power off the VMs and disconnect the segment before applying the transport node profile to associate host-1 to transport-zone-2 and disassociate it from transport-zone-1.

With that lets get started,

In previous post, I explained migration process of vmk from vCenter to NSX-T. Lets get started with reverting it back.
Verify the vmk location. It is on nsx-t logical switch “VLAN-1650” and the switch name is ‘data-nvds’

Back to NSX-T > System> Nodes> Select appropriate node> Action> ‘Migrate ESX Vmkernel and Physical Adaptors’

Select ‘Migrate to port group’ in this wizard.

Direction: Migrate to Port Groups
N-VDS: Select the target switch from where you want to remove vmk port.
Select the VMkernel Adapter (vmk3) and manually type the port group name ‘vDS-Test-1650’

Map the appropriate physical nics and uplinks in this wizard.
Note: Mapping physical nics here does not mean that it will remove mentioned nics from N-VDS.

Save.

Verify that the vmk3 is back to the VDS.

Test the connectivity to vmk from esxi. And we are done with the reverse migration of VMkernel port to the port group. That’s it for this post. Will come back soon with new content for my next blog.

Cheers..!!!

Are you looking out for a lab to practice VMware products..? If yes, then click here to know more about our Lab-as-a-Service (LaaS).

Leave your email address in below box below to receive notification on my new blogs.

NSX-T 3.0 – VMkernel Migration to an N-VDS Switch

March 18, 2021October 6, 2021Roshan Chavan1 Comment

Most of the customers are moving to NSX-T environment. One common use case / questions would have been, what happens to existing VMkernel Adaptors OR how does the migration of VMkernel works in NSX-T. One of my recent customer had similar use case, wherein he had backup application running in the VMware vSphere environment which had 2 vmk’s and the plan was to migrate all networks in to NSX-T (Overlay or VLAN). There are ‘n’ number of things to consider before we plan for such migrations. First, we got the email confirmation from the application vendor on application compatibility with NSX-T. It was also important to get confirmation from the vendor if the backup application will still behave as expected and will be able to backup VM’s connected to Overlay Segments.

Note: Some 3rd party applications do not support or understand Opaque networks. (For vCenter, all networks that have been created in NSX are Opaque networks)

In my case, customer had to upgrade the backup application to vendor suggested version to make it compatible with NSX-T and to be able to backup VM’s connected to NSX-T (Overlay & VLAN) networks.

Some additional points…
Please keep in mind that we are talking about 3rd party application vmkernel adaptors and NOT vMotion, management or vsan vmk’s. The migration process will always not be the way it is mentioned in this article. It completely depends on customers env and at what point you are planning for this migration and for which vmk’s. Shared Compute, Edge & Management cluster with only 2 pics and not on vSphere 7.0 version will need proper planning and migration methodology. Greenfield env will give you flexibility to migrate vmkernel using network mapping while configuring hosts transport nodes, whereas brownfield env will eat your head. So plan and prepare wisely before you propose your plan to the customer.

Following is my lab setup for this post.
NSX-T 3.0 installed and configured.
Four hosts cluster prepared and configured for NSX-T. It is a shared cluster for all components.
Physical Adaptors – vmnic0, vmnic1 connected to vDS on vCenter. And vmnic2, vmnic3 connected to nvds in nsx-t.
BGP routing is in place.
Edge VM’s uplinks have been configured and connected to logical segments.
Port group name ‘vDS-Test-1650’ with vlan id 1650 is in place. This port group has VMkernel Adaptor 3 (vmk3) and it has been configured on all hosts in the cluster.
‘Test-10’ VM connected to ‘vDS-Test-1650’ for testing connectivity.

Here is the plan.
Create vlan based logical segment in nsx-t for 1650 network.(VLAN-1650 LS)
Move ‘Test-10’ VM from ‘vDS-Test-1650’ port group to ‘VLAN-1650 LS’ logical segment.
Migrate vmkernel adaptor 3 (vmk3) from port group to logical segment.
Test connectivity from test vm to vmk ip after migration.
Revert the configuration.

With that lets get started…

‘vDS-Test-1650’ port group on distributed switch.

‘Test-10’ VM connected to ‘vDS-Test-1650’

Verify the connectivity to ‘172.16.31.110’ (DC in my env) from Test-10 VM.

ESXi01 has vmk3 created with network label as vDS-Test-1650 port group.

Similar configuration on other hosts.

Time to create vlan based logical segment in nsx-t.
Log into NSX-T VIP> Networking> Segments> Add Segment
Name: VLAN-1650
TZ: Shared VLAN TZ
VLAN: 1650

VLAN based logical segment is ready to move the VM’s into it.
Test-10 VM> Edit Settings> Change the network to newly create logical segment.

Test-10 VM now sits on VLAN based logical segment in NSX-T. Test the connectivity to DC again.

Let’s move vmkernel from vCenter PG to NSX-T LS.
System> Fabric> Nodes> Host TN> Select 1^st esxi and click on Action> ‘Migrate ESX VMkernel and Physical Adapters’

Select appropriate N-VDS to migrate to
Select the VMkernel Adaptor that you plan to migrate in to NSX-T.

And then the destination Logical switch that we created earlier.

Next > Select physical adaptors in N-VDS
Note: These vmnics have already been assigned to N-VDS and not the new ones.

Select physical nics and appropriate uplinks and SAVE.

You get a warning at this stage. Continue.

Once it is successful, verify it on the vCenter.
Notice that the vmk3 is sitting on the “data-nvds” instead of “DATA-VDS”

Testing connectivity to vmkernel (172.16.50.101) adaptor from the VM.

All Good. We have successfully migrated VMkernel (vmk3) to nsx-t. There may be situations where you want to revert back the configuration if expected results fails after vmk migration. I will cover the reverse migration in my next blog.

I hope that the blog has valuable information. See you all in next post.

Are you looking out for a lab to practice VMware products..? If yes, then click here to know more about our Lab-as-a-Service (LaaS).

Leave your email address in below box below to receive notification on my new blogs.

NSX-T 3.0 – Load Balancer Concept & Configuration

November 3, 2020October 6, 2021Roshan ChavanLeave a comment

It’s been a while since I wrote my last blog on NSX-T. Recently, I had several discussions with one of the customer to setup a NSX-T Logical Load Balancer. Hence, wanted to write a small blog with generic example. This will give you basic understanding of the NSX-T load balancer and how it is setup.

Let’s check on some theory part.

The NSX-T Data Center logical load balancer offers high-availability service for applications and distributes the network traffic load among multiple servers. The load balancer distributes incoming service requests evenly among multiple servers. You can map a virtual IP address to a set of pool servers for load balancing. The load balancer accepts TCP, UDP, HTTP, or HTTPS requests on the virtual IP address and decides which pool server to use.

Some key points to keep in mind before we proceed.

Logical load balancer is supported only on the tier-1 gateway.
One load balancer can be attached only to a tier-1 gateway.
Load balancer includes virtual servers, server pools, and health checks monitors. It can host single or multiple virtual servers.
NSX-T LB supports Layer 4 (TCP,UDP) as well as Layer 7 (HTTP,HTTPS).
Using a small NSX Edge node to run a small load balancer is not recommended in a production environment.
The VIP (Virtual IP) for the server pool can be placed in any subnet.

Load balancers can be deployed in either inline or one-arm mode.

Inline Topology

In the inline mode, the load balancer is in the traffic path between the client and the server. Clients and servers must not be connected to the same tier-1 logical router. LB-SNAT is not required in this case.

One-Arm Topology

In one-arm mode, the load balancer is not in the traffic path between the client and the server. In this mode, the client and the server can be anywhere. LB-SNAT is always required in this case.

Health check monitors is another area of discussion, which is used to test whether each server is correctly running the application, you can add health check monitors that checks the health status of a server.

Let’s get started with setting up the simple example of NSX-T Logical Load Balancer.

Here is the background of the lab. I have an NSX-T environment already running in the LAB. For demo purpose, I have already done following configuration.

New NSX-T logical segment called ‘LB_1680’ (Subnet: 172.16.80.253/24)
Installed and configured 2 test Web servers. (OS: Centos7 with web server role and added sample html file)
Connected 2 new web severs to LB_1680 segment.

Verify that you can access the web severs and web page is displayed.

1^st Web Server. (172.16.80.10)

2^nd Web Server. (172.16.80.11)

That was all background work. Lets start configuring the Logical NSX-T Load Balancer.

We have to configure the Server Pool first and then move on to next configuration.

Name: WevServerPool
Algorithm: Round Robin (To distribute the load in pool members)
SNAT Translation Mode: Automap (leave it to default)

Next, Click on Select Members> Add members & enter the information for the 1^st web server.

Follow the same procedure again for the 2^nd web server.

Click on Apply and Save.

Make sure that the status is Success.

Next, Click on Virtual Server and ADD L7 HTTP

Name: WebVirtualServer

IP: 192.168.10.15 (This IP can be in any subnet & We will use this IP add to access the Web Server)
Port: 80
Server Pool: WebServerPool (Select the pool that you created in earlier step)

Save & Make sure that the status is Success.

Let’s move to Load Balancer tab and click on Add Load Balancer.

Name: Web-LB
Size: Small (note the sizing information at the point)
Attachment: Select your existing Tier-1 gateway.

Click on Save and then click on NO to complete the configuration.

Now, we have to attach this Load Balancer to Virtual Server that we created in earlier step.

Go back to ‘Virtual Servers’ and click on Edit.

Under the LB, select the LB that we just created and Save.

Make sure that the status is Success for LB, Virtual Server & Server Pools.

That’s It. We are done with the configuration of NSX-T Load Balancer. Its time to test it.

Try to access the VIP (192.168.10.15), This ip should load the web page either from Web-1 server or Web-2.

The VIP is hitting to my 2^nd Web Server. Try to refresh the page.

Couple of refresh will route the traffic to 2^nd Web Server. You might have to try in different browser or try Ctrl+F5 to refresh the page.

Hurray…!! We have just configured NSX-T LB.

This is how my network topology looks. Web-LB is configured at tier-1 gateway.

Remember, there is much more than this when it comes to customer production environment. We must take several other things into consideration (health monitors, SNAT, LB rules etc…), and it is not that easy as it sounds. This blog was written to give you basic understanding of NSX-T LB.

I hope that the blog has valuable information. See you all in next post.

Are you looking out for a lab to practice VMware products..? If yes, then click here to know more about our Lab-as-a-Service (LaaS).

Leave your email address in below box below to receive notification on my new blogs.