NSX-T 3.1 – Backup & Restore_Production DR Experience – Part2

In previous post, We validated the base NSX-T env which is setup to perform Disaster Recovery of NSX-T. Here is the link to the previous post.

NSX-T 3.1 – Backup & Restore_Production DR Experience – Part1

There are couple of validation steps from DR point of view. Let’s start with the DR Validation Phase. I have bullet point in excel sheet. Will discuss them one by one with screen captures.

We need to make sure that the publish fqdn is set to ‘true’ in all nsx-t managers using the API call.

Here is the Get API call followed by PUT for all 3 NSX-T managers…

https://172.16.31.78/api/v1/configs/management
https://172.16.31.79/api/v1/configs/management
https://172.16.31.85/api/v1/configs/management

Before we go there let’s have a look at the backup details.

Note that the Appliance FQDN or IP lists an IP address of the appliance.

Let’s go back to the API call and run it. I am using postman utility for API calls.

GET https://172.16.31.78/api/v1/configs/management

Paste the above command, change the Authorization type to ‘Basic Auth’, enter the credentials and send. You might get SSL error if it is not disabled already.

Click on Settings and disable SSL verification here…

and send the call again.

Note that the ‘publish_fqdns’ value is false. We need change this to ‘true’

Change GET to PUT, copy 4 lines, Click on ‘Body’ change the radio button to ‘raw’ and select JSON from the drop down box at the end.

Paste copied 4 lines and change the value to ‘true’ and send it again.

Make sure to see the status as 200 OK.

Use the GET command again to verify…

Make sure that you see this value to ‘true’ on all 3 nsx-t managers.

Let’s go back to NSX-T and run the backup again. Note that the ‘Appliance FQDN’ now lists the FQDN instead of an IP address.

All good till here. Move to next step in the excel sheet.

Next step is to verify that all transport nodes in the env using FQDN instead of an IP address.

Run ‘get controller’ command on each edge node. Note that the “Controller FQDN” lists the actual FQDN instead of an IP.

Next,

All good here. Move to next action item…

Change the TTL value from 1 hr to 5 mins of all NSX-T managers in DNS records properties.

We have completed all validation steps for DR test. We now move to actual DR test in the next part of this blog series. Thank You.

Are you looking out for a lab to practice VMware products…? If yes, then click here to know more about our Lab-as-a-Service (LaaS).

Leave your email address in box below to receive notification on my new blogs.

NSX-T 3.1 – Backup & Restore_Production DR Experience – Part1

Hello Techies, This post will focus on NSX-T Disaster Recovery of the production env that I recently did for one of the customer. Post talks about my own experience and the procedure may differ as per your NSX-T design.

Here is the official VMware documentation which was referred while doing the activity.

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.1/administration/GUID-A0B3667C-FB7D-413F-816D-019BFAD81AC5.html

Additionally, following document is MUST to go through before you plan your DR.

https://communities.vmware.com/t5/VMware-NSX-Documents/NSX-T-Multisite/ta-p/2771370

To put the screenshots in this post, I have recreated the env in my lab. All captures in this post are from the lab that I created for testing purpose.

To set the right expectations, This DR was performed to backup and restore the Management Plane of NSX-T and not the Data Plane. Let me explain the existing env to understand the reason for doing Management Plane recovery only.

  • NSX-T Multisite Env
  • Both sites are active and configured with respective BGP routing to local Top of the Rack (TOR) switches.
  • Primary Site hosts the NSX-T Manager cluster
  • Backup of the NSX-T manager configured on SFTP server which sits at DR site.
  • Both sites have a vCenter, Edge VM’s and ESXi nodes.
  • Inter-Site link has jumbo frames enabled.
  • Both Sites hosts active workload. Also, Load Balancer, VPN as well as micro-segmentation is in place.
  • 3rd Party solution is already configured to Migrate / Restart the VM’s on the DR site in case of disaster.

Since both sites are independent and have sperate EDGE VM’s and routing in place, only Management Plane needs to be restored. The 3rd party backup solution will restore the VM’s on the DR site in case of disaster.

Important Note: Data Plane (i.e. host transport nodes, edge transport nodes…) does not get affected even if you loose the NSX-T manager cluster for any reason. Routing and Connectivity to all workload VM’s works perfectly fine. In short, During the loss of Management Plane, Data Plane is still running as far you do not add any new workload. Also, keep in mind that the vMotion of any VM will end up in loosing the connectivity of that VM if it’s connected to NSX-T Overlay Network. So, it would be a good idea to disable DRS until you bring back the NSX-T manager cluster on the DR site.

The other disadvantage is you cannot make any configuration changes in NSX-T since the UI itself is not available.

Here are some additional bullet points…

  • You must restore to new appliances running the same version of NSX-T Data Center as the appliances that were backed up.
  • If you are using an NSX Manager or Global Manager IP address to restore, you must use the same IP address as in the backup.
  • If you are using an NSX Manager or Global Manager FQDN to restore, you must use the same FQDN as in the backup. Note that only lowercase FQDN is supported for backup and restore.

In most of the cases, FQDN is configured in the env which involves additional steps while restoring the backup. We will discuss more about it in detail. Let’s focus on configuring the backup.

Check my following post for configuring the backup for NSX-T env.

NSX-T Backup Configuration on VMware Photos OS

To begin this post, let’s have a look at the existing env architecture…

List of servers in the env with IP’s.

Here is the screen capture from the env…

Site A vCenter – Dubai

Site B vCenter – Singapore

As I said earlier, we are going to perform Management Plane recovery and not Data Plane, hence I did not configure edge, tier-0 etc on the Site-B. However, customer env had another edge cluster for site B and so the Tier-0. (as shown in the above diagram)

Stable NSX-T manager cluster, VIP assigned to 172.16.31.78

Dubai vCenter host transport nodes

Singapore vCenter host transport nodes

Just a single Edge Transport node deployed at primary site.

BGP Neighbors Configuration…

Note the source addresses. We should see them on TOR as neighbors.

Let’s have a look at the TOR…

Established 172.27.11.2 & 172.27.12.2 neighbors.

BGP routes on the TOR.

Let’s create a new segment and to see if the new routes appears on the TOR.

We should see 10.2.98.X BGP route on the TOR.

Perfect. We have everything in place to perform the DR test and check the connectivity once we bring the NSX-T manager cluster UP in the DR site.

That’s it for this post. We will discuss further process in the next part of this blog series.

Are you looking out for a lab to practice VMware products…? If yes, then click here to know more about our Lab-as-a-Service (LaaS).

Leave your email address in box below to receive notification on my new blogs.

NSX-T Backup Configuration on VMware Photon OS

I want to keep this one short, since it is part of parent topic here…

“NSX-T 3.1 – Backup & Restore_Production DR Experience – Part1”

For NSX-T 3.1, following are supported operating systems as per the VMware documentation, however it also says that other software versions might work. SFTP is the only supported protocol for now.

I remember having discussion with someone using VMware Photon OS for NSX-T backup. It is based on Linux OS and lightweight too. Does not consume many resources. Available for download at following location…

https://github.com/vmware/photon/wiki/Downloading-Photon-OS

Get the Minimal ISO…

Installation is straight forward. Just mount an ISO on a VM and follow the instructions to install it. Then we just run couple of commands to setup the VMware Photon OS.

Here is the screen capture of the commands that has been run to setup the sftp server.

Add the sftp user…
root@VirtualRove [ ~ ]# useradd siteA

Create backup directory…
root@VirtualRove [ ~ ]# mkdir /home/nsxbkp-siteA/

Add a group…
root@VirtualRove [ ~ ]# groupadd bkpadmin

Add a user in the group…
root@VirtualRove [ ~ ]# groupmems -g bkpadmin -a siteA

Set the password for user
root@VirtualRove [ ~ ]# passwd siteA

New password:
Retype new password:

passwd: password updated successfully

The chown command changes user ownership of a file, directory, or link in Linux
chown  USER:[GROUP NAME] [Directory Path]
root@VirtualRove [ ~ ]# chown siteB:bkpadmin /home/nsxbkp-siteB/
root@VirtualRove [ ~ ]#

And that completes the configuration on Photon OS. We are good to configure this directory as backup directory in NSX-T.

Couple of things…
The Photon OS is not enabled for ICMP ping by default. You must run following commands on the console to enable ping.
iptables -A OUTPUT -p icmp -j ACCEPT
iptables -A INPUT -p icmp -j ACCEPT

Also, Root account is not permitted to login by default. You need to edit the ‘sshd_config’ file located at ‘/etc/ssh/sshd_config’
You can use any editor to edit this file…

vim /etc/ssh/sshd_config

Scroll it to the end of the file and change following value to enable ssh for root account…

Change the value from ‘no’ to ‘yes’ and save the file. You should be able to SSH to photon OS.

Let’s move to NSX-T side configuration.
Login to NSX-T VIP and navigate to System> Backup & Restore…

Click on Edit for SFTP server and fill in all required information.
FQDN/IP : is your sftp server
Port : 22
Path : We created this in our above steps.
Username, Password & Passphrase.

Save

It will prompt to add for Fingerprints.

Click on ‘Start Backup’ once you save it.

You should see successful backup listed in the UI.

Additionally, you can use WinSCP to login to photon and check for backup directory. You should see recent backup folders.

You also want to set an interval to backup NSX-T configuration as pe the mentioned schedule.
Click on ‘Edit’ from NSX-T UI backup page and set an interval.

I preferred everyday backup, so I set it up to 24 hrs interval.

Check your manager cluster to make sure its stable.

And take a backup again manually.

That’s it for this post.

We have successfully configured SFTP server for our NSX-T environment. We will use this backup to restore it at DR site in case of site failure or in case of NSX-T manager failure for any reason.

Are you looking out for a lab to practice VMware products…? If yes, then click here to know more about our Lab-as-a-Service (LaaS).

Leave your email address in box below to receive notification on my new blogs.