Sunday, August 17, 2014

VXLAN in the contemporary data center

What is a Contemporary data center?
A contemporary data center is a virtualized data center.  At first it was only the server that was virtualized.  Virtualizing the servers alone changed the data center from static environment to a dynamic environment where servers running as virtual machines can be provisioned and deleted as well as moved from one physical machine to another.  The contemporary data center has changed into a dynamic/elastic environment.  

Later on, storage virtualization has made the data center more dynamic/elastic where beside the virtual machines, the data can be moved around. 

Virtual machines can move from one physical server to another server is very useful.  However, the limitation was that these physical servers have to be connected in a flat network (layer 2).

With multiple virtual servers running on a physical server allows for multi tenancy.  VLAN is a good way for traffic isolation among the various tenants.  The number of VLANs in a network is limited by the 12-bit field which is 4096 in which VLAN 0 is not a valid VLAN thus only 4095 VLANs can exist in a given layer-2 network.

To cope with the increased demand on the network from the virtualized data center the industry has come up with 3 different ways to alleviate the problems.  The most discussed technologies are:
  • Network Virtualization
  • Network Function Virtualization and
  • Software Defined Networking

I will describe and compare these 3 technologies in another post.  This post will focus on VXLAN which is one version of Network Virtualization.

What is Network Virtualization?
Virtualization is the abstraction or decoupling of something from the physical entity.  In this case, for network virtualization it is the ability to abstract networking from the physical network. 

How does network virtualization abstract from the physical network?  One way is to use the technique of network overlay where tunnels between end points are created on existing physical networks.  The most common tunneling protocols are:

  • VXLAN (Virtual Extensible LAN)
  • Network Virtualization using Generic Encapsulation (NVGRE)
  • Stateless Transport Tunneling (STT)
  • Network Virtualization Overlay3 (NVO3)
Benefits of Network Overlay
The very first problem that network overlay can help solve is to extend the layer 2 domain across layer 3 subnets.  This resulted in physical servers are not confined to a single flat layer 2 network for virtual machines to move around.  With traffic tunneled between end points, it helps in traffic isolation among tenants.

Each overlay networks has its own network id and thus extend the 4095 VLAN limitation.  Furthermore, multi-tenants in the same data center can have the same private IP address.

What is VXLAN?
VXLAN (Virtual Extensible LAN) is a network tunneling technology by encapsulating UDP packet on top of a native Ethernet frame and transport over an IP network. 

It was jointly developed by VMware, Arista Networks and Cisco.  The latest specification which is moved to RFC status also has contribution from Storvisor, Broadcom, Citrix and Red Hat.  The title of this IETF draft is “VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks”.  This technology was first announced at VMworld 2011. Since then there are tons of articles about this topic on different technical magazine and blog.

VXLAN Terminology
The best way to understand a technology, in my opinion is to start from the terminology being used.  It provides a framework of what the important elements are for a given technology. At the very least we can type in these key words in our favorite search engine and start researching.

The following are in my opinion the essential basic terminology used in the VXLAN world:
  • Encapsulation
  • VTEP
  • VNI
  • IP Multicast 
The term encapsulation is used in Object Oriented Programming as well as in data communication.  The idea is the same in both cases. The concept of encapsulation is to put one object into another object and send to a destination. 

In the case of VXLAN, it is to put a layer-2 frame as the payload of an UDP packet and uses IP to reach the destination.  When reaching the destination, the packet is being de-capsulated.

I have a picture taken from the Cisco website that not only details the individual field of a VXLAN packet but also to explain the concept of encapsulation with color.  The yellow portion is the “original L2 frame” and is being put as the payload of an UDP packet as highlighted in blue.

Image source:

As we described in the above paragraph, packets are being encapsulated and de-capsulated from the source to the destination.  VTEP (VXLAN Tunnel End Point) is the entity that performs the encapsulation and de-capsulation. 

VTEP plays a vital role in the VXLAN operation.  It is these end points that the tunnel is created so that the “original L2 frame” can be transported back and forth thus achieving the goal of layer-2 communication over a layer-2 (IP) infrastructure when entities are in different IP subnets.  One constraint that we have mentioned in this post was that vMotions of virtual machine is limited to physical machines that are in the same layer-2 network.

VTEP can be implemented in virtual switches in the hypervisors or it can be on physical networking device such as switch and routers.

VXLAN is about layer-2 segments as inferred by its name – extensible.  Traditional VLAN segment is limited by the 12-bit VLAN ID to 4096 per network.  With VXLAN it is being expanded to 16 million logical segments.  This is done by the use of a 24-bit VNI or VNID (VXLAN Network ID) to uniquely identify logical segment within a VXLAN network.

Each device in the VXLAN network is uniquely identified by the combination of VNI and the MAC address.

IP Multicast
VXLAN operates on a layer-3/IP network.  Tunnels are created between VTEPs. IP multicast was specified in the VXLAN specification to be used to simulate a layer-2 broadcast to find the location of the destination device.  IP multicast can be IGMP or PIM.   I will have to find out if one method is being used more than the other.

Cisco has a proprietary implementation of using unicast to perform this function.  It is being called the unicast-mode.

Putting the terminology together to see how VXLAN works
Knowing the terminologies is like knowing the alphabets.  Now we are to make a sentence from the alphabets.

Image source:

The above diagram that I have seen at the VMware blog explains the VXLAN operation very well:
  • VTEP is implemented in VMware’s vSphere Distributed Switch.
  • VTEP has an IP address and in this case they are on the same subnet.
  • There is a Layer-3 infrastructure network.
  • The 2 VTEPs are member of a multicast group and in this case IGMP (Internet Group Management Protocol) is used.
  • The VNID is 5001.
When VM on the left wanted to communicate with VM on the right:
  • VM sends out a destination unknown, broadcast or multicast packet
  • VTEP on the left (IP = encapsulate this layer-2 frame into an UDP packet and send it out to the multicast group.
  • Other VTEPs in the multicast group (in this case there is only one) received the packet will de-capsulate and flood the packet on their local layer-2 domain.
  • In this process the VNI and the MAC address of the VM on the left is learned by the VTEPs.
  • VM on the right received the frame from VM on the left and reply.
  • The reply frame will be send from VTEP on the right to the VTEP on the left as a unicast frame since the MAC address and VNI are learned.
  • At this time the MAC address and the VNI of both VMs are learned by the VTEP and from this point onward, traffic between both VMs will be IP unicast between the 2 VTEPs

This is a brief overview of VXLAN in the contemporary data center.