VXLAN

What is VXLAN ?

VXLAN - rfc7348

Virtual Extensible LAN (VXLAN) is a network virtualization technology that attempts to address the scalability problems associated with large cloud computing deployments. It uses a VLAN-like encapsulation technique to encapsulate OSI layer 2 Ethernet frames within layer 4 UDP datagrams, using 4789 as the default IANA-assigned destination UDP port number. VXLAN endpoints, which terminate VXLAN tunnels and may be either virtual or physical switch ports, are known as VXLAN tunnel endpoints (VTEPs).


VXLAN is an evolution of efforts to standardize on an overlay encapsulation protocol. Compared to VLAN which provides limited number of layer-2 VLANs (typically using 12-bit VLAN ID), VXLAN increases scalability up to 16 million logical networks (with 24-bit VNID) and allows for layer-2 adjacency across IP networks. Multicast or unicast with head-end replication (HER) is used to flood Broadcast, unknown-unicast and multicast traffic.

Underlay Network

all L3 - Dynamic routing

no spanning tree

uses ECMP


Overlay Network

VNI (VXLAN network identifier)

VNI 24bits

*VLAN create virtual layer two segments called VNIs and VNIs run on top of a layer 3 network. VXLAN uses a special interface called a VTEP , this bridges VNIs to the layer 3 network. When traffic comes in the vtep encapsulates the traffic and sends it to a destination vtep where it is decapsulated.

*each VNI is a separate virtual network the runs over the underlay

*each VNI is called a bridge domain

*traffic is encapsulated with UDP and IP before it is sent out and decapsulated when its sent out.

*VXLAN can run on hardware or software

*VXLAN (software / host based) the vSwitch on the host has a VTEP which encapsulates the traffic from the VM before it touches any physical switches, the physical switches just see IP traffic.

*VXLAN (physical / nexus based) this is called vxlan gateway, the encapsulation is done on the switch and can improve performance.

VXLAN hybrid is also supported (hardware and software)

Switches and routers that use vxlan have interfaces called vteps, these provide the connection between the overlay and the underlay.

Each vtep has an IP address in the underlay network, it also has one or more VNIs

To deliver traffic between one host to another, a source and destination vtep will create a stateless tunnel, these tunnels only exist long enough to deliver the vxlan frame.


Header Format and Encapsulation

We start with an ordinary layer-2 frame that a host might send. This is the Inner MAC Frame. The hosts are unaware of VxLAN, so there is nothing special about this. It’s just a normal Ethernet frame like the one shown below.

The switch will add several headers, starting with the VxLAN header. Before sending the data across the IP network, it also needs to add:

An outer UDP headerAn outer IP headerAn Ethernet header

Aside from adding headers, this process also removes the FCS from the inner MAC frame.

There are four parts to the VxLAN header:

Reserved (8 bits) – Currently unused information. This is set to zero on transmission and ignored when received
VNI (24 bits) – The VNI ID number. 24 bits allows for about 16 million possible VNI’s
Reserved (24 bits) – As before, this is currently unused
Flags (8 bits) – Currently only bit 3 is used. This is the I flag, and indicates if this is a valid VNI

*** All this header makes the frame much larger, the extra VXLAN, UDP and IP headers adds around 50 bytes, to ensure that these packets are not fragmented, which in turn will decrease performance, its recommended that Jumbo frames are enabled everywhere.


Learning Addresses


Data Plane Learning (flood and learn)

Data plane learning is easier but does cause traffic to hairpin and is less efficient than control plane learning.



Control Plane Learning (Uses BGP to share MAC address information)

BUM traffic (Broadcast / Unknown unicast / Multicast)

*ARP is an example of BUM trafficIn a tradition network BUM traffic is flooded to many destinations

VXLAN handles BUM traffic differently, by using either Multicast or headend replication. Multicast is the most common.

Headend replication can be used but is only available if you use BGP EVPN, this is simpler but doesn't scale well.

Control plane learning uses EVPN address family, You need a full mesh or route reflectors. This also uses ARP suppression.

Multicast is the best for BUM traffic.

To support multi tenancy layer 3 VNIs are attached to a VRF.



VXLAN with EVPN

config blah!!!

vPC to VXLAN

vPC (Virtual Port-Channel), also known as multichassis EtherChannel (MEC) is a feature on the Cisco Nexus switches that provides the ability to configure a Port-Channel across multiple switches (i.e. vPC peers). vPC is similar to Virtual Switch System (VSS) on the Catalyst 6500s.

*Use different loopbacks for BGP and NVE*Configure delay restore interface-vlan*Enable peer-switch, ARP sync and ND sync*Increase STP timer to 4 seconds

*Use a backup routing SVI's on the peer link to protect against failures(without this traffic can be black-holed)


Spine and Leaf topology


Useful Commands

show ip arp supersession-cache detail
*show MAC to IP bindings that the switches are returning to the hosts(L flag shows locally learned and R remotely learned)

show vxlan
*show vlan to vni bindings


Notes

NVE (Network Virtual Interface)


Reference links

add FPLABS configuration video!!!


VXLAN explained - Excellent!