Scaling EVPN Multi-Site Overlays using Route-Servers

Cisco’s EVPN Multi-Site it’s a great technology that allows us to achieve massive scale of an EVPN network. With the latest release, the official scalability numbers give us something in the realm of over 12000 VTEPs (512 VTEPs per site x 25 sites).
I’m in no way suggesting that you would need such a big topology and you definitely should segment way sooner you reach the limit, but still…

The main configuration requirement for the Multi-Site overlay is to have a full mesh of eBGP peering between all border gateways.

This has scalability drawbacks as usual. Not only each leaf will have ever growing number of peers which will soon grow out of control, but maybe, worse is the fact that after one site is added, every other site must be touched too.

To avoid a full mesh, for iBGP topologies we would be using a Route Reflector, but with eBGP that’s obviously not an option. So, instead of a RR they way to scale eBGP peerings is to leverage a Route-Server.

A Route-Server provides route reflection capabilities and as such it must ensure that NLRIs attributes like the Next Hop and route-targets aren’t changed.
In Cisco’s EVPN implementation, the auto defined route-targets are based on ASN:VNI, and in order to be able to use this simplified config, the RS should also support the “rewrite-evpn-rt-asn” feature; if that’s not the case, then hard coded and consistent route-targets must be defined across the VTEPs in the network. Finally, the route-server doesn’t have to be in the data plane since it’s only a control plane node.

Unfortunately, for EVPN, there isn’t a “route-server-client” configuration nob yet 😦 , nor we can find a configuration example in the Cisco pages. Fortunately, knowing the requirements we can figure out how the config should look like.

NX-OS EVPN Route-Server Configuration

feature nv overlay
nv overlay evpn
feature bgp
!
route-map RETAIN-ORIGINAL-VTEP-NEXTHOP permit 10
set ip next-hop unchanged
!
router bgp 12345
log-neighbor-changes
address-family l2vpn evpn
retain route-target all
neighbor 1.1.1.0
remote-as 100
address-family l2vpn evpn
send-community
send-community extended
route-map RETAIN-ORIGINAL-VTEP-NEXTHOP out
rewrite-evpn-rt-asn
neighbor 1.1.1.3
remote-as 200
address-family l2vpn evpn
send-community
send-community extended
route-map RETAIN-ORIGINAL-VTEP-NEXTHOP out
rewrite-evpn-rt-asn

IOS-XE EVPN Route-Server Configuration

!
route-map RETAIN-ORIGINAL-VTEP-NEXTHOP permit 10
set ip next-hop unchanged
!
router bgp 12345
bgp log-neighbor-changes
no bgp default route-target filter
neighbor 1.1.1.0 remote-as 100
neighbor 1.1.1.0 disable-connected-check
neighbor 1.1.1.3 remote-as 200
neighbor 1.1.1.3 disable-connected-check
!
address-family l2vpn evpn
rewrite-evpn-rt-asn
neighbor 1.1.1.0 activate
neighbor 1.1.1.0 send-community both
neighbor 1.1.1.0 soft-reconfiguration inbound
neighbor 1.1.1.0 route-map RETAIN-ORIGINAL-VTEP-NEXTHOP out
neighbor 1.1.1.3 activate
neighbor 1.1.1.3 send-community both
neighbor 1.1.1.3 soft-reconfiguration inbound
neighbor 1.1.1.3 route-map RETAIN-ORIGINAL-VTEP-NEXTHOP out
exit-address-family
!

Just a couple of notes on the above config:

  1. The command “disable-connected-check” is required otherwise the router will reject received prefixes with “DENIED due to: non-connected MP_REACH NEXTHOP
  2. The command “next-hop-unchanged” has no effect in the address-family L2VPN EVPN (probably a bug). A route-map is necessary in order to achieve the same result.

IOS-XR EVPN Route-Server Configuration

!
route-policy ACCEPT-ALL
pass
end-policy
!
router bgp 12345
nsr
bgp router-id 1.1.1.1
bgp graceful-restart
!
address-family l2vpn evpn
retain route-target all
!
neighbor 1.1.1.0
remote-as 100
ignore-connected-check
!
address-family l2vpn evpn
send-community-ebgp
route-policy ACCEPT-ALL in
route-policy ACCEPT-ALL out
send-extended-community-ebgp
soft-reconfiguration inbound always
next-hop-unchanged
!
!
neighbor 1.1.1.3
remote-as 200
ignore-connected-check
!
address-family l2vpn evpn
send-community-ebgp
route-policy ACCEPT-ALL in
route-policy ACCEPT-ALL out
send-extended-community-ebgp
soft-reconfiguration inbound always
next-hop-unchanged
!
!
!

As for IOS-XE, the command “ignore-connected-check” is required.
Additionally, IOS-XR unfortunately doesn’t support “rewrite-evpn-rt-asn“.
This means that each VTEP will need to have manually configured the appropriate route-targets highly increasing configuration complexity.

Unless you have some automation backing up your EVPN deployment, probably isn’t a good idea to use IOS-XR as an EVPN Route-Server.

Do you have anything else to add? Then contact me, or leave a message below.

2 thoughts on “Scaling EVPN Multi-Site Overlays using Route-Servers

  1. Thanks for the post; it was very helpful. Do you have any thoughts on what criteria should be used to determine how many route-servers are needed and where they should be deployed?

    Like

    • It really depends on your design. Since the route servers do not need to be in the data plane you could run them outside of the DCI themselves, even as VMs.

      in my experience though, what we found is that it made a lot of sense to have it inside the DCI BB, since that backbone was a routed Nexus 9k environment

      Like

Leave a comment