Anycast DNS - Using BGP

/ DNS, BIND, Anycast, DDI

Anycast DNS

In this fifth article on Anycast DNS, we provide some examples of deploying Anycast using Border Gateway Protocol or BGP, the core routing protocol of the Internet.

While BGP is mostly used by Internet Service Providers (ISPs), it is also used in some of the larger enterprise environments that must interconnect networks that span geographical and/or administrative regions and boundaries. Since BGP is a very complex routing protocol, we will provide only a basic recipe using Cisco and Quagga host-based routing software. A detailed discussion of the BGP protocol is beyond the scope of this article.

BGP is an Exterior Gateway Protocol (EGP), which means that it exchanges routing information between Autonomous Systems (AS). BGP is quite different from other IGPs, such as RIP and OSPF. BGP uses a different routing algorithm that uses a path vector algorithm, causing it to keep a list of every AS that the path passes through.

Our recipe will demonstrate how to configure Quagga to peer with a Cisco router using BGP. Suppose our Anycast design consists of an Autonomous System 65500 and AS 64555 as shown below. AS 64555 will contain our Anycast DNS servers and we'll establish peering between the two as shown below:

Anycast DNS BGP

The recipe calls for configuring an Anycast DNS server each with two physical network connections on different subnets or VLANs. Two upstream routers are configured with BGP routing and will peer with our Anycast DNS server. The Anycast DNS servers will be configured with BGP routing protocol for originating our two Anycast VIPs of 192.168.0.1/32 and 192.168.1.1/32. The configuration is shown in the graphic below:

Anycast DNS BGP

We could advertise two (2) Anycast VIPs from within the same netblock 192.168.0.0/24, such as 192.168.0.1/32 and 192.168.0.2/32. This would save address space, but we're simply trying to show by example by using VIPs from different netblocks.

Recipe - Multihomed Anycast DNS using BGP

Step 1 - Configure Anycast VIPs on "Server A"

Add two (2) Anycast VIPs to the host's loopback interface as a virtual loopback device or sub-interface. This is performed using the following command:

ifconfig lo:0 192.168.0.1 netmask 255.255.255.255
ifconfig lo:1 192.168.1.1 netmask 255.255.255.255

NOTE: The command above shows the syntax for performing this on Linux. The loopback devices are named slightly different on Sun Solaris. The loopback devices on Solaris are called lo0:0 and lo0:1 respectively.

Step 2 - Configure Zebra (component of Quagga) on "Server A"

The typical location of the zebra configuration file is /etc/quagga/zebra.conf, unless you have built Quagga with non-default file locations. Create the /etc/quagga/zebra.conf file as follows:

!
! Zebra configuration saved from vty
! 2009/06/07 09:49:00
!
hostname server_a
!
password zebra
enable password zebra
!
interface eth0
ip address 10.0.1.10/24
!
interface eth1
ip address 10.0.2.10/24
!
interface lo
!
line vty
!

Once the zebra.conf file is built, start the zebra process and configure it to start automatically at boot time. With zebra running, we can access the running configuration interactively using the vty or vtysh. Please consult the Quagga on-line help for usage at http://www.quagga.net

Step 3 - Configure BGP on "server_a"

In order to configure BGP routing on server_a, we need to configure the server to run the bgpd routing daemon. The Quagga BGP routing daemon is configured through the /etc/quagga/bgpd.conf file as follows:

!
! bgpd configuration saved from vty
!2009/06/13 11:21:42
!
hostname server_a
password zebra
log stdout
!
router bgp 64555
bgp router-id 10.0.3.10
network 192.168.0.1/32
network 192.168.1.1/32
timers bgp 4 16
neighbor 10.0.1.1 remote-as 65500
neighbor 10.0.1.1 next-hop-self
neighbor 10.0.1.1 prefix-list DEFAULT in
neighbor 10.0.1.1 prefix-list ANYCAST out
neighbor 10.0.2.1 remote-as 65500
neighbor 10.0.2.1 next-hop-self
neighbor 10.0.2.1 prefix-list DEFAULT in
neighbor 10.0.2.1 prefix-list ANYCAST out
!
ip prefix-list ANYCAST seq 5 permit 192.168.0.1/32
ip prefix-list ANYCAST seq 10 permit 192.168.1.1/32
ip prefix-list DEFAULT seq 5 permit 0.0.0.0/0
line vty
!

Start the BGPD routing daemon and enable the service to start automatically at boot time. Similar to zebra, the BGP process can be maintained and configured by using the vty or vtysh. The only interfaces in our configuration that are actively participating using BGP are eth0 and eth1. They will "peer" with their respective upstream BGP neighboring router. The eth0 peers with router R1-A, and the eth1 interface will peer with the R1-B router.

In our configuration above, we used some of the more advanced BGP configuration directives. Here is a summary of what some of them do:

  • "timers bgp 4 16" - this command adjusts the network timers for keepalive and holddown timers. On Cisco routers, this defaults to 60 and 180 respectively. This means that a keepalive is sent every 4 seconds, and the router should wait 16 seconds for keepalive messages before it declares the peer dead
  • "neighbor 10.0.1.1 next-hop-self" - This configures "peering" by forcing routing updates to this upstream neighbor
  • "neighbor 10.0.1.1 prefix-list DEFAULT in" - this allows the ip prefix-list called "DEFAULT" to propogate the default route to this device
  • "neighbor 10.0.1.1 prefix-list ANYCAST out" - this enables our outbound ANYCAST prefix-list to be advertised to our upstream peer

Step 4 - Configure "Server A" upstream router R1-A and R1-B with BGP

The following Cisco configuration were applied to the upstream router R1-A:

interface FastEthernet0/0
description link to BGP AS 65500
 ip address 192.168.2.31 255.255.255.0
!
interface FastEthernet0/1
description link to BGP AS 64555
 ip address 10.0.1.1 255.255.255.0
!
router bgp 65500
 bgp log-neighbor-changes 
 network 10.0.1.0 mask 255.255.255.0
 network 192.168.2.0 
 network 0.0.0.0
timers bgp 4 16
 neighbor 10.0.1.10 remote-as 64555
 neighbor 10.0.1.10 next-hop-self
 maximum-paths 4

Perform a similar configuration to router R1-B:

interface FastEthernet0/0
description link to BGP AS 65500
 ip address 192.168.2.32 255.255.255.0
!
interface FastEthernet0/1
description link to BGP AS 64555
 ip address 10.0.2.1 255.255.255.0
!
router bgp 65500
 bgp log-neighbor-changes
 network 10.0.2.0 mask 255.255.255.0
 network 192.168.2.0
 network 0.0.0.0
timers bgp 4 16
 neighbor 10.0.2.10 remote-as 64555
 neighbor 10.0.2.10 next-hop-self
 maximum-paths 4

At this point, BGP routing should be operational, and our Anycast VIPs should be advertised.

Step 5 - Create Failover Mechanism

In the event that our DNS server process on "Server A" or "Server B" fails, it is desirable to remove the Anycast VIPs from the global routing table. To do that, we must stop the routes from being advertised at their point of origination. A small script can be used to accomplish this by performing cursory checks on the health of the DNS server, and its ability to respond to queries. A simple script is used to detect issues with DNS. The script will issue queries and as soon as they fail, it will simply shutdown our routing daemon(s) or remove the routes from being advertised. The following is an example of what a script might look like:

#!/bin/bash

DNSUP=`/usr/sbin/dig @192.168.0.1 localhost. A +short`
if [ "$DNSUP" != "127.0.0.1" ];
then
echo "Stopping Anycast...."
    /etc/init.d/bgpd stop
    /etc/init.d/zebra stop
    /etc/init.d/named stop
else 
    echo "Everything's good... Do nothing..."
fi

The script should be scheduled in cron or at to minimize downtime and provide quick failover.

Step 6 - Repeate Steps 1-5 for all other Anycast Servers that are part of this Anycast Group.

Key BGP Troubleshooting Commands

BGP is a complex routing protocol to deploy and maintain, especially in larger enterprise network environments. A great amount of planning time is needed to achieve an efficient routing architecture that provides high availability and fast convergence. As you work with BGP, you will need to rely on a bevy of tools for troubleshooting and validating your BGP routed network. Here are some Cisco IOS commands used in configuring and/or troubleshooting BGP:

show ip bgp summary - shows BGP neighbors in summary mode

R1-A# show ip bgp summary
BGP router identifier 192.168.2.31, local AS number 65500
BGP table version is 1, main routing table version 1
6 network entries using 582 bytes of memory
6 path entries using 216 bytes of memory
2 BGP path attribute entries using 120 bytes of memory
1 BGP AS-PATH entries using 24 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 942 total bytes of memory
BGP activity 6/0 prefixes, 6/0 paths, scan interval 60 secs

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.0.1.10       4 64555       4       3        0    0    0 00:00:02        2
10.0.2.10       4 64555       3       3        0    0    0 00:00:00        0

The output shown above displays a lot of useful information, including the local router identifier for router R1-A as 192.168.2.31, the local AS of 65500, and the BGP table version of 1. (An increasing version number indicates a network change is occurring; if no changes occur, this number remains the same.) It also shows six network paths on R1-A, using 582 bytes of memory. Memory is important in BGP because in a large network, such as the Internet, memory can be a limiting factor. As more BGP entries populate the IP routing table, more memory is required. The above output displays two configured remote peers: both are EBGP (because the AS is 64555 and are different the same as the local AS).

show ip bgp - displays the BGP topology table

R1-A# show ip bgp
BGP table version is 7, local router ID is 192.168.2.31
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 0.0.0.0          192.168.2.1              0         32768 i
*> 10.0.1.0/24      0.0.0.0                  0         32768 i
*> 10.0.2.0/24      0.0.0.0                  0         32768 i
*> 192.168.0.1/32   10.0.2.10                0             0 64555 i
*                   10.0.1.10                0             0 64555 i
*> 192.168.1.1/32   10.0.2.10                0             0 64555 i
*                   10.0.1.10                0             0 64555 i
*> 192.168.2.0      0.0.0.0                  0         32768 i

The BGP table version is displayed as 7 and the local router ID is 192.168.2.31. The various networks are listed along with the next hop address, metric (MED), local preference (Locpref), weight, and the path. The i on the left side (part of the status codes) indicates an internal BGP route and the i on the right side of our example indicates the origin. (i is for IGP, part of the origin codes.)

show ip bgp neighbors - displays BGP neighbors in detail

R1-A# show ip bgp neighbors
BGP neighbor is 10.0.1.10,  remote AS 64555, external link
  BGP version 4, remote router ID 10.0.1.10
  BGP state = Established, up for 00:05:07
  Last read 00:00:02, hold time is 16, keepalive interval is 4 seconds
  Configured hold time is 16, keepalive interval is 4 seconds
  Neighbor capabilities:
    Route refresh: advertised and received(old & new)
    Address family IPv4 Unicast: advertised and received
  Message statistics:
    InQ depth is 0
    OutQ depth is 0
                         Sent       Rcvd
    Opens:                  1          1
    Notifications:          0          0
    Updates:                2          1
    Keepalives:            79         63
    Route Refresh:          0          0
    Total:                 82         65
  Default minimum time between advertisement runs is 30 seconds

 For address family: IPv4 Unicast
  BGP table version 7, neighbor version 7
  Index 3, Offset 0, Mask 0x8
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               6          2 (Consumes 72 bytes)
    Prefixes Total:                 6          2
    Implicit Withdraw:              0          0
    Explicit Withdraw:              0          0
    Used as bestpath:             n/a          0
    Used as multipath:            n/a          2

                                   Outbound    Inbound
  Local Policy Denied Prefixes:    --------    -------
    Total:                                0          0
  Number of NLRIs in the update sent: max 4, min 0

  Connections established 1; dropped 0
  Last reset never
Connection state is ESTAB, I/O status: 1, unread input bytes: 0
Local host: 10.0.1.1, Local port: 179
Foreign host: 10.0.1.10, Foreign port: 48101

Enqueued packets for retransmit: 0, input: 0  mis-ordered: 0 (0 bytes)

Event Timers (current time is 0x5E1F8):
Timer          Starts    Wakeups            Next
Retrans            84          0             0x0
TimeWait            0          0             0x0
AckHold            67         64             0x0
SendWnd             0          0             0x0
KeepAlive           0          0             0x0
GiveUp              0          0             0x0
PmtuAger            0          0             0x0
DeadWait            0          0             0x0

iss:  915421219  snduna:  915422937  sndnxt:  915422937     sndwnd:   5840
irs: 4113695520  rcvnxt: 4113696868  rcvwnd:      15037  delrcvwnd:   1347

SRTT: 300 ms, RTTO: 303 ms, RTV: 3 ms, KRTT: 0 ms
minRTT: 0 ms, maxRTT: 300 ms, ACK hold: 200 ms
Flags: passive open, nagle, gen tcbs

Datagrams (max data segment is 1460 bytes):
Rcvd: 152 (out of order: 0), with data: 67, total data bytes: 1347
Sent: 148 (retransmit: 0, fastretransmit: 0), with data: 83, total data bytes: 1717

BGP neighbor is 10.0.2.10,  remote AS 64555, external link
  BGP version 4, remote router ID 10.0.1.10
  BGP state = Established, up for 00:05:19
  Last read 00:00:04, hold time is 16, keepalive interval is 4 seconds
  Configured hold time is 16, keepalive interval is 4 seconds
  Neighbor capabilities:
    Route refresh: advertised and received(old & new)
    Address family IPv4 Unicast: advertised and received
  Message statistics:
    InQ depth is 0
    OutQ depth is 0
                         Sent       Rcvd
    Opens:                  1          1
    Notifications:          0          0
    Updates:                1          1
    Keepalives:            82         65
    Route Refresh:          0          0
    Total:                 84         67
  Default minimum time between advertisement runs is 30 seconds

 For address family: IPv4 Unicast
  BGP table version 7, neighbor version 7
  Index 4, Offset 0, Mask 0x10
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               4          2 (Consumes 72 bytes)
    Prefixes Total:                 4          2
    Implicit Withdraw:              0          0
    Explicit Withdraw:              0          0
    Used as bestpath:             n/a          2
    Used as multipath:            n/a          2

                                   Outbound    Inbound
  Local Policy Denied Prefixes:    --------    -------
    Bestpath from this peer:              2        n/a
    Total:                                2          0
  Number of NLRIs in the update sent: max 4, min 0

  Connections established 1; dropped 0
  Last reset never
Connection state is ESTAB, I/O status: 1, unread input bytes: 0
Local host: 10.0.2.1, Local port: 179
Foreign host: 10.0.2.10, Foreign port: 39231

Enqueued packets for retransmit: 0, input: 0  mis-ordered: 0 (0 bytes)

Event Timers (current time is 0x60E88):
Timer          Starts    Wakeups            Next
Retrans            88          0             0x0
TimeWait            0          0             0x0
AckHold            69         51             0x0
SendWnd             0          0             0x0
KeepAlive           0          0             0x0
GiveUp              0          0             0x0
PmtuAger            0          0             0x0
DeadWait            0          0             0x0

iss: 2991828195  snduna: 2991829917  sndnxt: 299

1829917 sndwnd: 5840 irs: 4144867550 rcvnxt: 4144868936 rcvwnd: 14999 delrcvwnd: 1385 SRTT: 300 ms, RTTO: 303 ms, RTV: 3 ms, KRTT: 0 ms minRTT: 0 ms, maxRTT: 300 ms, ACK hold: 200 ms Flags: passive open, nagle, gen tcbs Datagrams (max data segment is 1460 bytes): Rcvd: 157 (out of order: 0), with data: 69, total data bytes: 1385 Sent: 139 (retransmit: 0, fastretransmit: 0), with data: 87, total data bytes: 1721

The output above shows the BGP neighbors in greater detail.

This concludes our high-level recipe on using BGP to configure Anycast DNS services. It also marks the final article in the Anycast DNS Recipe Series.

Series

DNS Anycast using static routes
DNS Anycast using RIP
DNS Anycast using RIP (cont)
DNS Anycast using OSPF (basic)
DNS Anycast using OSPF (advanced)
DNS Anycast using BGP

Next Post Previous Post