Build log · MikroTik RB5009 · BGP + BFD failover

Sub-second IPv6 failover on RouterOS

Replace a static IPv6 default with a BGP route on a BFD-monitored WireGuard session — ~30 s dead-tunnel detection down to ~600 ms. An optional enhancement to the CGNAT build log.

Overview

This is an optional enhancement to the CGNAT build log: replace that build's static IPv6 default route with a BGP-advertised one on a BFD-monitored session, so a dead WireGuard tunnel is detected in about 600 ms instead of ~30 seconds.

It is not self-contained. It assumes the IPv6-over-WireGuard layer from the CGNAT build is already up — specifically the wg-host interface, the <LAN_PREFIX>:0::1 / ::2 tunnel addresses, and the static ::/0 route commented vps primary that this post removes. Nothing here is needed for IPv6 to work; it only changes how fast the LAN gives up on a dead tunnel and fails over to native IPv4.

This applies to the VPS-routed /48 path only. If you took the Route64 /56 path instead, skip this post: that path has no VPS to run bird2 on and ships its own netwatch-driven fail-to-IPv4 (Route64 post §7). BGP+BFD here replaces the static default that the VPS path installs.

The problem it solves: a WireGuard interface stays administratively UP even when the path is dead — NAT mapping expired, peer rebooted, VPS null-routed — so neither interface state nor BGP keepalives alone are a reliable failure signal. The CGNAT build's static ::/0 with check-gateway=ping detects a dead tunnel in roughly 30 seconds; during that window dual-stack apps stall on AAAA before Happy Eyeballs falls back to IPv4. A BGP route on a BFD-monitored session withdraws the instant BFD declares the path dead — pings fail once and clients are already on IPv4 by the next attempt.

EventMeasured
WG silent → BFD down → route withdrawn~600 ms
WG restored → BFD up → route reinstalled~3 s
Full VPS reboot → bird up, route installed~28 s
BFD bandwidth (200 ms × 3, bidirectional)~3.4 GB / mo
BFD cost at $2.50/TB~$0.0085 / mo

1. Conventions and placeholders

The snippets continue the CGNAT build's placeholders and add the routing ones. Substitute before pasting.

PlaceholderMeaning
<LAN_PREFIX>The routed IPv6 /48 (or /56) from the CGNAT build; :0::1/:0::2 are the tunnel ends.
<VPS_AS> / <MT_AS>Private 2-byte ASNs (RFC 6996, 64512–65534), one per side.
<VPS_ROUTER_ID> / <MT_ROUTER_ID>Any unique 32-bit router IDs (IPv4-formatted).
wg-hostThe WireGuard interface created in the CGNAT build (§4.3).

2. VPS — bird2 with BFD

bird2: BGP + BFD on the VPS

bash

1# 1. Add a link-local on wg0 (bird's "next hop self" needs one). 2# Append to /etc/wireguard/wg0.conf and reload: 3# Address = fe80::1/64 4 5apt-get install -y bird2 6mkdir -p /etc/bird 7cat >/etc/bird/bird.conf <<EOF 8log syslog all; 9router id <VPS_ROUTER_ID>; 10 11protocol device { } 12protocol kernel kernel6 { ipv6 { export none; import all; }; learn yes; } 13 14protocol bfd { 15 interface "wg0" { 16 min rx interval 200 ms; 17 min tx interval 200 ms; 18 idle tx interval 1 s; 19 multiplier 3; 20 }; 21 # Explicit neighbor so bird actively probes; passive-only stalls 22 # after a tunnel flap because both sides wait for the other. 23 neighbor <LAN_PREFIX>:0::2 dev "wg0"; 24} 25 26protocol bgp mikrotik { 27 local <LAN_PREFIX>:0::1 as <VPS_AS>; 28 neighbor <LAN_PREFIX>:0::2 as <MT_AS>; 29 ipv6 { import none; export where net = ::/0; next hop self; }; 30} 31EOF 32chown -R bird:bird /etc/bird 33 34# Restart on any exit (packaged unit uses on-abnormal). 35mkdir -p /etc/systemd/system/bird.service.d 36printf '[Service]\nRestart=on-failure\nRestartSec=2s\n' \ 37 > /etc/systemd/system/bird.service.d/restart.conf 38systemctl daemon-reload && systemctl enable --now bird

The explicit neighbor in protocol bfd matters. Without it, bird is passive and only responds to probes; after a flap the MikroTik waits for BFD before re-establishing BGP, bird waits for BGP before initiating BFD, and recovery needs a manual birdc restart.

3. MikroTik — BGP, BFD, and remove the static route

RouterOS BGP + BFD

bash

1/routing/bgp/instance/add name=default-bgp as=<MT_AS> router-id=<MT_ROUTER_ID> 2/routing/bgp/template/add name=tpl-host as=<MT_AS> use-bfd=yes 3/routing/bgp/connection/add name=host-vps instance=default-bgp \ 4 remote.address=<LAN_PREFIX>:0::1 remote.as=<VPS_AS> \ 5 local.address=<LAN_PREFIX>:0::2 local.role=ebgp \ 6 templates=tpl-host afi=ipv6 7 8/routing/bfd/configuration/add interfaces=wg-host \ 9 min-rx=200ms min-tx=200ms multiplier=3 10 11/ipv6/firewall/filter add chain=input action=accept protocol=udp dst-port=3784 \ 12 in-interface=wg-host comment="BFD from VPS" \ 13 place-before=[find where chain=input and comment="defconf: drop everything else not coming from LAN"] 14 15# Remove the static ::/0; BGP-learned route at distance 20 takes over. 16/ipv6/route/remove [find comment="vps primary"]

BGP is used here only because it pairs cleanly with BFD: with one peer it is not chosen for scaling but for clean dynamic route withdrawal that static routes cannot do — it drops the ::/0 the instant BFD declares the path dead. RouterOS 7 splits BGP into instance, template, and connection; the address-family field is afi=ipv6 (singular), and the as= on both instance and template is the local AS, not the remote.

4. Verification

Confirm BGP/BFD and the failover

bash

1# On the VPS: 2birdc show protocols # bgp + bfd both Established/Up 3birdc show route # ::/0 exported to mikrotik 4 5# On the MikroTik: 6/routing/bgp/session/print # established 7/routing/bfd/session/print # state=up 8/ipv6/route/print where dst-address=::/0 # bgp, distance 20, no static 9 10# Failover: stop WireGuard on the VPS and time it. 11# wg-quick down wg0 12# A client's IPv6 should drop and Happy-Eyeballs to IPv4 within ~1 s; 13# the route reappears within a few seconds of bringing wg0 back.

A single ping that fails once and is on IPv4 by the next attempt — instead of a ~30 second AAAA stall — is the whole point of this change.

References

Share

Comments

Comments are powered by GitHub Discussions and require a free GitHub account to post.