Aug 29 2023
Border Gateway Protocol is the de facto protocol that directs routing decisions between different ISP networks, and is generally known as the “glue” that holds the internet together. It’s safe to say that the internet we currently know would not function without working BGP implementations.
However, the software on those networks’ routers (I will refer to these as edge devices from now on) that implements BGP has not had a flawless track record. Flaws and problems do exist in commercial and open source implementations of the world’s most critical routing protocol.
Most of these flaws are of course benign in the grand scheme of things; they will be issues around things like route filtering, or insertion, or handling withdraws. However a much more scary issue is a BGP bug that can propagate after causing bad behaviour, akin to a computer worm.
While debugging support for a future feature for my business (bgp.tools) I took a brief diversion to investigate something, and what I came out with might be one of the most concerning things I’ve discovered for the reliability of the internet. To understand the problems, though, we will need a bit more context.
On 2 June 2023, a small Brazilian network (re)announced one of their internet routes with a small bit of information called an attribute that was corrupted. The information on this route was for a feature that had not finished standardisation, but was set up in such a way that if an intermediate router did not understand it, then the intermediate router would pass it on unchanged.
As many routers did not understand this attribute, this was no problem for them. They just took the information and propagated it along. However it turned out that Juniper routers running even slightly modern software did understand this attribute, and since the attribute was corrupted the software in its default configuration would respond by raising an error that would shut down the whole BGP session. Since a BGP session is often a critical part of being “connected” to the wider internet, this resulted in the small Brazilian network disrupting other networks’ ability to communicate with the rest of the internet, despite being 1000’s of miles away.
The packet that causes session shutdowns was really quite benign at first glance:
When a BGP session shuts down due to errors, customer network traffic generally stops flowing down that cable until the BGP connection is automatically restarted (typically within seconds to minutes).
This appears to be what happened to a number of different carriers, for example COLT was heavily impacted by this. Their outage is what originally drew some of my attention to this subject area.
To understand why this sort of thing can happen, we’ll need to take a deeper look at what BGP route attributes are, and what they’re used for.
At their core a BGP UPDATEs purpose is to tell another router about some traffic that it can (or can no longer) send to it. However just knowing directly what you can send to another router is not very useful without context.
For this reason a BGP packet is split up into two sections: the Network Layer Reachability Information (NLRI) data (aka, the IP address ranges), and the attributes that help describe extra context about that reachability data.
Arguably the most used attribute is the AS_PATH (or actually, the AS4_PATH), an attribute that tells you which networks a route has travelled through to get to you. Routers use this list of networks to pick paths for their traffic that are either the fastest, economically viable, or least congested, playing a critical role in ensuring that things run smoothly.
At the time of writing there are over 32 different route attribute types, 14 deprecated ones, and 209 officially unassigned ones. The Internet Assigned Numbers Authority (IANA) is in charge of assigning codes to each BGP attribute type codes, normally off the back of IETF Internet-Drafts. The IANA list doesn’t always give the full story, though, as not all internet-drafts make their way into more official documents (like RFC’s), so code numbers are assigned (or sometimes even “squatted”) to attribute types that did not get wide deployment.
At the start of every route attribute is a set of flags, conveying information about the attribute. One important flag is called the “transitive bit”:
If a BGP implementation does not understand an attribute, and the transitive bit is set, it will copy it to another router. If the router does understand the attribute then it may apply its own policy.
At a glance this “feature” seems like an incredibly bad idea, as it allows possibly unknown information to propagate blindly through systems that do not understand the impact of what they are forwarding. However this feature has also allowed widespread deployment of things like Large Communities to happen faster, and has arguably made deployment of new BGP features possible at all.
What happens when an attribute fails to decode? The answer depends strongly on if the BGP implementation has been updated to use RFC 7606 logic or not; If the session is not RFC 7606 compliant, then typically an error is raised and the session is shut down. If it is, the session can usually continue as normal (except the routes impacted by the decoding error are treated as unreachable).
BGP session shutdowns are particularly undesirable, as they will impact traffic flow along a path. However in the case of a “Transitive” error they can become worm-like. Since not all BGP implementations support the same attributes, an attribute that i