The Router Configuration Checker at 20: The Birth of Static Network Configuration Analysis

This year marks the 20-year anniversary of rcc, our pioneering research and software on static verification of router configurations.

Dec 03, 2025

On October 20, 2025, Amazon Web Services experienced a massive outage that took down Snapchat, McDonald’s app, Ring doorbells, Roblox, Fortnite, and thousands of other services worldwide. The culprit? A DNS resolution failure triggered by a faulty configuration update. Just over a year earlier, CrowdStrike’s configuration error caused what has been called the largest IT outage in history, crashing 8.5 million Windows systems and causing an estimated $10 billion in damage.

Twenty years after these kinds of configuration errors first motivated my Ph.D. thesis research, they remain one of the primary causes of network and system outages.

In May 2005, my advisor Hari Balakrishnan and I published “Detecting BGP Configuration Faults with Static Analysis”. The tool was called rcc (Router Configuration Checker), and it introduced static analysis of router configurations before deployment. Studies at the time showed that 50-80% of network outages were caused by configuration errors. Two decades later, that fundamental problem persists—though the systems have gotten vastly more complex.

The Problem in 2005

In the mid-2000s, network operators were managing increasingly complex routing infrastructures with no systematic way to verify correctness. The state of the art was to push configurations to production routers and see what breaks. When something went wrong, operators would scramble to diagnose the problem and roll back changes.

Configuring routers in an autonomous system is essentially writing a distributed program, but we were doing it with none of the verification tools that software engineers take for granted. BGP’s flexible configuration language made it powerful but error-prone. Operators had to manually decompose network-wide policies into device-level configurations across potentially thousands of BGP sessions, with complex interactions between iBGP, eBGP, route reflection, and filtering policies.

The Approach

rcc introduced static analysis of router configurations before deployment to detect faults automatically.

The approach was to parse vendor-specific configuration files, normalize them into a vendor-independent representation using relational database tables, and then check constraints based on a correctness specification. No need to run configurations in a live network or build expensive emulation environments.

rcc focused on detecting two broad classes of BGP configuration faults:

Route validity faults: Where routers might learn routes that don’t correspond to usable paths. This included issues like:

Invalid AS paths being propagated
Next-hop addresses unreachable via IGP
Undefined references to filters and route maps
Improper or missing route filtering

Path visibility faults: Where routers fail to learn routes for paths that exist in the network. The classic example was iBGP configuration errors—particularly with route reflection—that could cause network partitions where some routers never learn about certain destinations.

What We Found

We analyzed BGP configurations from 17 real-world networks. rcc found errors in every network. Most operators were unaware of these errors, which ranged from simple single-router problems like undefined variables to complex multi-router interactions.

The errors we uncovered included:

Potential network partitions caused by route propagation problems
Invalid routes being propagated due to improper filtering
Routers forwarding packets inconsistently with high-level policy
420 incomplete iBGP sessions across the networks we studied

Many of these errors were latent—not actively causing problems yet, but violations of correctness that would manifest under certain conditions. For example, misconfigured backup paths that would only fail after a primary link went down.

Why Configuration Errors Happen

Our analysis revealed three main causes:

Complex propagation mechanisms: The techniques used to scalably propagate routes within a network—particularly route reflection with clusters—are easily misconfigured.
Levels of indirection: Even simple policy specifications get implemented using multiple layers of indirection in configuration files (distribute lists, prefix lists, route maps, community values).
No systematic process: Most operators had no disciplined approach to network configuration. There were no configuration management tools, no testing frameworks, no verification.

The Impact

rcc won Best Paper at NSDI 2005. The tool was open-sourced and is still on GitHub at github.com/noise-lab/rcc. Operators used it to find and fix bugs before deployment.

The bigger impact was establishing static configuration verification as a research area. rcc showed that you could reason formally about network behavior from configuration files alone, without needing access to live networks or running emulation.

The Research Lineage

Over the past 20 years, researchers have built on rcc’s foundation with increasingly sophisticated techniques:

Header Space Analysis (2012): Kazemian, Varghese, and McKeown extended static analysis from the control plane to the data plane, introducing geometric representations of packet header spaces.

Batfish (2015): Fogel, Mahajan, Millstein and colleagues built a general-purpose configuration analysis tool using Datalog and logic programming, extending beyond BGP to multi-protocol networks.

Minesweeper (2017): Beckett et al. developed SMT-based verification for a wide range of properties including reachability, waypointing, and fault tolerance.

These tools advanced the techniques for network verification—new representations, new algorithms, broader protocol coverage. But they all built on the paradigm rcc established: that you can and should verify network configurations statically before deployment.

Key Design Decisions

Several design decisions proved important:

Vendor-independent representation: By parsing vendor configs into normalized relational tables, we separated the parsing problem from the analysis problem. This pattern has been followed by every subsequent tool.

Focus on correctness, not optimality: rcc didn’t try to find the “best” configuration—it found wrong configurations.

Constraint-based approach: Rather than requiring operators to write formal specifications, rcc checked constraints that must hold for any correct configuration.

Static analysis: No need for live network access, no emulation overhead, no waiting for convergence. You could verify configurations as fast as you could parse and analyze them.

What’s Changed in 20 Years

Networks have gotten more complex. We’ve added software-defined networking, cloud and multi-cloud environments, container networking, network functions virtualization, and programmable data planes. Configuration errors are more consequential than ever. A misconfiguration in a cloud environment can expose thousands of customers’ data. An error in SDN controller logic can take down an entire data center. A faulty configuration update can crash millions of systems worldwide, as we saw with CrowdStrike.

Looking Forward: LLMs and the Next Generation

While outages persist and are often due to increasingly complex software and configuration errors, there is hope that tools with more sophisticated reasoning capabilities will ultimately help us reason about these errors. The rise of large language models has opened new approaches to the problems rcc was designed to solve.

Our recent work on CAIP (Context-Aware Iterative Prompting) demonstrates how LLMs can detect router misconfigurations with over 30% better accuracy than traditional model checkers and consistency checkers, finding over 20 previously undetected misconfigurations in real-world configurations. Unlike traditional tools that require significant manual effort to develop and maintain for each new protocol or vendor, LLMs can learn patterns from vast datasets and apply them across layers and vendors.

This makes the verification that rcc pioneered feasible for much larger and more complex systems—from multi-protocol networks to cross-vendor translations to intent-based configuration generation. The fundamental insight remains the same: configuration errors are preventable through analysis before deployment. But LLMs are making that analysis practical at scales and across domains we couldn’t have imagined in 2005.

The AWS and CrowdStrike outages remind us that configuration errors remain a critical challenge. But they also show us that the problem rcc addressed two decades ago is more important than ever—and that new techniques building on that foundation offer genuine hope for finally getting ahead of these errors.

The original rcc paper: “Detecting BGP Configuration Faults with Static Analysis,” Nick Feamster and Hari Balakrishnan, NSDI 2005. PDF

Recent work: “CAIP: Detecting Router Misconfigurations with Context-Aware Iterative Prompting of LLMs,” Xi Jiang, Aaron Gember-Jacobson, and Nick Feamster, 2024. arXiv

@inproceedings{feamster2005rcc,
  title     = {Detecting {BGP} Configuration Faults with Static Analysis},
  author    = {Feamster, Nick and Balakrishnan, Hari},
  booktitle = {2nd USENIX Symposium on Networked Systems Design and 
               Implementation (NSDI ‘05)},
  year      = {2005},
  month     = {May},
  address   = {Boston, MA},
  publisher = {USENIX Association},
  url       = {https://www.usenix.org/legacy/event/nsdi05/tech/feamster/feamster.pdf},
  note      = {Best Paper Award}
}

Practice Space

Discussion about this post

Ready for more?