RSTP and Fast Convergence

Fast Convergence for Designated Ports

RSTP protocol's fast convergence depends on the use of point-to-point links connecting switches. In order to quickly transition a designated port into non-discarding state, the upstream switch needs to make sure that the downstream neighbor agrees with that idea. This constitutes the process known as handshake (or proposal/agreement):

Upstream bridge sends a proposal out of a designated port. As a matter of fact, it just sets the proposal bit in outgoing configuration BPDUs.
Downstream bridge receives the proposal, and if it agrees with the upstream port role, it starts the process known as synchronization.
Synchronization implies the downstream bridge blocking all non-edge designated ports, prior to sending an agreement to the upstream bridge.
Synchronization is needed to make sure there are no loops in the topology, after the upstream bridge unblocks its designated port.
If the downstream bridge does not agree with the proposal, it will continues sending it's own configuration BPDUs with the proposal bit set. Eventually one of the bridges will accept the superior information and send an agreement.

Fast Convergence for Other Port Types

See more detailed overview at: http://blog.ine.com/wp-content/uploads/2010/04/understanding-stp-rstp-convergence.pdf

The above procedure outlines the fast transition procedure for designated ports. As for root ports, they can always transition to forwarding state upon receiving a superior BPDU and synchronizing the local designated ports. Alternate ports may quickly transition to forwarding state if the current root port is lost, thanks to the feature known as UplinkFast in the classic STP implementations. Inferior BPDU handling is similar to BackboneFast feature, where a designated port receiving an inferior BPDU will quickly send a new proposal to synchronize the downstream peer.

Why P2P links are So Important?

And now, an interesting question: Why RSTP needs point-to-point links for fast convergence? The answer lies in the handshake protocol. If we would have multiple devices on the segment, performing synchronization would become really cumbersome. The upstream bridge will have to detect all downstreams, and wait for every one of them to synchronize with its proposal. Implementing such complicated protocols is not worth the benefits, as most of the time switches are connected using full-duplex links.

Treating Shared Links as P2P

Yes, this is possible. Even though you may think this is purely a theoretical concern, it's possible to encounter this in real-life scenarios. Recall that RSTP detects P2P links by looking at the link duplex. What if we have switches A,B and C (customer switches) plugged into switch D (provider switch) and switch D performing Layer-2 Protocol Tunneling? In this case, all customer switches would consider their connection as being P2P. However, switch D will tunnel all BPDUs, effectively connecting the switches via a share cloud. In this situation, RSTP would behave according to the P2P link rules, sending a proposal and unblocking the designated port upon receiving the first agreement. This may potentially introduce temporary Layer 2 loops, as one of the switches may not yet be synchronized. You may easily lab up this scenario and observer "fast convergence" over a shared link.

Another situation is possible with Cisco's R-PVST, which is a hybrid of RSTP and Cisco's proprietary PVST+. The encapsulation rules for RPVST follow the same used for PVST+ in order to allow for tunneling of Cisco PVST instances over an IEEE CST cloud (You may read about PVST+ encapsulation rules in the blog post named PVST+ Explained). The problem with RPVST is the same – there could be multiple Cisco switches connected to the IEEE CST cloud, every switch treating this cloud as a P2P link, as the links are full-duplex. The net effect is the same as described above.

So how long does it take for RSTP to converge?

Contrary to many beliefs of RSTP's fast convergence (order of milliseconds), it may exhibit convergence times in order of seconds even for small topologies. The main problems is the distance-vector or gradient nature of STP protocol, which converges based on the best information received from peers. Under some some cases (e.g. root bridge crash), this may result in old information circulating inside the topology until the hop count exceeds the limit. This is known as count to infinity and is very similar to the problem found in RIP or any other distance-vector protocol. With the critical nodes failing, RSTP may take seconds to recover from the information loss.

Further Reading on Ethernet and RSTP

If you are curious about details, you may want to read the following relatively small article Scaling Etherenet to a Million of Nodes which discusses the RSTP convergence problem and proposes a solution for "broadcast-less" Ethernet, which does not require STP. Pay special attention to the figures, especially the one showing the convergence times in ring topologies. To anyone further interested in «scalable ethernet» I would highly recommend reading another, more recent article: Floodless in SEATTLE. Special thanks to Daniel Ginsburg for referring me to this little gem!

Have fun with RSTP! :)