Troubleshooting Multicast RPF Failure
Hi Brian,
I enjoy the new blog feature. Lots of valuable information condensed in a small space. Could you explain in a nutshell how to troubleshoot multicast RPF failures? I understand the concept, just figuring out what shows and/or debugs to use always seems to take me 30 minutes rather than 5 minutes.
First let’s talk briefly about what the RPF check, or Reverse Path Forwarding check, is for multicast. PIM is known as Protocol Independent Multicast routing because it does not exchange its own information about the network topology. Instead it relies on the accuracy of an underlying unicast routing protocol like OSPF or EIGRP to maintain a loop free topology. When a multicast packet is received by a router running PIM the device first looks at what the source IP is of the packet. Next the router does a unicast lookup on the source, as in the “show ip route w.x.y.z”, where “w.x.y.z” is the source. Next the outgoing interface in the unicast routing table is compared with the interface in which the multicast packet was received on. If the incoming interface of the packet and the outgoing interface for the route are the same, it is assumed that the packet is not looping, and the packet is candidate for forwarding. If the incoming interface and the outgoing interface are *not* the same, a loop-free path can not be guaranteed, and the RPF check fails. All packets for which the RPF check fails are dropped.
Now as for troubleshooting the RPF check goes there are a couple of useful show and debug commands that are available to you on the IOS. Suppose the following topology:
We will be running EIGRP on all interfaces, and PIM dense mode on R2, R3, R4, and SW1. Note that PIM will not be enabled on R1. R4 is a multicast source that is attempting to send a feed to the client, SW1. On SW1 we will be generating an IGMP join message to R3 by issuing the “ip igmp join” command on SW1’s interface Fa0/3. On R4 we will be generating multicast traffic with an extended ping. First let’s look at the topology with a successful transmission:
SW1#conf t
Enter configuration commands, one per line. End with CNTL/Z.
SW1(config)#int fa0/3
SW1(config-if)#ip igmp join 224.1.1.1
SW1(config-if)#end
SW1#R4#ping
Protocol [ip]:
Target IP address: 224.1.1.1
Repeat count [1]: 5
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]: y
Interface [All]: Ethernet0/0
Time to live [255]:
Source address: 150.1.124.4
Type of service [0]:
Set DF bit in IP header? [no]:
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 224.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 150.1.124.4Reply to request 0 from 150.1.37.7, 32 ms
Reply to request 1 from 150.1.37.7, 28 ms
Reply to request 2 from 150.1.37.7, 28 ms
Reply to request 3 from 150.1.37.7, 28 ms
Reply to request 4 from 150.1.37.7, 28 ms
Now let’s trace the traffic flow starting at the destination and working our way back up the reverse path.
SW1#show ip mroute 224.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report, Z - Multicast Tunnel
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode(*, 224.1.1.1), 00:03:05/stopped, RP 0.0.0.0, flags: DCL
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
FastEthernet0/3, Forward/Dense, 00:03:05/00:00:00(150.1.124.4, 224.1.1.1), 00:02:26/00:02:02, flags: PLTX
Incoming interface: FastEthernet0/3, RPF nbr 150.1.37.3
Outgoing interface list: Null
SW1’s RPF neighbor for (150.1.124.4,224.1.1.1) is 150.1.37.3, which means that SW1 received the packet from R3.
R3#show ip mroute 224.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode(*, 224.1.1.1), 00:03:12/stopped, RP 0.0.0.0, flags: DC
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Serial1/3, Forward/Dense, 00:03:12/00:00:00
Serial1/2, Forward/Dense, 00:03:12/00:00:00
Ethernet0/0, Forward/Dense, 00:03:12/00:00:00(150.1.124.4, 224.1.1.1), 00:02:33/00:01:46, flags: T
Incoming interface: Serial1/3, RPF nbr 150.1.23.2
Outgoing interface list:
Ethernet0/0, Forward/Dense, 00:02:34/00:00:00
Serial1/2, Prune/Dense, 00:02:34/00:00:28
R3’s RPF neighbor is 150.1.23.2, which means the packet came from R2.
R2#show ip mroute 224.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode(*, 224.1.1.1), 00:02:44/stopped, RP 0.0.0.0, flags: D
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
FastEthernet0/0, Forward/Dense, 00:02:44/00:00:00
Serial0/1, Forward/Dense, 00:02:44/00:00:00(150.1.124.4, 224.1.1.1), 00:02:44/00:01:35, flags: T
Incoming interface: FastEthernet0/0, RPF nbr 0.0.0.0
Outgoing interface list:
Serial0/1, Forward/Dense, 00:02:45/00:00:00
R2 has no RPF neighbor, meaning the source is directly connected. Now let’s compare the unicast routing table from the client back to the source.
SW1#show ip route 150.1.124.4
Routing entry for 150.1.124.0/24
Known via "eigrp 1", distance 90, metric 20540160, type internal
Redistributing via eigrp 1
Last update from 150.1.37.3 on FastEthernet0/3, 00:11:23 ago
Routing Descriptor Blocks:
* 150.1.37.3, from 150.1.37.3, 00:11:23 ago, via FastEthernet0/3
Route metric is 20540160, traffic share count is 1
Total delay is 21100 microseconds, minimum bandwidth is 128 Kbit
Reliability 255/255, minimum MTU 1500 bytes
Loading 1/255, Hops 2R3#show ip route 150.1.124.4
Routing entry for 150.1.124.0/24
Known via "eigrp 1", distance 90, metric 20514560, type internal
Redistributing via eigrp 1
Last update from 150.1.13.1 on Serial1/2, 00:11:47 ago
Routing Descriptor Blocks:
* 150.1.23.2, from 150.1.23.2, 00:11:47 ago, via Serial1/3
Route metric is 20514560, traffic share count is 1
Total delay is 20100 microseconds, minimum bandwidth is 128 Kbit
Reliability 255/255, minimum MTU 1500 bytes
Loading 1/255, Hops 1
150.1.13.1, from 150.1.13.1, 00:11:47 ago, via Serial1/2
Route metric is 20514560, traffic share count is 1
Total delay is 20100 microseconds, minimum bandwidth is 128 Kbit
Reliability 255/255, minimum MTU 1500 bytes
Loading 1/255, Hops 1R2#show ip route 150.1.124.4
Routing entry for 150.1.124.0/24
Known via "connected", distance 0, metric 0 (connected, via interface)
Redistributing via eigrp 1
Routing Descriptor Blocks:
* directly connected, via FastEthernet0/0
Route metric is 0, traffic share count is 1
Based on this output we can see that SW1 sees the source reachable via R3, which was the neighbor the multicast packet came from. R3 sees the source reachable via R1 and R2 due to equal cost load-balancing, with R2 as the neighbor that the multicast packet came from. Finally R2 sees the source as directly connected, which is where the multicast packet came from. This means that the RPF check is successful as traffic is transiting the network, hence we had a successful transmission.
Now let’s modify the routing table on R3 so that the route to R4 points to R1. Since the multicast packet on R3 comes from R2 and the unicast route will going back towards R1 there will be an RPF failure, and the packet transmission will not be successful. Again note that R1 is not routing multicast in this topology.
R3#conf t
Enter configuration commands, one per line. End with CNTL/Z.
R3(config)#ip route 150.1.124.4 255.255.255.255 serial1/2
R3(config)#end
R3#R4#ping
Protocol [ip]:
Target IP address: 224.1.1.1
Repeat count [1]: 5
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]: y
Interface [All]: Ethernet0/0
Time to live [255]:
Source address: 150.1.124.4
Type of service [0]:
Set DF bit in IP header? [no]:
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 224.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 150.1.124.4
.....
R4#
We can now see that on R4 we do not receive a response back from the final destination… so where do we start troubleshooting? First we want to look at the first hop away from the source, which in this case is R2. On R2 we want to look in the multicast routing table to see if the incoming interface list and the outgoing interface list is correctly populated. Ideally we will see the incoming interface as FastEthernet0/0, which is directly connected to the source, and the outgoing interface as Serial0/1, which is the interface downstream facing towards R3.
R2#show ip mroute 224.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report, Z - Multicast Tunnel
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode(*, 224.1.1.1), 00:07:27/stopped, RP 0.0.0.0, flags: D
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Serial0/1, Forward/Dense, 00:07:27/00:00:00
FastEthernet0/0, Forward/Dense, 00:07:27/00:00:00(150.1.124.4, 224.1.1.1), 00:07:27/00:01:51, flags: T
Incoming interface: FastEthernet0/0, RPF nbr 0.0.0.0
Outgoing interface list:
Serial0/1, Forward/Dense, 00:04:46/00:00:00
This is the correct output we should see on R2. Two more verifications we can do are with the “show ip mroute count” command and the “debug ip mpacket” command. “show ip mroute count” will show all currently active multicast feeds, and whether packets are getting dropped:
R2#show ip mroute count
IP Multicast Statistics
3 routes using 1864 bytes of memory
2 groups, 0.50 average sources per group
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kilobits per second
Other counts: Total/RPF failed/Other drops(OIF-null, rate-limit etc)Group: 224.1.1.1, Source count: 1, Packets forwarded: 4, Packets received: 4
Source: 150.1.124.4/32, Forwarding: 4/1/100/0, Other: 4/0/0Group: 224.0.1.40, Source count: 0, Packets forwarded: 0, Packets received: 0
“debug ip mpacket” will show the packet trace in real time, similar to the “debug ip packet” command for unicast packets. One caveat of using this verification is that only process switched traffic can be debugged. This means that we need to disable fast or CEF switching of multicast traffic by issuing the “no ip mroute-cache” command on the interfaces running PIM. Once this debug is enabled we’ll generate traffic from R4 again and we should see the packets correctly routed through R2.
R4#ping 224.1.1.1 repeat 100Type escape sequence to abort.
Sending 100, 100-byte ICMP Echos to 224.1.1.1, timeout is 2 seconds:
…R2(config)#int f0/0
R2(config-if)#no ip mroute-cache
R2(config-if)#int s0/1
R2(config-if)#no ip mroute-cache
R2(config-if)#end
R2#debug ip mpacket
IP multicast packets debugging is on
R2#
IP(0): s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=231, prot=1, len=100(100), mforward
IP(0): s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=232, prot=1, len=100(100), mforward
IP(0): s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=233, prot=1, len=100(100), mforward
IP(0): s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=234, prot=1, len=100(100), mforward
IP(0): s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=235, prot=1, len=100(100), mforward
R2#undebug all
Now that we see that R2 is correctly routing the packets let’s look at all three of these verifications on R3.
R3#show ip mroute 224.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode(*, 224.1.1.1), 00:00:01/stopped, RP 0.0.0.0, flags: DC
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Serial1/3, Forward/Dense, 00:00:01/00:00:00
Ethernet0/0, Forward/Dense, 00:00:01/00:00:00(150.1.124.4, 224.1.1.1), 00:00:01/00:02:58, flags:
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Ethernet0/0, Forward/Dense, 00:00:02/00:00:00
Serial1/3, Forward/Dense, 00:00:02/00:00:00
From R3’s show ip mroute output we can see that the incoming interface is listed as Null. This is an indication that for some reason R3 is not correctly routing the packets, and is instead dropping them as they are received. For more information let’s look at the “show ip mroute count” output.
R3#show ip mroute count
IP Multicast Statistics
3 routes using 2174 bytes of memory
2 groups, 0.50 average sources per group
Forwarding Counts: Pkt Count/Pkts(neg(-) = Drops) per second/Avg Pkt Size/Kilobits per second
Other counts: Total/RPF failed/Other drops(OIF-null, rate-limit etc)Group: 224.1.1.1, Source count: 1, Packets forwarded: 0, Packets received: 15
Source: 150.1.124.4/32, Forwarding: 0/0/0/0, Other: 15/15/0Group: 224.0.1.40, Source count: 0, Packets forwarded: 0, Packets received: 0
From this output we can see that packets for (150.1.124.4,224.1.1.1) are getting dropped, and specifically the reason they are getting dropped is because of RPF failure. This is seen from the “Other: 15/15/0” output, where the second field is RPF failed drops. For more detail let’s look at the packet trace.
R3#conf t
Enter configuration commands, one per line. End with CNTL/Z.
R3(config)#int e0/0
R3(config-if)#no ip mroute-cache
R3(config-if)#int s1/3
R3(config-if)#no ip mroute-cache
R3(config-if)#end
R3#debug ip mpacket
IP multicast packets debugging is on
IP(0): s=150.1.124.4 (Serial1/3) d=224.1.1.1 id=309, ttl=253, prot=1, len=104(100), not RPF interface
IP(0): s=150.1.124.4 (Serial1/3) d=224.1.1.1 id=310, ttl=253, prot=1, len=104(100), not RPF interface
IP(0): s=150.1.124.4 (Serial1/3) d=224.1.1.1 id=311, ttl=253, prot=1, len=104(100), not RPF interface
IP(0): s=150.1.124.4 (Serial1/3) d=224.1.1.1 id=312, ttl=253, prot=1, len=104(100), not RPF interface
R3#undebug all
All possible debugging has been turned off
From this output we can clearly see that an RPF failure is occurring on R3. The reason why is that multicast packets are being received in Serial1/3, while the unicast route is pointing out Serial1/2. Now that we see the problem occurring there are a few different ways we can solve it.
First we can modify the unicast routing domain so that it conforms to the multicast routing domain. In our particular case this would be accomplished by removing the static route we configured on R3, or configuring a new more-preferred static route on R3 that points to R2 for 150.1.124.4.
Secondly we could modify the multicast domain in order to override the RPF check. The simplest way to do this is with a static multicast route using the “ip mroute” command. Another way of doing this dynamically would be to configure Multicast BGP, which for the purposes of this example we will exclude due to its greater complexity.
The “ip mroute” statement is not like the regular “ip route” statement in the manner that it does not affect the actual traffic flow through the network. Instead if affects what interfaces the router will accept multicast packets in. By configuring the statement “ip mroute 150.1.124.4 255.255.255.255 150.1.23.2” on R3 it will tell the router that if a multicast packet is received from the source 150.1.124.4 it is okay that it be received on the interface in which the neighbor 150.1.23.2 exists. In our particular case this means that even though the unicast route points out Serial1/2, it’s okay that the multicast packet be received in Serial1/3. Let’s look at the effect of this:
R3#conf t
Enter configuration commands, one per line. End with CNTL/Z.
R3(config)#ip mroute 150.1.124.4 255.255.255.255 150.1.23.2
R3(config)#end
R3#R4#ping
Protocol [ip]:
Target IP address: 224.1.1.1
Repeat count [1]: 5
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]: y
Interface [All]: Ethernet0/0
Time to live [255]:
Source address: 150.1.124.4
Type of service [0]:
Set DF bit in IP header? [no]:
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 224.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 150.1.124.4Reply to request 0 from 150.1.37.7, 32 ms
Reply to request 1 from 150.1.37.7, 32 ms
Reply to request 2 from 150.1.37.7, 32 ms
Reply to request 3 from 150.1.37.7, 32 ms
Reply to request 4 from 150.1.37.7, 32 ms
R4#
Note that the above solution of using the multicast static route does not affect the return path of unicast traffic from SW1 to R4. Therefore this solution allows for different traffic patterns for the unicast traffic and the multicast traffic.