Modifying Traceroute Replies
In this post we get back to basics, and the roots of the CCIE lab exam – stupid router tricks. :) Recently I was posed the following question:
Can I force my router to reply to a traceroute with the IP of a certain interface, meaning not the usual that reply to the traceroute with the IP address of the ingress interface?
The short answer: yes. The slightly longer answer: yes, but not the way you probably think you can. To understand how and why you can do this, first let’s review how a traceroute works.
Traceroute is not a protocol on its own, but instead of method of discovering the route a packet takes from point A to point B. Various implementations of traceroute work differently behind the scenes, with some using ICMP, some UDP, and some even TCP. The ultimate goal of the implementation is common though, which is to get the nodes along the path from point A to point B to reveal themselves, which in turn tells us the route that packets take from point A to point B.
For the sake of this example we’re going to focus on the Cisco IOS implementation of traceroute, which like the Linux implementation uses outbound UDP messages in order to solicit inbound ICMP replies. Specifically when a router generates a traceroute request, the outbound packets are UDP unicast to the specified destination address, use a range of high ports, and increment the Time to Live (TTL) of the packet by one with each incrementing probe.
The basic idea is that you start sending packets with a TTL of 1, which means that the first hop receiving device decrements the TTL to 0. Per the basic IPv4 specification if a packet’s TTL reaches 0 it must be discarded, and the source should be notified of this drop with an error message. Specifically the error message is an ICMP Time Exceeded (i.e. TTL Expired), which is ICMP Type 11 Code 0. When this first hop replies with TTL Expired, the traceroute source now knows the first node in the path to the destination. The TTL is then incremented by one and the next probe is sent. The result is that the node 2 hops away must decrement the TTL to 0 and reply back with TTL Expired. The process repeats until the final destination is reached. Assuming the destination does not have a UDP service listening on the requested port – which is unlikely since a range of random high ports is used – the final destination will reply back with Port Unreachable, which is ICMP Type 3 Code 3, and the final path from the source to the destination is now known.
To illustrate how this works in IOS, take the following topology as below. R1, R2, and R3 are connected in serial; R1 will be the traceroute source and R3’s Loopback 3.3.3.3 the destination.
R1#traceroute 3.3.3.3Type escape sequence to abort.
Tracing the route to 3.3.3.31 10.0.12.2 0 msec 4 msec 0 msec
2 10.0.23.3 4 msec * 4 msec
R1 first sends three probes with a TTL of 1. R2 receives these, decrements the TTL to 0, and replies back with ICMP Time Exceeded. R1 then increments the probes to TTL 2 and repeats. R3 receives them, sees itself as the final destination but that it does not have a service listening at the requested port, and replies back ICMP Port Unreachable. The details of this flow can be seen in the debug outputs below.
R1#debug ip icmp
ICMP packet debugging is on
R1#traceroute 3.3.3.3Type escape sequence to abort.
Tracing the route to 3.3.3.31 10.0.12.2 0 msec 4 msec 0 msec
2 10.0.23.3 4 msec * 4 msecICMP: time exceeded rcvd from 10.0.12.2
ICMP: time exceeded rcvd from 10.0.12.2
ICMP: time exceeded rcvd from 10.0.12.2
ICMP: dst (10.0.12.1) port unreachable rcv from 10.0.23.3
ICMP: dst (10.0.12.1) port unreachable rcv from 10.0.23.3
Note that the source of neither of these replies comes from the traceroute destination 3.3.3.3. Instead the routers in the path are generating the packets with the IP addresses assigned to their transit interfaces. This is due to the basic principle that when the router's control plane locally generates a packet (e.g. an OSPF hello, a Telnet request etc.) the source address used in the packet is the primary IPv4 address of the outgoing interface in the routing table.
For example in the above case, before R1 sends the packet to 3.3.3.3 it must first choose the source. To do so, the router internally checks the routing table as follows:
R1#show ip route 3.3.3.3
Routing entry for 3.3.3.3/32
Known via "ospf 1", distance 110, metric 3, type intra area
Last update from 10.0.12.2 on FastEthernet0/0.12, 00:37:13 ago
Routing Descriptor Blocks:
* 10.0.12.2, from 3.3.3.3, 00:37:13 ago, via FastEthernet0/0.12
Route metric is 3, traffic share count is 1
The outgoing interface for this packet is FastEthernet0/0.12. Next the router finds the IP address assigned to that link, and that becomes the source of the packet:
R1#show ip interface FastEthernet0/0.12
FastEthernet0/0.12 is up, line protocol is up
Internet address is 10.0.12.1/24
[snip]
Per the above output, R1 uses the source address 10.0.12.1 in order to generate the traceroute requests towards 3.3.3.3. In the reverse path, both R2 and R3 must now do a routing lookup on 10.0.12.1 in order to reply. Per the above debug output of R1 we can see that these addresses were 10.0.12.2 and 10.0.23.3 respectively. Now comes the original question: can we change the address the router replies from?
Unfortunately it's not as simple as saying ip traceroute source-interface or similar, as IOS has no such command. If we can't tell the router to generate the packet from a certain interface, what if we were just to change the source address of the packets inline? What other tool is available to us in IOS that we can use to modify packets in the data plane? The answer: Network Address Translation (NAT).
We could create a NAT policy on the routers in the traceroute transit path to match on ICMP TTL Exceeded and ICMP Port Unreachable traffic coming from the router, and then change the source address to anything of our choosing. Specifically this configuration would look as follows on R2:
R2:
interface Loopback0
ip address 2.2.2.2 255.255.255.255
!
interface FastEthernet0/0.12
ip nat outside
!
interface FastEthernet0/0.23
ip nat outside
!
ip nat inside source list TRACE_REPLIES interface Loopback0 overload
!
ip access-list extended TRACE_REPLIES
permit icmp any any ttl-exceeded
permit icmp any any port-unreachable
The end result is as follows:
R1#traceroute 3.3.3.3Type escape sequence to abort.
Tracing the route to 3.3.3.31 2.2.2.2 4 msec 0 msec 0 msec
2 10.0.23.3 4 msec * 0 msecR3#traceroute 1.1.1.1
Type escape sequence to abort.
Tracing the route to 1.1.1.11 2.2.2.2 0 msec 0 msec 0 msec
2 10.0.12.1 4 msec * 0 msec
The locally originated traffic on R2 is treated as NAT Inside, and therefore matches the Inside to Outside policy. Note that R1 and R3's replies aren't NATed by R2, because the transit traffic is going between two NAT Outside interfaces, and there isn't an Outside to Outside policy defined. The specific translations on R2 can be seen as follows:
R2#show ip nat translations
Pro Inside global Inside local Outside local Outside global
udp 2.2.2.2:4501 10.0.12.2:33434 10.0.12.1:49294 10.0.12.1:49294
udp 2.2.2.2:4502 10.0.12.2:33435 10.0.12.1:49295 10.0.12.1:49295
udp 2.2.2.2:4503 10.0.12.2:33436 10.0.12.1:49296 10.0.12.1:49296
udp 2.2.2.2:4504 10.0.23.2:33434 10.0.23.3:49205 10.0.23.3:49205
udp 2.2.2.2:4505 10.0.23.2:33435 10.0.23.3:49206 10.0.23.3:49206
udp 2.2.2.2:4506 10.0.23.2:33436 10.0.23.3:49207 10.0.23.3:49207
Why would you ever want to do this though? One valid case would be if you don't advertise your transit links. For example a Service Provider could use RFC1918 addresses on their transit links, and have a public address assigned to the Loopback of the router. Normally traffic isn't destined to the router, only through the router, so there's no reason to advertise the transit links into IGP or BGP. When a traceroute goes through the router though it would reply back with an address that the source doesn't know, e.g. 192.168.1.100. Assuming that the router has at least one public address you could use the NAT translation to change the source address to something that they do know though. This design is kind of a stretch though.
The more practical application of this is as a stupid router trick in the CCIE Lab Exam. There are lots of interesting routing problems that can be solved by using NAT translations, so be sure not to exclude this as part of your toolkit that you have in the exam.