This is the story of being bitten by something I completely forgot existed: ICMP Redirect.

I should have remembered it. It’s on the CCNP Route exam topics. I’ve studied it. I’ve been doing this a long time. Somehow I missed it. Thanks to the good people of Reddit (/r/networking) for solving this mystery.

This network example has been cleaned up a bit, but I’ve left in a lot of the stuff that made this a real-world network. That should help explain why I was so confused.

Our story starts with a radio station (HQ) and a transmitter up on top of a mountain (Remote). For years the mountain top had been uplinked with a T1 circuit. A long time ago, there had just been a dumb switch up there, and T1 LAN extenders had been used to extend VLAN 200. A few years ago, the dumb switch was replaced with a Cisco 3650 layer 3 switch, and VLAN 300 was created at the mountain.

A static route on the HQ switch sent traffic for VLAN 300 (10.30.200.0/24) over the T1. VLAN 200 was created on the Remote switch, and the SVI created with an address of 10.1.200.254.

An audio-over-IP sending appliance was configured at HQ with an address 10.1.200.60 (on VLAN 200). A receiving AoIP device was configured on the mountain at 10.30.200.90. For a couple of years, UDP audio streamed between the two without a care in the world.

T1s are expensive, and there’s line of site between HQ and the mountain, so a 100 Mbps microwave link was added. This looks just like a 100 meg Ethernet connection between the two switches. But instead of using a VLAN this time, the two involved switchports were turned into layer 3 ports and they were configured on a /30.

interface g1/3
no switchport
ip address 10.5.1.1 255.255.255.252

Since this is live radio, even in the middle of the night, cutting over to the new microwave link had to be done with minimal downtime. No problem. Just change the static route on the HQ switch, and change the default route on the Remote switch.

! HQ Switch -
ip route 10.30.200.0 255.255.255.0 10.5.1.2
no ip route 10.30.200.0 255.255.255.0 10.1.200.254

! Remote Switch -
ip route 0.0.0.0 0.0.0.0 10.5.1.1
no ip route 0.0.0.0 0.0.0.0 10.1.200.1

Wow! That was great! No dropped packets from the audio devices when that change was made! It’s almost too perfect… Well, it was. When I looked at the traffic on the T1, it should have dropped to almost nothing, but that audio traffic was still going. WTH?!

So I did a couple of packet captures and noticed something interesting. The audio was sure enough going over the T1, but other traffic from elsewhere in the network (like network monitoring traffic) was going over the microwave. Huh?!

This is where I really start to feel an elephant with my eyes closed. Maybe there’s something about these audio packets I don’t understand? It there something about the Routing Information Base I don’t understand? Cisco Express Forwarding?

Before I went too far down the rabbit hole, the lead audio engineer suggested rebooting the audio appliance at HQ. There would be some dead air, but now it’s 1am, and there’s just not many people listening to the radio. So we reboot the box, and now audio is going over the microwave! Yes! It works! I can go home now!

But the next morning, I was still confused. It worked, but WHY did rebooting the audio appliance help? The good people at Reddit told me why. ICMP Redirect.

Consider this diagram:

Host 1 is on the same subnet as Router 1 and Router 2. So it wants to talk to Host 2. Realizing Host 2’s IP address is on a different subnet, it sends packets to its gateway (Router 1). Router 1 has a route to Host 2’s network, and that route is through Router 2.

Router 1 realizes this is sub-optimal. Why should Host 1 send packets to Router 1, just so Router 1 can send those packets right back out the same interface, over to Router 2? So Router 1 sends an ICMP Redirect message to Host 1. It says, hey, instead of sending your packets destined for Host 2 to me, just send them directly to Router 2.

It’s up to whoever wrote the TCP/IP stack for Host 1’s operating system to implement ICMP Redirects. But ideally, Host 1 gets that message, and all future packets are wrapped in a frame that has Router 2’s MAC address as the destination – not Router 1. Eventually Host 1 should kind of check back in, but again, it’s up to whoever wrote the TCP/IP stack to honor that.

I had never run into this before, because I’ve rarely worked on a network that has hosts on the same subnet as multiple routers. It’s just not very good network design.

Back to the real world.
When I turned up the microwave link, I now expected the audio traffic to change paths because I changed the static routes. But the HQ audio appliance had been told by the HQ Switch, via ICMP Redirect, to send frames directly to Remote Switch’s MAC address on the 10.1.200.0/24 subnet. And since this was a continuous stream of audio traffic, who knows how long it might have been before it got any kind of updated instruction from HQ Switch. 5 minutes? 5 hours? Never?

When we rebooted the audio appliance, all was well because HQ Switch never felt the need to send another ICMP Redirect, since the routing table was set to take another path.

Could I have fixed this without rebooting the appliance? I’m not really sure.
Doing a shut / no shut on the switchport it was connected to would have been a good start.
I could have disabled ICMP Redirect on the SVI with this command:

HQSwitch(config)#interface vlan 200
HQSwitch(config-if)#no ip redirects

Well, at this point, that’s kind of closing the barn door after the cow has escaped.

Probably best to just avoid network designs like this in the future, and remember that ICMP Redirect exists.

In the end, this was a classic case of missing a piece of information, the network not acting the way I expected it to, and questioning everything I ever knew.