Wednesday, August 11, 2021

PTP link drops traffic that exceeds MTU (with and without DF bit)

Tshooting an issue unique to a certain link.

Typical traffic tests OK. speedtests, web browsing, etc, all normal. bandwidth, packet loss, jitter, RTT, iperf tcp/udp tests as expected.

but certain traffic fails. tests show something about this link is dropping (not fragmenting) UDP/ICMP when it exceeds MTU. in my case, this causes problems with TFTP transactions and certain Aruba AP control traffic.

on all other routers, I can ping an internal host on the far side of the router's link with any packet size I want. As long as I leave the df-bit off, it fragments and succeeds:

working-rtr#ping 10.100.3.37 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms working-rtr#ping 10.100.3.37 size 1500 Type escape sequence to abort. Sending 5, 1500-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 1/4/9 ms working-rtr#ping 10.100.3.37 size 2000 Type escape sequence to abort. Sending 5, 2000-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 1/5/9 ms

Adding in the df-bit, I get expected results:

working-rtr#ping 10.100.3.37 size 1500 df-bit Type escape sequence to abort. Sending 5, 1500-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: Packet sent with the DF bit set !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 1/4/9 ms working-rtr#ping 10.100.3.37 size 1501 df-bit Type escape sequence to abort. Sending 5, 1501-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: Packet sent with the DF bit set ..... Success rate is 0 percent (0/5)

But on the router behind this PTP, I get inconsistent results.

I can send 1500-byte pings with and without df-bit like normal:

rtr-behind-ptp#ping 10.100.3.37 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 25/25/27 ms rtr-behind-ptp#ping 10.100.3.37 size 1500 df-bit Type escape sequence to abort. Sending 5, 1500-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: Packet sent with the DF bit set !!!!!

...and as expected, I can't send more than 1500 with the df-bit:

``` rtr-behind-ptp#ping 10.100.3.37 size 1501 df-bit Type escape sequence to abort. Sending 5, 1501-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds:

Packet sent with the DF bit set ..... Success rate is 0 percent (0/5) ```

But unlike all other routers, I can only get to 1504 without the df-bit. Beyond 1504 the pings fail.

rtr-behind-ptp#ping 10.100.3.37 size 1501 Type escape sequence to abort. Sending 5, 1501-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 8/8/10 ms rtr-behind-ptp#ping 10.100.3.37 size 1502 Type escape sequence to abort. Sending 5, 1502-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 8/9/11 ms rtr-behind-ptp#ping 10.100.3.37 size 1503 Type escape sequence to abort. Sending 5, 1503-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 8/9/11 ms rtr-behind-ptp#ping 10.100.3.37 size 1504 Type escape sequence to abort. Sending 5, 1504-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 8/9/11 ms rtr-behind-ptp#ping 10.100.3.37 size 1505 Type escape sequence to abort. Sending 5, 1505-byte ICMP Echos to 10.100.3.37, timeout is 2 seconds: ..... Success rate is 0 percent (0/5)

routers on both sides of this link have typical/default interface configs, same as other working links that have no problem passing pings > 1500 bytes.

this PTP link is built with Cambium PTP550 radios with 1542 byte MTU on all units.

the radios are strictly L2, they are not part of the routing path. the branch router's default route points directly to our core router. And again, all typical traffic routes fine.

Opened a case with Cambium support, but I don't see how this could be a cambium issue because I expect the router interface's 1500 byte MTU to trigger fragmentation before they hit the PTP link. I assume I missed something?

let me know if you need other details. thanks!



No comments:

Post a Comment