Feb. 9, 2009, 7:07 a.m.
posted by whitehat
Using traceroute to Test ConnectivityAnother tool for network troubleshooting is the TRaceroute command. It gives a listing of all the router hops between your server and the target server. This helps you verify that routing over the networks in between is correct. The traceroute command works by sending a UDP packet destined to the target with a TTL of 0. The first router on the route recognizes that the TTL has already been exceeded and discards or drops the packet, but also sends an ICMP time exceeded message back to the source. The TRaceroute program records the IP address of the router that sent the message and knows that that is the first hop on the path to the final destination. The traceroute program tries again, with a TTL of 1. The first hop sees nothing wrong with the packet, decrements the TTL to 0 as expected, and forwards the packet to the second hop on the path. Router 2 sees the TTL of 0, drops the packet and replies with an ICMP time exceeded message. TRaceroute now knows the IP address of the second router. This continues around and around until the final destination is reached.
Sample traceroute OutputHere is a sample output for a query to 144.232.20.158. Notice that all the hop times are under 50 milliseconds (ms) which is acceptable:
[root@bigboy tmp]# traceroute -I 144.232.20.158
traceroute to 144.232.20.158 (144.232.20.158), 30 hops max, 38 byte
packets
1 adsl-67-120-221-110.dsl.sntc01.my-isp-provider.net (67.120.221.110)
14.408 ms 14.064 ms 13.111 ms
2 dist3-vlan50.sntc01.my-isp-provider.net (63.203.35.67) 13.018 ms
12.887 ms 13.146 ms
3 bb1-g1-0.sntc01.my-isp-provider.net (63.203.35.17) 12.854 ms 13.035
ms 13.745 ms
4 bb2-p11-0.snfc21.my-isp-provider.net (64.161.124.246) 16.260 ms
15.618 ms 15.663 ms
5 bb1-p14-0.snfc21.my-isp-provider.net (64.161.124.53) 15.897 ms
15.785 ms 17.164 ms
6 sl-gw11-sj-3-0.another-isp-provider.net (144.228.44.49) 14.443 ms
16.279 ms 15.189 ms
7 sl-bb25-sj-6-1.another-isp-provider.net (144.232.3.133) 16.185 ms
15.857 ms 15.423 ms
8 sl-bb23-ana-6-0.another-isp-provider.net (144.232.20.158) 27.482 ms
26.306 ms 26.487 ms
[root@bigboy tmp]#
Possible traceroute MessagesThere are a number of possible message codes traceroute can give; these are listed in Figure.
traceroute Time Exceeded False AlarmsIf there is no response within a 5-second timeout interval, an asterisk (*) is printed for that probe, as seen in the following example:
[root@bigboy tmp]# traceroute 144.232.20.158
traceroute to 144.232.20.158 (144.232.20.158), 30 hops max, 38 byte
packets
1 adsl-67-120-221-110.dsl.sntc01.my-isp-provider.net (67.120.221.110)
14.304 ms 14.019 ms 16.120 ms
2 dist3-vlan50.sntc01.my-isp-provider.net (63.203.35.67) 12.971 ms
14.000 ms 14.627 ms
3 bb1-g1-0.sntc01.my-isp-provider.net (63.203.35.17) 15.521 ms 12.860
ms 13.179 ms
4 bb2-p11-0.snfc21.my-isp-provider.net (64.161.124.246) 13.991 ms
15.842 ms 15.728 ms
5 bb1-p14-0.snfc21.my-isp-provider.net (64.161.124.53) 16.133 ms
15.510 ms 15.909 ms
6 sl-gw11-sj-3-0.another-isp-provider.net (144.228.44.49) 16.510 ms
17.469 ms 18.116 ms
7 sl-bb25-sj-6-1.another-isp-provider.net (144.232.3.133) 16.212 ms
14.274 ms 15.926 ms
8 * * *
9 * * *
[root@bigboy tmp]#
Some devices will prevent traceroute packets directed at their interfaces, but will allow ICMP packets. Using TRaceroute with an -I flag forces traceroute to use ICMP packets that may go through. In this case the * * * status messages disappear:
[root@bigboy tmp]# traceroute -I 144.232.20.158
traceroute to 144.232.20.158 (144.232.20.158), 30 hops max, 38 byte
packets
1 adsl-67-120-221-110.dsl.sntc01.my-isp-provider.net (67.120.221.110)
14.408 ms 14.064 ms 13.111 ms
2 dist3-vlan50.sntc01.my-isp-provider.net (63.203.35.67) 13.018 ms
12.887 ms 13.146 ms
3 bb1-g1-0.sntc01.my-isp-provider.net (63.203.35.17) 12.854 ms 13.035
ms 13.745 ms
4 bb2-p11-0.snfc21.my-isp-provider.net (64.161.124.246) 16.260 ms
15.618 ms 15.663 ms
5 bb1-p14-0.snfc21.my-isp-provider.net (64.161.124.53) 15.897 ms
15.785 ms 17.164 ms
6 sl-gw11-sj-3-0.another-isp-provider.net (144.228.44.49) 14.443 ms
16.279 ms 15.189 ms
7 sl-bb25-sj-6-1.another-isp-provider.net (144.232.3.133) 16.185 ms
15.857 ms 15.423 ms
8 sl-bb23-ana-6-0.another-isp-provider.net (144.232.20.158) 27.482 ms
26.306 ms 26.487 ms
[root@bigboy tmp]#
traceroute Internet Slowness False AlarmThe following traceroute gives the impression that a Web site at 80.40.118.227 might be slow because there is congestion along the way at hops 6 and 7 where the response time is over 200ms:
C:\>tracert 80.40.118.227
1 1 ms 2 ms 1 ms 66.134.200.97
2 43 ms 15 ms 44 ms 172.31.255.253
3 15 ms 16 ms 8 ms 192.168.21.65
4 26 ms 13 ms 16 ms 64.200.150.193
5 38 ms 12 ms 14 ms 64.200.151.229
6 239 ms 255 ms 253 ms 64.200.149.14
7 254 ms 252 ms 252 ms 64.200.150.110
8 24 ms 20 ms 20 ms 192.174.250.34
9 91 ms 89 ms 60 ms 192.174.47.6
10 17 ms 20 ms 20 ms 80.40.96.12
11 30 ms 16 ms 23 ms 80.40.118.227
Trace complete.
C:\>
This indicates only that the devices on hops 6 and 7 were slow to respond with ICMP TTL exceeded messages, but not an indication of congestion, latency, or packet loss. If any of those conditions existed, all points past the problematic link would show high latency. Many Internet routing devices give very low priority to traffic related to TRaceroute in favor of revenue-generating traffic. traceroute Dies at the Router Just Before the ServerIn this case the last device to respond to the traceroute just happens to be the router that acts as the default gateway of the server. The problem is not with the router, but with the server. Remember, you will only receive traceroute responses from functioning devices. Possible causes of this problem include the following:
C:\>tracert 80.40.100.18
Tracing route to 80.40.100.18 over a maximum of 30 hops
1 33 ms 49 ms 28 ms 192.168.1.1
2 33 ms 49 ms 28 ms 65.14.65.19
3 33 ms 32 ms 32 ms 81.25.68.252
4 47 ms 32 ms 31 ms 80.40.97.1
5 29 ms 28 ms 32 ms 80.40.96.114
6 * * * Request timed out.
7 ^C
C:\>
Always Get a Bidirectional tracerouteIt is always best to get a traceroute from the source IP to the target IP and also from the target IP to the source IP. This is because the packet's return path from the target is sometimes not the same as the path taken to get there. A high traceroute time equates to the round-trip time for both the initial traceroute query to each hop and the response of each hop. Here is an example of one such case, using disguised IP addresses and provider names. There was once a routing issue between telecommunications carriers FastNet and SlowNet. When a user at IP address 40.16.106.32 did a traceroute to 64.25.175.200, a problem seemed to appear at the 10th hop with OtherNet. However, when a user at 64.25.175.200 did a traceroute to 40.16.106.32, latency showed up at hop 7 with the return path being very different. In this case, the real traffic congestion was occurring where FastNet handed off traffic to SlowNet in the second trace. The latency appeared to be caused at hop 10 on the first trace not because that hop was slow, but because that was the first hop at which the return packet traveled back to the source via the congested route. Remember, traceroute gives the packet round-trip time:
Trace route to 40.16.106.32 from 64.25.175.200
1 0 ms 0 ms 0 [64.25.175.200]
2 0 ms 0 ms 0 [64.25.175.253]
3 0 ms 0 ms 0 border-from-40-tesser.my-isp-provider.net
[207.174.144.169]
4 0 ms 0 ms 0 [64.25.128.126]
5 0 ms 0 ms 0 p3-0.dnvtco1-cr3.another-isp-provider.net
[4.25.26.53]
6 0 ms 0 ms 0 p2-1.dnvtco1-br1.another-isp-provider.net
[4.24.11.25]
7 0 ms 0 ms 0 p15-0.dnvtco1-br2.another-isp-provider.net
[4.24.11.38]
8 30 ms 30 ms 30 p15-0.snjpca1-br2.another-isp-provider.net
[4.0.6.225]
9 30 ms 30 ms 30 p1-0.snjpca1-cr4.another-isp-provider.net
[4.24.9.150]
10 1252 ms 1212 ms 1202 h0.webhostinc2.another-isp-provider.net
[4.24.236.38]
11 1252 ms 1212 ms 1192 [40.16.96.11]
12 1262 ms 1212 ms 1192 [40.16.96.162]
13 1102 ms 1091 ms 1092 [40.16.106.32]
Trace route to 64.25.175.200 from 40.16.106.32
1 1 ms 1 ms 1 ms [40.16.106.3]
2 1 ms 1 ms 1 ms [40.16.96.161]
3 2 ms 1 ms 1 ms [40.16.96.2]
4 1 ms 1 ms 1 ms [40.16.96.65]
5 2 ms 2 ms 1 ms border8.p4-2.webh02-1.sfj.fastnet.net
[216.52.19.77]
6 2 ms 1 ms 1 ms core1.ge0-1-net2.sfj.fastnet.net
[216.52.0.65]
7 993 ms 961 ms 999 ms sjo-edge-03.inet.slownet.net
8 [208.46.223.33]
1009 ms 1008 ms 971 ms sjo-core-01.inet.slownet.net [205.171.22.29]
9 985 ms 947 ms 983 ms svl-core-03.inet.slownet.net [205.171.5.97]
10 1028 ms 1010 ms 953 ms [205.171.205.30]
11 989 ms 988 ms 985 ms p4-3.paix-bi1.another-isp-provider.net
[4.2.49.13]
12 1002 ms 1001 ms 973 ms p6-0.snjpca1-br1.another-isp-provider.net
[4.24.7.61]
13 1031 ms 989 ms 978 ms p9-0.snjpca1-br2.another-isp-provider.net
[4.24.9.130]
14 1031 ms 1017 ms 1017 ms p3-0.dnvtco1-br2.another-isp-provider.net
[4.0.6.226]
15 1027 ms 1025 ms 1023 ms p15-0.dnvtco1-br1.another-isp-provider.net
[4.24.11.37]
16 1045 ms 1037 ms 1050 ms p1-0.dnvtco1-cr3.another-isp-provider.net
[4.24.11.26]
17 1030 ms 1020 ms 1045 ms p0-0.cointcorp.another-isp-provider.net
[4.25.26.54]
18 1038 ms 1031 ms 1045 ms gw234.my-isp-provider.net [64.25.128.99]
19 1050 ms 1094 ms 1034 ms [64.25.175.253]
20 1050 ms 1094 ms 1034 ms [64.25.175.200]
ping and traceroute Troubleshooting ExampleIn this example, a ping to 186.9.17.153 gave a TTL timeout message. Ping TTLs will usually timeout only if there is a routing loop in which the packet bounces between two routers on the way to the target. Each bounce causes the TTL to decrease by a count of 1 until the TTL reaches 0, at which point you get the timeout. The routing loop was confirmed by the TRaceroute, in which the packet was proven to be bouncing between routers at 186.40.64.94 and 186.40.64.93:
G:\>ping 186.9.17.153
Pinging 186.9.17.153 with 32 bytes of data:
Reply from 186.40.64.94: TTL expired in transit.
Reply from 186.40.64.94: TTL expired in transit.
Reply from 186.40.64.94: TTL expired in transit.
Reply from 186.40.64.94: TTL expired in transit.
Ping statistics for 186.9.17.153:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms
G:\>tracert 186.9.17.153
Tracing route to lostserver.my-isp-provider.net [186.9.17.153]
over a maximum of 30 hops:
1 <10 ms <10 ms <10 ms 186.217.33.1
2 60 ms 70 ms 60 ms rtr-2.my-isp-provider.net [186.40.64.94]
3 70 ms 71 ms 70 ms rtr-1.my-isp-provider.net [186.40.64.93]
4 60 ms 70 ms 60 ms rtr-2.my-isp-provider.net [186.40.64.94]
5 70 ms 70 ms 70 ms rtr-1.my-isp-provider.net [186.40.64.93]
6 60 ms 70 ms 61 ms rtr-2.my-isp-provider.net [186.40.64.94]
7 70 ms 70 ms 70 ms rtr-1.my-isp-provider.net [186.40.64.93]
8 60 ms 70 ms 60 ms rtr-2.my-isp-provider.net [186.40.64.94]
9 70 ms 70 ms 70 ms rtr-1.my-isp-provider.net [186.40.64.93]
...
...
...
Trace complete.
This problem was solved by resetting the routing process on both routers. The problem was initially triggered by an unstable network link that caused frequent routing recalculations. The constant activity eventually corrupted the routing tables of one of the routers. traceroute Web SitesMany ISPs will provide their subscribers with the facility to do a traceroute from purpose-built servers called looking glasses. A simple Web search for the phrase Internet looking glass will provide a long list of alternatives. Doing a TRaceroute from a variety of locations can help identify whether the problem is with the ISP of your Web server or the ISP used at home/work to provide you with Internet access. A more convenient way of doing this is to use a site like traceroute.org, which provides a list of looking glasses sorted by country. Possible Reasons for a Failed tracerouteA TRaceroute can fail to reach its intended destination for a number of reasons including the following:
|
- Comment