How to get a useful trace with tcpdump

Blue Bar separator


Something that I often see is people running tcpdump to get a trace of a TCP/IP problem. They then send this trace to someone or post it to a newsgroup and ask for help.
[root@localhost /]# tcpdump
tcpdump: verbose output suppressed, use -v 0r -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size is 96 bytes

19:10:51.591178 IP 192.168.0.103.40333 > members.cox.net.http: P 1:532(531) ack 
1 win 1460 
19:10:51.668693 IP members.cox.net.http > 192.168.0.103.40333: . ack 532 win 246
16 
19:10:51.690368 IP members.cox.net.http > 192.168.0.103.40333: P 1:201(200) ack 
532 win 24616 
19:10:51.690504 IP 192.168.0.103.40333 > members.cox.net.http: . ack 201 win 172
8 
The problem with this is that the default output does not provide very much information. You see source and destination addresses (numeric or name if it can be resolved), port numbers or names, sequence and acknowledgment numbers, window size and options. This may be enough to resolve a problem, then again it may not be. If it isn't they have to duplicate the problem, something that may not be easy to do and run tcpdump again - with the arguments they should have used the first time.

The purpose of this article is to give you enough information to effectively use tcpdump, capturing all the information you will need the first time. I explain what arguments to use, how to build simple filters so only packets of interest are captured and save the trace so that it can be analyzed offline. You will not become an expert on using tcpdump and I don't say much of anything about actual protocol analysis but you should be able to get a useful trace the first time you try.

If you are interested in leaning more about tcp dump try these links:

  1. tcpdump web site
  2. tcpdump man page

Save a trace file

First, if at all possible you want to write the raw data to a file. That way you are not limited to just the lines that remain in a terminal window. The "-w" argument can be used for that.
[root@localhost /]# tcpdump -w traceFile
listening on eth0, link-type EN10MB (Ethernet), capture size is 96 bytes
The above command will create a file named traceFile in the current dir and write the first 96 bytes of each packet to it. If the problem is purely a TCP/IP problem then the first 96 bytes of each packet is sufficient but if the problem is at a higher layer, perhaps an NFS or CIFS problem then you will need to capture more of the packet.

The -s argument can be used to indicate how much of the packet to capture. A value of 0 means that the entire packet should be captured.

[root@localhost /]# tcpdump -w traceFile -s 0
listening on eth0, link-type EN10MB (Ethernet), capture size is 65535 byte
s
Note the change in capture size. If you are only going to run tcpdump for a short time or have a filter in place (more on that below) the above command is probably sufficient. However, there is the potential for the traceFile file to get very large. To prevent that you can tell tcpdump to create N files each approximately X million (not mega) bytes in size.
[root@localhost tcpdumps]# tcpdump -w traceFile -s 0 -W 5 -C 1
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 byte
s
The above command will create 5 files named, traceFile0 thru traceFile4. When traceFile4 reaches its size limit of 1 million bytes traceFile0 will be overwritten. The result is that you can run tcpdump forever and always have the last 5 million bytes of trace data saved. When the problem occurs you stop the tcpdump process and you have captured the problem and the packets leading up to the problem.

There is one tricky point to this which has to do with file permissions. I've found that with my version of tcpdump (3.8 in the Fedora Core 4 distribution) you need to run it as root in order to go into promiscuous mode to capture all packets on the segment. When you do that it automagically changes the user and group that it runs in to pcap:pcap - BUT only after creating traceFile0. So the directory that the files are being written into must allow pcap:pcap access to create files or the process will terminate when it tries to create the second file.

[root@localhost /]# tcpdump -w traceFile -s 0 -W 5 -C 1
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 byte
s
tcpdump: traceFile1: Permission denied
In addition, when the last file fills and it cycles back to file 0. It can't write to the file because the owner is root. I found it necessary to create the file first, change its permissions so that it was writable by everyone and then start tcpdump.
[root@localhost tcpdumps]# touch traceFile0
[root@localhost tcpdumps]# chmod 666 traceFile0
[root@localhost tcpdumps]# tcpdump -w traceFile -s 0 -W 5 -C 1
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 byte
s
If you check on the files you will see something like this after the last file fills to max and tcpdump has cycled back to file 0.
[root@localhost tcpdumps]# ls -l
total 4352
-rw-rw-rw-  1 root root  400477 Sep 22 20:09 traceFile0
-rw-r--r--  1 pcap pcap 1000027 Sep 22 20:03 traceFile1
-rw-r--r--  1 pcap pcap 1000402 Sep 22 20:06 traceFile2
-rw-r--r--  1 pcap pcap 1000103 Sep 22 20:06 traceFile3
-rw-r--r--  1 pcap pcap 1000838 Sep 22 20:08 traceFile4
Note that the sizes are slightly greater than the 1 million byte limit specified. The limit is checked before the packet is written so a large packet written to an almost full file will make the file slightly larger then the specified limit.

Another point to consider is that every time a file is closed and another opened you run the risk of loosing some packets. One million bytes and 5 files are good for a demonstration but fewer larger files should be used during an attempt to capture a real problem. You should base what size you use on the amount of data you will need to capture (see filtering below to reduce this) and how much disk space you have. If you will be E-mailing the files keep in mind that there may be limitations on the size of an E-mail message.

Now that you have captured the data what should you do with it? Ideally you can package up the traceFile files and send them to someone who can interpret them, along with the time of the problem and the IP addresses of the hosts concerned and the port numbers of the connection. Sometimes that is not possible, for example when posting the trace to a news group.

Reading a trace file

If you can only send a printable copy of the trace the first thing you have to do is read the trace back in. You do that with the "-r" argument. Keep in mind that each file is a separate trace and will have to be processed separately.
[root@localhost /]# tcpdump -r traceFile0
reading from file traceFile0, link-type EN10MB (Ethernet)
19:10:51.591178 IP 192.168.0.103.40333 > members.cox.net.http: P 1:532(531) ack 
1 win 1460 
19:10:51.668693 IP members.cox.net.http > 192.168.0.103.40333: . ack 532 win 246
16 
19:10:51.690368 IP members.cox.net.http > 192.168.0.103.40333: P 1:201(200) ack 
532 win 24616 
19:10:51.690504 IP 192.168.0.103.40333 > members.cox.net.http: . ack 201 win 172
8 

Arguments to get more out of a trace file

But as stated above the default output may or may not provide enough information. I suggest using the following arguments "nnvvvSe". The "nn" tells tcpdump not to resolve IP addresses and port numbers into names. I just prefer to see the numbers rather than the names. The "vvv" tells tcpdump to be as verbose as possible, displaying all the IP and TCP (or other protocol) fields. You never know when a minor field like TTL, ID value or one of the IP flags may point to a solution to a problem. The "S" (case is important) says that sequence and acknowledgement numbers should be displayed as absolute values instead of relative values. This comes in really handy when you are trying to correlate two traces taken at different locations and possibly starting at different points in the connection. With absolute sequence numbers displayed they are easy to correlate. Of course, if you are not trying to correlate two traces this isn't important but if you are always in the habit of doing it the one time you need it, it will be automatic and you'll get it. Finally, the "e" indicates that the Ethernet layer should be displayed. This is the Ethernet MAC addresses and the type. The MAC addresses are what are important, or at least what may be important.
[root@localhost /]# tcpdump -r traceFile0 -nnvvvSe
reading from file traceFile0, link-type EN10MB (Ethernet)
19:10:51.591178 00:e0:29:6e:18:0f > 00:0f:3d:4b:25:8c, ethertype IPv4 (0x0800), 
length 597: IP (tos 0x0, ttl  64, id 49033, offset 0, flags [DF], proto 6, lengt  
h: 583) 192.168.0.103.40333 > 68.1.17.8.80: P [bad tcp cksum 1852 (->ca07)!] 356  
7474751:3567475282(531) ack 2950405661 win 1460 
19:10:51.668693 00:0f:3d:4b:25:8c > 00:e0:29:6e:18:0f, ethertype IPv4 (0x0800), 
length 66: IP (tos 0x0, ttl  44, id 9432, offset 0, flags [DF], proto 6, length: 
52) 68.1.17.8.80 > 192.168.0.103.40333: . [tcp sum ok] 2950405661:2950405661(0)  
ack 3567475282 win 24616 
19:10:51.690368 00:0f:3d:4b:25:8c > 00:e0:29:6e:18:0f, ethertype IPv4 (0x0800), 
length 266: IP (tos 0x0, ttl  44, id 9433, offset 0, flags [DF], proto 6, length
:252) 68.1.17.8.80 > 192.168.0.103.40333: P [tcp sum ok] 2950405661:2950405861(2
00) ack 3567475282 win 24616 
19:10:51.690504 00:e0:29:6e:18:0f > 00:0f:3d:4b:25:8c, ethertype IPv4 (0x0800), 
length 66: IP (tos 0x0, ttl  64, id 49035, offset 0, flags [DF], proto 6, length
: 52) 192.168.0.103.40333 > 68.1.17.8.80: . [tcp sum ok] 3567475282:3567475282(0
) ack 2950405861 win 1728 
As you can see there is a lot more output.

Getting the most out of a trace file

Of course, the above information is only useful if the problem is definitely in the TCP/IP or Ethernet layers. If it's in one of the application layers we still have no clue. The "-XX" argument will display the entire packet in both hex and ASCII. Most of the higher layers will not be interpreted by tcpdump but the data will be there to look at.
[root@localhost /]# tcpdump -r traceFile0 -nnvvvSeXX
reading from file traceFile0, link-type EN10MB (Ethernet)
19:10:51.591178 00:e0:29:6e:18:0f > 00:0f:3d:4b:25:8c, ethertype IPv4 (0x0800), 
length 597: IP (tos 0x0, ttl  64, id 49033, offset 0, flags [DF], proto 6, lengt
h: 583) 192.168.0.103.40333 > 68.1.17.8.80: P [bad tcp cksum 1852 (->ca07)!] 356
7474751:3567475282(531) ack 2950405661 win 1460 
        0x0000:  000f 3d4b 258c 00e0 296e 180f 0800 4500  ..=K%...)n....E.
        0x0010:  0247 bf89 4000 4006 630f c0a8 0067 4401  .G..@.@.c....gD.
        0x0020:  1108 9d8d 0050 d4a3 583f afdb 9e1d 8018  .....P..X?......
        0x0030:  05b4 1852 0000 0101 080a 3fac e808 0768  ...R......?....h
        0x0040:  83af 4745 5420 2f7e 6e64 6176 312f 2048  ..GET./~ndav1/.H
        0x0050:  5454 502f 312e 310d 0a48 6f73 743a 206d  TTP/1.1..Host:.m
        0x0060:  656d 6265 7273 2e63 6f78 2e6e 6574 0d0a  embers.cox.net..
        0x0070:  5573 6572 2d41 6765 6e74 3a20 4d6f 7a69  User-Agent:.Mozi
        0x0080:  6c6c 612f 352e 3020 2858 3131 3b20 553b  lla/5.0.(X11;.U;
        0x0090:  204c 696e 7578 2069 3638 363b 2065 6e2d  .Linux.i686;.en-
        0x00a0:  5553 3b20 7276 3a31 2e37 2e38 2920 4765  US;.rv:1.7.8).Ge
        0x00b0:  636b 6f2f 3230 3035 3035 3234 2046 6564  cko/20050524.Fed
        0x00c0:  6f72 612f 312e 302e 342d 3420 4669 7265  ora/1.0.4-4.Fire
        0x00d0:  666f 782f 312e 302e 340d 0a41 6363 6570  fox/1.0.4..Accep
        0x00e0:  743a 2074 6578 742f 786d 6c2c 6170 706c  t:.text/xml,appl
        0x00f0:  6963 6174 696f 6e2f 786d 6c2c 6170 706c  ication/xml,appl
        0x0100:  6963 6174 696f 6e2f 7868 746d 6c2b 786d  ication/xhtml+xm
        0x0110:  6c2c 7465 7874 2f68 746d 6c3b 713d 302e  l,text/html;q=0.
        0x0120:  392c 7465 7874 2f70 6c61 696e 3b71 3d30  9,text/plain;q=0
        0x0130:  2e38 2c69 6d61 6765 2f70 6e67 2c2a 2f2a  .8,image/png,*/*
        0x0140:  3b71 3d30 2e35 0d0a 4163 6365 7074 2d4c  ;q=0.5..Accept-L
        0x0150:  616e 6775 6167 653a 2065 6e2d 7573 2c65  anguage:.en-us,e
        0x0160:  6e3b 713d 302e 350d 0a41 6363 6570 742d  n;q=0.5..Accept-
        0x0170:  456e 636f 6469 6e67 3a20 677a 6970 2c64  Encoding:.gzip,d
        0x0180:  6566 6c61 7465 0d0a 4163 6365 7074 2d43  eflate..Accept-C
        0x0190:  6861 7273 6574 3a20 4953 4f2d 3838 3539  harset:.ISO-8859
        0x01a0:  2d31 2c75 7466 2d38 3b71 3d30 2e37 2c2a  -1,utf-8;q=0.7,*
        0x01b0:  3b71 3d30 2e37 0d0a 4b65 6570 2d41 6c69  ;q=0.7..Keep-Ali
        0x01c0:  7665 3a20 3330 300d 0a43 6f6e 6e65 6374  ve:.300..Connect
        0x01d0:  696f 6e3a 206b 6565 702d 616c 6976 650d  ion:.keep-alive.
        0x01e0:  0a49 662d 4d6f 6469 6669 6564 2d53 696e  .If-Modified-Sin
        0x01f0:  6365 3a20 5361 742c 2030 3320 4a75 6c20  ce:.Sat,.03.Jul.
        0x0200:  3230 3034 2030 333a 3131 3a33 3920 474d  2004.03:11:39.GM
        0x0210:  540d 0a49 662d 4e6f 6e65 2d4d 6174 6368  T..If-None-Match
        0x0220:  3a20 2237 3934 6339 2d31 3036 612d 3430  :."794c9-106a-40
        0x0230:  6536 3233 6562 220d 0a43 6163 6865 2d43  e623eb"..Cache-C
        0x0240:  6f6e 7472 6f6c 3a20 6d61 782d 6167 653d  ontrol:.max-age=
        0x0250:  300d 0a0d 0a                             0....

Filtering the data

One final point. A 100 mbps network by definition transmits 100 mbps per second. As we all know the actual maximum is smaller than 100 mbps, but it doesn't have to be too much smaller. If you limit yourself to 10 trace files of 100 million bytes each you have around 80 seconds on a busy network to stop the trace after the problem occurs before the interesting packets are overwritten. You can of course make the files bigger or use more of them but this can put a significant dent in your disk usage. Luckily there is a better way - filters.

Basically, you can tell tcpdump to ignore all packets but a specific subset. The filter expression can be quite complex (and there are more keywords than I list here) but these are the expressions I have found most useful:

  1. host A.B.C.D - any IP packet with either a source or destination address of A.B.C.D
  2. net A.B.C/n - any IP packet with either a source or destination address from the network A.B.C with "n" network bits
  3. port X - any TCP or UDP packet with either a source or destination port X
  4. ether broadcast - any packet with an Ethernet broadcast as its destination
  5. tcp - any TCP packet
  6. udp - any UDP packet
  7. icmp - any ICMP packet
  8. arp - any ARP packet
These expressions can be combined, for example I typically will capture all the ICMP and ARP packets even when I am tracing a specific IP address. Problems in the network that can effect TCP connections are reported via ICMP packets and if you filter only on the TCP packets you will not see these reports. By the same token ARP packets can sometimes provide clues to the source of problems. My typical filter then looks like
A.B.C.D or icmp or arp
Or
A.B.C.D and port X or arp or icmp

The whole command looks like

[root@localhost /]# tcpdump -w traceFile -s 0 -W 5 -C 1 host A.B.C.D and port X 
or arp or icmp
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 byte
s

Be aware of missing data

One final point, when you terminate the trace tcpdump will output the following three lines
	X packets captured
	Y packets received by filter
	Z packets dropped by kernel
The important number is Z. If Z is not zero your trace is incomplete. It might not matter, the missing packets might have been dropped by the filter anyway or possibly they are unimportant to diagnosing the problem. It is however something that should be communicated to the person analyzing the trace.
Blue Bar separator
This page was last modified on 05-11-25
mailbox Send comments and suggestions
to ndav1@cox.net