TCP-Reduce Documentation
Usage summary
Using the scripts is quite simple:
-
tcp-reduce file >reduced-file
-
reduces the
tcpdump
trace file file to the reduced form described below.
-
tcp-conn
-
is an internal script you should not run directly.
-
tcp-summary reduced-file >summary-file
-
summarizes on a per-TCP-protocol basis a reduced file produced by
tcp-reduce.
Format of the reduced files
Reduced files produced by tcp-reduce summarize each connection
with a single ASCII line. This line contains 8 or 9 columns:
- timestamp when the connection began (first SYN packet)
- duration of the trace in seconds, or a ? if the
trace did not show the connection terminating (no FIN or RST packets seen)
- protocol used by the connection. This in general is derived
from the port number used by the responder to the initial SYN packet.
An exception is made for an initial SYN packet sent from TCP port 20,
which corresponds to the well-known port for ftp-data.
Unidentified ports are reported as other-XXXX if non-privileged
(> 1024) and priv-XXXX if privileged. If the unidentified traffic
is coincident with an ftp connection between the two hosts, then
it is reported as ftpdata-XXXX instead.
- bytes sent by originator of the connection, or ? if
not available (due to connection not terminating, or terminating with RST).
- bytes sent by responder to the connection, or ? if
not available.
- local host - the (possibly renumbered) local host
that participated in the connection. See below for discussion of
local and remote hosts.
- remote host - the (possibly renumbered) remote host
that participated in the connection.
- state that the connection ended in. This can be one of the
following:
- SF normal SYN/FIN completion
- REJ connection rejected - initial SYN elicited a RST in reply
- S0 state 0: initial SYN seen but no reply
- S1 state 1: connection established (SYN's exchanged),
nothing further seen
- S2 state 2: connection established, initiator has closed their side
- S3 state 3: connection established, responder has closed their side
- S4 state 4: SYN ack seen, but no initial SYN seen
- RSTOSn connection reset by the originator when it
was in state n
- RSTRSn connection reset by the responder when it
was in state n
- SS SYN seen for already partially-closed connection
- SH a state 0 connection was closed before we ever saw the
SYN ack
- SHR a state 4 connection was closed before we ever saw the
original SYN
- OOS1 SYN ack did not match initial SYN
- OOS2 initial SYN retransmitted with different sequence number
Note that connections ending in states S2 and S3 (or
terminated by RST's after being in this state; e.g., RSTO3) may
have byte counts associated with them. These connections were "half-closed".
If the side that was half-closed was closed by a FIN packet, then the
FIN packet provides an accurate byte count for the side that was closed,
and a lower-bound byte count for the other side (from the sequence
number ack'd by the FIN). Thus you may trust one of the byte counts,
and the other is probably equal to or just a bit below the final byte count,
though it could be much below if the connection persisted half-open for
a long time.
- flags zero or more flags:
- L indicates the connection was initiated locally (i.e.,
the local host in column 6 is the one that began the connection)
- N indicates the connection was with a neighbor site.
See below for discussion of neighbor sites.
Configuring tcp-reduce
tcp-reduce has several configuration options:
- Which networks to treat as "local" - the function
is_local_site() in the internal script tcp_conn
returns true if the site passed to it should be considered "local",
and false otherwise. As distributed, this function matches the
networks belonging to the Lawrence Berkeley Laboratory,
where the script was developed.
- Which networks to treat as "neighbors" - the function
is_neighbor_net() in the internal script tcp_conn
returns true if the site passed to it should be considered a "neighbor",
and false otherwise. As distributed, this function matches the
networks belonging to the University of California at Berkeley, since
the link between LBL and UCB is very fast, and I wanted to be able to
flag traffic between the two sites as possibly atypical for wide-area
traffic.
If you don't care about discriminating between neighbor and non-neighbor
sites, you can just have the function always return 0.
- How to renumber hosts - if the variable
renumber_all is set at the beginning of the
tcp_conn script, then both local and remote
hosts will be renumbered. If only renumber_local is set (the
default), then just local hosts are renumbered. If neither
is set, then hosts have their full IP address in the reduced output.
Bogons
Sometimes a packet gets mangled by the network, and while it still
appears to be a SYN, FIN, or RST packet, its contents are obviously
in error. This particularly occurs when a packet is truncated by
the network. When tcp_conn encounters a mangled packet,
it reports it as a bogon to stderr and discards it.
Output generated by tcp-summary
The output of tcp-summary consists of six columns (plus a
header to remind you of what's in each column):
- TCP protocol
- Number of connections made or attempted using that protocol
- Number of kilobytes of user-level transferred in both directions
- Successful - percentage of connections that terminated in state
SF, indicating normal SYN/FIN completion
- Local - percentage of connections initiated by local hosts
- Neighbor - percentage of connections in which the remote
site was from a neighbor network.
The output is sorted on the third column (kilobytes). priv-XXXX,
other-XXXX, and ftpdata-XXXX connections are lumped together
as three collective protocols.
Protocols for which fewer than insig_bytes were transferred,
or insig_conn connections were made, will not be reported. These
variables are defined at the beginning of the script, and default to
500 KB and 100 connections, respectively.
Author
The scripts were written by Vern Paxson (vern@ee.lbl.gov) and
are copyrighted by the U.C. Regents as explained at the beginning
of tcp_conn.