SDSC-HTTP
-
Description
-
This trace contains a day's worth of all HTTP requests to the SDSC WWW server
located at the San Diego Supercomputer Center in San Diego, California.
-
Format
-
The logs are an ASCII file with one line per request, with the following
columns:
- host making the request. Hosts are identified as N+H:,
where N is a number identifying the hosts's network, and H is
a number identifying the host within that network.
- timestamp in the format "DAY MON DD HH:MM:SS YYYY", where
DAY is the day of the week, MON is the name of the month,
DD is the day of the month, HH:MM:SS
is the time of day using a 24-hour clock, and YYYY is the year.
Times are Pacific Daylight Time.
- filename of the requested item. Empty for error conditions,
non-empty for normal transactions. If non-empty it always starts with '/'.
- operation performed by the server.
Successful operations always start with "Sent" and are terminated with a ':'.
These include: "cache search", "range search", "search results",
"WAIS search", "grep search", "text", "range", "binary", "cache",
"exec binary", "exec", "HTTP/1.0 header", "CGI output", "CGI output [nph]",
and "CGI redirection".
Failed operations do not start with "Sent" (they still are terminated with
':'). The filename associated with them is always empty. There
are many failure modes and we do not list them here.
- remainder of the transaction log. This is either the HTTP
request, or information associated with a failed operation.
Routines to parse the logs are available from webmaster@sdsc.edu.
-
Measurement
-
The logs were collected from 00:00:00 PDT through 23:59:41 PDT on Tuesday,
August 22 1995, a total of 24 hours. There were 28,338 requests and no
known losses. Timestamps have 1 second resolution. Note that the
server used was the GN server, not the more common CERN or NCSA server.
-
Privacy
-
The sites making requests to the server have had their addresses renumbered
to preserve privacy. This was done by first splitting the address into
a network number and a host number, and then renumbering each. Networks
were numbered starting from 1 for the first network encountered in the
trace. Hosts were numbered starting from 1 for the first host encountered
in the trace for each network.
-
Acknowledgements
-
The logs were collected by Joshua Polterock (webmaster@sdsc.edu),
Hans-Werner Braun (hwb@sdsc.edu), and K Claffy
(kc@sdsc.edu), all of the San Diego Supercomputer Center.
Please include a corresponding acknowledgement in publications analyzing
the logs.
-
Restrictions
-
The trace may be freely redistributed.
-
Distribution
-
Available from the Archive in
compressed
ASCII format (580 KB; 3.6 MB uncompressed).
Up to
Traces In The Internet Traffic Archive.