SDSC-HTTP

Description
This trace contains a day's worth of all HTTP requests to the SDSC WWW server located at the San Diego Supercomputer Center in San Diego, California.
Format
The logs are an ASCII file with one line per request, with the following columns:
  1. host making the request. Hosts are identified as N+H:, where N is a number identifying the hosts's network, and H is a number identifying the host within that network.
  2. timestamp in the format "DAY MON DD HH:MM:SS YYYY", where DAY is the day of the week, MON is the name of the month, DD is the day of the month, HH:MM:SS is the time of day using a 24-hour clock, and YYYY is the year. Times are Pacific Daylight Time.
  3. filename of the requested item. Empty for error conditions, non-empty for normal transactions. If non-empty it always starts with '/'.
  4. operation performed by the server.
    Successful operations always start with "Sent" and are terminated with a ':'. These include: "cache search", "range search", "search results", "WAIS search", "grep search", "text", "range", "binary", "cache", "exec binary", "exec", "HTTP/1.0 header", "CGI output", "CGI output [nph]", and "CGI redirection".
    Failed operations do not start with "Sent" (they still are terminated with ':'). The filename associated with them is always empty. There are many failure modes and we do not list them here.
  5. remainder of the transaction log. This is either the HTTP request, or information associated with a failed operation.
Routines to parse the logs are available from webmaster@sdsc.edu.
Measurement
The logs were collected from 00:00:00 PDT through 23:59:41 PDT on Tuesday, August 22 1995, a total of 24 hours. There were 28,338 requests and no known losses. Timestamps have 1 second resolution. Note that the server used was the GN server, not the more common CERN or NCSA server.
Privacy
The sites making requests to the server have had their addresses renumbered to preserve privacy. This was done by first splitting the address into a network number and a host number, and then renumbering each. Networks were numbered starting from 1 for the first network encountered in the trace. Hosts were numbered starting from 1 for the first host encountered in the trace for each network.
Acknowledgements
The logs were collected by Joshua Polterock (webmaster@sdsc.edu), Hans-Werner Braun (hwb@sdsc.edu), and K Claffy (kc@sdsc.edu), all of the San Diego Supercomputer Center. Please include a corresponding acknowledgement in publications analyzing the logs.
Restrictions
The trace may be freely redistributed.
Distribution
Available from the Archive in compressed ASCII format (580 KB; 3.6 MB uncompressed).


Up to Traces In The Internet Traffic Archive.