This trace contains approximately one year's worth of all HTTP requests to the University of Calgary's Department of Computer Science WWW server
located at Calgary, Alberta, Canada.
The logs are an ASCII file with one line per request, with the following
- host making the request. Hosts are identified as either
local or remote where local is a host from the
University of Calgary, and remote is a host from outside of
the University of Calgary domain.
- timestamp in the format "DAY MON DD HH:MM:SS YYYY", where
DAY is the day of the week, MON is the name of the month,
DD is the day of the month, HH:MM:SS
is the time of day using a 24-hour clock, and YYYY is the year.
The timezone is -0700 between 30/Oct/1994:01:30:57 and 02/Apr/1995:03:03:26.
For all other requests, the timezone is -0600.
- filename of the requested item. Paths have been removed.
Modified filenames consist of two parts: num.type , where num
is a unique integer identifier, and type is the extension of
the requested file.
- HTTP reply code.
- bytes in the reply.
The logs were collected from October 24, 1994 through October 11, 1995,
a total of 353 days. There were 726,739 requests. Timestamps have 1
The sites making requests to the server have had their addresses removed
to preserve privacy. Paths have been removed. Files were numbered from
1 for the first file encountered in the trace. Files retain the original
file extension, so that the type of file can be determined.
The logs were collected
by Robert Fridman of the University of Calgary Department
of Computer Science,
and contributed by Martin Arlitt (firstname.lastname@example.org)
and Carey Williamson (email@example.com) of the University
This is one of six data sets analyzed in an upcoming paper by
M. Arlitt and C. Williamson, entitled
``Web Server Workload Characterization: The Search for Invariants'',
to appear in the proceedings of the
1996 ACM SIGMETRICS Conference on the Measurement and Modeling of
Computer Systems, Philadelphia, PA,
May 23-26, 1996. An
extended version of this paper is available on-line; see
DISCUS home page and the group's
Permission has been granted to make four of the six data sets discussed
in ``Web Server Workload Characterization: The Search for Invariants''
available. The four data sets are:
NASA-HTTP , and
The trace may be freely redistributed.
Available from the Archive in
ASCII format, 5.4 MB gzip compressed, 52.3 MB uncompressed.
Traces In The Internet Traffic Archive.