Calgary-HTTP
-
Description
-
This trace contains approximately one year's worth of all HTTP requests to the University of Calgary's Department of Computer Science WWW server
located at Calgary, Alberta, Canada.
-
Format
-
The logs are an ASCII file with one line per request, with the following
columns:
- host making the request. Hosts are identified as either
local or remote where local is a host from the
University of Calgary, and remote is a host from outside of
the University of Calgary domain.
- timestamp in the format "DAY MON DD HH:MM:SS YYYY", where
DAY is the day of the week, MON is the name of the month,
DD is the day of the month, HH:MM:SS
is the time of day using a 24-hour clock, and YYYY is the year.
The timezone is -0700 between 30/Oct/1994:01:30:57 and 02/Apr/1995:03:03:26.
For all other requests, the timezone is -0600.
- filename of the requested item. Paths have been removed.
Modified filenames consist of two parts: num.type , where num
is a unique integer identifier, and type is the extension of
the requested file.
- HTTP reply code.
- bytes in the reply.
-
Measurement
-
The logs were collected from October 24, 1994 through October 11, 1995,
a total of 353 days. There were 726,739 requests. Timestamps have 1
second resolution.
-
Privacy
-
The sites making requests to the server have had their addresses removed
to preserve privacy. Paths have been removed. Files were numbered from
1 for the first file encountered in the trace. Files retain the original
file extension, so that the type of file can be determined.
-
Acknowledgements
-
The logs were collected
by Robert Fridman of the University of Calgary Department
of Computer Science,
and contributed by Martin Arlitt (mfa126@cs.usask.ca)
and Carey Williamson (carey@cs.usask.ca) of the University
of Saskatchewan.
-
Publications
-
This is one of six data sets analyzed in an upcoming paper by
M. Arlitt and C. Williamson, entitled
``Web Server Workload Characterization: The Search for Invariants'',
to appear in the proceedings of the
1996 ACM SIGMETRICS Conference on the Measurement and Modeling of
Computer Systems, Philadelphia, PA,
May 23-26, 1996. An
extended version of this paper is available on-line; see
also the
DISCUS home page and the group's
publications.
-
Related
-
Permission has been granted to make four of the six data sets discussed
in ``Web Server Workload Characterization: The Search for Invariants''
available. The four data sets are:
Calgary-HTTP ,
ClarkNet-HTTP ,
NASA-HTTP , and
Saskatchewan-HTTP .
-
Restrictions
-
The trace may be freely redistributed.
-
Distribution
-
Available from the Archive in
ASCII format, 5.4 MB gzip compressed, 52.3 MB uncompressed.
Up to
Traces In The Internet Traffic Archive.