HCI'98 Short Paper. September 1-4, 1998. Sheffield, UK.
See also presentation slides.
SiteSeer: An Interactive Treeviewer for Visualizing Web Activity
Eric Sigman, Robert Farrell, and Mark Rosenstein
Bellcore
445 South St., Morristown NJ
07960, U.S.A
ABSTRACT
SiteSeer is an interactive visualization tool designed to support the
work of web site analysts in such tasks as understanding web site
traffic patterns and effective advertisement placement. SiteSeer
integrates visualization of the content, structure, and utilization
aspects of a web space.
KEYWORDS
Visualization, World-Wide Web, Data Filtering, Fisheye View, Traffic Analysis
INTRODUCTION
The growth of commercial web sites has resulted in the need for a tool
to aid in the monitoring, restructuring, and the introducing of new
content into the site. These tasks depend on the user's knowledge of
the content, organization and the current site visitors' usage
patterns. SiteSeer is a prototype visualization tool designed for web
analysis. In essence, the tool is an interactive tree viewer that uses
a variety of techniques for visual emphasis, focusing, and information
filtering. This paper describes the tool and its application to two
tasks: understanding site traffic patterns and the effective placement
of advertisments.
EARLIER WORK
SiteSeer is an outgrowth of our work with AMIT (Animated Multiscale
Interactive Treeviewer) (Wittenburg & Sigman, 1997a, 1997b), and
retains many of its basic features. AMIT is a tool aimed at
integrating search and browsing on the World-Wide Web. It presents a
web space as a tree structure. Font scaling and tree pruning are used
to provide multifoveal fisheye views (Furnas, 1986), and animation
provides transitions between the user's customized views. AMIT has
been deployed for a web space of over 12,000 documents
(
http://www.apparent-wind.com/sailing-page.html).
Initially, an off-line "web walker" collects documents by following
the outgoing link structure from a designated root node. The walker
generates a directed graph of that space, and then the system
represents this graph as a tree structure. In AMIT, the titles for
these documents are presented as nodes in a tree. The text collected
by the web walker is indexed by the Latent Semantic Indexing (LSI)
(Deerwester et al., 1990) module. At runtime, a user's query to
a LSI based search engine returns a list of document hits along with
relevancy scores. AMIT generates a view of the tree pruned to show the
hits exceeding a threshold; the relevancy scores are reflected in the
font size used to render the node. Users customize the tree view
through direct manipulation. For example, users can select a set of
nodes as foci for a succeeding view. The new view will be reduced to
the selected nodes and their paths to the root.
SITESEER FOR WEB SITE ANALYSIS
SiteSeer extends AMIT to encompass a repository of traffic and ad
presentation data. This data is posted against the tree of hyperlinked
documents. For example, heavily trafficked documents are represented
by larger nodes. In this way, regions or pathways with high
utilization become readily apparent. Often, users are concerned with
characteristics of the traffic, such as the originating site, type of
site, or day of the week. SiteSeer provides easy-to-use filters to
extract this data through point and click dialog boxes.
SiteSeer was applied to a site where advertisements were dynamically
served to visitors. In this case, analysts want to know both how
frequently ads are viewed throughout the site and ad effectiveness as
measured by the number of visitors clicking on the ad
banner. Typically, these analysts are seeking optimal advertisement
placement. A visualization that combines structure and traffic
supports this task. The user can formulate queries to filter ad data
by various parameters including those available to filter the traffic
data.
An important feature is the sequential visualization of a query
chain. Here, a view of the tree that results from a query can serve as
input to a subsequent query. For example, a user could first query for
the most frequently accessed documents, and then holding that
structure fixed, query for ad data that would be overlaid on the
current view of the tree. An even more interesting example utilizes
the LSI search engine. Since LSI rates the similarities among the
documents, it is possible to create a view based on documents that
have similar content. A subsequent chained query can then be overlaid
on this view. Thus, for example, a user could request pages with
sports related content, and then overlay traffic data to discover
promising regions of the site for placing sneaker ads.
From the experience with SiteSeer, perhaps the most interesting future
direction is to consider the issues of "path analysis." SiteSeer is
limited to showing access to documents in a structural path, but does
not show actual traversal behavior of visitors. A visualization that
shows these traversals would likely answer questions on how people are
actually navigating the site and help improve the site for
visitors.
REFERENCES
Deerwester, S., Dumais, S.T., Landauer, T.K., Furnas, G.W. and
Harshman, R.A. (1990) Indexing by latent semantic
analysis. Journal of the Society for Information Science, 41(6).
(postscript
or
PDF)
Furnas, G.W. (1986) Generalized fisheye views. In the
Proceedings of Human Factors in Computing Systems, CHI `86. (Boston,
MA, April).
Wittenburg, K. and Sigman, E. (1997)
Integration of Browsing,
Searching, and Filtering in an Applet for Web Information
Access. In the Proceedings of Human Factors in Computing Systems,
CHI `97. (Atlanta, GA, March).
Wittenburg, K. and Sigman, E. (1997) Visual Focusing Techniques in
a Treeviewer for Web Information Access. In the Proceedings of the
IEEE Symposium on Visual Languages, VL 97. (Capri, Italy, September).
(PDF).