CODATA Euro-American Workshop
Visualization of Information and Data: Where We Are and Where Do We Go From Here?
24-25 June 1997


Web: the Next Platform for Information Visualization?

Audris Mockus
amockus@research.bell-labs.com
www.bell-labs.com/home/~audris
December 22, 1997

Introduction

A widely accepted browser user interface, simple and uniform URL access to information and applications, and immense amounts of information available on the internet and intranets make Web the most desired platform for information visualization. This paper explores information visualization capabilities of the standard Web platform by describing a Web application for exploration of large version control databases.

An important goal of this investigation was to provide visualization tools for software engineers in our organization. The tools had to deal with a complex and frequently changing data source: a version control system of a large software project. This led us to choose standard Web infrastructure, such as common gateway interface (CGI) scripts to process and retrieve the data, HTML and JavaScript [Goodman, 1996] to provide hypertext and form-based interactions, and Java [Flanagan, 1996] applets to provide interactive graphical views.

The paper describes an application to analyze a large version control database. Software creation and maintenance processes are both complicated and interdependent making their analysis difficult. Important questions about software changes reflect the rich structure of the software development process and involve people (developers, organizations), source code (files, modules, subsystems), software releases (features, minor and major releases), changes themselves (atomic changes, Maintenance Requests), time frame (hour of the day, day of the week, month of the year, calendar time), as well as any combination of the above. Answers to those questions require inspecting and comparing profiles of many of the objects involved: developers, modules, and Maintenance Requests, to name a few. The ProfileView shows 50 to 100K object profiles (examples of objects are developers, MRs, modules, etc.) simultaneously, selects the appropriate positions and dynamic visual representations for the objects to perform visual classification, and uses zooming/panning techniques to perform drill-down inspection.

Classification of Programmers

ProfileView is composed of several applets: ProfileView, Eye, and control applets appearing as choice widgets in Figure 1. The applets are composed on a page using HTML. The version control database is accessed by specifying the relevant URL (pointing to a CGI script) as a parameter to the applets. The ProfileView applet at left shows all the profiles, the Eye applet at top right is used to select relevant data subsets, and control applets at bottom right are embedded within explanatory text and are used as menus to select the type of visual representation and other parameters.

The ProfileView applet shows profiles of 51 objects corresponding to developers. The subset of 51 developers (out of 509) was selected due to the difficulty of showing all developers on a small 3 by 6 inch static image. The data for the developers represents daily activity (hourly (CDT) numbers of changes) collected over a 12 year period. To represent hourly count profiles the clock icon is chosen. It shows the trace that a 24 hour clock would draw with the end of its hour pointer if the length of that pointer represented numbers of changes made during the particular hour. Other iconic representations may be more appropriate for different types of data (e.g., time series icon for calendar time). To facilitate the classification task the icons are ordered by similarity in their appearance to the selected developer at the bottom left. To preserve linear ordering on the two-dimensional window the icons are laid out on the space filling curve [Peano, 1890]. To make the visual representation robust to extreme values a square root transformation of counts is used. To adjust for the total time each developer was making changes (not all the developers were present continuously during the considered 12 year period) each profile is scaled individually. The main result of the classification task is the peak close to midnight (appearing as a panhandle) common to most developers. Some developers do not submit changes from 9AM to 1PM (first from left on the second row from top) and some do submit changes only this short period (top left).

In addition to visual classification tasks, the visualization is designed to answer questions concerning the group behavior and aggregate properties of the objects. The rectangular larger window in the Eye applet (Figure 1) represents 24 hours of the day on the horizontal axis and three maintenance activities (fault fixes, new functionality, and cleanup) on the vertical axis. The iris represented by a smaller light window within the Eye applet shows the numbers of changes made between hours 10 and 12 for bug fixes and new code. The icons are colored by the most busy hour since maximum is chosen as aggregation function. Moving and sizing the iris animates the icons in the ProfileView applet to show behavior of all profiles.


screen image
Figure 1: Screen shot of the browser window with ProfileView at left, Eye at top right, and control applets within explanatory text.

Discussion of the Web Platform

Implementing the ProfileView application on the Web platform raised two main issues: the relatively low bandwidth of Web's client-server architecture and compliance with browser's style of the user interface. Web users expect multi-threaded applications that are able to show the data as it is arriving, hence the views must know how to display only part of the data. This require receiving meta-data such as headers and other information before the data itself. Although the concept of an applet is a stand-alone application, constructing a page with many communicating component applets can save on download time by sharing the same data between applets and simplify prototyping. In effect, HTML becomes a scripting language to compose multiple views into one page and multiple pages. In our case we used static members of the class to share the data and to communicate between different applets on the same page or on different pages.

The main advantage from a tool developer's perspective is portability across hardware and operating system platforms. This drastically reduces development and maintenance effort and costs. Java is a high-level language with graphics (AWT), network, and multitasking capabilities. Different applications are easy to create by scripting with HTML or JavaScript and using the the ProfileView, Eye, and control applets as well as CGI scripts to access the version control database.

Summary

The investigation describes a visual classification tool implemented on the Web platform. The tool has been applied to classify daily activities of different programmers in a large software development project. The main results are the demonstration of Web as a feasible platform for information visualization, as well as the introduction of a tool for visual classification of complex objects and behaviors.

References

Flanagan, 1996 Flanagan, D. (1996). Java in a Nutshell. O'Reilly & Associates, Sebastopol, CA.

Goodman, 1996 Goodman, D. (1996). JavaScript Handbook. IDG Books Worldwide, Inc., Foster City, CA.

Peano, 1890 Peano, G. (1890). Sur une courbe, qui remplit toute une aire plane. Mathematische Annalen, 36:157-160. English translation in Selected Works of Guiseppe Peano, (1973). Hubert C. Kennedy, Ed., University of Toronto Press.


Posters Presentation
Web: the Next Platform for Information Visualization? by Audris Mockus
I-VIZ: Steps in Providing a Dynamic Visual Interface to Tabular Data by Ian Taylor and Steve Benford

Table of Contents