The Partial Occlusion Effect: Utilizing Semi-transparency in 3D Human Computer Interaction

SHUMIN ZHAI, WILLIAM BUXTON, PAUL MILGRAM

University of Toronto

zhai@ie.toronto.edu buxton@dgp.toronto.edu milgram@ie.toronto.edu

ABSTRACT

This study investigates user performance when using semi-transparent tools in interactive 3D computer graphics environments. We hypothesize that when the user moves a semi-transparent surface in a 3D graphic display, the partial occlusion effect introduced through semi-transparency acts as an effective cue in target localization — an essential component in many 3D manipulation tasks. This hypothesis was tested in a controlled experiment in which subjects were asked to acquire dynamic 3D targets (virtual fish) with a 3D cursor. In the experiment, cursors with and without semi-transparent surfaces were compared in monoscopic and stereoscopic displays. Statistically significant effects for trial completion time, error rate and error magnitude were observed for stereopsis and partial occlusion. The partial occlusion cue was effectively used by subjects in both monoscopic and stereoscopic displays. It was no less effective than stereopsis for successful 3D target acquisition. Subjects' performance in each of the conditions improved with learning, but their relative ranking remained the same. Subjective evaluations also supported the conclusions drawn from performance measures. The experimental results and their implications are discussed, with emphasis on the relative, discrete nature of the semi-transparency cue and on interactions between depth cues. The paper concludes with a review of a number of existing and potential future applications of semi-transparency in human computer interaction.

Categories and Subject Descriptors: H.1.2 [Models and Principles]: User/Machine Systems - human factors; H.5.2 [Information Interface and Presentation]: User Interfaces-input devices and strategies, interaction styles; I.3.6 [Computer Graphics]: Methodology and Techniques-interaction techniques; I.3.7 [Computer Graphics]: Three dimensional Graphics and Realism-virtual reality.

General Terms: Human Factors, Experimentation, Design, Measurement.

Additional Key Words and Phrases: semi-transparency, translucency, partial occlusion, stereopsis, depth perception, 3D interfaces.

1. INTRODUCTION

With the advent of modern workstations and increasingly high performance personal computers, applications requiring flexible manipulation of 3D data are moving from restricted domains to mainstream applications. Examples of such applications include information visualization [Card, Robertson, and Mackinlay 1991], virtual environments and telepresence [Ellis, Kaiser, and Grunwald 1991; Zeltzer 1992], computer-aided design [Majchrzak, Chang, Barfield, Eberts, and Salvendy 1987], telerobotics [Sheridan 1992] and entertainment. With this move to 3D, however, we see a breakdown in many of the interaction techniques that have traditionally been used in 2D direct manipulation systems. Tasks such as target acquisition, positioning, dragging, pursuit tracking, sweeping out regions, orienting, and navigating present new challenges to the interface designer. In response to these changes, a body of research is developing which is beginning to address some of these interaction issues. Representative examples are found in [Evans, Tanner, and Wein 1981; SIGGRAPH 1986; Chen, Mountford, and Sellen 1988; Mackinlay, Card, and Robertson 1990; Ware 1990; Zhai and Milgram 1993; Jacob, Sibert, McFarlane, and Mullen 1994; Hinckley, Pausch, Goble, and Kassell 1994a].

One of the key challenges in 3D interface design is to effectively reveal spatial relationships among objects within a 3D space, particularly in the depth dimension, so that the user can perceive, locate and manipulate such objects with respect to each other effortlessly. This paper addresses one particular 3D mechanism, the partial occlusion effect, which can be introduced by the use of semi-transparent surfaces as a means of improving 3D interaction performance. After a brief review of various depth cues in human perception and their exploitation in corresponding 3D display techniques, the paper presents a formal experimental study of the semi-transparency effect in a 3D manipulation task. The the experimental results are discussed with particular emphasis on the semi-transparency characteristics and the modeling of multiple depth cues. Finally, some existing and future potential applications of the interactive semi-transparency effect are described.

2. PRESENTING DEPTH INFORMATION

A variety of techniques are commonly used in computer interfaces for presenting 3D information. Almost all of these techniques can be linked to the depth cues identified in psychological research on human perception in the natural environment. [See Haber and Hershenson 1973; Kaufman 1974, Wickens, Todd, and Seidler 1989; McAllister 1993 for reviews of depth cue theory]. The most commonly exploited depth cues include occlusion, perspective, shadows, texture, binocular disparity, motion parallax and active movement. To put the study of semi-transparency into perspective, we briefly review these depth cues and their applications to 3D computer interfaces.

Occlusion (also called interposition) is one of the most dominant cues in depth perception. Objects appearing closer to the viewer occlude other objects which are further away from the viewer. In 3D computer graphics, the importance of occlusion has long been recognized, most commonly through the use of hidden line/surface removal techniques.

Stereopsis, produced from binocular disparity when viewing 3D objects in natural environments, is a strong depth cue, particularly when the perceived objects are relatively close to the viewer [Yeh 1993]. Various techniques have been devised to create stereopsis on a 2D screen [Arditi 1986; McAllister 1993]. The currently most common method uses liquid-crystal time-multiplexed shuttering glasses. The effectiveness of stereoscopic displays strongly depends on the particular experimental task to which they are applied and on technical implementation variables such as shutter frequency and the binocular geometric model.

Perspective and relative size cues, which account for objects further away producing smaller retinal images than closer objects, are commonly exploited in 3D graphics [Foley, van Dam, Feiner, and Hughes 1990]. Perspective cues are particularly effective when the displayed scene has parallel lines, as noted in [Brooks 1988].

Operating on the same principle as in perspective and size cues, the densities of surface features (texture) increase for more distant surface elements. Texture cues are therefore also described as detail perspective [Kaufman, 1974].

The shadow of a 3D object is also often an effective depth cue. Herndon and colleagues [Herndon, Zeleznik, Robbins, Conner, Snibbe, and van Dam 1992], for example, explicitly exploit shadows for 3D interaction. In their design, shadows are projected on walls and floors of a 3D environment so that the user can control object movement in each dimension selectively by choosing and moving the shadows. The use of shadows is also an important element of the information visualization display proposed by Robertson, Mackinlay, and Card [1991].

Motion parallax . When an object moves in space relative to an observer, the resulting motion parallax produces a sensation of depth. This effect is also frequently exploited in graphical displays. For example, Sollenberger and Milgram [1993] showed the usefulness of the kinetic depth effect in graphically visualizing the connectivity of complex structures such as blood vessels in the brain.

Active movement. Depth information obtained by actively altering a viewer’s own viewpoint is often referred to as movement cue. Motivated by the Gibsonian ecological approach, Smets and colleagues [Smets 1992; Overbeeke and Stratmann 1988] demonstrated the advantages of the active observer, for whom images on a screen were drawn according to tracked head movements, in comparison with the passive subject, whose head movements were not coupled to the displayed image. In a path-tracing experiment, Arthur and colleagues [Arthur, Booth, and Ware 1993; Ware and Arthur 1993] found that while subjects’ task completion time with an head-tracking display and a stereoscopic set-up were similar, their error rates were significantly lower with the head tracking condition.

As can we see, many of the depth cues have been carefully investigated and consciously applied to graphical displays. The relative strengths of various depth cues have also been studied. In one early cue conflict study, Schriever [1925] compared the relative influences of binocular disparity, perspective, shading and occlusion, and showed, among other things, the dominance of occlusion over disparity information. Edge-occlusion domination were also reported in [Braunstein, Anderson, Rouse and Tittle, 1986]. More recently, Wickens, Todd and Seidler [1989], in a review of the depth combination literature, concluded that motion, disparity and occlusion are the most powerful depth cues for computer displays.

We noticed that yet another phenomenon — partial occlusion — produced by semi-transparent * surfaces can be also a strong depth cue. Whenever a semi-transparent surface overlaps another object, the viewer will see the overlapped object in lower contrast (partially occluded ) (Figure 1). A typical example of this phenomenon in everyday life is the silk stocking; hence we also refer to the partial occlusion phenomenon as the "silk" effect.

Figure 1. Portion of an object appearing in front of (the protruding fin) or behind the semi-transparent "silk" surface are perceived as such according to different level of contrast.

The partial occlusion effect due to semi-transparency is closely related to the total occlusion (interposition) cue. Although occlusion is the dominant cue in depth perception, it is often difficult to use in 3D interaction tasks, because distal objects are completely obscured by the proximal, opaque surface, leaving the user in uncertainty about what objects are in the background. A semi-transparent surface, on the other hand, allows the user to see objects both in front and behind it. The research question here is whether partial occlusion is still a depth cue that can be readily perceived by human viewer. In other words, can viewers easily comprehend the depth relation between a semi-transparent surface and other objects that are in front of or behind it? Answers to such questions are not readily available in the literature, possibly because semi-transparency is not experienced very commonly in the natural environment.

We hypothesize that human viewers can perceive the depth position of a semi-transparent surface in relation to other objects, due to the fact that objects in front of the surface appear at different contrast levels compared with objects behind it. Furthermore, we believe that this relative, discrete depth cue is particularly useful in 3D interaction, because as users gradually move a semi-transparent surface through an object, they can perceive the immediate and continuous change in the object's appearance. This suggests a potentially powerful mechanism for users to locate objects in 3D interaction tasks.

It is also important to note that semi-transparency is relatively easy to implement with today’s computer systems, which further increases the justification for its careful study and wider application in computer interfaces. In fact, numerous examples of applying semi-transparency can already be found in HCI designs. Some of these applications will be reviewed in section 5. The effectiveness of such applications has seldom been studied formally, however. Is the hypothesis that partial occlusion is a useful depth cue true? If so, how powerful is it expected to be, relative to other commonly used 3D techniques such as stereoscopic presentation? What are some of the characteristics, limitations and constraints of semi-transparency? In order to answer some of these questions, we carried out a quantitative experimental study on the use of semi-transparency for manual interaction in a 3D environment.

3. METHOD

3.1 Experimental Task

In each trial of the experiment, a graphically rendered angel fish moved randomly in x, y, and z dimensions (no rotations) within a 3D virtual environment (Figure 2). Subjects were asked to manipulate a 3D cursor (Figure 3), with or without the silk surface, to envelop the fish and to "grasp” it when the fish was perceived to be completely inside the cursor. Subjects wore a special glove (Figures 2 and 4) as input device (section 3.1.3). Grasping was done simply by closing the hand naturally. If the fish was entirely inside the cursor volume, the trial was successful and the fish stayed "caught." The time score of each trial was displayed to the subjects, along with a short beep. If the fish was not completely inside the cursor when grasped, the fish disappeared. In this case (considered a "miss"), a long beep was sounded and error magnitudes in each of the x, y, and z dimensions were displayed, along with the message "Missed!". Each new trial was activated when the subjects pressed the spacebar on the keyboard. Subjects were instructed to complete each trial as quickly as possible and catch as many fish as possible (making as few errors as possible). No preference was given to either speed or accuracy.

Figure 2. The experimental set-up.

Figure 3. Use of a "silk" covering over a rectangular volume cursor in order to obtain occlusion-based depth cues. An object at point A is seen through two layer of "silk", and thus is perceived to be behind the cursor. An object at point B is seen through only one layer and thus is perceived as inside the cursor's volume. An object at point C is not occluded by the silk at all, and so is seen to be in front of the volume cursor.

Figure 4. The input glove.

Although presented as a game (which was greatly enjoyed by the subjects), the “virtual fishing” task is essentially a 3D dynamic target acquisition task, comprising both perception and manipulation in 3-space. Note that in this study target acquisition was taken as an experimental scenario to test user performance in perceiving and positioning objects in 3D, which are often fundamental elements in many of the 3D interaction tasks, such as acquiring objects, moving, dragging, rotating, stretching or sculpting them, or navigating along a desired trajectory. The silk cursor is not necessarily a practical 3D target selection technique in the narrow sense. Designating or selecting 3D targets as a practical task does not necessarily require much depth information, and there are simple and effective techniques for that purpose. For instance, the subject can easily "shoot" at a graphical fish with a line of ray trace, or a virtual "spotlight" as described in [Liang and Green, 1994].

3.1.1 Experimental Platform. The experiment was conducted using the MITS (Manipulation In Three Space) system developed by the authors. MITS is a desktop stereoscopic virtual environment, developed in C and making use of the GL graphics library. The experiment described here was carried out on a SGI IRIS Crimson/VGX graphics workstation. The MITS system automatically records a broad range of information during the experiment and can therefore be entirely "replayed" afterwards if required. MITS also manages the timing and execution of the experiment, including presentation of instructions to subjects so that experiments can be run with minimal interference or bias from the experimenter.

The origin of the {x, y, z} coordinates of the MITS virtual environment was located at the center of the computer screen surface, with the positive x axis pointing to the right, the y axis pointing upwards and the z axis pointing towards the viewer. All objects were drawn using perspective projection and were modeled in units of centimeters, where 1 cm in the virtual fish tank corresponded to 1 cm in the real world for any line segment appearing within the same plane as the surface of the screen. The graphics update rate was controlled at 15 Hz in this experiment.

3.1.2. The targets and their motion. Each of the targets (“angel fish”) used in this experiment had a flat body, except for two fins and two eyes protruding from the body (Figure 1). The angle between any fin and the body was 30 degrees. The size and color of the fish changed from trial to trial. The x (from lips to tail), y (vertical) and z (from left fin tip to right fin tip) dimensions of the largest (“adult”) fish were 10 cm, 15 cm and 1.3 cm respectively. The smallest (“baby”) fish was 30 percent of the size of the largest adult fish.

The fish movements were driven by independent forcing functions in the x, y and z dimensions. Such inputs, based on suitable combinations of sinusoidal functions, generate smooth and subjectively unpredictable motion, and are employed quite frequently in manual tracking research [Poulton 1974]. In this experiment, the particular forcing functions applied to the fish motions were:

where t was the time from the beginning of each test (see section 4.3 on experimental design and procedure for the definition of a test), A = 4.55 cm, p = 2, and fo = 0.02 Hz. The phase terms, and (i = 0, 1, ..., 5), were pseudo-random numbers, ranging uniformly between 0 and 2. This design resulted in fish motions which were sufficiently unpredictable to the subjects and different from trial to trial, but repeatable for each test and between experimental conditions.

3.1.3 The cursor and the input. The cursor used to capture the fish was a rectangular box of size 11.3 cm, 16.3 cm and 2.6 cm in x, y and z dimensions respectively (Figure 3). Two versions of the cursor were used in the experiment. One was a wireframe cursor that had no surfaces (totally transparent, see Figure 5) and the other was a silk cursor (Figure 1, Figure 6-8). The silk cursor had exactly the same geometry as the wireframe cursor but its surfaces were all semi-transparent. The intensity, I, of the semi-transparent surface was rendered by blending the cursor color (source) intensity, Is, with the destination color intensity, Id, [Foley, et al. 1990], according to:

Although Is was chosen to be white in this experiment, different color compositions may be more suitable for other particular applications.

If = 1, the cursor is totally opaque and therefore completely occludes objects behind it. If = 0, the cursor is totally transparent and no partial occlusion cues are available. On the basis of pilot experiments, we determined a suitable coefficient of = 0.38 for all surfaces of the cursor, except for the back surface, which was set at =0.6. These values resulted in partial occlusion states (i.e., in front, between two layers, and behind two layers of silk surface) that were judged to be satisfactorily distinguishable.

Figure 5. A fish and the wireframe cursor.

Figure 6. A fish in front of the silk cursor.

Figure 7. A fish behind behind the silk cursor.

Figure 8. A fish completely inside of the cursor.

The transparency interpolation was realized by means of blendfunction(sfactr, dfactr) in the SGI GL library. Note that the actual sequencing of rendering commands is critical to the transparency effect. Polygons further away from the user's viewpoint must be drawn before polygons closer to the user.

The wireframe cursor as used in the experiment (Figure 5) can obviously be improved by drawing line segments or cross hairs on the cursor surface so that the cursor appears like a fishing net. The resulting effect is also a form of "partial occlusion". When the fishing net mesh is dense enough, it will appear semi-transparent. In fact, this is one of the approaches, often called the "screen door" approach, for implementing semi-transparency in computer graphics [Foley et al 1990, page 755]. In order to investigate the effect of partial occlusion, we choose two special levels of occlusion: The wireframe cursor represents the extreme case of no occlusion, while the silk cursor exhibits an optimized degree of partial occlusion.

In the experiment, the cursor was driven by a custom-designed glove based on an Ascension Technology Bird™ equipped with a clutch, as shown in Figures 2 and 4. The glove operated in position control mode, with a Control/Display ratio of 1:1, as determined in previous research [Zhai and Milgram 1993b]. The "home" positions of the glove corresponded to a cursor location of (0, 0, 0) and were calibrated to make the subject most comfortable when using the glove. The Bird receiver and the clutch were at the center of the user’s hand to best allow the user to “grasp” a fish by means of finger/hand abduction. The Bird has six degrees-of-freedom, that is, translations in the x, y, z dimensions and roll, pitch, yaw, around the x, y, z axes. Since only translations were needed in the this task, rotational signals were disabled for this experiment.

3.1.4 The display. The fishing task was displayed on a SGI monitor with a resolution of 1280 by 1024 pixels (Model No. HL7965KW-SG). Monitor brightness and contrast were adjusted so as to minimize ghosting images for the stereoscopic displays and thereby optimize the stereoscopic effect. The experimental room was darkened throughout the experiment. The gamma correction value was set at 1.70.

Two modes of display were used in the experiment: stereoscopic and monoscopic projection. In the stereoscopic case, subjects wore 120 Hz flicker-free stereoscopic CrystalEyes™ glasses (Model No.CE-1), manufactured by StereoGraphics.

3.2 Experimental Conditions and Hypotheses

The primary goal of our experiment was to evaluate the effectiveness of the partial occlusion cue in 3D interaction. Stereoscopic projection has been found to be a powerful technique in displaying depth information [Wickens, et al. 1989; Yeh and Silverstein 1992; McAllister 1993] and has been used often as the control condition in studying other types of 3D display techniques [e.g. Sollenberger and Milgram 1993; Arthur, et al. 1993]. This experiment was designed to allow comparison of the relative performance of stereopsis versus partial occlusion for the interaction task. We were also interested in learning how the two sources of cues interact. Thus, two display modes (monoscopic versus stereoscopic) and two types of cursor (silk cursor versus wireframe cursor) were included in the experiment, resulting in four conditions: silk cursor with stereo display (SilkStereo), wire frame cursor with stereo display (WireframeStereo), silk cursor with mono display (SilkMono), and wire frame cursor with mono display (WireframeMono).

Apparently, the WireframeMono case, the baseline condition, is the most difficult one since neither partial occlusion nor stereopsis was present for judging depth relation. The subjects had to rely on occlusions between the edge of the cursor and the fish. They tended to move the cursor so that the fish first was apparently located between the edges of the cursor in the z dimension (Figure 5) and then slightly adjust the cursor in the x and y dimensions to bring the fish into the center of the cursor before grasping.

In the WireframeStereo case, subjects no longer had to depend on edge occlusion. Because the stereoscopic cue gave them a strong 3D sensation, they could judge the depth dimension directly and simultaneously with their judgment along the and x and y dimensions.

In the SilkMono case, portions of the target appeared with different contrast ratios when they were located in front of (Figure 6), behind (Figure 7) or inside the cursor (Figure 8). The subjects tended to use the semi-transparency cue interactively, by moving the silk cursor first through the target to observe the continuous change of target appearance (See Figure 1, the portion of fin in high contrast will change as the cursors) and then grasping immediately after the front surface of the silk cursor moved in front of the fish fin. The ability to judge where the semi-transparent surface is relative to the target through interactive movement is critical to the power of the partial occlusion cue. Without this interactive effect, subjects would not be able to tell when the back fin of the fish is inside of the cursor (see Figure 8).

In the SilkStereo case, subjects had the advantage of both the stereo cue and the partial occlusion cue. We expected SilkStereo to be the most efficient case and WireframeMono to be the least efficient. Whether SilkStereo would be significantly superior to WireframeStereo would reveal whether the partial occlusion cue provides depth information in addition to stereo cue. What was also of particular interest to us was whether the SilkMono case (partial occlusion cue alone) would generate superior, or in any case comparable, performance scores relative to the case of WireframeStereo (stereo cue alone), which would confirm to us the potentially powerful advantages of the semi-transparency on its own.

Stated formally, our hypotheses for this particular class of localization tasks, were:

1. Partial occlusion improves performance over no-occlusion (wireframes);

2. Stereoscopic display improves performance over monoscopic displays;

3. The strength of the partial occlusion cue is no less powerful than the stereo cue.

4. The two cues enhance each other and performance is best when both cues are present.

3.3. Experimental Design and Procedure

Twelve paid subjects were recruited through advertising on campus. The subjects were screened using the Bausch and Lomb Orthorator visual acuity and stereopsis tests. Subjects' ages ranged from 18 to 36, with the majority in their early and mid-20’s. One of the 12 subjects was left handed and the rest were right handed, as determined by the Edinburgh inventory [Oldfield 1971]. Subjects were asked to wear the input glove on their dominant hand.

A balanced within subjects design was used. The 12 subjects were randomly assigned to a unique order of the four conditions (SilkStereo, WireframeStereo, SilkMono, WireframeMono) using a hyper-Graeco-Latin square pattern, which resulted in every condition being presented an equal number of times as first, second, third and final condition.

Following a 2 minute demonstration of all four experimental conditions, the experiments with each subject, were divided into four sessions, with one experimental condition in each session. There was a 1 minute rest period between every two sessions. Each session comprised 5 tests. Each test consisted of 15 trials of fish catching. Test 1 started when the subject had no experience with the particular experimental condition. Test 2, 3, 4, and 5 started after the subjects had 3, 6, 9 and 12 minutes worth of experience respectively. Practice trials filled the gap following a test and before the next test began, so that each test (e.g. Test 3) always started when the subject had a fixed amount of practice with the particular experimental condition (e.g. 9 minutes for Test 3). At the end of each test, the number of fish caught and missed (as both an absolute number and a relative percentage) and mean trial completion time were displayed to the subject.

At the end of the experiment, a short questionnaire was administered to assess users' subjective preferences for all experimental conditions.

3.4 Performance Measures

Task performance was measured by trial completion time, error rate and error magnitude. Trial completion time was defined as the time duration from the beginning of the trial to the moment when the subject grasped. Error rate was defined as the percentage of fish missed in a test (15 trials). Whenever a fish was missed, the error magnitude was defined as the Euclidean summation of errors (portions of the body outside of the cursor) in the x, y, z dimensions respectively:

Note that the error magnitude is not a primary measure for two reasons. First, the subjects’ task was to capture the fish as quickly as possible. Error magnitude was not an explicit requirement. Second, it only occurs when the subject missed the fish. We included error magnitude to gather a complete set of performance measures.

3.5 Experimental Results

3600 experimental trials (i.e., 12 (subjects) x 2 (cursor types) x 2 (display modes) x 5 (tests) x 15 (trials per test)) of data were collected during the experiment. Repeated measure analyses of variance were conducted through the multivariate approach [Bock 1975] to test the statistical significance of the individual effects and their interactions under each of the three performance measures. All of the analyses were conducted first with the original performance score, followed by examination of the variances of the different effects and model residuals. As is common in human subject experiments, the data on trial completion times, error rates and error magnitudes collected here were not normally distributed, but rather skewed towards lower values. In order to increase the validity of the statistical analysis [Howell 1992], logarithmic transformations were applied to the trial completion time and error magnitude data and a square root transformation was applied to the error rate data. These transformation made the data meet the variance analysis assumptions of normality and homogeneity of variance. Statistical results reported below were based on the transformed data, even though graphs are presented in the original scale for ease of comprehension. Greenhouse-Geisser and Hunyh-Feldt adjustment epsilon values were calculated to estimate the potential correlation in repeated measure designs but no critical differences were found between the original and the adjusted probability values. The probabilities reported below were therefore left unadjusted. The following are the primary results of the statistical analysis.

3.5.1 Trial Completion Time. Variance analysis indicated that cursor type (silk vs. wireframe cursor: F(1,11) = 66.47, p<.0001), display mode (stereo vs. mono display: F(1,11) = 15.0, p < .005), learning phase (F(4,44) = 21.59, p<.0001), trial number (different fish size and 3D location: F(14,154) = 12.55, p<.0001), cursor x display interaction (F(1,11) = 6.68, p < .05), and cursor x display x phase interaction (F(4, 44) = 4.0, p <.01) all significantly affected trial completion time.

Figure 9. Trial completion times as a function of cursor type and display mode.

Figure 9 illustrates the effect of cursor type and display mode on trial completion time. Multiple contrast tests showed that the silk cursor produced significantly shorter completion times than the wireframe cursor, for both monoscopic and stereoscopic displays (Table 1). With regards to the magnitude of the differences, the mean completion time with the silk cursor was 48.4% shorter than that of the wireframe cursor in monoscopic display and 28.1% shorter in stereoscopic display. Finally, the mean completion time for SilkMono (partial occlusion cue alone) was 18.1% shorter than for WireframeStereo (stereo cue alone), even though this difference was not statistically significant (p = .28). These results suggest that, under the experimental conditions, the use of semi-transparent surfaces brought significant benefit to task performance as measured by completion time and the power of partial occlusion through semi-transparency was comparable, if not stronger than that of stereopsis.

Table 1. Multiple contrast tests of mean completion times

3.5.2 Error Rate. As illustrated in Figure 10, the pattern of the error rate data as a function of cursor type and display mode is very similar to that of the trial completion time data. The statistically significant factors affecting error rate were cursor type (F(1,11) = 92.16, p<.0001), display mode (F(1,11) = 14.48, p < .01), and cursor type x display mode interaction (F(1,11) = 7.47, p < .05). Neither learning phase nor any interactions between learning phase and other factors were significant.

Figure 10. Error rate as a function of cursor type and display mode.

Multiple contrast tests showed that the silk cursor produced significantly fewer errors than the wireframe cursor, both for monoscopic displays and for stereoscopic displays (Table 2). Regarding the actual differences in magnitude, for monoscopic displays the mean error rate of the silk cursor was 59% less than that of the wireframe cursor. For stereoscopic displays the mean error rate with the silk cursor condition was 36.7% less than for the wireframe cursor. For the case of partial occlusion cue alone (SilkMono) the mean error rate was 19.5% lower than for the stereo cue alone (WireframeStereo) but this difference was not statistically significant (p = .21). Similar to the trial completion time data, the error rate data suggests that the partial occlusion cue indeed brought performance improvement relative to the control condition, and it was no less powerful than the stereopsis cue.

Table 2. Multiple comparison tests of mean error rate

3.5.3 Error Magnitude. The effects of cursor type and display mode on error magnitude are shown in Figure 11. When examining the error magnitude data, please bear in mind that error magnitude was defined only when an error was made (i.e., a target was missed), and that fewer errors occurred in some conditions than for others, as indicated. The variance analysis concluded that error magnitude was significantly affected by cursor type (F(1,11) = 11.37, p < .01), display mode (F(1,11) = 18.19, p < .001), and learning phase (F(4,44) = 3.97, p < .01). No significant between factors interactions of any order were found.

Figure 11. Error magnitude as a function of cursor type and display mode.

Multiple contrast tests (Table 3) showed that the silk cursor produced significantly lower error magnitudes than the wireframe cursor, both for monoscopic displays and for stereoscopic displays. For monoscopic displays the mean error magnitude of the silk cursor was 15.1% smaller than that of the wireframe cursor. For stereoscopic displays the mean error magnitude of the silk cursor condition was 41.5% smaller than that of the wireframe cursor.

Table 3. Multiple comparison tests of mean error magnitude

In contrast to the trial completion time and error rate data, it appears that when an error did occur, the stereo cue was more effective than the partial occlusion cue in reducing the error magnitude. lessThe SilkMono mode (partial occlusion cue alone) produced a larger mean error magnitude average (as well as larger deviation, Figure 11) than the WireframeStereo mode (stereo cue alone). However this difference was not statistically significant (p = .97).

3.5.4 Learning Effects and Final Phase Results. As indicated in the variance analyses above, learning phase was a significant factor for trial completion time and error magnitude, but not error rate. It also interacted significantly with cursor display combinations, as measured by trial completion time. This subsection describes the performance changes as learning progressed, and the results in the final phase of the experiment.

Figure 12. Time performance for each of four conditions at each learning phase.

Figure 12 shows trial completion time data for each technique as a function of the learning phase. It shows clearly that the relative scores between the different conditions were ordinally consistent over all experimental phases. Subjects improved their time scores for the SilkStereo, SilkMono and WireframeStereo modes as they gained more experience, and presumably more confidence. Little improvement in completion time was evident with the WireframeMono condition, however.

Variance analysis was conducted on trial completion time data in the final learning phase (Test 5 in Figure 12). The statistical conclusions were the same as those drawn from the overall data above: cursor type (F(1,11) = 90.8, p<.0001), display mode (F(1,11) = 21.5, p < .001), cursor type and display mode interaction (F(1,11) = 17.3, p < .005), trial number (F(14, 154) = 6.4, p<.0001) all significantly affected trial completion time. Results of the multiple contrast comparisons for the final phase completion time data also agreed with the results from the overall data (Table 1): SilkStereo vs. SilkMono (p = .27) and SilkMono vs. WireframeStereo (p=.32) were not significantly different; All other pair comparisons were significant (p<0.05). Mean trial completion time reductions due to the partial occlusion effect in the final phase are as follows. For mono displays, SilkMono (mean 2.064 sec.) was 52.8 % less than WireframeMono (mean 4.376 sec.). For stereo display, SilkStereo (mean 1.850 sec.) was 20.6% faster than WireframeStereo (mean 2.329 sec.).

Figure 13. Error rate for each of four conditions at each learning phase.

Figure 13 presents the error rate data as a function of learning phase. Again the relative rank of each mode was consistent across all five phases of the experiment. Interestingly however, in contrast to the completion time data (Figure 13), error rate for the WireframeMono condition showed the most obvious improvement over the experiment. A small amount of improvement was also found in the SilkMono condition, but essentially none in the SilkStereo and WireframeStereo modes. Variance analysis for the final (test 5) phase error rate data showed that cursor type (F(1,11) = 26.6, p <.0005) and display mode (F(1,11) = 6.05, p < .05) were both significant factors but the cursor type and display mode interaction (F(1, 11) = 1.53, p = .24) was not significant. Multiple contrast comparisons showed that final phase error rate with WireframeMono was significantly higher than the other three cases (p <0.05). Other contrasts were not significant (p>0.05), however. Mean error rate reductions resulting from the partial occlusion effect in the final phase are as follows. For the mono display, error rate with SilkMono (mean 13.9%) was 60.8% lower than WireframeMono (mean 35.0%). For the stereo display, SilkStereo (mean 13.9%) was 26.5% lower than WireframeStereo (mean 18.9%). Note that the lowest average error rate (13.9%) was still greater than the error rates found in typical 2D target acquisition studies. This is probably due to two reasons. One is that the task was more difficult than usual, not only because it was performed in 3D but also because the target (fish) was always moving. The second reason is related to the instructions given to the subjects who were told to "catch as many fish as possible and complete each trial as quickly as possible." No emphasis was given to ensuring that no fish were missed.

Comparing Figure 12 with Figure 13 reveals important information about speed accuracy tradeoff patterns with respect to learning. For the WireframeMono mode, subjects had more than a 35% error rate, which apparently caused them to focus on improving the accuracy aspect of the task at the expense of time performance. In the other three cases (SilkStereo, SilkMono, and WireframeStereo), subjects already had less than a 25% error rate and it appears that they were more satisfied with this level of accuracy, and thus were devoting more effort to reducing their trial completion times.

The error magnitude data were not suitable for statistical analysis as a function of each learning phase, since very few errors occurred for some of the phase and technique combinations.

3.5.5 Subjective Preferences. Figure 14 shows the mean scores for the subjective evaluation data collected after the experiment. On the average, SilkStereo was the most preferred and WireframeMono was the least preferred, with SilkMono ranked higher than WireframeStereo. Statistically, significantly different preference scores were found across conditions through repeated measure variance analysis (F(3,33) = 74.23, p<0.0001). The results of the multiple contrast tests are summarized in Table 4, and show that subjects’ preferences between every pair of techniques were significantly different (including WireframeStereo vs. SilkMono). Interestingly, the subjective evaluation data in this experiment were consistent with the acquired performance measures (completion time and error rate) in trend but were more sensitive in detecting differences between conditions.

Figure 14. Mean scores for subjective evaluation.

Table 4. Multiple contrast test results of mean error magnitudes

3.5.6 Summary of Results. The experiment largely confirmed our initial hypotheses. In terms of all three measures of performance (trial completion time, error rate and error magnitude), both stereopsis through binocular disparity and partial occlusion through semi-transparency were significantly beneficial to the manual 3D localization task. The partial occlusion cue was effectively used by subjects in both display modes: it significantly improved users performance not only in the monoscopic display which had little depth information available, but also in the stereoscopic display which already had the powerful stereo cue. Comparing the two cues, partial occlusion was no less powerful than stereopsis for successful 3D target acquisition. Learning improved subjects' performance with each of the techniques but the relative rank of the techniques remained unchanged throughout the experiment. Subjective evaluations supported the conclusions drawn from performance measures.

4. DISCUSSION

4.1 Properties of Semi-transparency: Discrete, Relational Depth Cueing

Two particular properties of semi-transparency are especially relevant to practical HCI applications. One of these is the fact that a semi-transparent surface does not completely block the view of any object which it (partially) occludes. This eliminates one of the disadvantages of the powerful total occlusion cue and permits the user to maintain awareness of the background information.

The second property relates to the fact that, similar to the total occlusion cue, the partial occlusion through semi-transparency provides relational and discrete depth information about the position of a semi-transparent surface relative to other objects. This is in contrast to stereoscopic displays, which provide continuous, quantitative depth information. As illustrated in Figure 1 and Figures 6 to 8, we see how the silk covering the volume cursor directly reveals whether an object is in front of the cursor, within it, or behind it. When an object is behind a semi-transparent surface, however, the user will not able to tell by how much the object is separated from in the surface in space. For some tasks, such as making an absolute judgment of distance, the discrete nature of the partial occlusion cue may represent a shortcoming, whereas for others it will be a distinct advantage since the user does not have to make a qualitative decision based on quantitative, continuous information. This was precisely the case in the experiment described here, where the objective was to manipulate the cursor so that it totally enveloped the fish being hunted. This is clearly a discrete task, as the subjects were instructed simply to capture the fish and not necessarily to center the cursor on it as accurately as possible. This contention is supported by evidence from the experiment: in Figures 9 and 10 we see that semi-transparency appears to be a slightly more effective cue than binocular disparity for successful target acquisition. However, upon examining Figure 11, we note that the mean error magnitude and variance of the SilkMono case were larger than those of the WireframeStereo case. The implication of this is that although fewer errors were made under the SilkMono condition relative to the WireframeStereo condition, the magnitude of those fewer errors must have been relatively larger than in the WireframeStereo case, suggesting the distinction between discrete and continuous depth information.

We also point out that, although static semi-transparent surfaces provide primarily discrete cues, continuous depth information can nevertheless be acquired when semi-transparent surfaces are used as a dynamic interactive medium. That is, when the silk cursor is moved through another 3D object, the user may estimate the object's depth in a number of ways, including estimating the distance traveled, timing and kinesthesia. 5.4.3

4.2 Interactions among depth cues: Modeling of 3D performance

The manipulation of two sources of depth information in this experiment, occlusion and binocular disparity, brings to the fore an important theoretical question: When multiple sources of depth information are provided, how does the visual system judge actual depth information and how does performance change accordingly? Our visual system could either select one of the multiple sources or integrate them to form a decision. Two classes of models have been applied to address this issue, additive models and multiplicative models [Bruno and Cutting 1988; Sollenberger 1993]. An additive model represents the fact that either depth cue can improve performance on its own and when both sources of information are present simultaneously the resulting performance improvement is a simple summation of the benefit from the two sources individually. A multiplicative model describes the fact that the two sources of information can interact, causing a combined effect either greater or less than the additive effects. In their study of the combination of relative size, projection height, occlusion, and motion parallax, Bruno and Cutting [1988] concluded that additive models produced the best fit to their experimental data. In a series of experiments with motion parallax (kinetic depth) and binocular disparity, Sollenberger [1993] found some evidence for a multiplicative model with greater than additive effects for his path-tracing task.

In the present experiment, we found that task performance as measured by trial completion time and error rate were also compatible with a multiplicative model, but with less than additive effects. As shown in Figures 9 and 10, a strong interaction was found between display mode and cursor type for both trial completion time and error rate. That is, both stereo display alone (i.e. WireframeStereo) and partial occlusion alone (i.e. SilkMono) greatly improved performance relative to WireframeMono, but further improvements from SilkMono to SilkStereo (i.e. with both cues present) was marginal, suggesting the dominance of the partial occlusion cue in this task.

For cases in which targets were missed, on the other hand, the pattern of error magnitudes (Figure 11) conformed with an additive model. No interaction was found between display mode and cursor type (F(1,11) =0.0004, p = .97).

4.3 Future Work

Although this study has convincingly demonstrated some of the advantages of semi-transparency applied to 3D human computer interfaces, a number of issues related to semi-transparency remain to be explored in the future. First, transparency is actually a continuous variable, ranging from total transparency to total occlusion. In the present experiment, the selected transparency value (determined by testing sample values during our pilot study) was compared against total transparency (the wireframe case). In practice, however, the optimal level of transparency may vary for different applications. Future work should compare all levels of transparency, including total occlusion (opaque).

Secondly, as mentioned before, partial occlusion may also be realized by drawing solid line segments (or cross hires) on the cursor surface so that the cursor appears like a fishing net. This method also represents a continuum, along the dimension of line density. We expect that the partial occlusion provided by this "net" cursor approach will be inferior to the color interpolation method; however, a formal experiment would be worthwhile to carry out.

Finally, we used a dynamic target acquisition task to test the concept of semi-transparency as a general interaction mechanism. Although independent of the theme of this paper, an interesting issue related to the target acquisition task is the effect of relative size of the volume (or area) cursor versus the target. A separate study has been carried out in modeling such an effect through Fitts law, and has been reported elsewhere [Kabbash and Buxton, 1995].

5. APPLICATIONS

Although the present work is probably the first experimental research that explicitly studied the power of semi-transparency in 3D interactive graphical systems, semi-transparency has in fact already been used in practice by many HCI designers and researchers. Visual artists have also long made use of semi-transparency to enhance three dimensional sensation in their graphical designs. With a little attention, we can see examples of semi-transparency applications in TV broadcasting almost every day, particularly in leading graphics of a program. This section reviews a few existing applications of semi-transparency in HCI. Several future applications are also proposed. The entire application spectrum of semi-transparency is too broad to be covered in this section. We will restrict ourselves to only a very few examples. Novel and successful applications of semi-transparency in various domains will depend upon designers' creativity and thorough understanding of specific task constraints.

5.1 Information Visualization

A number of interactive systems developed at Xerox PARC for the purpose of information visualization make use of semi-transparency in rendering 3D objects. Two examples of this are the "cone tree" (Figure 15) [Robertson et al 1991] and the "spiral calendar" (Figure 16) [Card, Pirolli, and Mackinlay 1994]. In the case of the cone tree, as the authors state: “The body of each cone is shaded transparently, so that the cone is easily perceived yet does not block the view of cones behind." In addition, the different contrast ratio of the semi-transparent cones also provide cues regarding the inter-relationships of the cones in the depth dimension. In the case of the “spiral calendar” the use of semi-transparent surfaces helps the user to perceive the spatial relationships among the different calendar cards. For example, the card for "June" appears to be in front of the card for "1993", due to the use of size, perspective and partial occlusion cues.

Figure 15. The cone tree: the semi-transparent cone bodies reveal spatial interrelationships in the depth dimension (from [Roberstion, el al 1991], reprinted with permission)

Figure 16. The Spiral Calender: the semi-transparent surface improves the spatial structure of the interface (from [Card et al, 1994}, reprinted with permission).

5.2 Surgical Visualization

In a system developed for neurosurgical visualization, Hinckley and colleagues [Hinckley, Pausch, Goble, and Kassell 1994b] used a semi-transparent graphical representation of a cutting plane to enable surgeons to clearly visualize the spatial relationships among portions of the planned surgical procedure (Figure 17). Note that the surgeon can see parts of the organ both in front of and behind the cutting plane.

Figure 17. A semi-transparent cutting phane for surgical planning, enabling the user to see parts of the organ in front of and behind the cutting plane (from [Hinckley et al, 1994], reprinted with permission)

5.3 Six DOF Pursuit Tracking

Pursuit tracking is an effective experimental research paradigm for the study of human motor skills [Poulton 1974]. It also encompasses important elements of some practical tasks, such as those found in space teleoperation. Tracking a target in six degrees-of-freedom (6 DOF) motion , requires a well-designed display to reveal mismatches (tracking errors) between the pursued target and the tracking cursor. Zhai and Milgram [1994] designed a 3D cursor with semi-transparent surfaces for a 6 DOF tracking experiment that required subjects to follow and capture a target moving randomly in 6 DOF. As shown in Figure 18 the partial occlusion cue helps the user to detect both translational (Figure 18a) and orientational (Figure 18b) errors between the target and the cursor. Used in conjunction with stereoscopic presentation, the silk cursor generated satisfactory results. On the average, tracking errors in that experiment were much lower than previously found in a similar experiment with conventional 3D displays [Massimino, Sheridan, and Roseborough 1989].

Figure 18. Tracking a 3D target with a "silk cursor"; translational (a) and rotational (b) differences between the cursor and the target are effectively revealed with the silk surfaces.

5.4 2 1/2 Dimensional Interfaces

Although semi-transparency is useful in 3D interface, it can also be used in 2D, multi-layered interfaces (2.5 D interfaces). One example of this is the “tool glass” by Bier and colleagues [Bier, Stone, Pier, Buxton, and de Rose 1993; Kabbash, Buxton, and Sellen 1994]. Figure 19 shows one example of a tool glass, in which a user can move the color palette over an object and simultaneously view both the object to be colored and the colors available for selection.

Figure 19. Color selection "tool glass": the user can superimpose the semi-transparent color plate on a target object and click through the color selected for drawing (Courtesy of Paul Kabash).

Another potential application of semi-transparency is with User Interface (UI) widgets, which are devices such as pull down menus and dialogue boxes that are designed to facilitate user computer interaction. Conventional widgets often obscure the very objects on which the user wishes to focus attention. One way to solve this problem could be to use a semi-transparent background when constructing the UI widget so that the user can control the widget while still seeing the objects underneath. SilkWidgets is a sketching program developed by the authors at Alias Research Inc. (Figure 20) to test the concept of semi-transparent widgets. In SilkWidgets, UI widgets such as pull down menus, popup menus, and help sheets are all constructed with a semi-transparent background. One obvious issue in applying semi-transparent widgets is the possible interference between information contained in the widgets and the objects underneath. This has been addressed in [Harrison, Ishii, Vicente, Buxton 1995] and [Harrison, Zhai, Vicente, Buxton].

Figure 20. Semi-transparent pop-up menu in SilkWidgets. (Copyright Alias Research Inc.)

5.5 Using Transparency as State Display

Semi-transparency has also been used in ways other than introducing depth cues per se to display "state" in 3D interaction. Venolia [1993] used semi-transparency as interactive feedback for the state of 3D objects. Whenever an object was "touched," i.e., a small cursor moved into the object, the object changed from opaque to semi-transparent. The user could see not only the touched object, but also the cursor inside the object.

In addition to the above sample existing applications, we would also like propose a few examples of future applications.

5.6 3D Tool Glass

The see-through interface widgets ("tool glass") developed by Bier and colleagues [Bier et al, 1993] have proven to be quite advantageous in 2D interaction tasks [Kabbash et al, 1994]. The key concepts underlying “tool glass” include (1) making widgets (semi)transparent, so that the user can see through the widgets and superimpose them onto objects to be manipulated, and (2) both of the user’s two hands should be co-operatively involved in the interaction task (with the non-dominant hand positioning the tool glass widgets and the dominant hand selecting items on the tool glass). In light of the results reported in the present paper, extension of see-through interface widgets to 3D interactions is an obvious next step. According to this concept, a user would move a set of semi-transparent tool glass widgets in 3D space with the non-dominant hand while using the dominant hand in a coordinated fashion to complete the manipulation task.

5.7 Virtual Reality

Developments in virtual reality (VR) systems have largely been in the direction of rendering more and more realistic environments with computer graphics. In parallel with those developments in comp, silk surfaces are also expected to see increasing usage in VR systems, where a large number of interactive widgets could be drawn with semi-transparent surfaces. One example of this is the hand metaphor often used in VR applications as a representation of a user’s "own-hand" input image. Such a "cursor" can be either drawn in solid color or in wireframe. However, given the various manipulative functions of the hand representation (many of which involve occlusion of underlying objects), rendering the hand in semi-transparency, as illustrated in Figure 21, is expected to be beneficial. Note that the use of interactive silk surfaces goes beyond simple replication of a real world phenomenon, since in the real world one can not move a silk surface to pass through solid objects. augmented reality

Figure 21. A "silk magic hand" for VR applications.

5.8 Virtual Fixtures

Virtual fixtures are a class of haptic and auditory aids proposed by Rosenberg [1993] as a means of improving teleoperation performance. We suggest that the notion of virtual fixtures can also be implemented using semi-transparent graphical rendering, either combined with or independently of haptic and auditory aids. One of the examples of such a semi-transparent fixture is a 2D plane that specifies the boundary beyond which movement of a robot arm becomes dangerous. Such a tool could off-load a human operator from the task of memorizing the locations of boundaries of similar real warning zones. Another advantage of a semi-transparent virtual fixture is that the user could simultaneously maintain awareness of what lies on the other side of the warning plane. Obviously such fixtures could also be very useful for virtual environment models that simulate 3D real world tasks such as operations in space, underseas, nuclear hazardous environments, and tele-surgery.

5.9 Telerobotic Control

In order to off-load human operators from the task of continually having to control a telerobot in real time, some researchers have developed the technique of planning the slave robot movements by means of a graphical/virtual robot model [Bejczy, Kim, and Venema 1990; Funda, Lindsay, and Paul 1992; Zhai and Milgram 1991]. Such a "phantom robot" [Bejczy, et al. 1990] is usually drawn in a solid color or wireframe. A "silk phantom robot" (Figure 22) drawn in semi-transparency could allow the operator to see objects behind the robot and to better visualize operations, particularly when the robot is in close proximity to obstacles and targets.

Figure 22. The "silk phantom robot" for robot manipulation (Courtesy of Anu Rastogi)

In conclusion, in this paper we have proposed partial occlusion through semi-transparency as a potentially powerful depth cue for computer interface applications, alongside such established 3D graphic techniques as perspective projection, stereoscopic displays, motion parallax and viewpoint tracking. In an experimental investigation of the partial occlusion cue, we have demonstrated its merits relative to the important stereoscopic cue in a 3D target acquisition task. For tasks in which 3D localization is a critical component, semi-transparency is expected to play a potentially very useful role in the future, not only in conventional computer graphic applications but also in such areas as telerobotic control and virtual reality.

ACKNOWLEDGMENTS

We would like to thank the members of the Input Research Group (IRG) (http://www.dgp .toronto.edu/IRG/irg.html) and the Ergonomics in Teleoperation and Control (ETC) Lab (http://vered.rose.toronto.edu) at the University of Toronto, who provided the facilities within which this work was undertaken. We wish to thank George Fitzmaurice, Ken Hinckley, and Ferdie Poblete for their comments and assistance in conducting this research. We also wish to thank the anonymous referees of CHI'94 for their constructive comments on our earlier report of this work, which motivated much of the re-analyses and discussions in this writing. The thorough and helpful reviews of the TOCHI referees have greatly improved the presentation of the current paper. Primary support for this work has come from the Information Technology Research Centre of Ontario, the Defence and Civil Institute of Environmental Medicine, the Natural Sciences and Engineering Research Council of Canada, and Xerox PARC. Additional support has been provided by Alias Research Inc, Digital Equipment Corp., and Apple Computer Inc. This support is gratefully acknowledged.

REFERENCES

Arditi, A. (1986). Binocular vision. In K. R. Boff, L. Kaufman, and J. P. Thomas (Eds.), Handbook of Perception and Human Performance (pp. 23-1, 23-41). New York: John Wiley and Sons.

Arthur, K., Booth, K., and Ware, C. (1993). Evaluating 3D task performance for fish tank virtual worlds. ACM Transactions on Information Systems, 11(3), 239-265.

Bejczy, A. K., Kim, W. S., and Venema, S. C. (1990). The phantom robot: predictive displays for teleoperation with time delay. In Proceedings of IEEE International Conference on Robotics and Automation, (pp. 546-551). Cincinnati, Ohio: IEEE.

Bier, E. A., Stone, M. C., Pier, K., Buxton, W., and DeRose, T. D. (1993). Toolglass and magic lenses: the see-through interface. In Proceedings of SIGGRAPH 93 .

Bock, R. D. (1975). Multivariate Statistical Methods in Behavioral Research. New York: McGraw-Hill Book Company.

Brooks, F. P. J. (1988). Grasping reality through illusion - Interactive graphics serving science. In Proceedings of CHI'88: ACM Conference on Human Factors in Computing Systems.

Bruno, N., and Cutting, J. E. (1988). Minimodularity and the perception of layout. Journal of Experimental Psychology: General, 117, 161-170.

Card, S., Robertson, G., and Mackinlay, J. (1991). The information visualizer. In Proceedings of CHI '91: ACM conference on Human Factors in Computing Systems, (pp. 181-194).

Card, S. K., Pirolli, P., and Mackinlay, J. D. (1994). The cost-of-knowledge characteristic function: display evaluation for direct-walk dynamic information visualizations. In Proceedings of CHI'94: ACM Conference on Human Factors in Computing Systems, (pp. 238-244). Boston, MA.

Chen, M., Mountford, S. J., and Sellen, A. (1988). A study in interactive 3-D rotation using 2-D control devices. In Proceedings of ACM Siggraph’88, 22 .

Ellis, S. R., Kaiser, M. K., and Grunwald, A. J. (Ed.). (1991). Pictorial Communication in Virtual and Real Environments. London: Taylor and Francis.

Evans, K., Tanner, P., and Wein, M. (1981). Tablet-based valuators that provide one, two, or three degrees of freedom. Computer Graphics, 15(3), 91-97.

Foley, J. D., van Dam, A., Feiner, S. K., and Hughes, J. F. (1990). Computer Graphics Principles and Practice. Reading, MA: Addison-Wesley.

Funda, J., Lindsay, T. S., and Paul, R. P. (1992). Teleprogramming: toward delay invariant remote manipulation. Presence - Teleoperators and Virtual Environment, 1(1).

Haber, R. N., and Hershenson, M. (1973). The psychology of visual perception. New York: Holt, Rinehart and Winston.

Harrison, B.L., Ishii, H., Vicente, K.J., Buxton, W. (1995). Evaluation of a display design space: transparent layered user interfaces, to appear in Proceedings of CHI'95: ACM conference on Human Factors in Computing Systems, Denver.

Harrison, B.L, Zhai, S., Vicente, K.J., and Buxton, W., 1994, Semi-transparent “silk” user interface objects: supporting focused and divided attention, CEL Techinical Report, Department of Industrial Engineering, University of Toronto.

Herndon, K. P., Zelaznik, R. C., Robbins, D. C., Conner, D. B., Snibbe, S. S., and van Dam, A. (1992). Interactive shadows. In Proceedings of ACM Symposium on User Interface Software and Technology, (pp. 1-6). Monterrey, California.

Hinckley, K., Pausch, R., Goble, J. C., and Kassell, N. F. (1994a). A survey of design issues in spatial input. In Proceedings of ACM Conference on User Interface Software and Technology 1994.

Hinckley, K., Pausch, R., Goble, J. C., and Kassell, N. F. (1994b). Passive real-world interface props for neurosurgical visualization. In Proceedings of CHI'94: ACM conference on Human Factors in Computing Systems, Boston.

Howell, D. C. (1992). Statistical methods for psychology (Third ed.). Boston: PWS-Kent Publishing Company.

Jacob, R. J. K., Sibert, L. E., McFarlane, D. C., and Mullen, M. P. (1994). Integrality and separability of input devices. ACM Transactions on Computer-Human Interaction, 1(1), 3-26.

Kabbash, P., Buxton, W., and Sellen, A. (1994). Two-handed input in a compound task. In Proceedings of CHI'94: ACM conference on Human Factors in Computing Systems, (pp. 417-423). Boston, USA.

Kabbash, P., & Buxton, W. (1995). The “Prince” technique: Fitts’ law and selection using area cursor. In Proceedings of CHI’95: ACM Conference on Human Factors in Computing Systems . Dever, Colerado:

Kaufman, L. (1974) , Sight and mind - an introduction to visual perception, London: Oxford University Press.

Liang, J. and Green, M. (1994). JDCAD: A Highly Interactive 3D Modeling System. Computers and Graphics, 18(4), 499-506.

Mackinlay, J. D., S. Card, and G. G. Robertson. (1990). Rapid controlled movement through a virtual 3D workspace. Computer Graphics, 24(3).

Majchrzak, A., Chang, T.-C., Barfield, W., Eberts, R., and Salvendy, G. (1987). Human Aspects of Computer-Aided Design. Philadelphia & London: Taylor & Francis.

Massimino, M. J., Sheridan, T. B., and Roseborough, J. B. (1989). One hand tracking in six degrees of freedom. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, (pp. 498-503).

McAllister, D. F. (Ed.). (1993). Stereo computer graphics and other true 3D technologies. Princeton, New Jersey: Princeton University Press.

Oldfield, R. C. (1971). The assessment and analysis of handedness: The Edinburgh inventory. Neuropsychologia, 9, 97-113.

Overbeeke, C. J., and Stratmann, M. H. (1988) Space through movement. Ph.D. Thesis, Delft University of Technology.

Poulton, E. C. (1974). Tracking skill and manual control. New York: Academic Press.

Robertson, G. G., Mackinlay, J. D., and Card, S. K. (1991). Cone trees: animated 3D visualizations of hierarchical information. In Proceedings of CHI'91: ACM Conference on Human Factors in Computing Systems, (pp. 1898-194). New Orleans, Lousiana.

Rosenberg, L. B. (1993). Virtual fixtures: perceptual tools for telerobotic manipulation. In Proceedings of IEEE Virtual Reality Annual International Symposium (VRAIS'93), (pp. 76-82). Seattle.

Sheridan, T. B. (1992). Telerobotics, Automation, and Human Supervisory Control. Cambridge, Massachusetts: The MIT Press.

SIGGRAPH, A. (1986). Proceedings of the 1986 Workshop on Interactive 3D Graphics.

Smets, G. J. F. (1992). Designing for telepresence: the interdependence of movement and visual perception implemented. In Proceedings of 5th IFAC/IFIP/IFORS/IEA symposium on analysis, design, and evaluation of man-machine systems, The Hague, The Netherlands.

Sollenberger, R. L. (1993) Combining depth information: theory and implications for design of 3D displays. Ph.D Thesis, University of Toronto, Department of Psychology.

Sollenberger, R. L., and Milgram, P. (1993). Effects of stereoscopic and rotational displays in a three-dimensional path-tracing task. Human Factors, 35(3), 483-499.

Venolia, D. (1993). Facile 3D direct manipulation. In Proceedings of INTERCHI'93: ACM Conference on Human Factors in Computing Systems, (pp. 31-36). Amsterdam, The Netherlands.

Ware, C. (1990). Using hand position for virtual object placement. The Visual Computer, 6, 245-253.

Ware, C., and Arthur, K. (1993). Fish tank virtual reality. In Proceedings of INTERCHI'93: ACM Conference on Human Factors in Computing Systems, (pp. 37-42). Amsterdam, The Netherlands: ACM.

Wickens, C. D., Todd, S., and Seidler, K. (1989). Three-dimensional displays: Perception, implementation and applications. CSERIAC Technical Report 89-001, Wright Patterson Air Force Base, Ohio.

Yeh, Y. Y.(1993). Visual and perceptual issues in stereoscopic display. In D. F. McAllister (Eds.), Stereo computer graphics (pp. 50-70). Princeton, New Jersey: Princeton University Press.

Yeh, Y. Y., and Silverstein, L. D. (1992). Spatial judgments with monoscopic and stereoscopic presentation of perspective displays. Human Factors, 34(5), 583-600.

Zeltzer, D. (1992). Autonomy, Interaction, and Presence. Presence - teleoperators and virtual environment, 1(1), 127-132.

Zhai, S., and Milgram, P. (1991). A telerobotic virtual control system. In Proceedings of SPIE Vol. 1612 Cooperative Intelligent Robotics in Space II, (pp. 311-320). Boston: SPIE-The International Society for Optical Engineering.

Zhai, S., and Milgram, P. (1993). Human Performance Evaluation of Manipulation Schemes in Virtual Environments. In Proceedings of VRAIS’93: the first IEEE Virtual Reality Annual International Symposium, Seattle, USA.

Zhai, S., and Milgram, P. (1994). Asymmetrical spatial accuracy in 3D tracking. In Proceedings of The Human Factors and Ergonomics Society 38th Annual Meeting, Nashville, Tennessee.

Zhai, S., Buxton, W., & Milgram, P. (1994). The "silk cursor": investigating transparency for 3D target acquisition. In Proceedings of CHI'94: ACM conference on Human Factors in Computing Systems, (pp. 459-464). Boston: ACM.