3D User Interface Design — Part 2 - Interaction Techniques

The second edition of the series is dedicated to interaction techniques. Designed to process the information gathered and filtered via Input/Output, they play a crucial role in 3D user interface design determining how users can interact with objects in the virtual environment.

Methods for selecting, manipulating and navigating 3D objects – directly or indirectly – are designed to provide users with a sense of spatial awareness and enable them to interact with virtual objects in a way that is natural to human cognition. Multiple studies examined spatial awareness in a virtual environment, based on which we know how to incorporate object interaction into the interface. This article introduces an overview of techniques that are used to construct 3D UI for large-scale 3D VE.

Navigation

The task of navigation is one of the most frequent user actions in large-scale 3D environments. It presents a variety of challenges for users, such as providing efficient and comfortable movement between distant locations, maintaining spatial awareness and making the navigation process light enough to allow users to focus on more important tasks. We divide navigation into two components, namely travel (the motor component) and wayfinding (the cognitive component). Generally speaking, navigation tasks can be classified into three categories:

Exploration – navigation without any specific target

Search tasks – involving movement to a particular target location

Maneuvering – short-range, high-precision movements meant to put the viewpoint in a position that makes it easier to perform a specific task

A viewpoint movement from one location to another is conceptually a simple task. In immersive virtual reality, viewpoint orientation is usually handled by head tracking, which means that only techniques for setting viewpoint positions need to be considered. There are five common metaphors for travel interaction techniques – most published travel interaction techniques fall into one of the following five categories:

Physical movement – the use of the motion of the user's body to move through an environment. Examples include wide-area motion tracking, walking in place, stationary bicycles or treadmills. When the application requires the user to experience physical exertion while traveling or when an enhanced sense of presence is needed, these techniques are appropriate.

Manual viewpoint manipulation – Using hand motions, the user can control the viewpoint. For example, the user can “grab the air” and pull himself along as if with a virtual rope. Using a selected object as a center point for motion can be another type of technique. While these techniques are efficient and easy to learn, they can also cause fatigue.

Steering – In steering, the direction of motion is continuously specified. Travel metaphors such as gaze-directed steering, in which the user's head orientation determines the direction of travel, and pointing, in which the hand orientation determines the direction of travel, are common. A general and efficient steering technique is used.

Target-based travel – The user specifies the destination and the system moves it. The system may perform some transitional movement between the starting point and the destination, or it may perform teleportation, where the user jumps directly to the new location. From a user's perspective, target-based techniques are very simple.

Route planning – A user specifies a path through the environment, and the system moves the object based on it. In order to plan a route, the user can manipulate icons or draw paths on a map or in the actual environment. By using these techniques, the user can control travel while still being able to perform other tasks at the same time.

Besides picking a metaphor, other design aspects for traveling styles range from controlling the speed to utilizing constraints or pathways to provide assistance. Numerous tests have been run (study 1 / study 2 / study 3 / study 4) leading to quantitative evaluations of the strategies and laying down the foundation for designing suggestions. For example, designers should contemplate whether traveling serves as its own objective or rather as part of a different pursuit – due to their uncomplicated structure, target-based methods will enable the user to concentrate on what is more applicable. Another significant suggestion is that education for users and instructions may be just as critical as the method chosen. People with developed ways of navigation will maintain their orientation in space better than those with rudimentary travel approaches.

Travel is accompanied by wayfinding, the cognitive process of defining a path through an environment, which uses spatial knowledge to build up a cognitive map of the environment, the counterpart of wayfinding. Spatial knowledge consists of landmark, procedural and survey knowledge. Spatial knowledge is used and acquired based on such factors as the reference frame (first-person egocentric, or third-person exocentric) and travel technique.

The extra degrees of freedom within a VE can easily cause disorientation. Users have a wide range of spatial abilities, and wayfinding support is important during VE travel. In the case of VE training, with the goal of transferring knowledge from the VE to the real world, the application and environment should be designed to support the transfer (using techniques discussed below) – otherwise, the training session can quickly become counterproductive.

Wayfinding support can be categorized into two: user-centered and environment-centered. User-centered supports consist of elements that help a person orient themselves, such as having an extensive range of view and harnessing both visual motion and vestibular (real-motion) cues, as well as non-visual mediums like sound. Investigating such factors more meticulously may grant a greater sense of presence. Environment-centered support is split into structural organization and signifiers. Structural organization refers to how accessible the various portions of the VE are — Ingram & Benford's legibility techniques exemplify this idea. Then there are markers which entail actual world wayfinding conventions carried over to virtual environs, including atypical cues likes maps, compasses and grids, yet also including architectural clues such as lighting, hues and textures, or even natural landscape tips like having a visible horizon and atmospheric perspective.

There have been some gains in wayfinding performance when maps are used in VEs. Further evaluations are required in order to determine how the environment influences the application of cues. The majority of cues have been applied to environments that are closely related to real-world ones. However, wayfinding support must be carefully implemented for an environment to be navigable.

Selection and Manipulation

The interface for 3D manipulation in VEs should enable the user to perform at least one of three basic tasks: (1) selecting, (2) locating and (3) rotating objects. Because direct hand manipulation is a major interaction modality not only in the 3D virtual world but also in natural physical environments, the design of interaction techniques for object selection and manipulation has a profound effect on the quality of the entire VE user interface.

The classical design manipulation techniques involve providing the user with a 3D cursor that resembles a hand — ”classical” virtual hand technique. This virtual pointer mirrors the movements of the tracker, allowing an individual to select and move objects in the VE by simply touching them. This method is quite intuitive because it copies real-life actions, but it only works with objects within arm's reach.

Several techniques have been proposed to overcome this problem, including the Go-Go technique, which uses a non-linear mapping on the hand extension of the user to extend the user's reach. As the hand extends farther than a threshold distance D, the mapping becomes non-linear and the virtual arm grows. The control-display gain of real and virtual hands can be varied by using different mapping functions.

A popular method of selecting and manipulating objects in VEs is using a virtual ray originating from the user's virtual hand. If the ray intersects an object, it can then be grabbed and manipulated. To aid users in selecting remote or minuscule objects, various variations of ray-casting have been innovated. For example, employing a spotlight increases the area of selection, allowing multiple objects to be chosen; yet further differentiation of such objects will still need to be done. Another technique is aperture, which employs a conic ray whose direction relies on the position of the user's eye and a hand sensor. By merely bringing it closer or moving it away from one another, the size of the selection volume becomes adjustable. This concept has been further extended by the image plane family of interaction methods.

These techniques offer an array of options for users in terms of exploring the immersive virtual world. Instead of selecting and reaching, another technique is where you can manipulate the scale, like in 3DM immersive modeller. Here people can either "grow" or "shrink" to control various sizes of objects. On top of that, there's also World-In-Miniature (WIM) which provides a hand-held version of the VE, with virtual objects able to be manipulated by interactions via their representations in the WIM.

A number of attempts have been made to integrate and combine the best features of all manipulation techniques. The Virtual Tricorder (Wloka & Greenfield, 1995), for instance, combines ray casting for selection and manipulation with navigation and level-of-detail control in one universal tool. The HOMER technique, world scale grab and Voodoo Dolls are also examples.

The sheer range of interaction techniques can be daunting to the developer, yet can be guided by some general rules. Finding the "best" technique is not always easy as their efficiency varies depending on the task and environment. Sometimes, unreal practices tend to produce better results than real-world counterparts. Introducing constraints and reducing degrees of freedom is also recommended. Evaluation of virtual manipulation techniques offers quantitative feedback, making it an important area of research.

System Control

In system control, a command is issued to change either the state or mode of interaction of the system. As a result, there are some similarities between system control and object selection techniques. It is always the case that issuing a command also involves selecting an element from a set.

Much attention has been given to the use of commands in desktop applications. However, these interaction styles, like pull-down menus and command line input are not always applicable in VE environments. The basic challenge when attempting to control VEs is that a normally one- or two-dimensional task becomes three-dimensional which significantly reduces the effectiveness of normal techniques. An example being touching a menu item floating in space is more difficult than selecting one on the desktop, not only because it's three dimensional but also due to lack of any physical constraints like those found with a mouse pad for instance. There are very few evaluation results for system control techniques and most implementations have been done on an ad hoc basis with no real structure applied to approach the issue.

System control techniques for immersive VEs can be sorted into four categories:

• graphical menus

• voice commands

• gestural interaction

• tools

There are also merging approaches that incorporate multiple types. The ideal system control should be integrated subtly into a universal interaction so as to not impede the action or take away focus from the user. To achieve this, "modeless" interaction is best - where mode is seamlessly transitioned. A natural spatial reference to the body or head of the user may also enhance access to the system control interface.

This guideline is mostly pertinent to graphical menus, but tools may likewise benefit from a solid spatial context. A multimodal system control interface can also be used to permit a more fluid integration of system control into a process. When accessing one of these, the user must choose a command, and if the set of commands is expansive, then the items need to be organized appropriately. Context-sensitive menus or clearly labeled hierarchies of (sub)menus are strategies that can aid in this organization. To avoid mode errors, which have potential to disrupt an application's workflow significantly, feedback should be provided at all stages - beginning with selection of a command and continuing afterward.

In addition to evaluating different system control techniques within realistic application domains, further research is needed to determine their performance and their applicability.

2D Interaction in 3D Environments

A common mistake when designing 3D user interfaces is to assume that 3D interaction is the only way to operate them, due to their virtual worlds providing the capability for users to create, select and manipulate objects. However, 2D interactions offer a range of significant advantages for certain tasks over 3D interactions. When haptic or tactile displays are not present, 2D interaction on a physical surface provides useful feedback particularly for tasks such as creating objects, writing and annotating. Most efficient selection techniques tend to be fundamentally 2D though further manipulation may call for 3D interaction. By taking advantage of the benefits of both 2D and 3D interaction techniques, designers have the option to create interfaces suited to each particular type of application that are more intuitive and easier for the user.

A seamless integration of 2D and 3D interaction techniques is an essential design principle from both a physical and logical perspective. It is crucial to integrate 2D and 3D devices physically so that users are able to switch between them seamlessly. Logical integration is also important because we want the devices in the application to know whether they are used for 2D or 3D interaction. Contextual information that is based on the application reduces cognitive load on the user.

2D/3D interfaces can be roughly sorted into three groups. Regardless of which one is chosen, a physical platform is necessary for two-dimensional input. How it is used to distinguish the interfaces from each other. For instance, with fully immersive headsets, the user cannot physically see the two-dimensional surface even though it still forms part of the setup. In order to interact with it in VR, a graphical representation of the surface must be provided.

As mentioned, two-dimensional and three-dimensional interfaces can be broadly divided into three categories. All of these interfaces require a physical surface for 2D input, and the difference is how the surface is used.

Fully immersive — Applications using fully immersive displays cannot be physically observed by the user. Users must have a graphic representation of the 2D surface in order to interact with it in the virtual world.

Semi-immersive — The second category of 2D/3D interfaces cover applications that use semi-immersive displays such as workbenches. Users usually interact directly with the physical 2D interaction surface on the display surface on the workbench, or on a tracked, transparent tablet they can hold in their hands. Graphics are projected onto the primary display in the latter case, but appear virtually as if they are on the tablet's surface.

Non-immersive — Separate 2D displays are used, such as handheld computers, smartphones, tablets etc.

Final Thoughts

Creating effective interaction techniques is crucial for designing intuitive and user-friendly 3D user interfaces. As 3D interfaces continue to become more prevalent in the emerging tech and daily use, it is important to stay up-to-date with the latest trends and advancements in design. One such trend is the increasing use of natural user interfaces, which utilize gestures and voice commands to interact with 3D environments.

The third article of this series will be dedicated to Interaction Design Philosophy, which will explore the overarching design principles and strategies behind creating effective 3D user interfaces. By understanding the philosophy and theory behind interaction design, we can to some degree predict future trends and adjust the choice of a technique for our design. With the advancement of emerging tech and the increasing use of 3D interfaces, it is important to stay informed to create effective and engaging interfaces.

Petra Palusova2 May 2023UI, user interface, metaverse, virtual reality, virtual ecosystems, extended reality, technology, immersive