Object Manipulation | About Face 2.0(c) The Essentials of Interaction Design

Like controls, data objects on the screen, particularly graphical objects in drawing and modeling programs, can be manipulated by clicking and dragging. Objects (other than icons, which were discussed in the previous chapter) depend on click-and-drag motions for three main operations: repositioning, resizing, and reshaping.

Repositioning

Repositioning is the simple act of clicking on an object and dragging it to a new location. The most significant design issue regarding repositioning is that it usurps the place of other direct-manipulation idioms. The repositioning function demands the click-and-drag action, making it unavailable for other purposes. If the object is repositionable, the meaning of click-and-drag is taken and cannot be devoted to some other action, like scrolling or pressing a button, within the object itself.

The most common solution to this conflict is to dedicate a specific physical area of the object to the repositioning function. For example, you can reposition a window in Windows or on the Macintosh by clicking and dragging its title bar. The rest of the window is not pliant for repositioning, so the click-and-drag idiom is available for functions within the window, as you would expect. The only hint of the window's capability to be dragged is the color of the title bar, a subtle visual hint that is purely idiomatic: There is no way to intuit the presence of the idiom. But the idiom is very effective, and it merely proves the efficacy of idiomatic interface design. In general, however, you need to provide more explicit visual hinting of an area's pliancy. (Title bars could, for instance, use a slight shift in brightness as a pliancy hint, or you could use cursor hinting). The cost of this solution is the number of pixels devoted to the title bar. Mitigating this is the fact that the title bar does multiple-duty as a program identifier, active status indicator, and repository for certain other system-standard controls such as minimize, maximize, and close functions.

To move an object, it must first be selected. This is why selection must take place on the mouse-down transition: The user can drag without having to first click and release on an object to select it, then click and drag it to reposition it. It feels so much more natural to simply click it and then drag it to where you want it in one easy motion.

This creates a problem for moving contiguous data. In Word, for example, Microsoft uses this clumsy click-wait-click operation to drag chunks of text. You must click and drag to select a section of text and then wait a second or so and click and drag again to move it. This is unfortunate, but there is no good alternative for contiguous selection. If Microsoft were willing to dispense with its meta-key idioms for extending the selection, those same meta-keys could be used to select a sentence and drag it in a single movement, but this still wouldn't solve the problem of selecting and moving arbitrary chunks of text.

Resizing and reshaping

When referring to the desktop of Windows and other similar GUIs, there isn't really any functional difference between resizing and reshaping. The user adjusts a rectangular window's size and aspect ratio at the same time and with the same control by clicking and dragging on a dedicated control. On the Macintosh, there is a special resizing control on each window in the lower-right corner. Dragging this control allows the user to change both the height and width of the rectangle. Windows 3.x eschewed this idiom in favor of using the frame surrounding each window to allow resizing from any frame edge. It offered both generous visual hinting and cursor hinting, so it was easily discovered. Windows 95 added to this a reshaping-resizing control remarkably similar to the Macintosh's lower-right-corner reshaper/resizer, although it narrowed its window frame (resizing from the frame still works, though). Today Windows retains the frame and its cursor hinting, but virtually no visual hinting of the frame remains. For users who notice the cursor hinting, it provides the best of both worlds.

Such idioms are appropriate for resizing windows, but when the object to be resized is a graphical element in a drawing or modeling program, it is not acceptable to permanently superimpose such controls on objects. A resizing idiom for graphical objects must be visually bold to differentiate itself from parts of the drawing, especially the object it controls, and it must be respectful of the user's view of the object and the area around it. The resizer must also not obscure the resizing action.

A popular idiom accomplishes these goals; it consists of eight little black squares positioned one at each corner of a rectangular object and one centered on each side. These little black squares, shown in Figure 24-1, are called resize handles (or, simply, handles).

click to expand
Figure 24-1: The selected object has eight handles, one at each corner and one centered on each side. The handles indicate selection and are a convenient idiom for resizing and reshaping the object. Handles are sometimes implemented with pixel inversion, but in a multicolor universe they can get lost in the clutter.

Handles are a boon to designers because they can also indicate selection. This is a naturally symbiotic relationship because an object must usually be selected to be resizable.

The handle centered on each side moves only that side, while the other sides remain motionless. The handles on the corners simultaneously move both the sides they touch, an interaction which is quite visually intuitive.

Handles tend to obscure the object they represent, so they don't make very good permanent controls. This is why we don't see them on top-level resizable windows (although windows in some versions of Sun's Open Look GUI come close). For that situation, frame or corner resizers are better idioms. If the selected object is larger than the screen, the handles may not be visible. If they are hidden off-screen, not only are they unavailable for direct manipulation, but they are useless as indicators of selection.

Notice that the assumption in this discussion of handles is that the object under scrutiny is rectangular or can be easily bounded by a rectangle. Certainly in the Windows world, things that are rectangular are easy for programs to handle, and non-rectangular things are best handled by enclosing them in a bounding rectangle. If the user is creating an organization chart this may be fine, but what about reshaping more complex objects? There is a very powerful and useful variant of the resize handle: a vertex handle.

Many programs draw objects on the screen with polylines. A polyline is a graphics programmer's term for a multisegment line defined by an array of vertices. If the last vertex is identical to the first vertex, it is a closed form and the polyline forms a polygon. When the object is selected, the program, rather than placing eight handles as it does on a rectangle, places one handle on top of every vertex of the polyline. The user can then drag any vertex of the polyline independently and actually change one small aspect of the object's internal shape rather than affecting it as a whole. This is shown in Figure 24-2.

click to expand
Figure 24-2: These are vertex handles, so named because there is one handle for each vertex of the polygon. The user can click and drag any handle to reshape the polygon, one segment at a time. This idiom is useful for drawing programs, but it may have application in desktop productivity programs, too.

Freeform objects in PowerPoint are rendered with polylines. If you click on a freeform, it is given a bounding rectangle with the standard eight handles. If you right-click on the freeform and choose Edit Points from the context menu, the bounding rectangle disappears and vertex handles appear instead. It is important that both these idioms are available, as the former is necessary to scale the image in proportion, whereas the latter is necessary to fine-tune the shape.

Resizing and reshaping meta-key variants

In the context of dragging, a meta-key is often used to constrain the drag to an orthogonal direction. This type of drag is called a constrained drag, and is shown in Figure 24-3.

click to expand
Figure 24-3: When a drag is constrained, usually by holding down the Shift key, the object is only dragged along one of the four axes shown here. The program selects which one by the direction of the initial movement of the mouse, an implementation of the drag threshold discussed later in the chapter.

A constrained drag is a drag whose path is limited to a straight line up, down, left, right, or at 45-degree angles regardless of what path the user might take the mouse. Usually, the Shift meta-key is used, but this convention varies from program to program. Constrained drags are extremely helpful in drawing programs, particularly when drawing neatly organized diagrams. The predominant motion of the first few millimeters of the drag determines the angle of the drag. If the user begins dragging on a predominantly horizontal axis, for example, the drag will henceforth be constrained to the horizontal axis. Some programs interpret constraints differently, letting the user shift angles in mid-drag by dragging the mouse across a threshold.

The Paint program that comes with Windows doesn't constrain drags when moving an object around, but it does constrain the drawing of a few shapes, like lines and circles. Most drawing programs (like PowerPoint) that treat their graphics as objects instead of bits (as Paint does) allow constrained drags, and more sophisticated paint applications like Adobe Photoshop support the constrained drag idiom.

The use of meta-keys gives rise to a curious question: Where in the drag does the meta-key become meaningful? In other words, must the meta-key be held down when the drag begins — when the mouse button descends — or is it merely necessary for the meta-key to be pressed at some point during the drag? Or must the meta-key remain pressed at the time the user releases the mouse button? The best answer is this: The user should be able to switch to and receive visual feedback of constrained drag by pressing the meta-key at any time after he starts to drag. If he lets go of the meta-key during the drag, it reverts to unconstrained drag. Finally, if the computer detects that the meta-key is held down at the instant when the mouse button is released, the constrained drag is confirmed. This is true in PowerPoint and Paint, for example.

In an interesting bit-drawing variant, Paint also allows drag constraints in its pencil tool; any time the meta-key is held down during dragging, the constraint affects what is drawn during the drag. Mouse-up stops the flow of digital ink.

3D object manipulation

Working with precision on three-dimensional objects presents considerable interaction challenges for users equipped with 2D input devices and displays. Some of the most interesting research in UI design involves trying to develop better paradigms for 3D input and control. So far, however, there seem to be no real revolutions, but merely evolutions of 2D idioms extended into the world of 3D.

Most 3D applications are concerned either with precision drafting (for example, architectural CAD) or with 3D animation. When models are being created, animation presents problems similar to those of drafting. An additional layer of complexity is added, however, in making these models move and change over time. Often, animators create models in specialized applications and then load these models into different animation tools.

There is such a depth of information about 3D-manipulation idioms that an entire chapter or even an entire book could be written about them. We will thus briefly address some of the broader issues of 3D object manipulation.

DISPLAY ISSUES AND IDIOMS

Perhaps the most significant issue in 3D interaction on a 2D screen is that surrounding lack of parallax, the binocular ability to perceive depth. Without resorting to expensive, esoteric goggle peripherals, designers are left with a small bag of tricks with which to conquer this problem. Another important issue is one of occlusion: near objects obscuring far objects. These navigational issues, along with some of the input issues discussed in the next section, are probably a large part of the reason virtual reality hasn't yet become the GUI of the future.

MULTIPLE VIEWPOINTS Use of multiple viewpoints is perhaps the oldest method of dealing with both of these issues, but it is, in many ways, the least effective from an interaction standpoint. Nonetheless, most 3D modeling applications present multiple views on the screen, each displaying the same object or scene from a different angle. Typically, there is a top view, a front view, and a side view, each aligned on an absolute axis, which can be zoomed in or out. There is also usually a fourth view, an orthographic or perspective projection of the scene, the precise parameters of which can be adjusted by the user. When these views are provided in completely separate windows, each with its own frame and controls, this idiom becomes quite cumbersome: Windows invariably overlap each other, getting in each other's way, and valuable screen real estate is squandered with repetitive controls and window frames. A better approach is to use a multipane window that permits 1-, 2-, 3-, and 4-pane configurations (the 3-pane configuration has one big pane and 2 smaller panes). Configuration of these views should be as close to single-click actions as possible, using a toolbar or keyboard shortcut.

The shortcoming of multiple viewpoint displays is that they require the user to look in several places at the same time to figure out the position of an object. Forcing the user to locate something in a complex scene by looking at it from the top, side, and front, and then expecting him to triangulate in his head in real-time is a bit much to expect, even from modeling whizzes. Nonetheless, multiple viewpoints are helpful for precisely aligning objects along a particular axis.

BASELINE GRIDS, DEPTHCUEING, SHADOWS, AND POLES Baseline grids, depthcueing, shadows, and poles are idioms that help to get around some of the problems created by multiple viewpoints. The idea behind these idioms is to allow users to successfully perceive the location and movement of objects in a 3D scene projected in an orthographic or perspective view.

Baseline grids provide virtual floors and walls to a scene, one for each axis, which serve to orient users. This is especially useful when (as is usually the case) the camera viewpoint can be freely rotated.

Depthcueing is a means by which objects deeper in the field of view appear dimmer. This effect is typically continuous, so even a single object's surface will exhibit depthcueing, giving useful clues about its size, shape, and extent. Depthcueing, when used on grids, helps disambiguate the orientation of the grid in the view.

One method used by some 3D applications for positioning objects is the idea of shadows — outlines of selected objects projected onto the grids as if a light is shining perpendicularly to each grid. As the user moves the object in 3D space, she can track, by virtue of these shadows or silhouettes, how she is moving (or sizing) the object in each dimension.

Shadows work pretty well, but all those grids and shadows can get in the way visually. An alternative is the use of a single floor grid and a pole. Poles work in conjunction with a horizontally oriented grid. When the user selects an object, a vertical line extends from the center of the object to the grid. As she moves the object, the pole moves with it, but the pole remains vertical. The user can see where in 3D space she is moving the object by watching where the base of the pole moves on the surface of the grid (x and y axes), and also by watching the length and orientation of the pole in relation to the grid (z axis).

GUIDELINES AND OTHER RICH VISUAL HINTS The idioms described in the previous section are all examples of rich visual modeless feedback, which we will discuss in detail in Chapter 34. However, for some applications lots of grids and poles may be overkill. For example, @Last Software's SketchUp is an architectural sketching program where users can lay down their own drafting lines using tape measure and protractor tools and, as they draw out their sketches, get color-coded hinting that keep them oriented to the right axes. Users can also turn on a blue-gradient sky and a ground color to help keep them oriented. Because the application is focused on architectural sketching, not general purpose 3D modeling or animation, the designers were able to pull off a spare, powerful, and simple interface that is both easy to learn and use (see Figure 24-4).

click to expand
Figure 24-4: @Last Software's SketchUp is a gem of an application that combines powerful 3D architectural sketching capability with smooth interaction, rich feedback, and a manageable set of design tools. Users can set sky color and real-world shadows according to location, orientation, and time of day and year. These not only help in presentation, but help orient the user while building. Users also can lay down 3D grid and measurement guides just as in a 2D sketching application; the protractor tool is visible above. Camera rotate and zoom functions are cleverly mapped to the mouse scroll wheel, allowing fluid access while using other tools. ToolTips provide textual hints that assist in drawing lines and aligning objects.

WIRE FRAMES AND BOUNDING BOXES Wire frames and bounding boxes solve problems of object visibility. In the days of slower processors, all objects needed to be represented as wire frames because computers weren't fast enough to render solid surfaces in real time. It is fairly common these days for modeling applications to render a rough surface for selected objects, while leaving unselected objects as wire frames. Transparency would also work, but is still very computing-intensive. In highly complex scenes, it is sometimes necessary or desirable, but not ideal, to render only the bounding boxes of non-selected objects.

INPUT ISSUES AND IDIOMS

3D applications make use of many idioms such as drag handles and vertex handles that have been adapted from 2D to 3D. However, there are some special issues surrounding 3D input.

DRAG THRESHOLDS One of the fundamental problems with direct manipulation in a 2D projection of a 3D scene is the problem of translating 2D motions of the cursor in the plane of the screen into a more meaningful movement in the virtual 3D space.

In a 3D projection, a different kind of drag threshold is required to differentiate between movement in three, not just two, axes. Typically, up and down mouse movements translate into movement along one axis, whereas 45-degree-angle drags are used for each of the other two axes. SketchUp provides color-coded hinting in the form of dotted lines when the user drags parallel to a particular axis, and it also hints with ToolTips. In a 3D environment, rich feedback in the form of cursor and other types of hinting becomes a necessity.

THE PICKING PROBLEM The other significant problem in 3D manipulation is known as the picking problem. Because objects need to be in wireframe or otherwise transparent when assembling scenes, it becomes difficult to know which of many overlapping items the user wants to select when she mouses over it. Locate highlighting can help, but is insufficient because the object may be completely occluded by others. Group selection is even trickier.

Many 3D applications resort to less direct techniques, such as an object list or object hierarchy that users can select from outside of the 3D view. Although this kind of interaction has its uses, there are more direct approaches.

For example, hovering over a part of a scene could open a ToolTip-like menu that lets users select one or more overlapping objects (this menu wouldn't be necessary in the simple case of one unambiguous object). If individual facets, vertices, or edges can be selected, each should hint at its pliancy as the mouse rolls over it.

Although it doesn't address the issue directly, a smooth and simple way to navigate around a scene can also ameliorate the picking problem. SketchUp has mapped both zoom and orbit functions to the mouse scroll wheel. Spin the wheel to zoom in towards or away from the central zero point in 3D space; press and hold the wheel to switch from whatever tool you are using to orbit mode, which allows the camera to circle around the central axes in any direction. This fluid navigation makes manipulation of an architectural model almost as easy as rotating it in your hand.

OBJECT ROTATION, CAMERA MOVEMENT, ROTATION, AND ZOOM One more issue specific to 3D applications is the number of spatial manipulation functions that can be performed. Objects can be repositioned, resized, and reshaped in three axes. They can also be rotated in three axes. Beyond this, the camera viewpoint can be rotated in place or revolved around a focal point, also in three axes. Finally, the camera's field of view can be zoomed in and out.

Not only does this mean that assignment of meta-keys and keyboard shortcuts is critical in 3D applications. (Obviously, some of these controls can be put in toolbars, but dedicated users will almost exclusively use the keyboard to control these modes.) There is another problem: It can be difficult to tell the difference between camera transformations and object transformations by looking at a camera viewpoint, even though the actual difference between the two can be quite significant. One way around this problem is to include a thumbnail, absolute view of the scene in a corner of the screen. It could be enlarged or reduced as needed, and could provide a reality check and global navigation method in case the user gets lost in space (note that this kind of thumbnail view is useful for navigating large 2D diagrams as well).