There is widespread belief that some models are more "natural" than others. For example, the standard reference work on computer graphics by Foley at al. [2] claims that:
Later, the same text states:
One color model, TekHVC, has even been patented by Tektronix Corporation [9].
"Natural" and "intuitive" can be defined many ways. We could compare the effects of color models on user performance using any number of criteria. How quickly can an expert perform color selection? How accurately can an expert match? How much practice does it take to arrive at a specified level of proficiency, with proficiency defined in terms of time and accuracy? How well does a color model help a user select harmonious or functionally useful colors?
There is surprisingly little empirical data comparing color models. Schwarz, Cowan, and Beatty [6] performed the major experiment to date (referred to in the remainder of this paper as "the Schwarz study"). They compared five models and found many significant differences between them. In particular, they found the RGB model to be the fastest yet least accurate, while the HSV model was amongst the slowest and most accurate. The slower speed of the HSV model suggests it is not so "tractable" as commonly believed. This belief is so strongly rooted that the above quote from Foley immediately follows their description of the Schwarz results, yet the apparent contradiction is unremarked.
The arguments for or against a given color model are typically presented abstractly, without consideration of the interface which represents that model to the user. The model and the interface cannot be separated, because the factors affecting performance of a color model are complex and strongly affected by the interface: perception, screen representation, learning, and time vs. accuracy tradeoffs. The complexity of the issue and the perplexing nature of some of the Schwarz results make the area worth revisiting. We begin with a careful analysis of the empirical results to date.
Murch [5] summarizes a color-matching experiment comparing the number of steps subjects took to match colors using the RGB, HSL, and Swedish Natural Color System models. A mixture of experienced and inexperienced subjects was used. For inexperienced subjects the HSL system is reported to require the fewest steps. Unfortunately, no details of this experiment were ever published, so we can neither analyze his results nor attempt to replicate them.
As mentioned earlier, the Schwarz study is the current reference work for empirical data about color models. In this study, five different color models were compared: LAB, HSV, Opponent, YIQ, and RGB. Each color model was implemented with two input methods. The experimental design was five by two factorial, with color model and input method as the between-subjects variables.
While the experiment studied two methods of input, these were all limited in similar ways. In each case, the screen representation was identical. For each match, a "target" color was displayed in a rectangle on the screen and the subject used the interface to set the color of a second "controlled" rectangle to match the first one as closely as possible (Color Plate 1).
Color Plate 1: the screen representation for the Schwarz experiment
The interfaces differed in how the subject used a puck on a graphics tablet to control the three parameters of the color model. In one interface, called "3*1d", the each parameter was controlled by horizontal motion (Figure 1). Three puck buttons selected which parameter was being actively controlled. In the second method, called "2d + 1d", horizontal motion controlled one parameter and vertical motion another. Pressing a button made horizontal motion control the third parameter.
Figure 1: a "3*1d" input scheme. All parameters are
controlled by horizontal puck movement, with three separate
buttons selecting the active parameter. (Adapted from Fig. 6 of [6]).
A given subject used one color model / interface combination to match five colors six times. The authors analyzed the time required by the subjects to match, the accuracy of the final match, relationship of color model axes to accuracy in various perceptual attributes of the target, time to reach a given level of accuracy, and learning. Accuracy was measured in color distance units (cdus) of the LAB color space. The amount of data reported in their work is tremendous; the following two statistically significant results stand out:
In neither interface was the subject given any indication of the dimensions of each parameter of the space. Furthermore, they were given no indication of how close they were to the boundaries of each axis. The controlled color simply stopped changing without warning when the subject moved the puck outside the range of one of the coordinates. They refer to this process as "clipping". We conjecture the subjects may have been slower and less accurate because they had so little information available to orient themselves within the color space. This inadequate representation may even have affected some color models more than others, biasing the comparison of models.
The lack of visual representation was compounded by the use of relative coordinates for the puck. This meant there was no fixed mapping between the location of the puck on its tablet and the value of a given parameter, eliminating another means the subjects might have used to orient themselves in the space. For example, the puck location gave no indication of how close they were to being clipped. The subjects couldn't orient themselves by "feel" (the location of their arm) nor visually (by looking at the puck).
While the subjects were initially instructed in the effects of each parameter, they had no visible representation of what the parameters did. It is difficult to infer the effect of a parameter from simply moving the puck and observing the results, because the kind of change in the controlled color varies depending upon what color is currently displayed. For example, in the HSV model the saturation parameter will vary the amount of red if the controlled color has a red hue, but will increase the amount of blue if the controlled color is in that hue. Similar interactions between the parameters occur in the other color models.
In summary, the Schwarz study contradicts widely-held folklore about the superiority of the HSV color model. A question remains: Will these effects recur in an interface with higher feedback?
We were interested in how feedback in the interface affects the usage of color models; our experiment used two interfaces which had increasing levels of visual feedback. To address our first two concerns about representation of the color space, both of our interfaces displayed the location of the current color within the color model. Each of the three parameters of the model was represented by a slider, and an arrow indicated the current value of the associated parameter within its total range (see Color Plates 2 and 3). The user controlled the current color either by dragging the arrow along the slider with the mouse or by clicking directly on some row of the slider. Both of these methods were explicitly demonstrated to the user during the instructional phase.
Color Plate 2: the "position-only" interface
Color Plate 3: the "position+effect" interface
We wanted to see if the interface, and in particular the visual representation of the effects of each parameter, measurably affected users' performance with a color model. To compare the influence of such a representation, we used two different formats of sliders. One format, called "position-only" (Color Plate 2), gave no indication of what each parameter did: the interior of the sliders was a constant gray at all times. The second format, called "position+effect" (Color Plate 3), filled each slider with a range of colors. Each pixel row on the slider displayed the color that the controlled rectangle would take if the arrow were moved to that row. A user could look at the slider and know what effect it would have if it were moved to any point.
The experiment was run on an Apple Macintosh IIfx with an Apple 8.24GC accelerated graphics card capable of representing 16 million colors. A SuperMac PressView 21 display was used, set to a whitepoint of D65. This monitor is designed to be used in color-critical applications, has stable color rendition characteristics, and can be set for various calibrated white point and gamma values. We recalibrated the monitor using the SuperMatch calibration tool several times during the experiment to maintain the chromaticities of our target colors. All of our target colors were represented in the RGB space of the standard EBU phosphor chromaticities used in this monitor, and our RGB and HSV models used these phosphor chromaticities as the basis of their axes.
We selected thirty colors. Six of these were from the original Schwarz experiment. The remaining twenty-four were taken from the MacBeth ColorChecker chart [4], a standard reference chart for tests of color rendition. Twelve of these are representative of colors commonly found in natural and office environments (flesh tones, sky blue, and common office colors), six are the additive and subtractive color primaries, and the final six are an achromatic ramp from black to white.
Our data collection program logged the current slider positions and the value of the controlled color every tenth of a second. In particular, the final reading of each match indicated the total time and distance between the controlled and target colors.
The experimenter demonstrated how to manipulate the sliders using the mouse. To replicate a situation of use similar to what most users typically encounter, no abstract explanation of the color model was given to subjects. They were not explicitly told what the three parameters of their color model were, nor were the parameters named on the screen. In the instructions, subjects were asked to learn how the different sliders affected the color during the course of the experiment. We used Schwarz' wording to describe how closely the subjects should try to match the target color: "continue to refine the match until you think they are the same color or until it becomes extremely difficult to get the colors any closer to each other."
After the instruction period, subjects were given ten minutes to practice using the system. The experimenter was present for the first match and then left them alone for the remainder of their practice. The colors used during the practice time were different from those used in the actual experiment. After the practice period, the experimenter returned and started the sequence of thirty experimental colors. Subjects had three minutes to complete a single match. If they did not finish in three minutes, the match was ended by the program and the subjects moved on to the next color. Subjects performed this sequence alone and at their own pace. Times for this phase of the experiment ranged from twenty minutes to an hour and a half, with most subjects taking about forty-five minutes. A concluding questionnaire asked about the subjects' satisfaction and comfort with the experiment.
The high variance of this population may have masked
some effects of color model and feedback. With twenty-four
subjects in each color model or feedback condition, we have
a 95% chance, i.e., beta = .05, of detecting differences of 25
seconds or more between the mean times of the two color
models. To accurately detect differences of, say, 10 seconds,
would have required over 100 subjects in each condition---
far more than was practically possible Considering accuracy
of final match, our experiment could detect differences of 3
cdus or more between the mean accuracies of the two color
models. While our sample size may not be large enough to
detect moderate differences in time and accuracy between the
models, we feel it is sufficient to detect the kind of major
differences which folklore ascribes to color model.
Comparing the conclusions of the Schwarz study and our
own suggests which factors determine the influence of color
model on a color matching task. We consider three factors:
practice effects, levels of visual feedback of the interfaces,
and user selection strategies.
Practice Effects
It takes time to learn a color model. Perhaps differences
between color model would emerge after the subjects had
more practice. Although in both Schwarz and our present
study subjects completed thirty trials, the Schwarz study
used the same five colors repeated six times. Our study
used thirty different colors to more nearly represent the wide
variety that users typically select. We observed that in all
our conditions subjects had high variance, suggesting that
they were still in early stages of learning. Reducing
variance with more practice would perhaps create significant
differences in our experiment. For example, RGB with
high feedback, which has a mean nearly ten seconds less
than the other conditions, might be significantly faster (see
Table 2).
To discover user strategies we must look at the paths individual users took through the color space from the initial gray towards the target color. A color model is supposed to help a user predict which adjustments to the sliders will take the controlled color closer to the target. We sought a measure of how directed a subject's activity was. We define a move to be a single adjustment to a slider, from the time the mouse button was pressed inside the slider to the time the button was released. We characterize a move by its final color, the value of the controlled color at the time the button was released.
If a user has learned a color model and is actively using it to plan a path towards the target color, most of the moves will end with the controlled color closer to the target. We used the notion of moves to quantify how much backtracking the subjects did. If most of the subjects' moves took them closer to the target, it provides evidence that they are actively using the color model.
We broke every trial down into its constituent moves and computed the distance of the ending color from the target. Table 4 lists the percentage of total moves which ended with the subject closer to the target.
The subjects only moved closer to their target on a little more than half of their moves. This admittedly crude measure suggests that the subjects were not using the color model to predict where their moves were going but instead adopted some sort of simple hill-climbing strategy. This hypothesis is also supported by the significant improvement in accuracy for the "position+effect" interface, which allows the subject to predict the effect of the next move by looking at the screen display rather than reasoning based upon the color model. The subjects were more accurate when they could better predict their next move.
Our research explores the precise effects of these two factors and their interactions. While color selection is different from color matching, the tasks both involve navigation through a color space, and we believe that the role of color model and feedback in navigation are comparable in both tasks. Once controlled experiments have clarified the relationships between color model and feedback, more qualitative field studies can be used to examine their effects in actual contexts of use.