The discourse around digital color is complex and confusing. If you care about color at all and you work in digital, you should probably spend at least a whole day of your life on The Hitchhiker’s Guide to Digital Colour, which gives several messianic fucks about making this difficult stuff clearer. Or, if you prefer non-expletive-drenched, vintage HTML, head straight for the work of Charles Poynton, starting with his Color FAQ and Gamma FAQ. More sources at the end of this note.
What follows, is a potted and probably mangled summary, written mostly to make sure that we have wrapped our heads around some of the basics…yes, another of those posts.
Color & light
Light is a fundamental entity of physics whose properties are scientifically measurable, mathematically describable, and typically linear in their relationships (1+1=2).
In contrast, human color perception is a subjective and partly psychological phenomenon, evolved to be useful, highly sensitive to the surrounding environment, and subject to all sorts of non-linearities and optical illusion type weirdnesses.
Radiance, luminance & tone
To bridge the two discourses, the physical and the perceptual, requires that we pay close attention to linguistic ambiguity and terminological nuance. We need to be able to talk about the physical brightness of a light, termed its radiance, measured in watts per steradian per square meter, and amenable to linear math. We also need to be able to talk about the perceived brightness of a light, termed its luminance, measured in candelas per square meter, and the value of which is adjusted to reflect the varying sensitivity of the human eye to different wavelengths of light. (Green light looks brighter than red or blue light).
Finally, we also need to be able to talk about the perceptual response to the brightness of a light, termed lightness (or tone), which takes into account the fact that human subjects perceive luminance in a way that is both logarithmic, with much greater discrimination amongst dark tones than amongst light tones, and also relational, since our perceptual judgements of tone are affected by the luminance of the surrounding environment.
(Spoiler: this is why we typically encode a so-called gamma value and why a range of gamma values are in use).
Most perceptual concepts, including color, require this type of unpicking, and sometimes some daunting but not-really-very-complicated-math linking the several domains.
Color & chromaticity
Color, that is to say perceived color, is a subjective experience affected by many environmental and psychological factors, including the levels of ambient light, the luminance and lightness of nearby objects, the perceived color of adjacent objects, the location in the subject’s field of vision in which the colored object happens to appear, and also an important perceptual phenomenon known as chromatic adaptation whereby the subject gradually ignores the impact of imperfectly white light sources so as to always appear to see a white object as white.
Chromaticity, on the other hand, is an absolute and objective measure of color. A chromaticity is a coordinate in a color model that defines the ratios in which three primary wavelengths of light need to be mixed for a standard human observer to judge the mixed light as matching the color of a monochromatic source of light of a single wavelength.
For example, given three lights outputting particular wavelengths of red, green, and blue light, in what quantities must they be mixed to match the output of a pure yellow light emitting radiation of a single wavelength? This is an empirical question, and was answered empirically in the late 1920s in a series of experiments in which human subjects with normal color vision in a controlled environment were asked to color match the combined output of three primary lights to monochromatic spectral lights at 5 nm intervals of wavelength. Red, green, and blue primary lights were used because the red, green and blue photoreceptive cells (also known as cones) in the human retina are stimulated by light of those wavelengths.
Luminance is typically excluded when defining chromaticity because doing so yields a convenient result. If, instead of expressing red, green, and blue primaries as quantities (which would have a brightness) we express them as ratios (which could therefore have any brightness), and as ratios that add up to one, we only need two values to define a chromaticity because the third value can always be inferred (b = 1 – r – g). Which then means that the chromaticity map, like the CIE RGB chromaticity diagram below, can be presented as a two-dimensional area (a projected plane) rather than as a three-dimensional solid. Convenient, because 3D maps are quite hard to read. That said, whilst chromaticity coordinates may be plotted two dimensionally in this way, a third dimension does need to be added whenever we wish to bring luminance back into the picture.
(It is worth emphasizing here that the coordinates of a chromaticity together with a specified luminance do not define a color response in the subject. They merely define the properties of the stimulus. This allows for absolute color specification and enables scientifically calibrated color matching, but says nothing about how light of a given chromaticity and luminance will actually be perceived by a human subject).
The CIE 1931 RGB chromaticity diagram
This diagram is essentially a summary of the results of the color matching experiments conducted in the late 1920s.
In this diagram, the horizontal axis is lowercase ‘r’, which is a normalized version of uppercase R, such that r = R / R+G+B where R represents the quantity of light from the red primary. The vertical axis is ‘g’, a normalized version of G, such that g = G / R+G+B, where G represents the quantity of light from the green primary. The value of b representing normalised blue light does not need to be graphically represented because it can always be inferred (b = 1 – r – g).
The line defining the left-leaning, upside-down, U-shaped curve, termed the spectral locus, is a plot, in terms of r and g, of the levels of the three primary lights needed to match all of the monochromatic wavelengths of light within the visible spectrum of electromagnetic radiation from 380 nm to 700 nm. This line is simply the plot of the 1920s experiments. There are no wavelength numbers listed on the line closing the curve at the bottom of the diagram, for the good reason that although these purple lights can be mixed, none of them exist as monochromatic spectral lights (because no light of a single wavelength can tickle our cones in just the right way).
All of the points plotted within the spectral locus are chromaticities that can be achieved by mixing varying quantities of two or more monochromatic spectral lights and hence all the chromaticities which are distinguishable to the human eye. (Although not all definable chromaticities are perceptually distinguishable from each other). It is worth noting that there are an infinite number of wavelength combinations, each termed a spectral power distribution, that correspond to each chromaticity. Explanatorily, this is because there are an infinite number of spectral power distributions that can stimulate the cones in the human retina to the same degree, which is because each type of cone can be stimulated by different wavelengths of light yet only the aggregate stimulus of that cone type impacts color perception. Differing spectral power distributions which map to the same chromaticity are known as metamers of each other. Same color, different physical basis.
Saturation is maximal on the spectral locus, where light of a given wavelength is unmixed with light of any other wavelength. Saturation decreases as the coordinate moves away from the spectral locus and other wavelengths of light are mixed in.
(This unpicks how the concept of saturation, which is concerned with the purity of spectral light, differs from the concept of luminance, which is concerned with the perceived brightness of a chromaticity and hence ultimately explained by the physical radiance of the light being emitted. This also helps explain how highly saturated blues can look much darker than highly saturated greens).
In the RGB chromaticity diagram above, the red primary is located at (1,0), the green primary at (0,1), and the blue primary at (0,0). Together they define a triangle, contained within which are all the chromaticities that can be defined with positive contributions of light from the three primaries. At the centre of this triangle at (0.33, 0.33) is the point at which levels of light from the three primaries are equivalent. This point is achromatic: white at maximum luminance levels, gray at lower luminance levels. It is also, and consequently, the point of minimal saturation.
But what to make of all the chromaticities with negative values of r, those inside the spectral locus but outside the RGB triangle? There seem to be a vast swathe of blues, greens, and turquoises that cannot be mixed from the three primary lights. First, notice the empirical and mathematical point that the gamut of all human perceivable colors does not form a triangular area. Consequently, there are no three primaries that can contain it. Not real ones, at any rate. In practice, negative levels of red light were achieved in the empirical color matching experiments by adding red light to the monochromatic spectral light that was the target of the color match, the practical and linear light equivalent equivalent of subtracting red light from the side of the three primary lights whose brightness was being adjusted in an attempt to match the monochromatic target.
In order to eliminate negative numbers from the chromaticity coordinate system, and achieve some other practical benefits, the RGB chromaticity diagram was mathematically transformed into a different two dimensional space, yielding the widely utilized xyY chromaticity model.
The CIE 1931 xyY chromaticity diagram
The math isn’t complicated: it’s simply a matrix transformation. But it’s probably easier to grock it on the RGB diagram above. The new horizontal axis is the ‘x’ axis between Cb and Cr. The new vertical axis is the ‘y’ axis between Cb and Cg. Cb becomes the new origin, the new (0,0), Cg the new (1,0), Cr the new (0,1). As you might have guessed by now, Cr, Cg, and Cb are the new primaries. They are, however, imaginary primaries, as they lie outside the spectral locus. They are not real chromaticities, and yet they work perfectly fine as mathematical constructs.
The transformed diagram with the new x and y axes is shown below. Now there are no negative numbers! This is still a tristimulus model. There is a z, but it is not graphically represented, because it can always be inferred (z = 1 – x – y).
One of the clever things about this transformation is that the Y value (capital ‘Y’) is equivalent to luminance. Which helps us to understand better what the x, y, and z values are. Like r, g, and b, they are normalized values, derived from X, Y, and Z. Definitionally, Y is luminance. Z happens to be very similar to B, so approximately representative of blue light. And X, X is just the mathematical stew of everything else left over.
In this model, x and y plot chromaticity. If you add Y as a third dimension, you have chromaticity plus luminance in a 3D map. For any level of luminance, all chromaticities possible at that luminance are given by intersection with the xy plane. Neat! Hence the xyY color model and the xyY chromaticity diagram.
(Internal dialogue: if y is normalised Y, such that y = Y / X + Y + Z, isn’t luminance already captured in the two-dimensional plot? Not in any useful way. In this normalisation process luminance (Y) gets divided by the sum of luminance plus two mathematical placeholders for color information (X and Z) that are conceptually dissimilar to luminance. A normalized value for height doesn’t tell you much about height, if the two other concepts are weight and age. The normalised y can be thought to ‘contain’ luminance information only in the sense that it can be mathematically converted back to Y).
The triangle superimposed on the CIE xyY diagram above is simply the sRGB color specification. The corners of the triangle specify the chromaticity coordinates of the sRGB primaries, and the corresponding triangle defines the sRGB gamut, which is the range of all possible colors that can be achieved by mixing those primaries.
This illustrates the importance of the CIE 1931 xyY model. It has its kinks and foibles (especially, that it is not perceptually uniform) but every digital color specification is defined in relation to the CIE 1931 xyY model. It is the foundation of scientific color matching and the ground truth of digital color.
Every pixel on a modern screen contains a red, a green, and a blue light. The exact chromaticity of red, green, and blue emitted by those lights is important, because it determines the range of possible colors (chromaticities) that can be produced by the screen (its gamut).
The red LED always emits light of the same chromaticity. You can vary the quantity of light emitted, the physical radiance of the LED, and hence the luminance of that LED, resulting in darker or lighter reds, but these reds are all, definitionally, the same chromaticity.
(This is almost but not quite the same thing as saying that you can make the LED more or less bright by varying the current applied to it, but in doing so you cannot thereby change its spectral power distribution. Which is good, because that claim is false. Varying the current to an LED does sometimes affect its spectral power distribution, and in these cases this extra data about the LED is yet another thing that needs to be baked into the screen’s color transformation functions).
You can, of course, change the chromaticity of the pixel by mixing the light emitted by the three LEDs in different ratios. Bright red + bright green = bright yellow. And so on.
Color space = ‘gamut’ + white point + transfer functions ( ‘gamma’)
A digital color space defines three things.
#1 The chromaticity of the three primary colors, red, green, and blue, and hence the range of all possible chromaticities that may be achieved by varying the ratios of light emitted from red, green and blue LEDs with those chromaticities. This range of possible chromaticities is typically known as the gamut. Some choices of primaries enable large ranges of possible chromaticities: these are known as wide gamuts. Other choices enable smaller ranges: narrower gamuts.
#2 The chromaticity of the white point. Typically, this is the chromaticity when equal and maximum levels of red, green and blue light are emitted, but white points can be specified to be warmer (redder) or colder (bluer) than this.
#3 A set of color transformation functions that enable chromaticity, luminance, and tone values to be mapped to linear values expressing the physical radiance of light emitted from three LEDs with spectral distributions matching the chromaticity of the three specified primaries. Chromaticity drives the proportional mix of emitted light. Tone (and luminance) the absolute quantity of radiance.
You don’t have to specify all three things together, but any specification that omits one of these three things should not be called a color space.
Gamma and gamma correction
We noted earlier that the human perception of luminance is non-linear. We perceive differences between darker shades more readily than between lighter shades. In fact, the relationship is logarithmic, and this is broadly true of all human perception and sensation. We have much better powers of discrimination at lower levels of light, sound, touch, heat, pain, and so on than at higher levels.
The power law in the case of luminance perception has been captured with varying degrees of precision over the last hundred years, but we perceive tone (lightness and darkness) as approximately luminance to the power of 1 / 2.2, that is luminance to the power of roughly 0.45.
The reason this is interesting from the standpoint of color models and digital color spaces is that it enables information about perceived tone to be compressed and hence stored and computed more efficiently.
By encoding luminance levels with the same exponent as that found in human perception (1 / 2.2), the color model records logarithmically more information about darker tones than it does about lighter tones. The encodings of the color model can then easily be decoded by reversing the transfer function, so that tone values are expanded back to luminance values by raising them to the power 2.2 (and subsequently to physical radiance values that will be communicated to the LEDs of some display device).
The value 2.2 is typically termed gamma in this context. Gamma values for compressing and expanding luminance values vary between color spaces, in the range 1.8-2.6. The most common gammas are 2.2, typically used for web-based content, 2.4, used for TV, and 2.6, used for theatrical cinema. The fundamental precept here is that perception of luminance is relational to the viewing environment, so that the lower the background luminance, the brighter the screen will appear to be. Cinema is designed to be viewed in near darkness, TV in subdued conditions with limited backlight, and web content in much brighter environments.
This offers two possible workflow practices.
The first is to edit and color grade content in lighting conditions the same as those in which it is designed to be viewed, so relatively bright environments for web-based content, then encode and decode with a constant gamma of 2.2, representative of human perception of luminance in everyday conditions. Analogously, you would edit TV content in more subdued conditions, with no direct sunlight, and encode and decode with a constant gamma of 2.4. Feels fairly common sense.
The second path is to choose a middling, subdued editing environment, encode with a gamma of 2.4, then decode with a gamma of 2.2 for web content and 2.6 for theatrical content. This has the effect of lightening images for web content and darkening content for cinema to stop it looking washed out. Bit weirder, but makes it clear that you need to lighten content for light environments and darken it for dark environments.
The key thing in all of this is to be intentional. If you forget to decode gamma-encoded content, apply gamma correction to content that has not been gamma-encoded, or apply the wrong gamma in decoding, the tonality of the image will look unrealistically off, and no amount of image post-processing and color grading will be able to rectify the mistake.
One area in which to be especially vigilant is precisely that of image manipulation: it is essential to manipulate content that has not been gamma encoded or subjected to any other transfer functions which might render luminance relations non-linear. Otherwise, the application of linear math to non-linear relationships results in color fringing, color casts, gradient artefacts and other screwy optical effects. If your content has been gamma-encoded, you need to make sure that any software you’re using decodes using the same gamma before it manipulates the imagery with math.
Most of the time, ‘linear’ in the context of digital color merely means linear gamma, which means a gamma value of 1.0, which effectively means that no gamma encoding has been applied, since raising everything to the power 1/1 leaves every number the same. It’s not a totally unhelpful use of the term: linear math needs linear gamma for the effects of light manipulation to be computed correctly.
- The Hitchhiker’s Guide to Digital Colour, mostly 2020-21, Troy something-or-other
- Gamma and Color FAQs, 1996, Charles Poynton
- A Beginner’s Guide to (CIE) Colorimetry, 2016, Chandler Abraham
- The CIE XYZ and xyY Color Spaces, 2010, Douglas A. Kerr
- Color Rendering of Spectra, 1996, John Walker
- CIE Color Space, retrieved 2021, Wikipedia
- Understanding Gamma Correction, retrieved 2021, Sean McHugh
- What Every Coder Should Know About Gamma, 2016, John Novak
- Why Do We Perceive Logarithmically?, 2013, Lav R. Varshney and John Z. Sun