Color composition is critical in visual applications in art, design, and visualization. Over the centuries, different theories about how colors interact with each other have been proposed [Westland et al., 2007]. While it is arguable whether a universal and comprehensive color theory will ever exist, most previous proposals share in common the use of a color wheel (with hue parameterized by angle) to explain pleasing color combinations in geometric terms. In the digital world, the color wheel often serves as a user interface to visualize and manipulate colors. This has been explored in the literature for specific applications in design [Adobe, 2018] and image editing [Cohen-Or et al., 2006].
In this paper, we embrace color wheels to present a new framework where color composition concepts are easy and intuitive to formulate, solve for, visualize, and interact with; for applications in art, design, or visualization. Our approach is based on palettes and relies on palette-based image decompositions. To fully realize it as a powerful image editing tool, we introduce an extremely efficient yet simple new image decomposition algorithm.
We define our color relationships in the CIE LCh color space (the cylindrical projection of CIE Lab). Contrary to previous work using HSV color wheels, the LCh color space ensures that perceptual effects are accounted for with no additional processing. For example, a simple global rotation of hue in LCh-space (but not HSV-space) preserves the perceived lightness or gradients in color themes and images.
To represent color information, we adopt the powerful palette-oriented point of view [Mellado et al., 2017] and propose to work with color palettes of arbitrary numbers of swatches. Unlike hue histograms, color palettes or swatches can come from a larger variety of sources (extracted from images, directly from user input, or from generative algorithms) and capture the 3D nature of LCh in a compact way. They provide intuitive interfaces and visualizations as well.
Color palettes also simplify the modelling and formulation of relationships between colors. This last point enables the simplification of harmonic templates and other relationships into a set of a few 3D axes that capture color structure in a meaningful and compact way. This is useful for various color-aware tasks. We demonstrate applications to color harmonization and color transfer. Instead of using the sector-based templates from Matsuda [Tokumaru et al., 2002] (appropriate for hue histograms) we derive our harmonic templates from classical color theory [Itten, 1970; Birren and Cleland, 1969] (see Figures 2 and 14). We also propose new color operations using this axes-based representation. Our proposed framework can be used by other palette-based systems and workflows, either for palette improvement or image editing.
At the core of our and other recent approaches [Chang et al., 2015; Tan et al., 2016; Aksoy et al., 2017; Zhang et al., 2017] to image editing, images are decomposed into a palette and associated per-pixel compositing or mixing parameters. We propose a new, extremely efficient yet simple and robust algorithm to do so. Our approach is inspired by the geometric palette extraction technique of Tan et al. . We consider the geometry of 5D RGBXY-space, which captures color as well as spatial relationships and eliminates numerical optimization. After an initial palette is extracted (given an RMSE reconstruction threshold), the user can edit the palette and obtain new decompositions instantaneously. Our algorithm’s performance is extremely efficient even for very high resolution images ( megapixels)—20x faster than the state-of-the-art [Aksoy et al., 2017]. Working code is provided in Section 3. Our algorithm is a key contribution which enables our approach and many other applications proposed in the literature.
In summary, this papers makes the following contributions:
A new palette-based color harmonization framework, general enough to model classical harmonic relationships, new color composition operations, and a compact structure for other color-aware applications, also applicable to video.
An extremely efficient, geometric approach for decomposing an image into spatially coherent additive mixing layers by analyzing the geometry of an image in RGBXY-space. Its performance is virtually independent from the size of the image or palette. Our decomposition can be re-computed instantaneously for a new RGB palette, allowing designers to edit the decomposition in real-time.
Three large-scale, wide-ranging perceptual studies on the perception of harmonic colors and our algorithm.
We demonstrate other applications like color transfer, greatly simplified by our framework.
2. Related Work
There are many works related with our contributions and their applications. In the following we cover the most relevant ones.
2.1. Color Harmonization
Many existing works have applied different concepts from traditional color theory for artists to improve the color composition of digital images. In their seminal paper, Cohen-Or et al.  use hue histograms and harmonic templates defined as sectors of hue-saturation in HSV color space [Tokumaru et al., 2002], to model and manipulate color relationships. They fit a template (optimal or arbitrary) over the image histogram, so they can shift hues accordingly to harmonize colors or composites from several sources. Additional processing is needed to ensure spatial smoothness. Several people have built on top of this work, extending or improving parts of their proposed framework. Sawant and Mitra  extended it to video, focusing on temporal coherence between successive frames. Improvements to the original fitting have been proposed based on the number of pixels for each HSV value [Huo and Tan, 2009], the visual saliency [Baveye et al., 2013], the extension and visual weight of each color. [Baveye et al., 2013], or geodesic distances [Li et al., 2015]. Tang et al.  improves some artifacts during the recoloring of [Cohen-Or et al., 2006]. Chamaret et al.  defines and visualizes a per-pixel harmony measure to guide interactive user edits.
Instead of using hue histograms from images, our framework is built on top of color palettes, independently of their source. Given the higher level of abstraction of palettes, we simplify harmonic templates to arrangements of axes in chroma-hue space (from LCh), interpreted and derived directly from classical color theory [Itten, 1970; Birren and Cleland, 1969]. This more general and simpler representation makes for more intuitive metrics, easier to solve, that enable a wider range of applications. When working with images, this approach fits perfectly with our proposed palette extraction and image decomposition for very efficient and robust image recoloring. Related to our approach, Mellado et al.  is also able to pose harmonization as a set of constrains within their general constrained optimization framework. Our new templates, posed in LCh space, could be added as additional constraints.
2.2. Palette Extraction and Image Decomposition
A straightforward approach consists of using a k-means method to cluster the existing colors in an image, in RGB space[Chang et al., 2015; Phan et al., 2017; Zhang et al., 2017]. A different approach consists of computing and simplifying the convex hull enclosing all the color samples [Tan et al., 2016], which provides more general palettes that better represent the existing color gamut of the image. A similar observation was made in the domain of hyperspectral image unmixing [Craig, 1994]. (With hyperspectral images, palette sizes are smaller than the number of channels, so the problem is one of fitting a minimum-volume simplex around the colors. The vertices of a high-dimensional simplex become a convex hull when the data is projected to lower dimensions.) Morse et al.  work in HSL space, using a histogram to find the dominant hues, then to find shades and tints within them. Human perception has also been taken into account in other works, training regression models on crowd-sourced datasets. [O’Donovan et al., 2011; Lin and Hanrahan, 2013]. Some physically-based approaches try to extract wavelength-dependent parameters to model the original pigments used paintings. [Tan et al., 2017; Aharoni-Mack et al., 2017]. Our work builds on top of Tan et al. , adding a fixed reconstruction error threshold for automatic extraction of palettes of optimal size, as described in Section 3.1.
For recoloring applications, it is also critical to find a mapping between the extracted color palette and the image pixels. Recent work is able to decompose the input image into separate layers according to a palette. Tan et al.  extract a set of ordered translucent RGBA layers, based on a optimization over the standard alpha blending model. Order-independent decompositions can be achieved using additive color mixing models [Aksoy et al., 2017; Lin et al., 2017a; Zhang et al., 2017]. For the physically-based palette extraction methods mentioned previously [Tan et al., 2017; Aharoni-Mack et al., 2017], layers correspond to the extracted multi-spectral pigments. We prefer a full decomposition to a (palette-based) edit transfer approach like Chang et al. ’s. With a full decomposition, edits are trivial to apply and spatial edits become possible (though we do not explore spatial edits in this work). We present a new, efficient method for layer decomposition, based on the additive color mixing model (Section 3.2). Our approach leverages 5D RGBXY-space geometry to enforce spatial smoothness on the layers. This geometric approach is significantly more efficient than previous approaches in the literature, easily handling images up to 100 megapixels in size.
2.3. Color Transfer
We also explore color transfer as an application of our work. Color transfer is a vast field with contributions from the vision and graphics communities. As such, we describe only the most closely related work to our approach. Hou et al.  conceptualize and apply color themes as hue histograms in HSV space. Wang et al.  solve an optimization that simultaneously considers a desired color theme, texture-color relationships as well as automatic or user-specified color constraints. Phan et al. 
explored the order of colors within palettes to establish correspondences and enable interpolation. Nguyen et al. find a group color theme from multiple palettes from multiple images using a modified k-means clustering method, and use it to recolor all the images in a consistent way. Han et al. [2013; 2017] compute a distance metric between palettes in the color mood space, and then sort and match colors from palettes according to their brightness. Munshi et al.  match colors between palettes according to their distance in Lab space. Based on our harmonic templates, palettes, and the LCh color space; we propose several intuitive metrics for color transfer that take into account human perception for goals like colorfulness, preservation of original colors, or harmonic composition. The final image recoloring is performed using our layer decomposition.
3. Palette extraction and image decomposition
A good palette for image editing is one that closely captures the underlying colors the image was made with (or could have been made with), even if those colors do not appear in their purest form in the image itself. Tan et al.  observed that the color distributions from paintings and natural images take on a convex shape in RGB space. As a result, they proposed to compute the convex hull of the pixel colors. The convex hull tightly wraps the observed colors. Its vertex colors can be blended with convex weights (positive and summing to one) to obtain any color in the image. The convex hull may be overly complex, so they propose an iterative simplification scheme to a user-desired palette size. After simplification, the vertices become a palette that represents the colors in the image.
We extend Tan et al. ’s work in two ways. First, we propose a simple, geometric layer decomposition method that is orders of magnitude more efficient than the state-of-the-art. Working code for our entire decomposition algorithm can be written in under 50 lines (Figure 4). Second, we propose a simple scheme for automatic palette size selection.
3.1. Palette Extraction
In Tan et al. , the convex hull of all pixel colors is computed and then simplified to a user-chosen palette size. To summarize their approach, the convex hull is simplified greedily as a sequence of constrained edge collapses [Garland and Heckbert, 1997]. An edge is collapsed to a point constrained to strictly add volume [Sander et al., 2000] while minimizing the distance to its incident faces. The edge whose collapse adds the least overall volume is chosen next, greedily. After each edge is collapsed, the convex hull is recomputed, since the new vertex could indirectly cause other vertices to become concave (and therefore redundant). Finally, simplification may result in out-of-gamut colors, or points that lie outside the RGB cube. As a final step, Tan et al.  project all such points to the closest point on the RGB cube. This is the source of reconstruction error in their approach; some pixels now lie outside the simplified convex hull and cannot be reconstructed.
We improve upon this procedure with the observation that the reconstruction error can be measured geometrically, even before layer decomposition, as the RMSE of every pixel’s distance to the simplified convex hull. (Inside pixels naturally have distance 0.) Therefore, we propose a simple automatic palette size selection based on a user-provided RMSE reconstruction error tolerance ( in our experiments). For efficiency, we divide RGB-space into bins (a total of bins). We measure the distance from each non-empty bin to the simplified convex hull, weighted by the bin count. We start measuring the reconstruction error once the number of vertices has been simplified to 10. By doing this, we are able to obtain palettes with an optimal number of colors automatically. This removes the need for the user to choose the palette size manually, leading to better layer decompositions.
(If non-constant palette colors were acceptable, instead of clipping one could cast a ray from each pixel towards the out-of-gamut vertex; the intersection of the ray with the RGB cube would be the palette color for that pixel. There would be zero reconstruction error. The stopping criteria could be the non-uniformity of a palette color, measured by the area of the RGB cube surface intersected with the simplified convex hull itself.)
3.2. Image decomposition via RGBXY convex hull
From their extracted palettes, Tan et al.  solved a non-linear optimization problem to decompose an image into a set of ordered, translucent RGBA layers suitable for the standard “over” compositing operation. While this decomposition is widely applicable (owing to the ubiquity of “over” compositing), the optimization is quite lengthy due to the recursive nature of the compositing operation, which manifests as a polynomial whose degree is the palette size. Others have instead opted for additive mixing layers [Aksoy et al., 2017; Lin et al., 2017a; Zhang et al., 2017] due to their simplicity. A pixel’s color is a weighted sum of the palette colors.
In this work, we adopt linear mixing layers as well. We provide a fast and simple, yet spatially coherent, geometric construction.
Any point inside a simplex (a triangle in 3D, a tetrahedron in 3D, etc.) has a unique set of barycentric coordinates, or convex additive mixing weights such that where the mixing weights are positive and sum to one, and are the vertices of the simplex. In our setting, the simplified convex hull is typically not a simplex, because the palette has more than 4 colors. There still exist convex weights for arbitrary polyhedron, known as generalized barycentric coordinates [Floater, 2015], but they are typically non-unique. A straightforward technique to find generalized barycentric coordinates is to first compute a tessellation of the polyhedron (in our case, the simplified convex hull) into a collection of non-overlapping simplices (tetrahedra in 3D). For example, the Delaunay generalized barycentric coordinates for a point can be computed by performing a Delaunay tessellation of the polyhedron. The barycentric coordinates of whichever simplex the point falls inside of are the generalized barycentric coordinates. For a 3D point in general position in the interior, the mixing weights will have at most 4 non-zero weights, which corresponds to the number of vertices of a tetrahedron.
This is the approach taken by Tan et al.  for their as-sparse-as-possible (ASAP) technique to extract layers. Because Tan et al.  considered recursive over compositing, users provided a layer or vertex order; they tessellated the simplified convex hull by connecting all its (triangular) faces to the first vertex, which corresponds to the background color. This simple star tessellation is valid for any convex polyhedron. In the additive mixing scenario, no order is provided; we discuss the choice of tessellation below. Because the weights are assigned purely based on the pixel’s colors, however, this approach predictably suffers from spatial coherence artifacts (Figure 7). The colors of spatially neighboring pixels may belong to different tetrahedra. As a result, ASAP layers produce speckling artifacts during operations like recoloring (Figure 7).
To provide spatial coherence, our key insight is to extend this approach to 5D RGBXY-space, where XY are the coordinates of a pixel in image space, so that spatial relationship are considered along with color in a unified way (Figure 3). We first the compute convex hull of the image in RGBXY-space. We then compute Delaunay generalized barycentric coordinates (weights) for every pixel in the image in terms of the 5D convex hull. Pixels that have similar colors or are spatially adjacent will end up with similar weights, meaning that our layers will be smooth both in RGB and XY-space. These mixing weights form an matrix , where is the number of image pixels and is the number of RGBXY convex hull vertices. We also compute , Delaunay barycentric coordinates (weights) for the RGBXY convex hull vertices in the 3D simplified convex hull. We use the RGB portion of each RGBXY convex hull vertex, which always lies inside the RGB convex hull. Due to the aforementioned out-of-gamut projection step when computing the simplified RGB convex hull, however, an RGBXY convex hull vertex may occasionally fall outside it. We set its weights to those of the closest point on the 3D simplified convex hull. is a matrix, where is the number of vertices of the simplified RGB convex hull (the palette colors).
The final weights for the image are obtained via matrix multiplication: , which is a matrix that assigns each pixel weights solely in terms of the simplified RGB convex hull. These weights are smooth both in color and image space. To decompose an image with a different RGB-palette, one only needs to recompute and then perform matrix multiplication. Computing is extremely efficient, since it depends only on the palette size and the number of RGBXY convex hull vertices. It is independent of the image size and allows users to experiment with image decompositions based on interactive palette editing (Figure 10 and the supplemental materials).
At first glance, any tessellation of 3D RGB-space has approximately the same weight sparsity (4 non-zeros). In practice, the “line of greys” between black and white is critically important. Any pixel near the line of greys can be expressed as the weighted combination of vertices in a number of ways (e.g. any tessellation). It is perceptually important that the line of greys be 2-sparse in terms of an approximately black and white color, and that nearby colors be nearly 2-sparse. If not, then grey pixels would be represented as mixtures of complementary colors; any change to the palette that didn’t preserve the complementarity relationships would turn grey pixels colorful (Figure 7). This tinting is perceptually prominent and undesirable.111For pathalogical images containing continuous gradients between complementary colors, this tinting behavior would perhaps be desired. .
To make the line of greys 2-sparse in this way, the tessellation should ensure that an edge is created between the darkest and lightest color in the palette. Such an edge is typically among the longest possible edges through the interior of the polyhedron, as the luminance in an image often varies more than chroma hue. As a result, the Delaunay tessellation frequently excludes the most desirable edge through the line of greys. We propose to use a star tessellation. If either a black or white palette color is chosen as the star vertex, it will form an edge with the other. We choose the darkest color in the palette as the star vertex. This strategy is simple and robust and extends naturally to premultiplied alpha RGBA images.
We also experimented with a variety of strategies to choose the tessellation such that the resulting layer decomposition is as sparse as possible: RANSAC line fitting and PCA on the RGB point cloud and finding the longest edge. We evaluated the decompositions with several sparsity metrics ([Tan et al., 2016; Aksoy et al., 2017; Levin et al., 2008], as well as the fraction of pixels with transparency above a threshold). Ultimately, tinting was more perceptually salient than changes in sparsity, and our proposed tessellation algorithm is simpler and robust.
The primary means to assess the quality of layers is to apply them for some purpose, such as recoloring, and then identify artifacts, such as noise, discontinuities, or surprisingly affected regions. Figure 6 compares recolorings created with our layers to those from Aksoy et al. , Tan et al. , and Chang et al. . Our approach generates recolorings without discontinuities (the sky in (b), second row), undesirable changes (the top of the chair in (c), third row), or noise.
We have no explicit guarantees about the sparsity of our weights. is as sparse as possible to reconstruct 3D colors (4 non-zeros). has 6 non-zeros among the (typically) 2000–5000 RGBXY convex hull vertices, which is also as sparse as possible to recover a point in RGBXY-space. The sparsity of the product of the two matrices depends on which 3D tetrahedra the 6 RGBXY convex hull vertices fall into. Nevertheless, it can be seen that our results’ sparsity is almost as good as Tan et al. .
Figure 5 shows a direct comparison between our additive mixing layers and those of Aksoy et al.  for direct inspection. In contrast with our approach, Aksoy et al. ’s approach has trouble separating colors that appear primarily in mixture. As a result, Aksoy et al. ’s approach sometimes creates an overabundance of layers, which makes recoloring tedious, since multiple layers must be edited.
Our decomposition algorithm is able to reproduce input images virtually indistinguishably from the originals. For the 100 images in Figure 8, our RGBXY method’s RGB-space RMSE is typically . Aksoy et al. ’s algorithm reconstruct images with zero error, since their palettes are color distributions rather than fixed colors.
We evaluate our RGB tessellation in Figure 7. In this experiment, we generate a random recoloring by permuting the colors in the palette. The RGB-space star triangulation approach is akin to Tan et al. ’s ASAP approach with the black color chosen to be the first layer. The lack of spacial smoothness is apparent in between the RGB-only decompositions in the top-row and the RGBXY decompositions in the bottom row. The decompositions using the Delaunay generalized barycentric coordinates (left column) result in undesirable tinting for colors near the line of grey. Additional examples can be found in the supplemental materials.
Throughout the remainder of the paper, all our results rely on our proposed layer decomposition.
In Figure 8, we compare the running time of additive mixing layer decomposition techniques. We ran our proposed RGBXY approach on 100 images under 6 megapixels with an average palette size of 6.95 and median palette size of 7. Computation time for our approaches includes palette selection (RGB convex hull simplification). Because of its scalability, we also ran our proposed RGBXY approach on an additional 70 large images between 6 and 12 megapixels, and an additional 6 extremely large images containing 100 megapixels (not shown in the plot). The 100 megapixel images took on average 12.6 minutes to compute. Peak memory usage was 15 GB. For further improvement, our approach could be parallelized by dividing the image into tiles, since the convex hull of a set of convex hulls is the same as the convex hull of the underlying data. A working implementation of the RGBXY decomposition method can be found in Figure 4 (48 lines of code). The “Layer Updating” performance is nearly instantaneous, taking a few milliseconds to, for 10 MP images, a few tens of milliseconds to re-compute the layer decomposition given a new palette.
Our running times were generated on a 2015 13” MacBook Pro with a 2.9 GHz Intel Core i5-5257U CPU and 16 GB of RAM. Our layer decomposition approach was written in non-parallelized Python using NumPy/SciPy and their wrapper for the QHull convex hull and Delaunay tessellation library [Barber et al., 1996]. Our layer updating was written in OpenCL.
Aksoy et al. ’s performance is the fastest previous work known to us. The performance data for Aksoy et al.’s algorithm is as reported in their paper and appears to scale linearly in the pixel size. Their algorithm was implemented in parallelized C++. Aksoy et al.  reported that their approach took 4 hours and 25 GB of memory to decompose a 100 megapixel image. Zhang et al. ’s sole performance data point is also as reported in their paper.
We also compare our approach to a variant of Tan et al. ’s optimization. We modified their reconstruction term to the simpler, quadratic one that matches our additive mixing layer decomposition scenario. With that modification, all energy terms become quadratic. However, because the sparsity term is not positive definite, it is not a straightforward Quadratic Programming problem; we minimize it with L-BFGS-B and increased the solver’s default termination thresholds since RGB colors have low precision (gradient and function tolerance ). This approach was also written in Python using NumPy/SciPy. The performance of the modified Tan et al.  is somewhat unpredictable, perhaps owing to the varying palette sizes.
The fast performance of our approach is due to the fact that the number of RGBXY convex hull vertices is virtually independent of the image size and entirely independent of the palette size. Finding the simplex that contains a point is extremely efficient (a matrix multiply followed by a sign check) and scales well. Our algorithm’s performance is more correlated with the number of RGBXY convex hull vertices and tessellated simplices. This explains the three red dots somewhat above the others in the performance plot.
In contrast, optimization-based approaches typically have parameters to tune, such as the balance between terms in the objective function, iteration step size, and termination parameters.
Interactive Layer Decompositions
4. Color Harmonization
In the following we describe our palette-based approach to color harmonization and color composition. Our work is inspired by the same concepts and goals as related previous work [Cohen-Or et al., 2006]. However, we also aim for a simpler and more compact representation that can express additional operations and be applied directly to palettes. First, we explain how we fit and enforce classical harmonic templates. Next, we describe how our framework can be used for other color composition operations.
4.1. Template fitting
Figure 2 shows our new axis-based templates compared to the sector-based ones from Tokumaru et al. . For our results in this paper we use seven templates . A template is defined by , where is the index of each axis (the total number of axes varies between templates), and is an angle of rotation in hue. While our templates are valid in any circular (or cylindrical) color space (e.g. HSV), we apply them in LCh-space (Lightness, Chroma, and hue) to match human perception.
Given an image and its extracted color palette , we seek to find the that is closest to the colors in in the Ch plane. For that, we find the closest axis to each color, and solve for the global rotation and additional angles that define the template. We define the distance between a palette and a template as:
where is the axis of template that is closest to palette color . measures the difference in Hue angle. Note that for the analogous template, any palette color inside that arc area will be zero distance to the template. is the contribution of color to all the pixels in image, computed as the sum of all the weights for layer and normalized by the total number of pixels in the image. promotes the template to be better aligned with the relevant colors of the image. When using color palettes that do not come from images, is the same for each color and can be discarded. The lightness and chroma of the color are also used as weights so that we measure the arc distance around the color wheel (the angular change scaled by radius). The darker the color or the less saturated, the smaller the perceived change per hue degree.
Since the search space is a finite range in 1D, we use a brute-force search to find the optimal global rotation angle fitting a template to a palette :
Monochrome, complementary, triad and square templates have only one degree of freedom, so we search the global rotation every 1 degree in . For analogous, single split and double split we allow an additional degree of freedom (angle between axes), which we allow degrees. In this case, . With being the optimal global rotation, and the optimal angle between axes. Given that palettes are typically small (less than 10 colors), our brute force search is very fast (less than a second).
Once a template is fit, we harmonize the input image by using to move the colors in closer to the axis assignment that minimizes equation 1. We leverage the image decomposition to recolor the image. Because we use a spatially coherent image decomposition, no additional work is needed to prevent discontinuous recoloring as in Cohen-Or et al. . Figure 11 shows different harmonic templates enforced over the same input image. Additional examples can be found in the supplementary material. Users can control the strength of harmonization via an interpolation parameter, where leaves the palette unchanged and fully rotates each palette color to lie on its matched axis (Figure 12). In the LCh color space, this affects hue alone.
Depending on the colors in , some templates are a better fit than others as measured by Equation 1. We can determine the optimal template automatically:
Depending on the palette size or its distribution, some axes may end up without any color assigned to them. We deem those cases not compliant with the intended balance of the harmonic template and remove them from this automatic selection.
Figure 13 shows the best fitting template for a set of images, and the fully harmonized result. More examples can be found in the supplementary material. We compare our results with harmonizations from previous works in Figure 17. While our result is clearly different, it arguably produces a more balanced result. Cohen-Or et al.  demonstrated harmonization between different parts of an image using masks or harmonization of image composites. We provide comparisons for this scenario in Figure 18.
4.2. Beyond hue
Our compact representation using palettes and axis-based templates allows to formulate full 3D color harmonization operations easily.
Apart from hue, some authors have described harmony in terms of lightness and chroma as well [Moon and Spencer, 1944; Birren and Cleland, 1969; Tokumaru et al., 2002]. While histogram-based approaches may be non-trivial to extend to these additional dimensions, our approach generalizes to them easily. Figure 14 shows our interpretation of the most typical LC templates defined in the literature. Analogous to our hue templates, we use to find the optimal for each template , and the best fitting template .
Snapping colors to a template requires finding the 2D line that fits best the LC distribution of the colors over a narrow hue band. To do that we minimize a weighted sum of all the perpendicular distances from each color to the axis of , weights are the same from Subsection 4.1. Specifically:
For , the optimal position for the vertical axis after the optimization is .
For , the optimal horizontal axis is .
For and , we look for the axis pivoting from and . We search for the axis rotation by brute force every to find the optimal
For , the diagonal line equation is , where is and is . Then optimal displacement
For , the line equation is . Then optimal displacement .
For all templates, after line fitting we find the two extreme colors for the axis, and space the remaining ones evenly between those.
As can be seen, are defined primarily for a single axis, and so they are directly applicable to monochromatic and analogous hue templates. For multi-axis templates, specific arrangements are described by Munsell [Birren and Cleland, 1969] for complementary schemes, in terms of visual balance between the two axes, pivoting around a neutral point. We implement this idea by applying pivoting around for each axis. This approach can handle an arbitrary number of axis, although for palettes of optimal size, sometimes it is difficult to find more than one color per axis. Figure 15 shows examples of LC harmonization. It is worth mentioning that while our hue harmonization is always able to produce colorful results that preserve shading and contrast, harmonizing lightness and chroma may produced unwanted loss of contrast when enforcing templates other than the optimal .
As part of his seminal work on color composition for design, Itten  described additional pleasing color arrangements to create contrast. In contrast with sector-based templates, it is straightforward to model them with our axis-based representation. Here is the exhaustive list of Itten’s additional contrasting color arrangements and how they fit into our framework:
Hue: Triad template aligned with the RGB primaries. No need to solve for .
Light-dark: analogous or monochrome template, plus .
Complementary: same as complementary hue template.
Simultaneous: complementary template, plus the axis with the smaller overall scales down its chroma proportionally to .
Saturation: analogous or monochrome template, plus .
Extension: solve for L so the total sum of for each axis in is the same.
Cold-warm: a complementary template whose axis is aligned perpendicular to the cold-warm divide. The cold-warm divide is the complementary axis from red to cyan as seen in Figure 16.
5. Perceptual Study
We conducted a set of wide-ranging perceptual studies on harmonic colors and our harmonization algorithm. participants took part in our studies with 31% self-reporting as having some knowledge in color theory. We obtained between and ratings per template, depending on the study. In our first study, we performed an end-to-end evaluation of our image harmonization algorithm and Cohen-Or et al. . To disentangle image content from color, we conducted a second evaluation on our harmonized palettes alone. Finally, to disentangle our algorithm from the percept of color harmony, we conducted a study evaluating the perception of archetypal harmonic color schemes.
In our experiments, we avoided the use of Likert scales, because the anchoring or range is unclear. While a given harmonic scheme can be applied with varying strength ( in Section 4.1), different harmonic schemes are incomparable. If shown all harmonized images in a gallery, participants may develop anchors for the Likert scale between templates. If shown harmonized images one-at-a-time in sequence, the same phenomenon would occur, but the anchors would develop dynamically across the sequence.
Therefore, all of our experiments are based on 2-alternative forced-choice (2AFC) questions [Cunningham and Wallraven, 2011]. Participants were shown two images and asked to choose which of two images has the most harmonic colors (Figure 19). The instructions explained that, “Harmonic colors are often perceived as balanced or pleasing to the eye.” In all experiments, a participant saw every stimulus (pair of images) twice. We used blockwise randomization so that, for each image, all stimuli were seen once before they were seen a second time. We used rejection sampling to guarantee that no stimuli was seen twice back-to-back. The initial left/right arrangement of the pair was random. For balance, the second time the pair was shown in the opposite arrangement. We do not discard data from participants who answer inconsistently. If a participant cannot decide, they are expected to choose randomly.
All stimuli and study data can be found in the supplemental materials.
5.1. Image and Palette Harmonization
In our first experiment, we evaluated the output of image harmonization. Each survey compared an unmodified image to various harmonization algorithms: our monochromatic, complementary, single split, triad, double split, square, analogous, and two LC harmonization algorithms (monochromatic and complementary), and the output of Cohen-Or et al. . For all algorithms, we compared the unmodified image to the harmonized. For our harmonization output, we also compared the unmodified image to the harmonization applied 50% (), and the harmonization applied 50% to the harmonization applied 100% (). We did not compare different templates directly.
We hypothesized that the harmonized images would be preferred, perhaps weakly, by viewers. We further hypothesized that this preference would vary by template, and that the preference would decrease when applying templates which lead to smaller changes in the output. If the palette change to match the metric is small, then the harmonized image may be indistinguishable from the original. In 2AFC experiments, this causes participants to choose randomly, so the preference tends towards 50/50.
We ran on our experiment on 25 images, 9 of which had output from Cohen-Or et al. . We recruited participants via Amazon Mechanical Turk, 29% of whom reported having some knowledge in color theory. Individuals with impaired color vision were asked not to participate in the study. We sought 1000 ratings per template in order to detect an effect size of approximately 55% with a factor-of-10 correction for multiple comparisons (Šidák or Bonferroni) due to our 10 harmonization algorithms. To obtain 1000 or more ratings per pair of images, we obtained ratings from 20 individuals for each of the harmonizations of the 16 images without Cohen-Or et al. ’s output, and from 60 individuals for each of the harmonizations of each of the 9 images with Cohen-Or et al. ’s output. (Each individual rated each pair twice.)
The most notable observation about this first study is that participants overall preferred the original images to harmonizations and a preference for to (Figure 20, left). While any given harmonization was not preferred to the original across all images, there was substantial variation per-image. For example, an analogous template fared better on some images versus others. Participants with knowledge about color theory had a statistically significant () stronger preference for harmonized images (3.7% overall).
In addition to our 9 harmonization templates, we also evaluated Cohen-Or et al. ’s harmonization result on a subset of 9 images. Because we only have Cohen-Or et al. ’s optimal harmonization result, we compared preference rates to our automatically-chosen optimal harmonization (Equation 3) and to the harmonization template most preferred by participants in the perceptual study (Figure 21).
Harmonizing the colors of natural images was noted as a limitation by Cohen-Or et al.  due to our expectations. In their output, they used masks to avoid, for example, affecting human skin. (We do not.) However, several of the images in our study were not natural images with no apparent effect on ratings (for our technique and Cohen-Or’s). To investigate whether the image content was biasing partipants’ perception, we performed a second perceptual study that repeated the experiment, replacing every image with its palette. For this study, we recruited participants via Amazon Mechanical Turk, 32.5% of whom reported having some knowledge in color theory. Because Cohen-Or et al.  is not palette-based, we omitted it from the study since there are no before/after palettes to evaluate. Therefore, to obtain 1000 ratings per comparison, we obtained ratings from 20 individuals for each of the harmonizations for all 25 images.
In this experiment, the harmonizations were judged significantly better than when displaying images (Figure 20, left). The harmonizations on average were preferred to the original palettes (). Our monochromatic (), monochromatic LC (), and complementary ( with factor-of-nine Bonferroni correction) harmonizations produced palettes preferred to the originals. Participants with knowledge about color theory had a statistically significant () stronger preference for harmonized palettes (4.5% on average). Among knowledgeable participants, each template’s harmonized palettes were preferred to the originals of the time. The same three templates (monochromatic, monochromatic LC, and complementary) were preferred with statistical significance; the power of our study when restricted to knowledgable participants (344 ratings per template) had insufficient power to conclude whether the preference for additional specific templates was significantly different than chance.
We expected all of our harmonized results to be judged more harmonic than the input. Since many harmonizations of the same image or palette were shown to participants, there may have been a familiarity bias towards the more common original image or palette. To eliminate this as well as any biases stemming from the complexity of the image palettes, we performed an additional study.
5.2. Perception of Archetypal Color Harmony
To evaluate whether color harmony can be perceived in an archetypal setting, we evaluated the following basic templates in a controlled manner: monochromatic, complementary, triad, and square. We generated random monochromatic, complementary, triad, and square palettes with one, one, two, and three colors, respectively. For the complementary, triad, and square palettes, we randomly generated one color and then spaced the rest , , and around the color wheel. We used Lch color-space, which is a cylindrical parameterization of Lab color-space. All colors had luminance 60 and chroma 100. The monochromatic template consisted of two colors. The first color had luminance 60, chroma 100, and randomly chosen hue. The second color was obtained by scaling the luminance and chroma of the first by two random factors in the range . We generated 15 palettes for each of the four categories.
To eliminate any familiarity bias, each of the 60 palettes were paired with a unique, random palette. Each palette was shown exactly twice with the same pairing. The random palettes shared the first color with their paired harmonic palette. For random palettes, we obtained the remaining color(s) by randomly sampling hues. We ordered the remaining colors according to their hue relative to the first color around the color wheel. For random palettes paired with complementary, triad, or square palettes, luminance and chroma were uniform across the entire palette. For random palettes paired with monochromatic palettes, the remaining color shared the same luminance and chroma as the second color in the monochromatic palette. We used rejection sampling to ensure that we didn’t accidentally generate a palette fitting one of the harmonic templates. (No two colors can be less than 23 units apart in Lab space, which is 10 times the just noticeable difference.)
We recruited participants via Amazon Mechanical Turk, 38% of whom reported having some knowledge in color theory. Each participant saw all pairs of palettes with the aforementioned randomization scheme (120 questions) and presentation (Figure 19). Each of the four templates therefore received evaluations versus a random palette.
The monochromatic and square templates were perceived to be significantly more harmonic than random palettes (Figure 22). However, random templates were perceived as more harmonic than complementary and triad templates. In this study, participants with knowledge about color theory did not significantly differ in their judgments from participants without knowledge. We conjecture that the complementary and triad templates created the most contrast, which may have been the primary phenomena participants considered when evaluating harmony; in other words, strong contrast was perceived to be disharmonious. This experiment suggests that perceptual uniformity in hue intervals may not be consciously perceived as “balanced or pleasing to the eye.”
6. Video Harmonization
Our methods can naturally extend to video by simply applying our image decomposition and harmonization on each frame independently. In this case, given the properties of our extracted palettes, we first compute a global palette for each sequence of frames, aiming at a more coherent layer decomposition without additional processing beyond the proposed framework. We describe the overall pipeline in Algorithm 1.
We show examples of video harmonization in Figure 23. Videos can be found in the supplementary material.
7. Color Transfer
Our palette extraction, image decomposition, and harmonic templates enable new approaches to color transfer. Harmonic templates carry important information about the color distribution in a palette or an image. We propose to transfer that information between palettes and images.
Given an input image and a reference image , we already know how to extract their palettes and
, and estimate their optimal templates,and . After the fitting, we can compute the weight of each axis of the template as the sum of the weights of each color assigned to it. With this, we have an estimate of the main axis for each template—the one with the greatest influence on the image. This simple procedure helps to establish a straightfoward match between palettes, something we can leverage to find the global rotation that aligns with . Next, we apply to globally rotate the colors of and then we harmonize them with the target’s template with . This method achieves results where is recolored so it is harmonic with respect to , taking into account the overall relevance of each color of the palette. Figure 24 shows results of this approach. We found that this method is good for matching dominant colors, which works better for content without real reference colors (e.g. graphics design or man-made objects).
When the final results should preserve better the original colors, a more conservative method can be formulated. In this case, we harmonize the input image colors directly to the the best-fitting template for the reference image , without any global rotation. We match palette colors to template axes according to equation 1. After changing the hues of with any of the proposed methods, we attempt to match lightness and chroma between palettes by scaling the lightness and chroma of each palette color to that the average of the input and reference palette colors match.
We have presented a very efficient, intuitive and capable framework for color composition. It allows us to formulate previous and novel approaches to color harmonization and color transfer with very robust results. Our palette manipulations can be plugged into any palette-based system. Our image decomposition can be used generally by artists for manual editing or in other algorithms. Our large-scale perceptual study provides important data and insights into the perception of color harmony.
During our performance tests for the image decomposition, we found isolated cases where the computation of the 5D convex hull takes somewhat longer than usual. We believe it is due to very specific color distributions ( out of tested images), but we would like to study the phenomenon in more depth.
There are also cases for the templated color transfer where the input palette tries to match a reference palette with a higher number of axes. This is usually a case of colorization (adding more colors than the existing ones) that we currently handle with varying success depending on the input color palette. These cases may need more elaborate formulations for the transfer.
Because there is not a universal color theory, the concepts we leverage for our methods may not work for everybody. In fact, we already saw clear differences in our results with respect to previous work, even building on top of comparable foundations. Our perceptual study has revealed potential problems with the percept of color harmony that affect all work on color harmony. This exposes the need for additional perceptual studies evaluating the perceived quality of results from different algorithms. This also exposes the need for intuitive frameworks like ours, enabling users to use and interact with color harmony, despite only passing familiarity, so that they might find something indeed balanced and pleasing to the eye.
8.2. Future work
Inspired by Lin et al. [2017b], we wish to explore the use of superpixels to see if we are able to achieve greater speed ups. We also wish to explore parallel and approximate convex hull algorithms.
Other color-aware applications
We believe that our templates may carry semantic structure that we would like to keep exploring in the future. Among others, we believe this can enable higher level and more intuitive image search algorithms, where images or palettes can be used transparently to retrieve other images and color themes for design.
- Adobe  Adobe. 2018. Adobe Color CC. (2018). http://color.adobe.com
- Aharoni-Mack et al.  Elad Aharoni-Mack, Yakov Shambik, and Dani Lischinski. 2017. Pigment-Based Recoloring of Watercolor Paintings. In NPAR.
- Aksoy et al.  Yağiz Aksoy, Tunç Ozan Aydin, Aljoša Smolić, and Marc Pollefeys. 2017. Unmixing-based soft color segmentation for image manipulation. ACM Transactions on Graphics (TOG) 36, 2 (2017), 19.
- Arbelot et al.  B. Arbelot, R. Vergne, T. Hurtut, and J. Thollot. 2016. Automatic Texture Guided Color Transfer and Colorization. In Proceedings of the Joint Symposium on Computational Aesthetics and Sketch Based Interfaces and Modeling and Non-Photorealistic Animation and Rendering (Expresive ’16). Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, 21–32. http://dl.acm.org/citation.cfm?id=2981324.2981328
- Barber et al.  C. Bradford Barber, David P. Dobkin, and Hannu Huhdanpaa. 1996. The Quickhull Algorithm for Convex Hulls. ACM Trans. Math. Softw. 22, 4 (Dec. 1996), 469–483.
- Baveye et al.  Yoann Baveye, Fabrice Urban, Christel Chamaret, Vincent Demoulin, and Pierre Hellier. 2013. Saliency-Guided Consistent Color Harmonization.. In CCIW. 105–118.
- Birren and Cleland  F. Birren and T.M. Cleland. 1969. A grammar of color: a basic treatise on the color system of Albert H. Munsell. Van Nostrand Reinhold Co. https://books.google.com/books?id=LnUaAAAAMAAJ
- Chamaret et al.  Christel Chamaret, Fabrice Urban, and Lionel Oisel. 2014. Harmony-guided image editing. In Image Processing (ICIP), 2014 IEEE International Conference on. IEEE, 2171–2173.
- Chang et al.  Huiwen Chang, Ohad Fried, Yiming Liu, Stephen DiVerdi, and Adam Finkelstein. 2015. Palette-based Photo Recoloring. ACM Trans. Graph. 34, 4 (July 2015).
- Cohen-Or et al.  Daniel Cohen-Or, Olga Sorkine, Ran Gal, Tommer Leyvand, and Ying-Qing Xu. 2006. Color Harmonization. In ACM SIGGRAPH 2006 Papers (SIGGRAPH ’06). ACM, New York, NY, USA, 624–630. https://doi.org/10.1145/1179352.1141933
- Craig  Maurice D Craig. 1994. Minimum-volume transforms for remotely sensed data. IEEE Transactions on Geoscience and Remote Sensing 32, 3 (May 1994), 542–552. https://doi.org/10.1109/36.297973
- Cunningham and Wallraven  Douglas W Cunningham and Christian Wallraven. 2011. Experimental design: From user studies to psychophysics. CRC Press.
- Floater  Michael S Floater. 2015. Generalized barycentric coordinates and applications. Acta Numerica 24 (2015), 161–214.
- Garland and Heckbert  Michael Garland and Paul S. Heckbert. 1997. Surface Simplification Using Quadric Error Metrics. 209–216.
- Han et al.  Yu Han, Chen Xu, George Baciu, Min Li, and Md Robiul Islam. 2017. Cartoon and texture decomposition-based color transfer for fabric images. IEEE Transactions on Multimedia 19, 1 (2017), 80–92.
- Han et al.  Yu Han, Dejun Zheng, George Baciu, Xiangchu Feng, and Min Li. 2013. Fuzzy region competition-based auto-color-theme design for textile images. Textile Research Journal 83, 6 (2013), 638–650. https://doi.org/10.1177/0040517512452953
- Hou and Zhang  Xiaodi Hou and Liqing Zhang. 2007. Color conceptualization. In Proceedings of the 15th ACM international conference on Multimedia. ACM, 265–268.
- Huo and Tan  Xing Huo and Jieqing Tan. 2009. An improved method for color harmonization. In Image and Signal Processing, 2009. CISP’09. 2nd International Congress on. IEEE, 1–4.
- Itten  Johannes Itten. 1970. The elements of color. John Wiley & Sons.
- Levin et al.  Anat Levin, Alex Rav-Acha, and Dani Lischinski. 2008. Spectral matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 10 (2008), 1699–1712.
- Li et al.  Xujie Li, Hanli Zhao, Guizhi Nie, and Hui Huang. 2015. Image recoloring using geodesic distance based color harmonization. Computational Visual Media 1, 2 (2015), 143–155.
- Lin et al. [2017a] Sharon Lin, Matthew Fisher, Angela Dai, and Pat Hanrahan. 2017a. LayerBuilder: Layer Decomposition for Interactive Image and Video Color Editing. arXiv preprint arXiv:1701.03754 (2017).
- Lin et al. [2017b] Sharon Lin, Matthew Fisher, Angela Dai, and Pat Hanrahan. 2017b. LayerBuilder: Layer Decomposition for Interactive Image and Video Color Editing. CoRR abs/1701.03754 (2017). arXiv:1701.03754 http://arxiv.org/abs/1701.03754
- Lin and Hanrahan  Sharon Lin and Pat Hanrahan. 2013. Modeling how people extract color themes from images. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3101–3110.
- Mellado et al.  Nicolas Mellado, David Vanderhaeghe, Charlotte Hoarau, Sidonie Christophe, Mathieu Brédif, and Loic Barthe. 2017. Constrained palette-space exploration. ACM Transactions on Graphics (TOG) 36, 4 (2017), 60.
- Moon and Spencer  Parry Moon and Domina Eberle Spencer. 1944. Geometric Formulation of Classical Color Harmony. J. Opt. Soc. Am. 34, 1 (Jan 1944), 46–59. https://doi.org/10.1364/JOSA.34.000046
- Morse et al.  Bryan S Morse, Daniel Thornton, Qing Xia, and John Uibel. 2007. Image-based color schemes. In Image Processing, 2007. ICIP 2007. IEEE International Conference on, Vol. 3. IEEE, III–497.
- Munshi and Singh  Bharat Munshi and Navjyoti Singh. 2015. Palette based colour transfer using image segmentation. In Image and Vision Computing New Zealand (IVCNZ), 2015 International Conference on. IEEE, 1–6.
- Nguyen et al.  RMH Nguyen, B Price, S Cohen, and MS Brown. 2017. Group-Theme Recoloring for Multi-Image Color Consistency. In Computer Graphics Forum, Vol. 36. Wiley Online Library, 83–92.
- O’Donovan et al.  Peter O’Donovan, Aseem Agarwala, and Aaron Hertzmann. 2011. Color compatibility from large datasets. ACM Transactions on Graphics (TOG) 30, 4 (2011), 63.
- Phan et al.  Huy Phan, Hongbo Fu, and Antoni Chan. 2017. Color Orchestra: Ordering Color Palettes for Interpolation and Prediction. IEEE Transactions on Visualization and Computer Graphics (2017).
- Pitié et al.  François Pitié, Anil C. Kokaram, and Rozenn Dahyot. 2007. Automated Colour Grading Using Colour Distribution Transfer. Comput. Vis. Image Underst. 107, 1-2 (July 2007), 123–137. https://doi.org/10.1016/j.cviu.2006.11.011
- Sander et al.  Pedro V. Sander, Xianfeng Gu, Steven J. Gortler, Hugues Hoppe, and John Snyder. 2000. Silhouette Clipping. In Proceedings of ACM SIGGRAPH. 327–334.
Sawant and Mitra 
N. Sawant and N. J.
Color Harmonization for Videos. In
Indian Conference on Computer Vision, Graphics and Image Processing. 576–582.
- Sunkavalli et al.  Kalyan Sunkavalli, Micah K Johnson, Wojciech Matusik, and Hanspeter Pfister. 2010. Multi-scale image harmonization. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 125.
- Tan et al.  Jianchao Tan, Stephen DiVerdi, Jingwan Lu, and Yotam Gingold. 2017. Pigmento: Pigment-Based Image Analysis and Editing. arXiv preprint arXiv:1707.08323 (2017).
- Tan et al.  Jianchao Tan, Jyh-Ming Lien, and Yotam Gingold. 2016. Decomposing Images into Layers via RGB-space Geometry. ACM Trans. Graph. 36, 1, Article 7 (Nov. 2016), 14 pages. https://doi.org/10.1145/2988229
- Tang et al.  Zhen Tang, Zhenjiang Miao, and Yanli Wan. 2010. Image composition with color harmonization. In 25th International Conference of Image and Vision Computing New Zealand.
- Tokumaru et al.  Masataka Tokumaru, Noriaki Muranaka, and Shigeru Imanishi. 2002. Color design support system considering color harmony. In Fuzzy Systems, 2002. FUZZ-IEEE’02. Proceedings of the 2002 IEEE International Conference on, Vol. 1. IEEE, 378–383.
- Tsai et al.  Y.-H. Tsai, X. Shen, Z. Lin, K. Sunkavalli, X. Lu, and M.-H. Yang. 2017. Deep Image Harmonization. ArXiv e-prints (Feb. 2017). arXiv:cs.CV/1703.00069
- Wang et al.  Baoyuan Wang, Yizhou Yu, Tien-Tsin Wong, Chun Chen, and Ying-Qing Xu. 2010. Data-driven image color theme enhancement. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 146.
- Westland et al.  Stephen Westland, Kevin Laycock, Vien Cheung, Phil Henry, and Forough Mahyar. 2007. Colour harmony. JAIC-Journal of the International Colour Association 1 (2007).
- Zhang et al.  Qing Zhang, Chunxia Xiao, Hanqiu Sun, and Feng Tang. 2017. Palette-Based Image Recoloring Using Color Decomposition Optimization. IEEE Transactions on Image Processing 26, 4 (2017), 1952–1964.