In this post, we are going to talk about color quantization. Color quantization is a technique for reducing the number of used in an image and, as stated in Wikipediathis is important for displaying images on devices that support a limited number of colors and for efficiently compressing certain kinds of images. This also leads to a less variance in the colors.

In other terms, if we want that an image is made of just 5 colors, a color quantization technique analyzes the image pixel by pixel in order to fit those pixels in the range defined by the 5 colors and so, rearranging the image.

One of the most used technique for color quantization is K-Means. K-means is an unsupervised machine learning algorithm which tries to group similar items in the form of clusters. The number of groups is represented by K. It is a very simple algorithm which follows these 5 stepes:

  1. Select a K as the number of clusters,
  2. Choose randomly the initial centers for each cluster.
  3. Group the other samples calculating the minimum distance in space to each center.
  4. Once all points are associated to a certain cluster, calculate again the center of the cluster.
  5. If the new centers does not change much from the centers previous centers, the algorithm ends, otherwise we start again from step 3 using the new centers.

In our specific case, i.e. color quantization, the K value corresponds to the number of colors we want to use for the final image (for other information about K-means, you can follow this link).

import cv2
import numpy as np

#read the image
img = cv2.imread("gent.jpg")

#reshape the image in order to have a vector for each chanel nx3
reshaped_img = np.float32(image).reshape((-1, 3))

#defining criteria:
#cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER means that we want the algorithm to stop either if the accuracy is reached or the number of iterations has passed.
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 20, 1.0)

#number of clusters
K = 5

#applying cv2.kmeans function
ret, labels, centers = cv2.kmeans(reshaped_img, K, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)

#convert the centers from float to int
centers = np.uint8(centers)

#select the color according to the labels
result_image = centers[labels.flatten()]

#reshape the image to the original size
result_image = result_image.reshape(img.shape)

#concatenating the images
concat_image = np.concatenate((image, result_image), axis=1)

#show the image
cv2.imshow("result", concat_image)
cv2.waitKey(0)

If we execute the above code, we obtain the following result:

Color Quantization

In conclusion, color quantization is a powerful technique that we can use for many tasks like: segmentation, compression, etc. We used K-means (where we specify just the number of clusters we want in the color space) as technique for obtaining the final image, but we can also use other kind of clustering techniques or a simple lookup table (LUT) where we can explicitally define the colors we want to obtain the in the final image and map the colors in the original image according to the table.