Color Recognition for Rubik's Cube Robot

01/11/2019 ∙ by Shenglan Liu, et al. ∙ 0

In this paper, we proposed three methods to solve color recognition of Rubik's cube, which includes one offline method and two online methods. Scatter balance & extreme learning machine (SB-ELM), a offline method, is proposed to illustrate the efficiency of training based method. We also point out the conception of color drifting which indicates offline methods are always ineffectiveness and can not work well in continuous change circumstance. By contrast, dynamic weight label propagation is proposed for labeling blocks color by known center blocks color of Rubik's cube. Furthermore, weak label hierarchic propagation, another online method, is also proposed for unknown all color information but only utilizes weak label of center block in color recognition. We finally design a Rubik's cube robot and construct a dataset to illustrate the efficiency and effectiveness of our online methods and to indicate the ineffectiveness of offline method by color drifting in our dataset.



There are no comments yet.


page 1

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Rubik’s cube puzzle has continually been as a hot topic in intelligence competition for child/adult. While in service robot fields, efficient solution of Rubik’s cube puzzle is a challenge for computer vision. A software scheme to solve Rubik’s cube puzzle includes detection, color recognition and solve method of a randomly scramble cube. Rubik’s cube puzzle can be also considered as a sequential manipulation problem for service robot

[1]. For example, optical time-of-flight pre-touch sensor are used for grasp Rubik’s cube to achieve a high precise sequential manipulation.

In recent years, excellent research works devote in proposing quick algorithms for solving Rubik’s cube puzzle. Kociemba [2] proposed close to God’s algorithm to solve the problem. Korf [3] [4] involved tree search and graph search to enhance the efficient performance of Rubik’s cube puzzle. Rokicki [5] et al. proposed ”Rubik’s Cube Group Is Twenty”, which is a ground-breaking results in Rubik’s cube puzzle. The above algorithms focus on solution of cube for human or robots. However, many procedures need to be completed before the solution which is the last procedure in service robot for Rubik’s cube puzzle. In cube service robot, we first detect the location of the cube by camera. Then, color recognition method need to achieve after get the color block of cube, which is one of the most important problem in the whole cube service robot procedures. The whole sequential manipulation of cooperating service robots should get correct color information of each surface in Rubik’s cube. In fact, we will meet an important issue that incorrect color information make a wrong initialization for algorithm of Rubik’s cube puzzle. However, a few works focus on color recognition of Rubik’s cube puzzle to our best knowledge. Cezary [6] et al. introduce a full robot project of Rubik’s cube including grasp the cube, color recognition, solving algorithm and mechanical arm controlling, which uses HSV color space to realize the color block distribution and illustrates the importance of color recognition in Rubik’s cube solving robot. In computer vision, many image processing methods can be referred to color recognition of Rubik’s cube (CRRC) problem [7] [8] [9]. Although we can get numerous images of cubes, it is hard to generalize all the situations caused by complex environment, e.g. illumination changing, cube material and abrasion by long time using. Therefore, most pre-trained models can not achieve a satisfactory performance on CRRC problem, which can be denoted as color drifting.

Figure 1: Cube-solving Robot

In this paper, we collect different illumination and material images and label the images using perceptual visual interactive learning (PVIL) method [10] of Rubik’s cubes as a dataset (RC dataset) which contains 348 images and offers two HSV’s features. To solve CRRC problem, we introduced one model and proposed two models for CRRC problem including offline and online approaches. The offline method using supervised dimensionality reduction and classification methods calls scatter balance & extreme learning machine (SB-ELM) to enhance the robustness and considers different utilizing environment for CRRC problem. The online methods fall into two approaches as follows.

(1) Weak label hierarchic propagation (WLHP) base on weak supervised hypothesis [11] is proposed in this paper for a special problem which indicates nine color blocks on each face of Rubik’s cube.

(2) Dynamic weight label propagation (DWLP), which is motivated by idea of KNN, designs known center block color of each surface on Rubik’s cube. Here, we use orange and blue center block surfaces face to the camera as original status on robot arm (See Fig.


The rest of this paper is organized as follows. The offline and online methods of color recognition on Rubik’s cube is proposed in section 2. Our new Rubik’s cube dataset for color recognition is introduced and analyzed in section 3. In section 4, the experimental results are listed and analyzed. The conclusion is the last section.

2 Proposed Method

2.1 Overview

In this section, we propose two HSV’s features to represent color blocks of Rubik’s cube. Based on HSV’s features, one offline approach and two online approach are introduced as follows. The offline method using supervised dimensionality reduction and classification methods calls scatter balance & extreme learning machine (SB-ELM). The online methods are Weak label hierarchic propagation (WLHP) and Dynamic weight label propagation (DWLP). WLHP takes the fact that none of the six central color blocks of Rubik’s cube belong to the same color. While the process of DWLP adopts specific colors of six central blocks.

2.2 Features

Before color recognition, it is necessary to select suitable features to represent color blocks. RGB color space is often used in computers to represent color. While compared to the RGB color space, HSV color space makes it easier to recognize colors because the HSV color space simulates human perception of color [12] [13] [14]. We propose two features built in the HSV color space in this paper, one is 3DHSV, the other one is 16DHSV.

3DHSV: Since the pixels in one color block are concentrated in HSV color space, we only need to find the coordinates of the pixel with the highest frequency in histogram as the feature. As Fig. 2 shows, for each dimension in HSV color space, 3DHSV calculates the histogram on the color block and takes the value of the max bin in histogram as one dimension.

16DHSV: 3DHSV might ignore key information in certain situations, such as in the phenomenon of reflection. To improve 3DHSV, we design an uneven histogram feature based on the distribution of color blocks [6]. According to the color distribution characteristics, uneven histogram statistics are performed on the color blocks to obtain 16DHSV features.

2.3 Scatter balance & extreme learning machine

We use a novel supervised dimensionality reduction method ALDE (Angle Linear Discriminant Embedding) [15] to extract the features, with ELM (Extreme Learning Machine) [16]

as the classifier. ALDE redefines the within-class scatter matrix and between-class scatter matrix by introducing CO-Angle measurement. Thus ALDE is not inclined to neither within-class scatter nor between-class scatter. ALDE can be transformed to the following optimization problem:

Figure 2: Description of the 3DHSV

where and are the new defined within-class scatter matrix and between-class scatter matrix,

is the identity matrix, and

is the transformation matrix. and are showed as below:


where denotes the number of classes, , denote the number of all samples and the number of samples in class . In addition, , denote the mean of all samples and the mean of samples in class . While denotes the process of unitizing.

Furthermore, ELM is a feed-forward neural network with a hidden layer, which has extreme fast training speed and high recognition accuracy. ELM can be mathematically modeled as:


where denote the samples, denote their labels, and

is the activation function. While

denote input weights, denote bias parameters, and

are output weights. A sigmoid function is used as the activation function:


2.4 Weak label hierarchic propagation

Weak label hierarchic propagation (WLHP) takes advantage of the sociality in identifying the cube state to get data with weak labels, and the labels are used by hierarchical propagation to solve CRRC problem.

Actually, to solve CRRC problem, it only needs to know the distribution information of each cube block, so CRRC problem can be simplified. All we have to do is just distinguishing the belonging surfaces of cube blocks, without knowing their specific colors. And it is a fact that none of the six central color blocks of the cube belong to the same color. So WLHP takes the features of six central color blocks as start points of propagation, then find other color blocks that belong to the same face with the corresponding central color block by the hierarchic propagation.

1:The set of 54 features for cube color blocks, ;
2:The recognition results, ;
3:for  do
5:     ;
6:     remove from X;
7:     get 2 nearest neighbors index of in by KNN, stored in ;
8:     for  do
9:         ;
10:         remove from ;
11:     end for
12:end for
13:for  do
14:     for  do
15:         get 3 nearest neighbors index of in by KNN, stored in ;
16:         for  do
17:              ;
18:              remove from ;
19:         end for
20:     end for
21:end for
22:return ;
Algorithm 1 Weak label hierarchic propagation.

The implementation of the hierarchic propagation is shown in Alg. 1, where and is the set of features and labels for cube color blocks respectively. , the subscript of or donates the index of cube color blocks. and record the intermediate results of nearest neighbor index. At the initial time, contains color block features of a cube obtained online, there a total of 54 features. Hierarchical propagation is to deal with the situation where the feature distribution of the central color block deviates from the center of the color category.

2.5 Dynamic weight label propagation

Label Propagation is a semi-supervised machine learning algorithm that assigns labels to previously unlabeled data points

[17]. Dynamic weight label propagation uses dynamic weights which base on labels. For CRRC problem, the size of the data is very small (54 features), so KNN is used as a measure of label propagation rather than constructing graph. Weight matrix is used to adjust the distance between two samples, which can be described as follows:


where is the distance between and , represents the sample with the label of , represents the sample without labels, denotes the weight matrix of label .

Figure 3: Dynamic weight label propagation

As we can see in Fig. 3, during the propagation process of green labels, if we directly calculate the distance from to other samples without weight matrix , it is hard to choose and . In DWLP, distance is calculated in the form of Eq. 5, it will be easy to recognize and . The matrix is dynamically determined by the label during the propagation process. For example, is used in the propagation process of white label. In this way, the distribution characteristics of the data are well utilized, and a high recognition rate can be obtained.

3 Rubik’s Cube Dataset

3.1 Overview

To evaluate performance of the color classification algorithm, we have collected a set of Rubik’s cube images under different environmental circumstances. In this section, we first introduce the collection process of the Rubik’s cube dataset, then explain how to label the images with perceptual visual interactive learning (PVIL) method [10].

3.2 Samples collection

We put the cube in the hands of the robot and use the camera to take pictures in different situations. Images are collected in groups. Each group represents the state of one cube, which consists of three images. And each image in one group contains two faces of the cube, as Fig. 4(a) shows.

In order to facilitate the extraction of color features, it is necessary to separate the cube form the background, so we manually mark the six corners of the cube then use perspective transformation to convert irregular surfaces into regular squares, as Fig. 4(b) shows.

(a) one group images
(b) regular squares
Figure 4: Separate the cube surfaces from background

The entire dataset consists of 348 images with the same resolution (640 480), a total of 18792 color block information. The dataset contains the following 5 circumstances: cube A in bright, cube A in dark, cube B in bright, cube B illuminated from right, cube B illuminated from above. A and B are cubes with two different backgrounds. Fig. 5 shows examples of different circumstances.

(a) cube A in bright
(b) cube A in dark
(c) cube B in bright
(d) cube B illuminated from right
(e) cube B illuminated from above
Figure 5: Examples in Rubik’s Cube Dataset
Figure 6: The process of PVIL

3.3 Label samples

To get labels of the Rubik’s cube dataset, we adopt a visual labeling method called PVIL labeling [10]. Compared with original handcraft labeling method, the PVIL method is stable in accuracy and efficient in time cost. The process of PVIL includes the following 4 steps as shown in Fig. 6

. (a) mapping the origin data into a feature subspace. (b) Visualizing feature vector. (c) To judge and select the data which could be fallen into one group. (d) The labeling results is obtained.

4 Experiment

In this section, we evaluate proposed methods on the Rubik’s Cube dataset introduced above. Methods are divided into an offline method and online methods. SB-ELM is an offline method, and 16DHSV features are used in SB-ELM. WLHP and DWLP are online methods based on 3DHSV features. In our experiment, we use A to E to represent the five circumstances in the Rubik’s cube dataset. Circumstances are arranged according to the order from A to E as follows: cube A in bright, cube A in dark, cube B in bright, cube B illuminated from right, cube B illuminated from above.

4.1 Accuracy of offline method

samples A B C D E total
50 85.75 96.06 82.03 78.07 76.15 84.54
100 97.76 97.88 82.39 79.17 76.86 84.47
150 96.94 98.88 78.52 78.64 79.04 87.02
200 96.96 98.98 82.34 80.34 78.42 87.22
250 98.80 99.13 82.34 81.37 79.64 85.03
300 97.54 94.56 81.33 80.10 79.18 87.23
Table 1: Accuracy of SB-ELM with 16DHSV

We test the accuracy of SB-ELM in single circumstance and mutiple circumstances respectively. The accuracy of SB-ELM under a single environmental circumstance is high, while the accuracy under mutiple circumstances is relatively low. The reasons of the above situation can fall into two aspects: (1) It is hard to generalize all the circumstances caused by complex environment. (2) When the brightness of the light changes, most of features of orange and red in the color space will shift, which result in the overlap distribution of orange and red colors. As we can see in Tab. 1, the accuracy of SB-ELM in circumstance A and B trained by 250 samples can reach around 99%. At the same time, the accuracy in mutiple circumstances is between 84% and 88%.

4.2 Accuracy of online methods

We test the accuracy of online methods in single circumstance and allover circumstances respectively. The accuracy of KNN is used as a baseline. Compared with KNN, WLHP has a higher recognition rate because it considers that the center color block feature is not distributed in the date center. To explain more specific, if KNN is used to select 8 neighbors of the center color block at a time, some of the 8 neighbors may be misidentified. While WLHP reduces the number of neighbors selected, effectively avoiding the occurrence of misjudgment. At the same time, the neighbors of neighbors are used as the remaining color blocks to ensure the correct number of identifications. It can be seen from Tab. 2 that WLHP can get better results in all circumstances than KNN.

Methods A B C D E total
KNN 95.26 95.11 85.19 79.01 77.68 86.45
WLHP 96.22 95.85 87.27 77.70 80.70 87.55
WLHP* 97.63 99.85 99.61 89.86 94.15 96.22
DWLP 100.0 99.26 96.84 93.88 98.15 97.63
Table 2: Accuracy of online methods with 3DHSV

The optimized WLHP is another version of online method, which is denoted as WLHP*. The process is as follows: Find the white color block in center color blocks based on the saturation dimension of 3DHSV, then identify other white color blocks by the white center color block. When identifying other color blocks, the weight of the hue dimension is increased when calculating the distance between two features. The shift of color distribution are more sufficiently considered by increasing the weight, base on which the discrimination between colors is more obvious. The experiment results in Tab. 2 show that the accuracy of WLHP* is 8.67% better than WLHP on average.

Inspired by WLHP*, DWLP makes full use of color information. DWLP utilizes the distribution characteristics of the six colors in HSV color space to increase the degree of color differentiation. Color-varying weight matrix is used to calculate the distance between two features when performing different label propagation. After the above improvements, DWLP achieves a good recognition effect in the experiment. As we can see in Tab. 2, the accuracy of DWLP (97.63%) is the highest accross the entire dataset.

The accuracy of online methods higher than the offline method because the relative distribution of the data is stable regardless of light changes, for example, the brightness of the orange is always higher than the red one. In CRRC problem, the experiment results has illustated that the online methods, especially DWLP, can perform better in varied environment. As shown in Tab. 1 and Tab. 2, the accuracy of SB-ELM accross entire dataset is only around 87%, while the accuracy of DWLP is around 97%.

5 Conclusion

CRRC is a very important issue in Rubik’s cube robot which also can be treated as a sequential manipulation problem. Therefore, CRRC should be deep analyzed. In this paper, we point out the color drifting problem of CRRC. Furthermore, we construct a dataset to compare our proposed offline and online methods, which illustrates the effectiveness of online methods. Meanwhile, to verify our viewpoint, we design a Rubik’s cube robot which works in real-world environment. The experimental results are satisfactory by most users who unknowns the operational principle of our Rubik’s cube robot.


  • [1] W. Kasprzak, W. Szynkiweicz, and L. Czajka, “Rubik’s cube reconstruction from single view for service robots,” Machine Graphics & Vision, vol. 15, no. 3-4, pp. 451–9, 2006.
  • [2] Herbert Kociemba, “Close to god’s algorithm,” Cubism for Fun, vol. 28, no. April, pp. 10–13, 1992.
  • [3] Richard E Korf, “Depth-first iterative-deepening: An optimal admissible tree search,” Artificial intelligence, vol. 27, no. 1, pp. 97–109, 1985.
  • [4] Richard E Korf, “Linear-time disk-based implicit graph search,” Journal of the ACM (JACM), vol. 55, no. 6, pp. 26, 2008.
  • [5] T. Rokicki, H. Kociemba, M. Davidson, and J. Dethridge, “The diameter of the rubik’s cube group is twenty,” SIAM REVIEW, vol. 56, no. 4, pp. 645–670, 2014.
  • [6] Cezary Zielinski, Tomasz Winiarski, Wojciech Szynkiewicz, Maciej Staniak, Witold Czajewski, and Tomasz Kornuta, “Mrroc++ based controller of a dual arm robot system manipulating a rubik’s cube,” 12 2018.
  • [7] Shenglan Liu, Jun Wu, Lin Feng, Yang Liu, Hong Qiao, Wenbo Luo, Muxin Sun, and Wei Wang,

    “Perceptual uniform descriptor and ranking on manifold: a bridge between image representation and ranking for image retrieval,”

    arXiv, p. 14, 2016.
  • [8] AM. Ferman, AM. Tekalp, and R. Mehrotra, “Robust color histogram descriptors for video segment retrieval and identification,” IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 11, no. 5, pp. 497–508, 2002.
  • [9] Gh. Liu and JY. Yang, “Content-based image retrieval using color difference histogram,” PATTERN RECOGNITION, vol. 46, no. 1, pp. 188–198, 2013.
  • [10] Shenglan Liu, Xiang Liu, Yang Liu, Lin Feng, Hong Qiao, Jian Zhou, and Yang Wang, “Perceptual visual interactive learning,” arXiv preprint arXiv:1810.10789, 2018.
  • [11] Martin Rajchl, Matthew C. H. Lee, Franklin Schrans, Alice Davidson, Jonathan Passerat-Palmbach, Giacomo Tarroni, Amir Alansary, Ozan Oktay, Bernhard Kainz, and Daniel Rueckert, “Learning under distributed weak supervision,” 2016.
  • [12] Ritendra Datta, Jia Li, and James Ze Wang, “Content-based image retrieval—approaches and trends of the new age,” in ACM Sigmm International Workshop on Multimedia Information Retrieval, 2005, pp. 253–262.
  • [13] Jing Huang, S Ravi Kumar, Mandar Mitra, Wei-Jing Zhu, and Ramin Zabih, “Image indexing using color correlograms,” in Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on. IEEE, 1997, pp. 762–768.
  • [14] Liang Zheng, Shengjin Wang, Ziqiong Liu, and Qi Tian,

    “Packing and padding: Coupled multi-index for accurate image retrieval,”

    in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1939–1946.
  • [15] Shenglan Liu, Lin Feng, and Hong Qiao, “Scatter balance: An angle-based supervised dimensionality reduction,” IEEE transactions on neural networks and learning systems, vol. 26, no. 2, pp. 277–289, 2015.
  • [16] Guang-Bin Huang, Qin-Yu Zhu, and Chee-Kheong Siew, “Extreme learning machine: theory and applications,” Neurocomputing, vol. 70, no. 1-3, pp. 489–501, 2006.
  • [17] Xiaojin Zhu and Zoubin Ghahramani, “Learning from labeled and unlabeled data with label propagation,” 2002.