Today, when non-renewable energy such as oil and coal are about to be exhausted, solar energy is becoming the focus of the world. Photovoltaic solar cells are the main products that convert solar energy into electric energy. In the process of intelligent manufacturing, the solar cell defect inspection is an essential part that can guarantee the products with high quality [1, 3, 2]. Automated solar cell defect detection is one of the most direct applications of artificial intelligence algorithms in an industrial setting. Moreover, solar cell defect detection is always of considerable interest to both industrial and academic researchers. Traditional methods [4, 5] mainly rely on filters or feature descriptors, which have some apparent limitations. One is the poor generalization performance, we need to select a specific filter or feature descriptor for a specific task, and some prior knowledge is necessary during this process. Another is weak ability of anti-disturbance, especially in the complex situation of industrial applications. These approaches are easily disturbed by illumination, imaging quality, and complex background. Thus, traditional methods are inevitably replaced by new and better algorithms.
Recently, deep learning has arose many scholars attention. The advantages of deep learning present as high accuracy, good generalization performance, and strong anti-interference ability. It has obtained many successful industrial applications, such as face recognition, cityscape segmentation , style transfer  and defect detection [2, 9]. Deep learning algorithm will automatically extract the semantic and texture feature of the input image, without designing a specific feature extractor for different tasks.
The deep learning model based on generative adversarial network (GAN) [10, 11, 12] to eliminate image noise is excellent, which can filter out the noise disturbance and retain image details well. Another successful application of GAN is image generation [13, 14, 15], which can be roughly divided into the following two parts: object to object (style transfer) and object to object-free (or object-free to object). The first one is usually called style transfer , which migrates some attributes of the object in the source domain to the object in the target domain, such as apple to orange, zebra to horse, summer to winter. The second one is from “yes” to “no” (or from “no” to “yes”) [19, 20], which means that an input image contains the object, which will be removed in the output image of the GAN. An example of the proposed algorithm is shown in the first row of Fig. 1.
Yes2no (defect2defect-free) can be viewed as filtering, after inputting a defective image to the GAN, the generator of GAN will filter out the defect and almost completely preserve the background texture simultaneously. Then, we can make difference between the generated image with the input image to eliminate the background. Only the defect region is retained. By thresholding the eliminated image, the defect region can be segmented.
No2yes (defect-free2defect) can be viewed as defective dataset augmentation, which plays an important role in the industrial defect inspection task. Industrial datasets are different from natural scene datasets such as ImageNet dataset
and COCO dataset. In the industrial manufacturing process, defect samples are difficult to collect, which account for a small part. However, defect-free samples are easy to collect, thus utilizing defect-free samples to generate realistic defect samples is very valuable for small-samples defect detection.
To accomplish above two tasks by an algorithm, some problems should be considered. Firstly, it is difficult to obtain one-to-one defective image and defect-free image annotation, so the supervised learning network can not be trained. This problem can be solved by the Cycle Consistency Generative Adversarial Network (CycleGAN), which can employ adversarial loss and cycle-consistency loss to carry out the mutual transformation of two image domains (defect and defect-free) without requiring pixel-wise annotation. Secondly, another key problem of the yes2no or no2yes is how to keep the background unchanged, whether it is to remove or generate defects. To overcome this problem, we propose the strong identity (SI) loss to constrain the textural similarity between input image and the generated image. In other words, except for the defect region, the background texture is as consistent as possible before and after image generation, as shown in Fig. 1. The main contributions of this paper can be summarized as follows:
A new image generation algorithm named as Strong Identity GAN (SIGAN) is proposed by employing a new strong identity (SI) loss to generate high-quality defective or defect-free image, while keeping the background almost unchanged.
This paper provides a new idea of using GAN for defect segmentation, which is accomplished by thresholding the difference between the input defective image and the generated defect-free image.
SIGAN can be used not only for defect segmentation, but also small-samples defective dataset augmentation. Experimental results show that the defect segmentation method achieves the better performance than many state-of-the-art methods. Moreover, defect classification models trained with dataset augmented by the SIGAN perform substantially better than those trained without augmentation.
Almost all public industrial image datasets only contain defective images, they ignore the value of defect-free images. This paper releases a solar cell EL image dataset named as EL-2019, which contains 260 defective images and 280 defect-free images.
This paper is organized as follows: Section II presents an overview of the related works. Section III gives the details of the proposed methods. Section IV presents extensive experiments. Finally, Section V concludes this study.
Ii Related Works
Ii-a Defect Segmentation
Defect segmentation methods can be roughly divided into two categories: one is the filter-based methods, the other is the CNN-based methods.
In terms of filter-based approaches, the filter is employed to filter out the high frequency information or low frequency information. After filtering out the defect, the filtered image will make difference with the original input image to accomplish defect segmentation. Of course, the filter can also directly filter out the background and retain the defects simultaneously. Based on above strategies, some filter-based approaches have been proposed to segment the defect in EL images. Tsai et al.  introduced the fourier image reconstruction technique to detect defect of solar cell EL image. The spatial image is firstly transformed into spectral image. Next, the spectral frequency associated with the defect was removed by setting the threshold. Finally, the defect region can be easily identified by evaluating the gray-level differences between the original image and its reconstructed image. Anwer et al.  proposed an improved anisotropic diffusion filter, which can smooth the image while preserving the edge texture. This method performs well in micro-crack defect detection task. Chen et al.  applied 2D Hessian-based enhancement filter to obtain the linear-structure and blob-structure. Then, a novel structure similarity measure (SSM) function is designed by using the identification functions of two structures, which can highlight crack defect (line-structure), and suppress crystal grains (blob-structure) simultaneously. This approach is relatively effective in micro-crack defect identification and outperforms the previous methods. Subsequently, Chen et al.  proposed a novel steerable evidence filter (SEF) to detect crack defects in solar cell EL images. SEF is an oriented filter, which is robust to the arbitrary texture orientations. Thus, it is better to detect various shapes of crack defects.
Recently, Convolutional Neural Network (CNN) has achieved good performance in image segmentation task, researchers proposed many excellent algorithms[28, 29, 30, 31, 32, 33], such as UNet , deeplabv3 , and DANet . Different from above supervised methods that requires pixel-wise annotation dataset for training, the approach proposed by this paper is weakly supervised and based on image generation. SIGAN does not need one-to-one corresponding data annotation, only needs to provide two image domains (defect and defect-free) for training. It is similar as the filter-based method. To accomplish the defect segmentation, SIGAN will employ deep learning network to filter out the defect region and retain the background simultaneously. It uses GAN to simulate the filters and complete the defect segmentation.
Ii-B Image Generation
Many researches have employed GAN-based image generation approaches [17, 18, 19, 20] to augment the image dataset. Liu et al.  proposed a multi-stage GAN to generate defective fabric images. They applied conditional GAN to synthesize the reasonable defective patches. Next, another GAN-based model is employed to fuse these patches into a raw large-resolution defect-free image. The generated defective images are expanded to augment the dataset, which can fine-tuning the segmentation network to better segment the defects. To improve the surface defect recognition, Niu et al.  proposed a surface defect-generation adversarial network (SDGAN) to augment the defect dataset. SDGAN employed the defect-free image to generate high-quality and diversity defective images, which were used for fine-tuning the classification effect of the surface defect.
Different from above methods, this paper provides a new idea of using GAN for solar cell defect segmentation, which is accomplished by thresholding the difference between the input defective image and the generated defect-free image. However, a key problem is that how to ensure the consistency of the background except for the defect area before and after the image generation. To solve this problem, this paper proposes strong identity (SI) loss in GAN to constrain the background details of generated image to be as similar as the input image except for the defect region. As shown in Fig. 2, the gray histogram, brightness and contrast of the same background region are close to each other before and after image generation with the proposed SIGAN. Thus, it is not hard to see that in the process of removing the defective area, the background remains almost unchanged, which verifies its effectiveness. Of course, the proposed SIGAN can accomplish the mutual transformation between defective and defect-free images, it can also be used for defective dataset augmentation (Inputting a defect-free image, SIGAN can generate a defective image.), which is validated by image classification results in this paper.
This section firstly introduces the architecture of the proposed SIGAN, which can generate the defective or defect-free images. Next, we present the details of the defect segmentation approach based on defect2defect-free and defect augmentation approach based on defect-free2defect.
Iii-a The SIGAN
The goal of the proposed SIGAN is to accomplish the mutual transformation between the defective image and defect-free image (defect2defect-free and defect-free2defect), while keeping the image background almost unchanged. SIGAN includes three types of losses: adversarial loss  is used for matching the distribution of generated image to the data distribution in the target domain; Cycle consistency loss  guarantees the mutual transformation between defective image and defect-free image; The proposed strong identity (SI) loss constraints the generated image to be as similar as the input image except for the defect region. Next, the network architectures of the generator and discriminator are introduced in details.
Iii-A1 Adversarial Loss
Adversarial loss  plays a vital role to the cross domain image generation. For the source domain A to target domain B, the adversarial loss is defined as follows:
where denotes the image distribution of domain A, denotes the image distribution of domain B. As shown in Fig. 3, given two image domains: source domain A and target domain B, generator G tries to make the generated image of source domain A to be as similar as the image in target domain B. While discriminator aims to distinguish the generated image with real domain image . In short, the generator G expects to cheat the discriminator , but the discriminator does not allow the generated fake image to be discriminated as b. This is a min-max problem, i.e., . The loss optimization direction is to minimize and maximize in the training process of the novel SIGAN. Moreover, the mapping process from source domain B to target domain A is similar as above, i.e., .
Iii-A2 Cycle consistency Loss
Cycle consistency loss  can ensure the SIGAN to accomplish the mutual transformation between domain A and B simultaneously: 1) defect-free domain A to defective domain B, 2) defective domain B to defect-free domain A. It is defined as follows:
where denotes L1 norm. For example, is a cycle, which can bring the generated image by generator back to original image through generator , as shown in Fig. 3. We hope that the generated image should be as similar as the original input image . Thus, L1 norm is employed to measure cycle consistency between and . Moreover, the reverse cycle should also be satisfied.
For an image, the ground truth is not required in the training process of SIGAN, instead, two domains (defect and defect-free) are needed. Cycle consistency loss is one of the key point to ensure that two different domains can be transformed without one-to-one corresponding data annotation for training.
Iii-A3 Strong Identity Loss
No matter to generate defective image or defect-free image, a key problem for solar cell EL image generation is that how to ensure the consistency of the background except for the defect area before and after the image generation. To solve this problem, this paper proposes strong identity (SI) loss to limit the background of generated image to be as similar as the input image. SI loss plays a vital role in the image segmentation task, which applies the original defective image to subtract the generated defect-free image, then the defect region is retained and the background region is eliminated. For generator , the SI loss is denoted as:
where and are the inputs, and are the outputs of the generator . For generator , the SI loss is denoted as:
where and are the inputs, and are the outputs of the generator . In the experiments, the L1 norm is utilized to measure the SI loss. There are two reasons why we use the proposed SI loss to measure the input image and the output image of each generator to keep the background almost unchanged. Firstly, the adversarial loss magnifies the difference between the generated image and the original input image to accomplish the domain transformation. However, the SI loss is employed to make the difference smaller, and finally achieves a compromise balance with the adversarial loss. Secondly, as shown in Fig. 2, the gray histogram of the input image is similar as the generated defective image of SIGAN. In terms of each generator, we hope that the data distribution of input image and output image should be as consistent as possible. Thus, the L1 norm is utilized to measure the consistency before and after image generation.
Iii-A4 Total Loss
The total SIGAN loss includes adversarial loss, cycle consistency loss and SI loss, it is defined as follows:
where terms and represent the relative importance between adversarial loss, cycle consistency loss and SI loss. In the total loss, guarantees that one domain can be transformed to another domain. ensures that the two domains can be converted to each other (defect2defect-free and defect-free2defect). makes the generated image similar to the input image. Noted that domain and domain must have similar data distribution, which is an important precondition for the application of the proposed method.
Iii-A5 Generator Network
Many previous works [34, 35] adopts UNet  as the network architecture of the generator, which is also employed by and simultaneously. As shown in Fig. 4, UNet is an encoder-decoder network. The input image feature is extracted as the layer downsamples until a bottleneck, then the process is reversed. Moreover, the low-level layers include much more textural features, high-level layers include much more semantic features, skip connection can ensure that the high-level layers contain rich texture and semantic features simultaneously.
Furthermore, we introduce the non-local module  in the last of UNet to help the generator reconstructing realer image. The non-local module has achieved successful application to capture the global context information for pixel-based image generation [37, 38, 39]. Each pixel is generated by depending on the relationship with all other pixels. It is defined as:
where is the input feature, is the output feature of the non-local module.
Iii-A6 Discriminator Network
The discriminator network contains five convolution layers, which are employed to classify the generated image (real or fake). The purpose of discriminator is to avoid being cheated by generator. As shown in Fig.3, discriminator expects to divide the generated image with real image . And discriminator expects to divide the generated image with real image . Table I
presents the network details of the discriminator, which has five convolutional (conv) layers. Batch normalization (BN) and LeakyReLU are applied to normalize and activate the convolutional layers respectively. This discriminator has less computational complexity, and is effective to divide the image. Moreover, the network architectures of discriminatorsand share the same network, as mentioned above.
Iii-B Defect Segmentation Approach Based on defect2defect-free
As shown in Fig. 5, when inputting a defective image to generator , a defect-free image is output. Generator is similar as a filter, which filters out the defect region, and retains the background region simultaneously. Then, the defect region can be acquired by employing the input defective image to subtract the output defect-free image. By setting an appropriate gray intensity threshold, the defect region can be detected. There is an important premise that the background area cannot be changed a lot during the image generation process. Then, the background can be eliminated by subtraction. To keep the background almost unchanged, the proposed SI loss plays a vital role to constrain it. The adversarial loss magnifies the difference between the generated image and the original input image to accomplish the domain transformation. However, the SI loss makes the difference smaller, and finally achieves a compromise balance with the adversarial loss.
Iii-C Defect Augmentation Based on defect-free2defect
Generator can transform the defect-free image into defective image, which can be used to augment the small-samples dataset. As is known to all, deep CNN model is a data-driven approach , which requires a great number of dataset for training. However, it is very difficult to obtain the defective images in the industrial manufacturing process. Thus, using defect-free images to generate defective images is of great research value. Our method provides a solution, which can employ the easily available defect-free images to generate defective images under complex background disturbance. As shown in Fig. 6, when inputting a defect-free image with random textural distribution, the generator will output an defective crack image. It is difficult to distinguish real or fake with the naked eye. To validate the effectiveness of the generated dataset, we utilize the generated defective dataset combining with original dataset to train and test the classification model. The detailed experimental results are presented in Section IV. F.
In this section, several experiments are carried out to evaluate the performance of proposed SIGAN. Firstly, we introduce the dataset distribution. Secondly, the evaluation metrics are presented, which can quantify the experimental results. Thirdly, we introduce the detailed setting of the experiments. Finally, the experimental evaluations of image generation, defect segmentation, and defect augmentation are presented respectively.
The dataset used to evaluate the proposed method is collected in the manufacturing process of the multicrystalline solar cell. The solar cell EL image is collected by a near-infrared camera of WP-US146 with a SONY ICX825 chip. The raw image with a resolution of 10241024 pixels is cropped into patches with a resolution of 128128 pixels. The categories of defects cover two types, i.e., crack and finger interruption. As shown in Fig. 7, crack defect appears as linear shape and randomly distributes in the image. Finger interruption defect presents cylindrical shape and is vertically located in the image. These two types of defects are easily disturbed by the complex background that shows dark and irregular area. Moreover, as shown in Fig. 7. (a), the defect-free image is also with the heterogeneous complex background disturbance.
The dataset distribution is illustrated in Table II. This dataset is employed to evaluate the performance of the image generation and defect segmentation tasks, we name it as (EL-2019). For image preprocessing, the defective image dataset is insufficient to train a general model, and less training data will lead to over fitting. Thus, we use mirror, flip, and contrast normalization to augment the defective image dataset. Moreover, we access this solar cell EL image dataset at: https://github.com/binyisu/EL-2019.
Iv-B Evaluation Metrics
The Frechet Inception Distance score 
, or FID for short, is a metric that calculates the distance between feature vectors calculated for real and generated images. The score summarizes how similar the two groups of image features are. These features are calculated using the inceptionv3 model that is used for image classification. Lower scores indicate that the two groups of images are more similar, or have more similar statistics. A perfect score being 0.0 illustrates that the two groups of images are identical. The FID score is used to evaluate the quality of images generated by GAN, and lower scores have been shown to correlate well with higher quality images.
where is the sum of all elements on the diagonal of the feature matrix. The mean value is and the covariance is . Moreover, and
represent the real image feature and the generated image feature extracted by pre-trained incptionv3 model.
|the number of defect pixels in ground truth|
|the number of pixels detected by the segmentation method|
|the number of the same pixels in and|
To evaluate the performance of defect segmentation, completeness (cpt), correctness (crt) and F-score are introduced. As illustrated in Table III, denotes the number of defect pixels in ground truth. represents the number of pixels detected by the segmentation algorithm. is the number of the same pixels in and . Term illustrates the completeness of the segmentation method. Term indicates the correctness of the segmentation results. The index can comprehensively evaluate the performance of segmentation algorithm, which represents the weighted harmonic average of the and indicators. The higher the value, the more effective the segmentation algorithm is.
Iv-C Implementation Details
The proposed method is validated on a server with a Intel Core i7-10700 CPU and a NVIDIA GeForce RTX3090. The batch size is set to 4. In the total loss of SIGAN, the balanced terms and
are set to 10 and 5 respectively. The learning rate 0.0002 is used to train all networks, and remains unchanged before the first 30 epochs, and then linearly decreased to zero after next 30 epochs. The parameter optimization approach is adaptive moment estimation (Adam) algorithm, which can dynamically adjust the learning rate to prevent parameter oscillation. Moreover, before training, all images are normalized and resized to 256256 pixels.
Iv-D Image Generation
Image generation based on the proposed SIGAN plays a vital role in the following two defect inspection tasks: defect segmentation based on defect2defect-free and defect augmentation based on defect-free2defect, as shown in Fig. 6 and Fig. 7. The following subsection quantifies the quality of image generation using FID indicator.
The proposed SIGAN is compared with different loss functions for defect-free image generation ondataset. As illustrated in Table IV, by introducing the strong identity loss () in SIGAN (+) (CycleGAN  is equivalent to SIGAN (+).), the FID score of defect-free image generation is improved from 102.88 to 86.33. The reason is that the proposed SI loss makes the background of two image domains tend to be consistent, including color, texture, brightness, contrast and so on. The defective image background is retained, only the defect area is removed, which greatly improves the authenticity of the generated defect-free image. Thus, the SIGAN (++) achieves a better performance than SIGAN (+).
Fig. 8 is used to qualitative analysis of the visualization improvement. Fig. 8 (a), (b), and (c) are the original defective images, the defect-free images generated by SIGAN (+), and the defect-free images generated by SIGAN (++) respectively. Comparing (a) with (b), it is not hard to see that when removing the defective region, the background is also changed. In the absence of SI loss constraint, SIGAN (+) is easier to play freely, which leads to changes in the background of the generated image. However, comparing (a) with (c), the background texture is almost completely retained, meanwhile, the defect is almost completely removed. Furthermore, the brightness and contrast are closer before and after image generation, which proves the effectiveness of SI loss in the proposed SIGAN. Moreover, defect2defect-free is the basis for defect segmentation.
In terms of crack and finger interruption defective image generation, as illustrated in Table IV, the FID scores of crack and finger interruption defective image generation are improved from 156.25, 91.12 to 100.05, 77.84 respectively. It indicates that SIGAN (++) generates realer image than SIGAN (+). Moreover, as shown in Fig. 9, we employ the same three defect-free images (yellow box and blue box) to generate crack and finger interruption defects. For crack defect, the generated defective images of SIGAN (+) are short of diversity. Otherwise, the brightness and contrast are changed a lot. However, the proposed SIGAN avoids these problems, which can add the defect region while keeping the image background almost unchanged. It shows that it is feasible to use a defect-free image to generate a relatively real defective image, which is of great significance for the augmentation of small-samples data in the industrial defect inspection process.
Iv-E Defect Segmentation Based on Defect2defect-free
In terms of defect segmentation, the proposed SIGAN can remove the defect region and almost completely retain the background region simultaneously. Then, the defect region is obtained by employing the input defective image to subtract the generated defect-free image. The defect detection process is similar to filter-based spectral domain methods, which can inspect the defect by the spectrum response.
Iv-E1 Qualitative Evaluation
The qualitative evaluation is shown in Fig. 10
, the novel SIGAN is compared with Tsai’s Fast Fourier Transform (FFT) method, Gabor filter , Chen’s Steerable Evidence Filter (SEF) method , and CycleGAN . To achieve a better performance of the four comparison methods, some key parameters are adjusted. Moreover, seven defective images with different shape, contrast and brightness are presented in the first column of Fig. 10, (a)-(d) are crack images, (e)-(g) are finger interruption images. The segmentation results are shown in the following five columns. Finally, the ground truths are listed in the last column.
Tsai’s FFT method assumes that the defect texture is high frequency information, the background is low. However, some background textures are also high-frequency, such as the dark strip in the bottom of Fig. 10 (e1) and the heterogeneous background in Fig. 10 (f1). Thus, FFT method is easy to detect background as defect. The Gabor method is a linear filter for edge detection. The frequency and direction expression of Gabor filter is similar to that of human visual system. It is found that Gabor filter is very suitable for textural edge representation and separation. To achieve a better segmentation performance, we employ the convolution kernel with 8 directions and 6 scales to filter the image. Because the edge of crack defect is clear, the crack segmentation effect is better than FFT. However, as shown in Fig. 10 (f2), the edge of finger interruption defect is frequently confused with the background, thus the background is easy to be segmented as defect. The segmentation results of Gabor are still not satisfactory. In terms of filter-based methods, Chen’s SEF achieves the best segmentation effect. SEF is an efficient oriented filter. To improve the robustness of intensity variation, except for the direction, the spatial distance is also taken into account. As shown in Fig. 10 (d2), in the case of weaker intensity change, SEF method performs better than other filters. Moreover, as shown in Fig. 10 (f3) and (g3), the segmentation performance is better than FFT and Gabor for the finger interruption defects with weak edge intensity change. However, some backgrounds are still easy to be detected as defects.
The CycleGAN method also uses the difference between the images before and after generation to achieve defect segmentation. Due to the absence of identity loss constraint, the background is changed too much, the results are shown in Fig. 10 (a4)-(g4). Compared with above four approaches, the novel SIGAN can extract the defect region more accurate and complete without the background disturbance. The experimental results are presented in Fig. 10 (a5)-(g5). Compared with FFT, Gabor, SEF and CycleGAN, SIGAN can better remove the complex background region, and retain more complete defect area, which verifies its effectiveness for defect segmentation.
Iv-E2 Quantitative Evaluation
The quantitative evaluation is presented in Table V. This experiment evaluates two types of defects (crack and finger interruption) with five approaches. The evaluation metrics (cpt, crt, and F-score) are given to assess the performance of different methods. For Tsai’s FFT method, it performs badly for the finger interruption defect inspection (38.30% F-score), this is because the gray intensity of finger interruption is close to the complex background, thus it is very easy to confuse the FFT filter. To achieve a good result, Gabor filter uses a convolution kernel of 8 directions and 6 scales to filter the image, this filter has a well response to the edge gradient. However, the edge gradient of complex background is similar as the defect gradient, thus it also achieves a poor segmentation result (54.97% total F-score). Steerable evidence filter (SEF)  is proposed in recent years, it has a 69.08% total F-score that is higher than other filters. The linear crack defect can be highlighted in the surrounding complex background. SEF responds strongly to linear textures, thus the experimental performance is better.
For CycleGAN, the segmentation result of crack defect is poor (17.89% F-score), the reason is that as shown in Fig. 8, although the crack defect can be partially removed, the background texture of the generated defect-free image is changed a lot, thus the defect segmentation results contain much more background region, which will reduce the defect inspection effect. As for the proposed SIGAN, the strong identity loss will ensure that the background before and after generation is as similar as possible, thus SIGAN can achieve a good performance to keep the background almost unchanged while removing the defects. The total cpt, crt and F-score of SIGAN are 92.03%, 88.73% and 90.34% respectively, which is better than other four methods. The above experimental results validate the effectiveness of the proposed method. Moreover, the F-score values of different methods are shown in Fig. 11.
Iv-E3 Time Efficiency
The time efficiency evaluation is presented in Table VI. Time efficiency evaluation is conducted on a server with a Intel Core i7-10700 CPU and a NVIDIA GeForce RTX3090. Noting that GAN-based approaches call the GPU to process the image. Owing to the acceleration of GPU, the speed of SIGAN is 62 ms that is not slower than other methods.
|Average time (ms)||72||94||86||62||62|
|Generated dataset||Defect-free||Crack||Finger interruption|
|Defect Type||Method||Original dataset||Augmentated dataset|
Iv-F Defect Augmentation Based on Defect-free2defect
As listed in Table VII, the augmented dataset named as is used to validate the defective image augmentation effect of SIGAN. The training data includes 50 real crack and 50 real finger interruption defects. We apply 200 defect-free images to generate 200 fake crack images and 200 fake finger interruption images, which are employed to augment the small-samples dataset. To validate the effectiveness of defective image augmentation with SIGAN, the augmented dataset is compared with the original dataset in terms of image classification task. The classification performance is evaluated by the indicators: precision, recall and F-measure , which have similar means as the indicators cpt, crt and F-score respectively in Table III, except that pixels are replaced by images.
The experimental results are illustrated in Table VIII. Four CNN-based methods (Resnet50 , Mobilenet , Inceptionv3 , and Densenet121 ) are used to evaluate the image classification results. For different methods, the augmented dataset has a 0.83%, 0.91%, 0.46% and 0.79% higher total F-measure respectively. In the experiments, the method based on SIGAN for data augmentation significantly increases the recognition rate of the solar cell EL defects, mainly due to the following improvements from the data distribution and training process. Firstly, regarding the data distribution, with the assistance of data augmentation by the SIGAN, the border between the defect and defect-free data distributions becomes clearer, which can improve the recognition of data in the border area, such as data with poor defect features because of poor or uneven lighting. Images generated using the SIGAN possess better quality and diversity; Therefore, our proposed method makes the border clearer and further decreases the inaccuracy. Secondly, regarding the training process, the data distribution after data augmentation based on SIGAN becomes more diverse. Therefore, the models do not easily over fitting during the training process, which increases the accuracy of recognition.
In this paper, a novel SIGAN is proposed to solar cell EL defect segmentation and augmentation without paired labeled image for training. We provide a new idea that applies GAN for industrial defect segmentation. SIGAN can accomplish defective image and defect-free image generation, and keep the complex background almost unchanged. The generated defect-free image can be made difference with the input defective image to realize defect segmentation. Moreover, the generated defective image can be used for solar cell EL image dataset augmentation, which can be employed to fine-tuning the CNN models. Experimental results show that the proposed SIGAN performs better than other methods in terms of defects segmentation in EL image, and is effective to augment the small-samples solar cell defect dataset simultaneously.
The limitation of SIGAN is that this method can only generate EL image patch with 256256 pixels, how to generate raw EL image with 10241024 pixels is our future research work.
-  D. Tsai, S. Wu and W. Chiu, “Defect Detection in Solar Modules Using ICA Basis Images,” IEEE Trans. Ind. Inform., vol. 9, no. 1, pp. 122–131, Feb. 2013.
-  B. Su, H. Chen, P. Chen, G. Bian, K. Liu and W. Liu, “Deep Learning-Based Solar-Cell Manufacturing Defect Detection With Complementary Attention Network,” IEEE Trans. Ind. Inform., vol. 17, no. 6, pp. 4084–4095, Jun. 2021.
-  B. Su, H. Chen, K. Liu and W. Liu, “RCAG-Net: Residual Channelwise Attention Gate Network for Hot Spot Defect Detection of Photovoltaic Farms,” IEEE Trans. Instrum. Meas., vol. 70, no. 2, pp. 1-14, 2021.
-  B. Su, H. Chen, Y. Zhu, W. Liu and K. Liu, “Classification of Manufacturing Defects in Multicrystalline Solar Cells With Novel Feature Descriptor,” IEEE Trans. Instrum. Meas., vol. 68, no. 12, pp. 4675–4688, Dec. 2019.
-  K. Liu, H. Yan, K. Meng, H. Chen and H. Sajid, “Iterating Tensor Voting: A Perceptual Grouping Approach for Crack Detection on EL Images,” IEEE Trans. Autom. Sci. Eng., doi: 10.1109/TASE.2020.2988314.
-  R. He, J. Cao, L. Song, Z. Sun and T. Tan, “Adversarial Cross-Spectral Face Completion for NIR-VIS Face Recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 5, pp. 1025–1037, May. 2020.
-  Q. Geng, H. Zhang, X. Qi, G. Huang, R. Yang and Z. Zhou, “Gated Path Selection Network for Semantic Segmentation,” IEEE Trans. Image Process., vol. 30, no. 1, pp. 2436–2449, 2021.
-  M. Cheng, X. Liu, J. Wang, S. Lu, Y. Lai and P. Rosin, “Structure-Preserving Neural Style Transfer,” IEEE Trans. Image Process., vol. 29, no. 8, pp. 909–920, Aug. 2020.
-  B. Su, H. Chen, and Z. Zhou, “BAF-Detector: An Efficient CNN-Based Detector for Photovoltaic Cell Defect Detection,” IEEE Trans. Ind. Electron., doi: 10.1109/TIE.2021.3070507.
-  I. Goodfellow, J. Pouget-Abadie, M. Mirza, et al., “Generative Adversarial Networks,” in Proc. Adv. Neural Inf. Process. Syst., Jun. 2014, pp. 2672–2680.
-  Z. Li, J. Huang, L. Yu, Y. Chi and M. Jin, “Low-Dose CT Image Denoising Using Cycle-Consistent Adversarial Networks,” in Proc. IEEE Nucl. Sci. Symp. Med. Imaging Conf., Apr. 2019, pp. 1-3.
-  H. Zhang, V. Sindagi and V. M. Patel, “Image De-Raining Using a Conditional Generative Adversarial Network,” IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 11, pp. 3943-3956, Nov. 2020.
X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao and C. Chen, “ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks,” inProc. Eur. Conf. Comput. Vis., Sep. 2019, pp. 63–79.
L. Rout, I. Misra, S. M. Moorthi and D. Dhar, “S2A: Wasserstein GAN with Spatio-Spectral Laplacian Attention for Multi-Spectral Band Synthesis,” in
Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jul. 2020, pp. 727-736.
Z. Yi, Z. Chen, H. Cai, W. Mao, M. Gong and H. Zhang, “BSD-GAN: Branched Generative Adversarial Network for Scale-Disentangled Representation Learning and Image Synthesis,”IEEE Trans. Image Process., vol. 29, no. 8, pp. 9073-9083, Aug. 2020.
-  Y. Zhang, Y. Zhang and W. Cai, “A Unified Framework for Generalizable Style Transfer: Style and Content Separation,” IEEE Trans. Image Process., vol. 29, no. 1, pp. 4085–4098, Jan. 2020.
-  K. Liu, Y. Li, J. Yang, Y. Liu and Y. Yao, “Generative Principal Component Thermography for Enhanced Defect Detection and Analysis,” IEEE Trans. Instrum. Meas., vol. 69, no. 10, pp. 8261-8269, Oct. 2020.
-  F. Yu, X. Wu, J. Chen and L. Duan, “Exploiting Images for Video Recognition: Heterogeneous Feature Augmentation via Symmetric Adversarial Learning,” IEEE Trans. Image Process., vol. 28, no. 11, pp. 5308–5321, Nov. 2019.
-  J. Liu, C. Wang, H. Su, B. Du and D. Tao, “Multistage GAN for Fabric Defect Detection,” IEEE Trans. Image Process., vol. 29, no. 12, pp. 3388–3400, Jan. 2020.
-  S. Niu, B. Li, X. Wang and H. Lin, “Defect Image Sample Generation With GAN for Improving Defect Recognition,” IEEE Trans. Autom. Sci. Eng., vol. 17, no. 3, pp. 1611–1622, Jul. 2020.
D. Jia, D. Wei, R. Socher, et al
., “ImageNet: A large-scale hierarchical image database” inProc. IEEE Conf. Comput. Vis. Pattern Recognit., Aug. 2009, pp. 248–255.
O. Vinyals, A. Toshev, S. Bengio and D. Erhan, “Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 4, pp. 652–663, Apr. 2017.
J. Zhu, P. Taesung, I. Phillip, and E. Alexei, “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks”, inProc. IEEE Conf. Comput. Vis. Pattern Recognit., Dec. 2017, pp. 2242-2251.
-  D. Tsai, S. Wu, and W. Li, “Defect detection of solar cells in electroluminescence images using Fourier image reconstruction,” Sol. Energy Mater. Sol. Cells., vol. 99, no. 4, pp. 250-262, Apr. 2012.
-  S. A. Anwar and M. Z. Abdullah, “Micro-crack detection of multicrystalline solar cells featuring an improved anisotropic diffusion filter and image segmentation technique,” EURASIP J. Image Video Process., vol. 2014, no. 1, pp. 1-17, Dec. 2014.
-  H. Chen, H. Zhao, D. Han, W. Liu, P. Chen, and K. Liu, “Structure-aware-based crack defect detection for multicrystalline solar cells”, Measurement, vol. 151, no. 2, pp. 101–115, Feb. 2020.
-  H. Chen, H. Zhao, D. Han, and W. Liu, “Accurate and robust crack detection using steerable evidence filtering in electroluminescence images of solar cells,” Opt. Lasers Eng., vol. 118, no. 19, pp. 22-33, Jul. 2019.
-  G. Wang, M. Zuluaga, and W. Li et al., “DeepIGeoS: A Deep Interactive Geodesic Framework for Medical Image Segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 7, pp. 1559-1572, Jul. 2019.
Q. Yu, Y. Shi, J. Sun, Y. Gao, J. Zhu and Y. Dai, “Crossbar-Net: A Novel Convolutional Neural Network for Kidney Tumor Segmentation in CT Images,”IEEE Trans. Image Process., vol. 28, no. 8, pp. 4060-4074, Aug. 2019.
T. Nakazawa and D. V. Kulkarni, “Anomaly detection and segmentation for wafer defect patterns using deep convolutional encoder-decoder neural network architectures in semiconductor manufacturing,”IEEE Trans. Semiconduct. M., vol. 32, no. 2, pp. 250–256, May. 2019.
-  O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent., Nov. 2015, pp. 234–241.
-  L. Chen, Y. Zhu, F. Schroff, and H. Adam, “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation,” in Proc. Eur. Conf. Comput. Vis., Aug. 2018, pp. 833–851.
-  J. Fu et al., “Dual Attention Network for Scene Segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2019, pp. 3146–3154.
-  P. Isola, J. Zhu,T. Zhou, and A. Efros, “Image-to-Image Translation with Conditional Adversarial Networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Nov. 2017, pp. 5967–5976.
-  T. T. Nguyen, T. N. Chi, M. D. Hoang, H. N. Thai and T. N. Duc, “3D Unet Generative Adversarial Network for Attenuation Correction of SPECT Images,” in Proc. Int. Conf. Rec. Adv. Signal Process., TeleCommun. & Comput., Aug. 2020, pp. 93-97.
-  X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Dec. 2018, pp. 7794–7803.
-  H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-Attention Generative Adversarial Networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Jun. 2019, pp. 12744–12753.
-  Z. Mi, X. Jiang, T. Sun and K. Xu, “GAN-Generated Image Detection With Self-Attention Mechanism Against GAN Generator Defect,” IEEE J. Sel. Top. Signal Process., vol. 14, no. 5, pp. 969–981, Aug. 2020.
-  Z. Wu, J. Li, Y. Wang, Z. Hu and M. Molinier, “Self-Attentive Generative Adversarial Network for Cloud Detection in High Resolution Remote Sensing Images,” IEEE Geosci. Remote Sens. Lett., vol. 17, no. 10, pp. 1792–1796, Oct. 2020.
-  M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local Nash equilibrium,” in Proc. Adv. Neural Inf. Process. Syst., Dec. 2017, pp. 6626–6637.
-  L. Kingma, D. P. Ba, and J. L. Adam, “A method for stochastic optimization,” in Proc. Int. Conf. Learn. Represent., Jan. 2015, pp. 1–15.
-  D. Dunn and W. Higgins, “Optimal Gabor filters for texture segmentation,” IEEE Trans. Image Process., vol. 4, no. 7, pp. 947-964, Apr. 1994.
-  K. He et al., “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Dec. 2016, pp. 770–778.
-  A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” 2017. arXiv:1704.04861. [Online]. Available: http://arxiv.org/abs/1704.04861
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit., Dec. 2016, pp. 2818–2826.
-  G. Huang, Z. Liu, M. Laurens, and W. Kilian, “Densely Connected Convolutional Networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Nov. 2017, pp. 2261-2269