I Introduction
Steganography is a means of covert communication in which secret information is embedded into some form of digital media, such as an image, video or text file [3]. In multimedia security, steganography forms a critical research topic [4]. In general, images are considered as the embedding medium due to minute changes in an image being imperceptible to the human eye [4]. The capacity for a steganographic algorithm represents the amount of data that can be embedded in an image before there is a noticeable visual change in the image [5]. Steganalysis is the process of detecting if a given image has information hidden in it or not [27]. In this regard, we can convert this problem into that of a simple classification problem. To detect if an image is embedded with information we propose the use of an ensemble color space model. Recently, it was seen an ensemble colorspace model [1] obtained excellent results on large scale image classification datasets such as imagenet [2]. Based on [1] we propose a novel steganalysis approach.
We use a colorspace approach to determine if an image is hiding information or not. We use ColorNet [1] and take the final activation map from each colorspace. We use weighted averaging to obtain a single feature map from all the individual feature maps that are generated by each colorspace. It was seen [1] that each color space had features explicit to themselves and this would help us detect minute changes in the image. We then use a levyflight grey wolf optimization method (metaheuristic approach) to select a smaller subset of features. Using these features, we classify the given image into one of two classes: containing concealed information or not.
Ii Related Work
Iia Steganography
Steganography algorithms can be classified broadly into four categories: 1) cover image size 2) embedding domainbased algorithms 3) nature of retrieval based algorithms 4) adaptive steganographic algorithms. In the case of 2D images, the information is embedded onto the 2D plane of the cover image. This embedding can be done over transform domain coefficients (such as discrete cosine transforms, Fourier transforms, etc.) or on the spatial domain (an example is LSB). The 3D approaches essentially follow the same general procedure. However, the procedure is repeated on multiple planes (for instance RGB in a color image has 3 planes that can embed information). Image steganography on 3D images can be made in either geometrical domain [5], representation domain [6] or topological domain [7]. Some of the transformbased steganographic algorithms include discrete Fourier transform (DFT) [9], discrete cosine transform (DCT), discrete wavelet transform [10], complex wavelet transform [11] among others. Here, frequency coefficients obtained after applying transforms are used to hide secret bits. Along with the security being improved, these algorithms are robust to image compression, cropping, scaling, etc. Off late, machine learning approaches have been proposed such as SVM (Support Vector Machine)[12], genetic algorithm approaches [13], neural networkbased steganography [14]. Though these approaches are blackbox approaches, they have shown good results.
IiB Steganalysis
Steganalysis is the method of trying to either determine a stego image (image where information is hidden) or extract the secret information. Our method deals with the former. We treat the problem at hand to be a classification problem, wherein, each image either contains some hidden information or not. There are two basic approaches to steganalysis: signature steganalysis and statistical steganalysis. Signature steganalysis is the method wherein patterns, or signatures relevant to various steganographic algorithms are searched for. The statistical approach searches for mathematical results to determine if the information is being hidden.
Signature steganalysis is further classified into specific embedding [16] and universal blind steganalysis [15]. Specific embedding approaches are impractical because we need to know what steganography approach has been used to embed information. Hence, universal blind steganalysis [8,17] is preferred. These approaches help in the extraction of high dimensional features. However, the curse of dimensionality occurs. Hence, a need to reduce feature size occurs. Some commonly used algorithms to do the same include wrappers, filters, etc. Filters are less complex; however, they perform poorly. Wrapper methods evaluate feature subset using predictive models [18]. However, wrappers are complex and timeconsuming.
To overcome this, metaheuristic approaches have been deployed. These approaches solve optimization problems by utilizing natural phenomena [1920]. It was seen that Grey Wolf Optimization (GWO) performed better than other metaheuristic approaches for solving nonlinear problems in a multidimensional space [19]. However, it has a slow convergence rate and gets trapped in local optima at times. It has been seen that GWO can be optimized by modifying it’s parameter A to obtain a quick convergence rate, better convergence precision and higher agility for global searching.
Iii Proposed Approach
Iiia Overall architecture and effect of using color spaces
We consider steganalysis as a 2 class classification problem. The overall architecture is described in figure 1. The experimental analysis along with details regarding training set etc are explained in the next section. Recently, the effect of color spaces on image classification has been explored [1]. It was seen that individual color spaces inherited classification features explicitly to themselves. This helped us ponder about the ability to extract information in an image where there is secret information being embedded. Colornet [1] being an ensemble model, that could extract features specific to each colorspace, was an excellent choice to utilize to help us in determining if an image could have information hidden in it. The output of Colornet is a highdimensional vector, which causes a computationally intensive execution. To reduce the number of features selected we have to use an optimization approach for feature selection. Figure 1 shows the architecture of the model.
IiiB Optimization process for feature selection
IiiB1 Feature selection using LFGrey Wolf optimization
In GWO, the head of the pack is the . The next level of the hierarchy is , and finally followed by . GWO models the social hierarchy and mathematically illustrates the hunting procedure as an optimization problem. If X(t) and X(t) represent the position of prey and wolf at iteration ’t’, we can mathematically model the encircling process [19] with two coefficients A and C as shown in (1). A and C are calculated by (2).
(1) 
(2) 
Here, r and r are random vectors in [0,1], a is a parameter that decreases linearly from 2 to 0 over iterations and also helps to control step size D of a grey wolf. Implementation of the end of the hunting process is done by decreasing the value of A which in turn depends on a. Once a turns zero, it means that the wolves have stopped moving. The linear decrease in A helps to exploit search space with minimal exploration. Hence, this traps a local optimum.
The size of the aggregated feature map creates an issue in terms of the complexity of the algorithm and the overall time needed for execution. To deal with this, we propose the use of levy flightbased grey wolf optimization (LFGWO) for feature selection based on Levy probability function in (3). Here, represents position parameter, represents scale parameter and represents the collection of samples in the distribution. The above equation holds good for all positive values of and 0 otherwise.
(3) 
The parameter A is modified by the Levy flight function as A = L(S)*r1. This makes A take up values in a nonlinear decrease. S is the position of the wolf and r1 is a random vector.
IiiB2 Choice of optimization function
The reason for selection of LFGWO is based in the statistical results obtained in [21]. It was seen that for 15 defined benchmark functions, the wilcoxon rank sum test of LFGWO outperforms existing optimization approaches in terms of mean fitness values.
Figure 2 represents a comparison of the LFGWO with Grey Wolf Optimization (GWO), Gravitational search algorithm (GSA), particle swarm optimization (PSO) and fast evolutionary programing (FEP) using a boxplot and a graph showing how quickly the convergence of the best fitness value is obtained with respect to the number of iterations. The box plot represents the benchmark function defined in equation 4 and the convergence map that of the function defined in equation 5.
(4) 
(5) 
Iv Experimental Analysis
Iva Datasets and training
Most commonly used steganalysis datasets are the Bossbase [22] and BOWS2 [23]. Each contains 10000 grayscale images. However, the approach proposed is dependent on color, and as such, we use a dataset with color images. Hence, starting with the 10000 images of Bossbase [22] dataset, we generate a dataset by following the process done in [24]. We downsampled the fullresolution images to a size of 512x512. We then followed the process in [25], so that the training and testing scenarios were conducted in a similar environment. In [25], two datasets were created by using two demosaicing algorithms: Patterned pixel grouping (PPG) and Adaptive HomogeneityDirected (AHD) and named BOSSPPGLAN and BOSSAHDLAN correspondingly. Further, by removing the downsampling method, we can obtain two more datasets: BOSSPPGCRP and BOSSAHDCRP. By pairing a demosaicing algorithm with bilinear or bicubic kernels, we obtain four more datasets: BOSSPPGBIL, BOSSAHDBIL, BOSSAHDBIL, and BOSSAHDBIC.
We train our model by utilizing minibatch stochastic gradient descent with the following parameters: learning rate : 0.0001, weight decay : 0.0005, step size : 5000, momentum: 0.75, gamma : 0.75, batch size: 32, maximum iterations: 40 x 104. Testing of the trained model was done for every 5000 iterations and accuracy in 40 x 104 iterations. HILL, SUNIWARD, CMDCSUNIWARD and CMDCHILL: 4 state of the art color steganography algorithms, were used as attacking targets for experimental analysis. The embedding payload was set to 0.2 bpc (bits per channel/band pixel) and 0.4 bpc. In order to select the most challenging scenarios and also follow similar conditions for result comparison, we followed the process executed in WISERNet [25].
IvB Results comparison
To compare our results, we considered three deep learning approaches for color steganalyzers, that are widely considered state of the art approaches: WISERNet [25], Deep Hierarchical Representations (DHR) [26] and DeepCNN [27]. Experiments were conducted on the same datasets and using similar resources for a fair comparison. Popular steganography methods such as SUNIWARD [28], MiPOD [29], HILL [30] adopt an additive embedding distortion approach for minimizing framework [31]. Recently, CMDC was proposed [32] by improvising the CMD approach for color images. We denote the CMDC method using SUNIWARD and HILL as CMDCSUNIWARD and CMDCHILL respectively. Although DHR [26] and DCNN [27] can be executed in channelwise convolution, normal convolution and input concatenation as seen in [25], we show results only for the normal convolution as WiserNet [25] outperforms DHR and DCNN in all cases. We also compare results with the Pixel Vector Cost (PVC) [33] and channel gradient correlation (CGC) [34].
The parameters used in terms of batch size and iterations were the same for all the comparisons. The other parameters were used as described in the original paper. Each experiment constituted 75 percent training images, i.e., 7500 images and 2500 images were used for testing. All experiments were performed 10 times and the average accuracy of testing was used. Table 1 compares the results of our approach with WISERNet (WNet) [25], DHR [26], DCNN [27], on BOSSPPGLAN (BPL), BOSSPPGBIC (BPBc), BOSSPPGBIL (BPBl), BOSSAHDBIC (BABc) and BOSSAHDBIL (BABl) with 0.2 bpc and table 2 with 0.4 bpc. As can be seen, the proposed method outperforms other state of the art methods for all but one case and also the percentage increase in detection is significant when patterned pixel grouping is performed on the datasets.
Dataset  DHR  DCNN  WNet  CGC  PVC  Proposed 

BPL  0.6474  0.6562  0.7139  0.7231  0.7120  0.7741 
BPBc  0.6589  0.7124  0.7318  0.7278  0.7657  0.7912 
BPBl  0.7611  0.7487  0.8033  0.8120  0.8068  0.8316 
BABc  0.6614  0.6627  0.7369  0.7168  0.7211  0.7368 
BABl  0.7622  0.7647  0.8022  0.7981  0.7764  0.8044 
Dataset  DHR  DCNN  WNet  CGC  PVC  Proposed 

BPL  0.7568  0.7941  0.8361  0.8268  0.8148  0.8724 
BPBc  0.7732  0.8068  0.8435  0.8314  0.8514  0.8814 
BPBl  0.87211  0.9045  0.9169  0.9165  0.9056  0.9381 
BABc  0.7728  0.8141  0.8448  0.8412  0.8378  0.8468 
BABl  0.8738  0.9067  0.9144  0.9044  0.9022  0.9088 
Further experimental analysis is done by mixing datasets as shown in [27]. Table 3 shows how the datasets were mixed. We further label the datasets in roman numerals for simplicity to display in the comparison of steganalyzers in table 4 and 5. BPL, BPBc, BPBl, BABc, BABl, BAL are further abbreviations of BOSSPPGLAN, BOSSPPGBIC, BOSSPPGBIL, BOSSAHDBIC, BOSSAHDBIL and BOSSAHDLAN.
Name  BPL  BPBc  BPBl  BABc  BABl  BAL 

SetI  ✓  ✓  ✓       
SetII        ✓  ✓  ✓ 
SetIII  ✓          ✓ 
SetIV  ✓  ✓  ✓  ✓  ✓  ✓ 
Similarly to tables 1 and 2, table 4 compares results on the abovementioned mixture of datasets with 0.2 bpc. Table 5 compares the results with 0.4 bpc. As can be seen, the proposed method outperforms recent state of the art approaches, by a significant margin.
Dataset  DHR  DCNN  WNet  CGC  PVC  Proposed 

SetI  0.7237  0.7259  0.7675  0.7712  0.7734  0.8029 
SetII  0.7214  0.7217  0.7714  0.7710  0.7684  0.8026 
SetIII  0.6722  0.6865  0.7284  0.7412  0.7388  0.7648 
SetIV  0.7164  0.7182  0.7671  0.7782  0.7684  0.8048 
Dataset  DHR  DCNN  WNet  CGC  PVC  Proposed 

SetI  0.8241  0.8289  0.8594  0.8788  0.8641  0.9041 
SetII  0.8231  0.8417  0.8806  0.8762  0.8661  0.9021 
SetIII  0.7812  0.7892  0.8316  0.8411  0.8421  0.8598 
SetIV  0.8161  0.8214  0.8893  0.8796  0.8812  0.9013 
V Conclusion
With recent developments of color based steganography algorithms, the need for a powerful steganalyzer is needed. We saw recently, that an ensemble model of colorspaces has a significant impact on classification results. We propose StegColNet as a powerful color image steganalyzer. We employ an ensemble colorspace strategy to determine if an image is protecting information or not. We use ColorNet and take the final activation map from each colorspace. We use weighted averaging to obtain a single feature map from all the feature maps that are generated by each colorspace. We then use a levyflight grey wolf optimization method to select a smaller subset of features. Using these features, we classify the given image into one of two classes: containing concealed information or not.
References
References

[1]
Gowda, S.N. and Yuan, C., 2018, December. ColorNet: Investigating the importance of color spaces for image classification. In Asian Conference on Computer Vision (pp. 581596). Springer, Cham.

[2]
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K. and FeiFei, L. 2009, June. Imagenet: A largescale hierarchical image database. In IEEE Conference on In Computer Vision and Pattern Recognition (pp. 248255). IEEE.
 [3] Kahn, D., 1996, May. The history of steganography. In International Workshop on Information Hiding (pp. 15). Springer, Berlin, Heidelberg.
 [4] Cheddad, A., Condell, J., Curran, K. and Mc Kevitt, P., 2010. Digital image steganography: Survey and analysis of current methods. Signal processing, 90(3), pp.727752.
 [5] Li, N., Hu, J., Sun, R., Wang, S. and Luo, Z., 2017. A highcapacity 3D steganography algorithm with adjustable distortion. IEEE Access, 5, pp.2445724466.
 [6] Tsai, Y.Y., 2014. An adaptive steganographic algorithm for 3D polygonal models using vertex decimation. Multimedia Tools and Applications, 69(3), pp.859876.
 [7] Cheng, Y.M. and Wang, C.M., 2006. A highcapacity steganographic approach for 3D polygonal meshes. The Visual Computer, 22(911), pp.845855.
 [8] Chakraborty, S., Jalal, A.S. and Bhatnagar, C., 2017. LSB based non blind predictive edge adaptive image steganography. Multimedia Tools and Applications, 76(6), pp.79737987.
 [9] Jayaram, P., Ranganatha, H.R. and Anupama, H.S., 2011. Information hiding using audio steganography–a survey. The International Journal of Multimedia and Its Applications (IJMA) Vol, 3, pp.8696.
 [10] Kumar, V. and Kumar, D., 2018. A modified DWTbased image steganography technique. Multimedia Tools and Applications, 77(11), pp.1327913308.
 [11] Narasimmalou, T. and Joseph, R.A., 2012, March. Discrete wavelet transform based steganography for transmitting images. In IEEEInternational Conference On Advances In Engineering, Science And Management (ICAESM2012) (pp. 370375). IEEE.
 [12] Gowda, S.N., 2016, September. Innovative enhancement of the Caesar cipher algorithm for cryptography. In 2016 2nd International Conference on Advances in Computing, Communication and Automation (ICACCA)(Fall) (pp. 14). IEEE.
 [13] Chang, C.C., Yu, Y.H. and Hu, Y.C., 2008, December. Hiding secret data into an ambtccompressed image using genetic algorithm. In Second International Conference on Future Generation Communication and Networking Symposia (Vol. 3, pp. 154157). IEEE.
 [14] Gowda, S.N., 2016, October. Using Blowfish encryption to enhance security feature of an image. In 2016 6th International Conference on Information Communication and Management (ICICM) (pp. 126129). IEEE.
 [15] Luo, X.Y., Wang, D.S., Wang, P. and Liu, F.L., 2008. A review on blind detection for image steganography. Signal Processing, 88(9), pp.21382157.

[16]
Fridrich, J. and Goljan, M., 2004, June. On estimation of secret message length in LSB steganography in spatial domain. In Security, steganography, and watermarking of multimedia contents VI (Vol. 5306, pp. 2335). International Society for Optics and Photonics.
 [17] Kodovsky, J., Fridrich, J. and Holub, V., 2012. Ensemble classifiers for steganalysis of digital media. IEEE Transactions on Information Forensics and Security, 7(2), pp.432444.
 [18] Deng, H. and Runger, G., 2012, June. Feature selection via regularized trees. In The 2012 International Joint Conference on Neural Networks (IJCNN) (pp. 18). IEEE.
 [19] Chhikara, R.R., Sharma, P. and Singh, L., 2018. An improved dynamic discrete firefly algorithm for blind image steganalysis. International Journal of Machine Learning and Cybernetics, 9(5), pp.821835.

[20]
Yao, X., Liu, Y. and Lin, G., 1999. Evolutionary programming made faster. IEEE Transactions on Evolutionary computation, 3(2), pp.82102.
 [21] Pathak, Y., Arya, K.V. and Tiwari, S., 2019. Feature selection for image steganalysis using levy flightbased grey wolf optimization. Multimedia Tools and Applications, 78(2), pp.14731494.
 [22] Bas, P., Filler, T. and Pevný, T., 2011, May. ” Break our steganographic system”: the ins and outs of organizing BOSS. In International workshop on information hiding (pp. 5970). Springer, Berlin, Heidelberg.
 [23] Piva, A. and Barni, M., 2007, February. The first BOWS contest: break our watermarking system. In Security, Steganography, and Watermarking of Multimedia Contents IX(Vol. 6505, p. 650516). International Society for Optics and Photonics.
 [24] Goljan, M., Fridrich, J. and Cogranne, R., 2014, December. Rich model for steganalysis of color images. In 2014 IEEE International Workshop on Information Forensics and Security (WIFS) (pp. 185190). IEEE.
 [25] Zeng, J., Tan, S., Liu, G., Li, B. and Huang, J., 2018. Wisernet: Wider separatethenreunion network for steganalysis of color images. arXiv preprint arXiv:1803.04805.
 [26] Ye, J., Ni, J. and Yi, Y., 2017. Deep learning hierarchical representations for image steganalysis. IEEE Transactions on Information Forensics and Security, 12(11), pp.25452557.

[27]
Xu, G., 2017, June. Deep convolutional neural network to detect JUNIWARD. In Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security (pp. 6773). ACM.
 [28] Holub, V., Fridrich, J. and Denemark, T., 2014. Universal distortion function for steganography in an arbitrary domain. EURASIP Journal on Information Security, 2014(1), p.1.
 [29] Sedighi, V., Cogranne, R. and Fridrich, J., 2016. Contentadaptive steganography by minimizing statistical detectability. IEEE Transactions on Information Forensics and Security, 11(2), pp.221234.
 [30] Li, B., Wang, M., Huang, J. and Li, X., 2014, October. A new cost function for spatial image steganography. In IEEE International Conference on Image Processing (ICIP) (pp. 42064210). IEEE.
 [31] Fridrich, J. and Filler, T., 2007, February. Practical methods for minimizing embedding impact in steganography. In Security, Steganography, and Watermarking of Multimedia Contents IX(Vol. 6505, p. 650502). International Society for Optics and Photonics.
 [32] Tang, W., Li, B., Luo, W. and Huang, J., 2016. Clustering steganographic modification directions for color components. IEEE Signal Processing Letters, 23(2), pp.197201.
 [33] Qin, X., Li, B., Tan, S. and Zeng, J., 2019. A novel steganography for spatial color images based on pixel vector cost. IEEE Access, 7, pp.88348846.
 [34] Kang, Y., Liu, F., Yang, C., Xiang, L., Luo, X. and Wang, P., 2019. Color image steganalysis based on channel gradient correlation. International Journal of Distributed Sensor Networks, 15(5), p.1550147719852031.
Comments
There are no comments yet.