Often, steganography is categorized as art to hide secret messages to a medium [4, 3], such as images, videos, etc. Steganalysis is the technique of detecting hidden secret messages from the steganographic algorithm [3, 28]. Unfortunately, as new steganographic algorithms are developed so does the steganalysis techniques to detect the information about the hidden message. The focus of steganographic algorithms is to hide the existence of the message; on the other hand, the focus of steganalysis algorithms is to reveal the secret messages. Therefore, previous studies suggested that researchers should consider existing steganalysis techniques while they are trying to develop a steganographic algorithm [4, 3].
In cryptography, the secret message is encrypted so that attackers or unwanted parties cannot read the message . However, if an attacker decrypts the secret message then the cryptographic system is broken . On the contrary, in steganography, if an attacker reveals the existence of the concealed secret message then the purpose of steganography is defeated. Therefore, in steganography concealing the existence of the secret message is more important than not making the encryption more difficult to break . In other words, the main focus of cryptography is to protect the information from reading from unwanted parties, and steganography is to hide the existence of the information from unwanted parties.
Images, videos, text files, pdf files, etc. are the most common medium of steganography. However, images are the most popular medium for steganography . Because image steganography is simpler and provides comparatively higher hiding/embedding capacity . In image steganography, the Least Significant Bit (LSB) modification method is considered as a pioneer work . Note that, LSB modification and LSB matching have two different application areas . LSB modification is popular for the uncompressed domain, while LSB matching is popular for the compressed domain. It is worth to mention here that the detection processes of these techniques are also different.
Several innovative steganographic algorithms are developed within the last few decades . Such as, matrix embedding or F5 algorithm , modified matrix embedding, BCH coding, and trellis-coded . While these methods have claimed to have better resistance against statistical attacks or statistical steganalysis, many researchers have developed methods to break them.
Westfeld  mentioned that statistical attack or steganalysis is the most popular attack in steganography. Hence, researchers should check steganographic algorithms against statistical attacks first [4, 3]. There are some other steganalysis techniques, such as calibrated statistics attack [11, 16], which should be considered as well. However, nowadays most steganographic techniques made sure that they have resistance against statistical attacks, such as F5 algorithm , modified matrix embedding , and secure steganographic algorithm , etc.
It is important to develop a steganographic algorithm by making it secure against statistical attacks; more specifically first-order statistical attack , or to keep the histogram of the JPEG image coefficients intake or less altered. In general, the JPEG image coefficient histogram is bell-shaped . Therefore, several steganographic algorithms attempted to keep the JPEG image coefficient histogram in bell-shaped after hiding the data. For example, OutGuess , F5 algorithm , secure steganography . However, to keep the stego image (i.e., altered image) histogram may harm the quality of the stego image or may cause higher distortion [2, 26, 22].
To minimize the distortion of the stego image, Kim et al. have developed the Modified Matrix Embedding (MME) algorithm . In their work, the authors considered JPEG images as their secret medium. The MME algorithm helps to identify which LSB bit to modify in order to hide a secret message, this part is the same as the F5 algorithm . In general, modifying LSB of an image using the MME algorithm produces the least amount of distortion . Yet extensive analysis could reveal the existence of a secret message because of LSB modification.
In a study, Pevny and Fridrich , explained JPEG image distortion because of rounding operation during the image compression. During image compression, an image goes through various operations. Such as Discrete Cosine Transform (DCT), quantization, etc. The detail of JPEG image compression is explained in . Among all the processes in JPEG compression rounding operation occurs to transform to quantized coefficients to integer numbers, which is also known as JPEG coefficients.
Pevny and Fridrich  described how rounding operation contributes to image distortion. The authors have explained how to reduce rounding error and ultimately distortion in the compressed JPEG image. Kim et al.  used the idea of reducing the rounding error in matrix embedding or F5 algorithm. The MME algorithm is well-known for its less distortion compared to the F5 Algorithm.
Although, the MME algorithm outperforms the F5 algorithm by minimizing the rounding error. However, because of the position of candidate coefficient is defined by the matrix embedding technique without any flexibility. So, the error minimization is yet to be the least. The proposed technique allows finding the best candidate that will allow minimizing the rounding error and overcomes the drawback of the MME algorithm.
The proposed technique uses a block of coefficients to hide a single bit of secret message. It modifies only one of the coefficients from the group of coefficients (i.e., whichever produces the least amount of distortion after hiding the secret message is the candidate coefficient). This method uses a block of coefficients and considers minimizing the distortion in embedding, thus the method is called Minimum Distortion Embedding (MDE).
The rest of this paper is organized as follows: Section II presents related works. Section III describes the proposed method in detail, including encoding and decoding techniques. The experimental results and comparisons of obtained results are presented in V section. Finally, section VI concludes the study and provides future research directions.
Ii Existing methods
Ii-a JPEG image
During the JPEG image compression, an uncompressed image goes through JPEG encoder and decoder while transformation happens using the DCT . In the encoder, each channel of the image is divided into blocks, which is also known as the JPEG block. Let, , where of a image channel block and , where of a DCT transformation of the image channel block .
Note that, the first coefficient after DCT transformation is known as DC coefficient and the rest of the 63 coefficients of an block are known as AC coefficients . If the quantization matrix is denoted as , then after quantization and before rounding the coefficients can be expressed as,
clearly there is a difference between and because of the rounding operation, which can be expressed as,
Ii-B F5 algorithm
The F5 steganographic method was proposed by Westfeld . This method is considered as one of the first methods of data hiding that provides less modification during the data hiding process. The F5 algorithm is based on the matrix encoding technique, where among non-zero AC coefficients are considered to hide secret message bits by modifying only one coefficient. For example, non-zero coefficients will be considered to hide message bits. Since secret bits are either 0 or 1, thus and can be computed as,
where is the number of non-zero coefficients and is the number of secret message bits.
It is important to note that the F5 algorithm is not breakable using the first-order statistics or histogram-based steganalysis . Histogram-based steganalysis compares the histogram of original image (i.e., unaltered image) and stego image (i.e., modified image) . Hence, the F5 method is considered as one of the best steganographic methods . However, the F5 method has its limits as well; for example, the F5 method does not provide freedom to select a position of the candidate coefficient. In other words, the position of the candidate coefficient is not changeable.
Also, the F5 algorithm increases the number of zeros coefficients in the stego image and does not have the ability to minimize the distortion in the stego image . However, it important to note that the F5 algorithm is much secure than most algorithms when it hides a small amount of secret message in a JPEG image, and it is secure against the Chi-Square () analysis [23, 5].
Example 1. Let, the non-zero AC coefficients LSBs are denoted by , where, , secret message bits are denoted by where, . Suppose there are 3 bits of secret message, and their combination can be expressed using matrix as:
Since the coefficient matrix is one-dimensional, thus in order to multiply that with the matrix, it is important to transpose . Note that only non-zero would be considered for this process. See the following Eq. 5 for more details:
Ii-B1 Encoding of F5 algorithm
To get the position () of the candidate coefficient or the coefficient that needs to change to hide secret message bits can be found using Eq. 6. This means that needs to be subtracted from the results of Eq. 5. After getting the position of the candidate coefficient 1 would be subtracted if the coefficient is positive and 1 would be added if the coefficient is negative.
Ii-B2 Decoding of F5 algorithm
From the stego image all non-zero coeffients would be extracted as , and extracted coeffients will be multiplied with matrix to get the secret message bit . See the following Eq. 7 for more details:
Example 2. Suppose, there are 7 non-zero AC coefficients as [5 2 3 1 -2 -5 -1] and
corresponding LSB bits are = [5 2 3 1 -2 -5 -1], and secret message bits are = [1 0 1].
and the position of the coefficient is
Thus, using the matrix, it is easy to find that second coefficient need to modify, so, the after modification the coefficients became .
To decode the message
Ii-C Modified Matrix Embedding (MME)
The MME algorithm works the exact same way as the F5 algorithm (explained in section II-B). Except, after finding a candidate coefficient’s position, MME alters the coefficient with the help of Eq. 3 and modify the coefficient using the Eq. 8.
Again, it is important to note that MME can reduce the distortion by modifying the candidate coefficient in a way that provides the least distortion for that particular coefficient.
Iii Proposed MDE method
The proposed method is very simple, it uses the idea of MME as explained in section II-C. However, it allows finding the candidate which provides the least amount of distortion. In other words, it has the ability to overcome the limitation of the MME algorithm. Therefore, mathematically and theoretical this method can outperform both F5 and MME algorithms.
In its first step, the proposed method gathers all the non-zero AC coefficients in a single array (), and divides the array into small coefficient blocks , where = 1, 2, 3 , , is total number of block or total number of secret message bits (). So, The can be obtain by dividing the coefficient array by the number of secret message bits (see Eq. 9). Let, the number of secret message bits be then,
The second step is to find the coefficient that produces the least distortion in the block
In step three, after finding the best candidate in the block the proposed algorithm modifies the coefficient following the rule explained in Eq. 8.
Suppose, a block is [5 2 3 1 -2 -5 -1], so, after adding all the coefficients the sum becomes 3, which is an odd number. So, if this block needs to hide 1 as a secret message bit, then nothing needs to be done for this case.
However, if this block needs to hide 0 as a secret message bit, then the sum value needs to be modified to an even number (i.e., 2 or 4). This can be done by modifying any of the coefficients. Either add 1 or subtract one would do the trick. However, because this method tries to reduce the distortion as much as possible, therefore it looks for the coefficient and to it.
Iii-a Encoding of MDE algorithm
The encoding scheme is very simple and easy to implement. The encoding
scheme working as follows:
Make same as the hidden number of hidden message bits. Make the sum of all coefficients of blocks. If the sum is odd then that sum can represent hidden bit 1, and if the sum is even then that can represent hidden message bit 0.
Modify one of the coefficients of the block following the less distortion rule (if necessary).
Iii-B Decoding of MDE algorithm
The decoding scheme working as follows:
If the sum of a block is odd then the hidden secret message bit is 1, and if the sum is even then hidden secret message bit is 0.
Example 2. Suppose a block size is 5, and non-zero AC
coefficients are before rounding -0.6994, 0.8534,
1.7352, 1.6229, -2.6861, and after rounding the DCT coefficients became as -1, 1, 2, 2, -3.
So, for the given block the rounding errors would be as, -0.3006, 0.1466, 0.2648, 0.3771, -0.3139. Now, if modifications made by following the Eq. 8, then because of might look like as (-1-1), (1+1), (2-1), (2-1), (-3+1). Then, the error between original coefficients (before rounding) and modified coefficients (after modifying) would be -1.3006, 1.1466, -0.7352, -0.6229, and 0.6861. So, clearly the best candidate is the second to last coefficient.
So, the proposed method will find the second to last coefficient because of it is producing the least amount of distortion among all the coefficients, and modify it to hide the secret message bit.
Iv Compare the distortion between F5/MME and MDE
. In general, there are 50% chances that the MME method reduced the distortion than the F5 algorithm. Perhaps, the F5 algorithm modified a coefficient in a way that produces the same distortion as MME. Also, this can be proved by the logic of uniform distribution. As the secret message bits are either 0 or 1, thus in most cases the coefficients did not need to be modified as they may already be in a form that can represent the secret message bit .
In addition, both F5 and MME get the position of the candidate coefficient to modify. There may be another coefficient that may produce less distortion than the candidate coefficient, however, F5 and MME method do not get to pick that coefficient to modify and hide data. In contrary, the purposed MDE method find which coefficient produces the least distortion , where is the candidate coefficient that produces the least amount of distortion after hiding the secret message bit.
Therefore, based on the above discussion it is clear that MDE method outperform both F5 and MME methods, and produces less distortion than those methods. In sum, MDE produces less error so probability of detecting the existence secret message in MDE is less than F5. The following section would show some experimental comparisons between MDE and F5 method as it it the base method.
V Experimental Results and Comparisons
V-a Analysis using error probability
An image database was used to test both the F5 method and the proposed MDE method. In the database, there are over 1,173 images. A Support Vector Machine was trained using the features of original images from the database. Then tested by modifying those images using F5 and MDE methods. Results of F5 and MDE were checked separately to compare the error probability. If SVM produced more error probability for one method than the other that showed which method is stronger or have better strength against steganalysis.
The error probability can be defined using following equation:
where, is the error probability, is the false positive and missed detection. Note that, if a method reaches 50% error probability, then that would mean that SVM is unable to determine whether there is any secret message hidden in it. Therefore, 50% error probability is desired.
The proposed method and F5 method compared with two different JPEG image quality factors (QF), such as QF = 50 and QF = 75. Information loss happens during the image compression process, and this loss of information is measured by a quality factor or QF . A higher-quality factor means less information loss. For example, QF = 50 means more information loss than QF = 75. In the case of QF = 50, results suggested that the F5 algorithm produced smaller steganalysis error probability than the proposed method MDE algorithm. For example, with 5% data hiding capacity F5 algorithm produced 23.085 steganalysis error probability and the proposed MDE method produced 44.8041. Again, a higher value of error probably means less detectable.
Likewise, with 10% data hiding capacity and with QF = 50, F5 algorithm produced 4.5997 steganalysis error probability, while the proposed MDE method produced 33.0494 steganalysis error probability. Then again, with 15% data hiding capacity and with QF = 50, F5 algorithm produced 2.0443 steganalysis error probability, and the proposed method produced 18.9949 steganalysis error probability. In addition, with 20% data hiding capacity and with QF = 50,
F5 algorithm produced 0.5111 steganalysis error probability, and the
proposed MDE method produced 4.4293 steganalysis error probability (see Table I).
Similarly, with 5% data hiding capacity and QF = 75, F5 algorithm produced 18.3986 steganalysis error probability, and the proposed MDE method produced 44.9744 steganalysis error probability. With 10% data hiding capacity and with QF = 75, F5 algorithm produced 2.1995 steganalysis error probability, and the proposed MDE method produced 33.3049 steganalysis error probability. With 15% data hiding capacity and with QF = 75 F5 algorithm produced 0.6814 steganalysis error probability, and the proposed MDE method produced 17.4617 steganalysis error probability. Again, with 20% data hiding capacity and with QF = 75 F5 algorithm produced 0.2555 steganalysis error probability, and the proposed method produced 3.8330 steganalysis error probability
(see Table I).
|Steganalysis by Error Probability (EP)|
|QF = 50||F5||23.0835||4.5997||2.0443||0.5111|
|QF = 75||F5||18.3986||2.1295||0.6814||0.2555|
V-B Analysis using embedding rate
Both MDE and F5 methods were tested with Support Vector Machine to detect steganalysis probability, the following comparison are prepared after getting the steganalysis detection result. During the performance testing, the error probability and embedding rate were considered with both QF = 50, and QF = 75. With both QF (i.e., 50 and 75), the proposed method (i.e., MDE) achieved better performance than the F5 method. A distortion function was was defined by the guidelines provided by Filler and Fridrich  as below:
where is the cost functions satisfying whenever .
V-C Visual analysis
Visual analysis is another important steganalysis technique that helps to understand whether an image was artificially altered [29, 8]. As the proposed method produces less distortion than the F5 method. Therefore, the output image or stego image produced by proposed MDE method does not show any visual issues , and it will not reveal the existence of secret message in visual inspection (see Fig. 1 and Fig 2).
The results of this study suggest that the proposed MDE method has better resistance ability in terms of steganalysis than the F5 method. In the case of steganography, attacks are more important than capacity, while this method has better hiding capacity also . The main advantage of this proposed method is the freedom of modifying any coefficients. Resulting in a better quality of stego image and higher resistance against attacks.
As for the future study, other steganographic algorithms should be considered for the comparison. In addition, more feature-based steganalysis should be considered as more features may increase the chance of detection . Moreover, a study could find a way to increase the hiding capacity and yet maintain less distortion ability.
-  (2019) Secure mobile computing authentication utilizing hash, cryptography and steganography combination. Journal of Information Security and Cybercrimes Research (JISCR) 2 (1). Cited by: §I.
-  (2008) Secure steganographic method. pp. 141–145. Cited by: §I, §I.
-  (2009) An improved steganography covert channel. In International Conference on Advanced Software Engineering and Its Applications, pp. 176–187. Cited by: §I, §I.
-  (2010) Concurrent covert communication channels. In Advances in Computer Science and Information Technology, pp. 203–213. Cited by: §I, §I, §II-A.
-  (2011) Steganographic covert communication channels and their detection. MS thesis, Kent State University. Cited by: §I, §I, §I, §II-A, §II-A, §II-B, §II-B, §V-B, §V-C, §VI.
-  (1996) Techniques for data hiding. IBM systems journal 35 (3.4), pp. 313–336. Cited by: §I.
-  (2019) Effect of jpeg quality on steganographic security. In Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, pp. 47–56. Cited by: §V-A.
-  (2008) Enhancing steganography in digital images. In 2008 Canadian Conference on Computer and Robot Vision, pp. 326–332. Cited by: §V-C.
-  (2010) Minimizing embedding impact in steganography using trellis-coded quantization. In Media Forensics and Security II, Vol. 7541, pp. 754105. Cited by: §I, §V-B.
-  (2001) Detecting lsb steganography in color, and gray-scale images. IEEE multimedia 8 (4), pp. 22–28. Cited by: §I.
Quantitative steganalysis of digital images: estimating the secret message length. Multimedia systems 9 (3), pp. 288–302. Cited by: §I.
-  (2002) Steganalysis of jpeg images: breaking the f5 algorithm. In International Workshop on Information Hiding, pp. 310–323. Cited by: §I.
-  (2011) Steganalysis of content-adaptive steganography in spatial domain. In International Workshop on Information Hiding, pp. 102–117. Cited by: §VI.
-  (2013) Multivariate gaussian model for designing additive distortion for steganography. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2949–2953. Cited by: §V-B.
-  (2007) Statistically undetectable jpeg steganography: dead ends challenges, and opportunities. In Proceedings of the 9th workshop on Multimedia & security, pp. 3–14. Cited by: §V-A.
-  (2004) Feature-based steganalysis for jpeg images and its implications for future design of steganographic schemes. In International Workshop on Information Hiding, pp. 67–81. Cited by: §I.
-  (1998) Exploring steganography: seeing the unseen. Computer 31 (2), pp. 26–34. Cited by: §I.
-  (2006) Modified matrix encoding technique for minimal distortion steganography. In International Workshop on Information Hiding, pp. 314–327. Cited by: §I, §I, §I, §I.
-  (2012) Uniform distribution of sequences. Courier Corporation. Cited by: §IV.
-  (2019) Binary image steganalysis based on histogram of structuring elements. IEEE Transactions on Circuits and Systems for Video Technology. Cited by: §I, §II-B.
-  (2007) Merging markov and dct features for multi-class jpeg steganalysis. In Security, Steganography, and Watermarking of Multimedia Contents IX, Vol. 6505, pp. 650503. Cited by: §I, §I.
-  (2001) Defending against statistical steganalysis.. In Usenix security symposium, Vol. 10, pp. 323–336. Cited by: §I.
-  (2019) Impact of steganography on jpeg file size. In 2019 27th Iranian Conference on Electrical Engineering (ICEE), pp. 1869–1873. Cited by: §II-B.
-  (1996) Perceptual adaptive jpeg coding. In Proceedings of 3rd IEEE International Conference on Image Processing, Vol. 1, pp. 901–904. Cited by: §II-A.
Less detectable jpeg steganography method based on heuristic optimization and bch syndrome coding. In Proceedings of the 11th ACM workshop on Multimedia and security, pp. 131–140. Cited by: §I.
-  (2001) F5—a steganographic algorithm. In Information Hiding: 4th International Workshop, IH 2001, Pittsburgh, PA, USA, April 25-27, 2001. Proceedings, Vol. 2137, pp. 289. Cited by: §I, §I, §I, §I, §II-B, §II-B.
-  (2019) A natural steganography embedding scheme dedicated to color sensors in the jpeg domain. Cited by: §V-B.
-  (2014) An analysis of lsb based image steganography techniques. In 2014 International Conference on Computer Communication and Informatics, pp. 1–4. Cited by: §I, §I.
-  (2004) Cyber warfare: steganography vs. steganalysis. Communications of the ACM 47 (10), pp. 76–82. Cited by: §V-C.
-  (2018) A roi-based high capacity reversible data hiding scheme with contrast enhancement for medical images. Multimedia Tools and Applications 77 (14), pp. 18043–18065. Cited by: §IV.
-  (2015) Defining embedding distortion for motion vector-based video steganography. Multimedia tools and Applications 74 (24), pp. 11163–11186. Cited by: §V-B.