1 Introduction
Biclustering has become prevalent and useful data mining technique among researchers for analyzing the data. It has been applied to a wide variety of applications such as bioinformatics, information retrieval, text mining, dimensionality reduction, recommender systems, electoral data analysis, disease identification, association rule discovery in databases, and many more [1]. Among these, bioinformatics [2] [3] seems to have taken the advantage of biclustering for analysis of the gene expression data. During any biological process under different experimental conditions, genes are examined by their expression levels. The data is present in a matrix form with rows representing genes and columns as experimental conditions. The aim is to group genes and conditions into a submatrix to obtain crucial biological information such as identification of coregulated patterns among genes. A bicluster B can be represented as
(1) 
where refers to the expression level of instance under sample , and , is the number of instances, and
is the number of attributes. It involves finding the maximum submatrices in a data matrix with maximum coherency. Since biclustering is a NPhard problem, various heuristics and metaheuristics approaches have been used in the literature to find better solutions
[4].The traditional clustering algorithms give equal importance to all the columns. These algorithms are means clustering [5]
[6], selfoptimal clustering [7], improved mountain clustering [8], fuzzy Cmeans clustering [9], unsupervised fuzzy clustering[10], etc. Each algorithm has its own advantage. Despite their usefulness, they are not very helpful in a variety of problems. For example, every gene may not take part in every condition with gene expression analysis. Thus, combinatorial regulation and joint patterns of gene expression biclustering are essential to realize the complex nature of genes. In [11], a plethora of solutions to perform biclustering has been presented. Undoubtedly, among the pool of algorithms, all have their own distinctive ways including heuristic and statistical approaches with their merits and demerits. It is not expected that a single approach would turn out to be wellsuited for all types of data. So, any problem must be tackled with respective suitable algorithms and the best result must be noted. This generates the need of a comprehensive biclustering toolbox where various algorithms can be tested, validated, and visualized. A toolbox can be compared in terms of the following:
Number of algorithms embedded in the toolbox.

Number of validation indices present for qualitative analysis of generated biclusters.

Number of visualization methods available for generated biclusters.

Userfriendly interface of the toolbox.
Toolboxes  Algorithms  Validation Indices  Visualization Methods  Platform  
BicAT[12]  CC[23], ISA[27], OPSM[26], xMotif[30]  None  Heat Map[48]  JAVA  
BiVisu[16] 



MATLAB  
BicOverlapper 2.0[13]  Visualization Toolbox  None  Venn like Diagrams  R, JAVA  
Expander[14]  SAMBA[46]  None  Heat Map  JAVA  
BAT[17]  BiHEA[47]  Pairwise Gene Analysis  Heat Map, Numerical Matrix  JAVA  
BiBench[18] 

Jaccard Index[40], Fmeasure 

Python  
BiClust[19]  BiMax, CC, Plaid  Jaccard Index, Constant Variance 

R  
BicNET[15]  BicNET  None  Biclustering Network Data  Java  
MTBA[20] 


Heat Map, Gene Plot  MATLAB  
CoClust[21] 

None 

Python  
BicPAMS[22] 

None 

Java  




MATLAB 
Based on the abovementioned features, it can be summarized that a toolbox must be diverse in nature. In the past decade, the growing demand of biclustering algorithms has led the intense research on developing toolboxes for biclustering. This paper proposes a userfriendly toolbox namely “BIDEAL” which incorporates biclustering algorithms, validation indices, and visualization methods. Table I summarizes various biclustering toolboxes in terms of available algorithms, validity indices, and visualization methods. Considering the visualization methods or result presentation for generated biclusters, BicAT [12], BicOverlapper 2.0 [13], Expander [14], and BicNET [15] provide only single visualization method. On the other hand, BiVisu [16], BAT [17], BiBench [18], BiClust [19], MTBA [20], CoClust [21], BicPAMS [22], and BIDEAL have multiple methods of visualization. Among these, CoClust and BIDEAL offers the maximum number of visualization methods. By default, BIDEAL provides bicluster results in a numerical matrix. Another important feature of a toolbox is the validation indices to check the quality of obtained biclusters. BiVisu, BAT, BiBench, and BiClust offers only one or two validation indices whereas, BIDEAL have six i.e. maximum among the listed toolboxes. The Graphical User Interface (GUI) of any application for the execution of various algorithms on a single platform alleviates the process. The userfriendly interface of BIDEAL enables the testing of new dataset quite easy without any prior knowledge of backend programming. On the other hand, BiBench, BiVisu, BiClust, CoClust, and MTBA requires a little bit familiarity with the programming knowledge. Moreover, BicAT allows the execution of algorithms with default parameter settings, which is a constraint whereas, BIDEAL allows to change these parameters.
Contributions: This paper introduces the proposed BIDEAL toolbox, its necessity, and importance in comparison with other existing toolboxes in literature. Table II summarizes the comparison of the features available in BIDEAL with respect to existing toolboxes in the literature. In summary, the features of BIDEAL are as follows:

It is developed to integrate the largest number of biclustering algorithms, validation indices, and visualization methods (over existing toolboxes) on a single platform.

It accommodates preprocessing methods as well within.

It has a userfriendly interface than other existing biclustering toolboxes.

To demonstrate the usefulness of BIDEAL, it has experimented with four standard datasets and their validation indices have been compared.
To the best of our knowledge, no existing biclustering toolboxes have all these features incorporated on a single platform.
The paper is arranged as: Section 2 presents a brief introduction about biclustering algorithms embedded in BIDEAL, Section 3 describes validation indices, Section 4 illustrates GUI of BIDEAL, and Section 5 provides the results on four standard datasets using BIDEAL. Finally, Section 6 concludes the paper.
FeaturesToolboxes 













No. of Algorithms  5  1  1  1  1  4  3  1  12  3  5  17  
No. of Validation Indices  0  2  0  0  1  2  2  0  4  0  0  6  
No. of Visualization Methods  1  2  1  1  3  3  3  2  2  4  2  5  
Graphical User Interface (GUI)  Yes  No  Yes  Yes  Yes  No  No  Yes  No  No  Yes  Yes 
The values shown in bold represents the best feature among all the toolboxes.
2 BIDEAL: Ready for use Biclustering Algorithms
This section provides a brief overview of biclustering algorithms embedded in BIDEAL.
Cheng and Church (CC)[23] proposed an algorithm to process expression data on the basis of Mean Squared Residue (MSR) score as
(2) 
MSR measures coherency of genes and conditions using mean values and extract biclusters. Another effective algorithm FLexible Overlapped biClustering (FLOC) [24] was proposed. It performs probabilistic steps and find overlapped biclusters further refined using MSR score to overcome the effect of missing values in biclusters. The missing values often create random disturbances which affect the quality and slow down the operation of biclusters identification. The biclusters acquired by FLOC give better results for a larger matrix with smaller MSR in comparison to CC.
Dhillon [25] used Bipartite Spectral Graph Partitioning (BSGP) to model data matrix as =. It is based on an exhaustive bicluster enumeration approach, which tries to find partitions of the minimum cut vertex in a bipartite graph between rows and columns. Considering the time and memory, it is quite expensive. BSGP approach can be represented as
(3) 
Order Preserving SubMatrices (OPSM) [26] algorithm finds matrices, which have expression level in strictly increasing linear order. The algorithm uses a heuristic approach for biclustering. A submatrix can be said to be order preserving, if under the permutation of the conditions, the value of the gene expression data is linearly increasing or decreasing.
Another approach proposed by Bergmann et al. i.e. Iterative Search Algorithm (ISA) based on coherently overlapped biclusters, also referred as Transcription Modules (TM), can extract biclusters by iterative search from the gene expression data matrix [27].
In [28], Kluger et al.
proposed a spectral technique known as kSpectral to find biclusters based on Eigenvectors of the data matrix. Firstly, the datasets are normalized and then a singular value decomposition technique is applied on the micro array, where the constant part wise Eigenvalues give the checkerboard patterns in the submatrix. Finally,
means clustering is applied to obtain the checkerboard structures from the data matrix.In [29]
, the authors presented the informationtheoretic (ITL) formulation for biclustering. In this formulation, an optimization approach has been followed where the number of rows and column clusters are constraints and the task is to maximize the mutual information between clustered random variables. It can reduce the problem of high dimensionality and sparsity.
Murali et al. [30] proposed a representation for gene expression data called as conserved gene expression motifs or xMotifs. It tries to find largely conserved gene expression motifs from the given discretized data matrix. It uses a greedy approach that conserves row. A submatrix is said to be a conserved motif if the expression level of a gene is found consistent in the respective submatrix. Comparing distinct gene motifs for distinct conditions, we get to know of genes which are conserved in multiple conditions but are the in dissimilar state in various conditions.
In Plaid [31], a bicluster is assumed to follow the statistical model and the binary least squares is used to fit the bicluster membership parameters. In this model, data matrix can be considered as a superposition of layers, where layer is a subset of genes and conditions of the data matrix. The data tries to fit in a plaid model can be expressed as
(4) 
Binary Inclusion Maximal (BiMax) is based on fast divide and conquer approach [32]. It tries to find all the bimaximal biclusters which contains only one element. The algorithm requires discretization of the gene expression level matrix into a binary matrix by deciding a threshold.
Large Average SubMatrix (LAS) [33] is a statistically advanced algorithm which uses a Gaussian null model for gene expression data. It finds the bicluster to give the largest significance score which is defined as
(5) 
The elements of the data matrix are subtracted from the mean of the significance score (5) to form a residual matrix. The search is iteratively repeated until optimal value falls below the predefined threshold.
Hochreiter et al. [34] presented a multiplicative model biclustering algorithm i.e. Factor Analysis for Bicluster Acquisition (FABIA) that takes linear alliance of genes and conditions into account. In this model, the row and column vectors need to be multiple of each other. FABIA models the data matrix as the addition of biclusters and an additive noise. Here, the linear dependency of subsets of rows and columns can be described by outer product . The overall model is given by
(6) 
In [35], bitpatterns are extracted from the data matrix using two phase process known as BitBit algorithm. The first phase includes a novel encoding process to divide the columns of the data matrix to a certain length determined by the minimum number of columns. In the second phase, biclustering of bit patterns takes place using selective search. Each pair of row generates a pattern. In BitBit, the comparison between rows takes place at bit level. To tackle excessive computation, iterative approach is used instead of divide and conquer approach as in BiMax by avoiding recursion and also additional traversals of the matrix a.k.a. BiSim [36].
Wang et al. [37] proposed Modular Singular Value Decomposition MultiObjective Evolutionary biclustering (MSVD) algorithm. MSVD splits the gene expression data matrix into a set of submatrices with equal dimensions into a nonoverlapping manner. Then, it projects the data obtained for the desired number of eigenvalues and applies means clustering to cluster them.
Another algorithm QUalitative BIClustering (QUBIC) [38] based on graph theory approach is also embedded in BIDEAL. In QUBIC, the expression level of genes is expressed in a qualitative or semiqualitative manner under multiple conditions as an integer value.
Tchagang et al. proposed ROBA [39], where basic linear algebra techniques were used. There are three main steps in this algorithm. The first step involves preprocessing of data to handle missing values and noise. The second step decomposes given data matrix into binary matrices. The last step involves identification based on the type of bicluster.
3 BIDEAL: Accessible Validation Indices for Performance Measures
Various validation indices as performance measures are used to check the quality of biclusters as described in further subsections.
3.1 Jaccard Index
Jaccard index [40] compares the biclusters obtained by applying the two biclustering algorithms and finding out the number of similar biclusters between them. Jaccard index gives a value of 0 if biclusters are dissimilar else 1. Jaccard index is defined as
(7) 
3.2 SB Score
Differential coexpression ranking score a.k.a. SB score was proposed in [41]. Considering two biclusters and , where is formed by gene under the first set of conditions and is formed by the same gene with a second set of conditions. Chia et al. proposed an algorithm to compare the goodness of gene w.r.t. two nonidentical set of conditions. If is good gene than there will be coexpression between gene and first set of conditions while differential coexpression between gene and second set of condition. The differential coexpression of can be measured as
(8) 
where is used to offset the large ratios.
3.3 Constant Variance
In [7], corresponding variance of genes/ conditions is taken into consideration where the variance is the average of the sum of Euclidean distances between rows and columns of bicluster. Higher the value of the variance, lower the quality of the bicluster. The expression of the variance is given by
(9) 
3.4 Sign Variance
3.5 Hausdorff Distance
The Hausdorff distance [42] calculates the distance between the pair of submatrices obtained from the gene expression data matrix. It is maximum for traversal from the element of first bicluster to the nearest element of second bicluster and signifies dissimilarity. Mathematically, it can be written as
(10)  
3.6 Mean Squared Residue
To calculate mean squared residue, the mean square error (MSE) of each bicluster is calculated [23]. Then overall MSE can be calculated by taking the mean of individual values.
4 BIDEAL: Key Features and GUI
BIDEAL integrates various biclustering algorithms into a standalone application of graphical user interface (GUI) developed using MATLAB. It is executable on Windows as well as on Linux operating system. BIDEAL includes several functions to preprocess the raw data, validate, and visualize the biclusters. The key features of BIDEAL are as follows:
4.1 Data Preprocessing
BIDEAL includes four preprocessing methods, i.e. filtering, binarization, discretization, and normalization. Filtering is used to eliminate the effect of Not a Number (NaN) spots and missing values from the data. Binarization is used to convert a numerical feature vector into a Boolean, it is mostly useful for downstream probabilistic estimators which assume that the input data is distributed according to a multivariate Bernoulli distribution. Discretization, a.k.a. quantization/ binning, is used to transform continuous features into discrete values. Some specific datasets with continuous features may not be linearly correlated with the target and are not able to handle with feature selection methods. In such cases, obtaining an interpretable explanation of such features won’t be easy. However, this type of data may be benefited from discretization because it can transform the dataset of continuous attributes to one with only nominal attributes. Normalization is used for scaling the individual samples to have unit norm. In general minmax and zscore normalization are used when data come from the normal distribution. However, biomedical data or most of the clinical research data do not follow the normal distribution because they are mostly skewed. For this purpose, logarithmic transformation bistochastization and item independent rescaling of rows and columns are used. The log transformation decreases the variability of data and bistochastization makes all rows and columns to have the same mean value and the matrix is repeatedly normalized until convergence, whereas, in the independent row and column normalization of rows sum to a constant and columns sum to a distinct constant
[28].4.2 Largest Number of Biclustering Algorithms
For biclusters generation, biclustering algorithms have been embedded in BIDEAL that is maximum among all the available toolbox listed in Table II. It provides flexibility to select biclustering algorithms according to the nature of data. Availability of all algorithms at a single platform allows to analyze the data with minimal efforts.
4.3 Initial Parameter Setting of Algorithms
Without a prior knowledge of algorithms, the parameters setting is quite challenging for naive user. BIDEAL facilitates the initial value of parameters as provided in the original published work which users can easily change if needed.
4.4 Robust Bicluster Generation
BIDEAL offers several ways to ensure a smooth and robust bicluster generation. For example, the filtering option is availed to reduce the effect of NaN and missing values present in the dataset.
4.5 Identification of Cluster Type
BIDEAL offers validation indices to determine the type of biclusters. For example, the constant variance can identify constant bicluster, whereas sign variance allows to identify bicluster where coherent sign changes on rows and columns.
4.6 Similarity Measures
BIDEAL offers two validation indices, i.e. Jaccard index and Hausdorff distance to measure the similarity and dissimilarity, respectively between two biclusters. The value of Jaccard index of a particular biclustering algorithm varies from to depending upon the level of similarity. Hausdorff distance, widely used in several applications, can also measure the distance between two distinct biclusters. For example, in Yeast dataset, Jaccard index values were calculated for CC algorithm and it can be seen that results obtained from other algorithms were dissimilar from CC as Jaccard index values were very less for all other algorithms.
4.7 UserFriendly Interface
BIDEAL offers a user friendly GUI which is easy to use for bicluster analysis including generation, visualization, and validation. The unique features of this interface are:

BIDEAL is a self contained concise toolbox with all the relevant information present in it. It provides immediate visual results and effect of each action.

In many cases, the installation of toolbox depends on other components like language, which in general is not availed with toolbox package. To ease the installation, the standalone executable files are packaged with MATLAB runtime compiler in BIDEAL. This enables the user to just click and install the ready to use biclustering algorithms.
4.8 Implementation and GUI
BIDEAL has been developed using MATLAB which integrates various features into a standalone application. The GUI of developed BIDEAL toolbox comprises of the following steps for biclusters generation, validation, and visualization:

The home page of BIDEAL is shown in Fig. 1. At first, the dataset should be loaded. It can be either a sample or userdefined dataset.

The data can be preprocessed using filtering, binarization, normalization, or discretization.

Select the required algorithm to generate biclusters. User will be prompted to feed input parameters else BIDEAL will consider the default values.

Generated results can be saved in .mat file.

Click the Bicluster Visualization button on the home page to visualize the biclusters. Any of the available three options on visualization page i.e. heat map, cluster plot, or gene profile can be clicked to visualize the result.

Click the Bicluster Quality Index button to access the validation indices. The validation page displays individual bicluster or overall biclusters result.

Press Reset button to again access the home page.
DatasetsAlgorithms 


















Yeast[23]  100  10  16  16  0  1  97  4  20  75  20  2  212  1547  13  10  10104  
ALL vs. AML[43]  1  0  37  500  0  100  89  4  20  100  52  5  0  0  100  0  32591  
GDS205[44]  1  7  7  13  6  0  5  0  20  11  5  5  0  1  3  10  3925  
GDS301[45]  1  0  10  0  0  100  39  0  20  100  5  4  1  1  1  0  0 
5 BIDEAL: Testing and Validations on Benchmark Datasets
To demonstrate the utility of gene expression profiling by generation of patterns or biclusters through a single platform decreases user efforts. Hence, BIDEAL offers a user friendly interface to decrease the cumbersomeness faced during the biclusters formation. In this section, the experiments and validation on four benchmark datasets have been provided using BIDEAL. The four datasets used are Saccharomyces cerevisiae cell cycle dataset (Yeast) [23] with genes and conditions, Leukemia (ALL vs. AML) dataset [43] with genes and conditions, Mammary tissue profile dataset (GDS205) [44] with genes and conditions, and Ligand screen in B cells dataset (GDS301): Epstein Barr virusinduced molecule1 [45] with genes and conditions. The biclusters formed on these four benchmark datasets are further validated using validation indices available in BIDEAL as depicted in Fig. 2. Table III tabulates the number of biclusters obtained using biclustering algorithms embedded in BIDEAL. Since Yeast [23] and ALL vs. AML [43] datasets are preprocessed therefore GDS205 and GDS301 were preprocessed before execution of the biclustering algorithms. In further subsections, the findings of BIDEAL have been discussed in detail.
5.1 Saccharomyces Cerevisiae Cell Cycle (Yeast) Dataset
Yeast dataset [23] comprises of genes and conditions. The objective of this dataset is the identification of genes whose mRNA levels are regulated by the cell cycle. The number of biclusters generated on Yeast dataset using BIDEAL have been reported in Table III. The table depicts that among all algorithms ROBA generates highest number of biclusters whereas kSpectral fails to produce any bicluster i.e. 0. It is due the fact that kSpectral did not find any distinctive checkerboard patterns in Yeast dataset. On the other hand, ROBA utilizes simple linear algebraic methods instead of complex optimization and extracted highest i.e. 10104 number of biclusters. Since the hierarchy of biclustering algorithms is application specific therefore one cannot measured their utility in terms of number of bicluster like BiSim forms biclusters whereas ITL and FABIA extracted only 1 and 2 biclusters respectively. However all of them have their own biological significance. CC forms biclusters which cover approximately genes and approx. of conditions. Fig. 3 shows a sample heat map and gene plot using CC algorithm for generated bicluster on Yeast dataset. BSGP and QUBIC reported biclusters, whereas FABIA and Plaid had very few biclusters with fewer genes and conditions. BitBit gave biclusters while kSpectral failed to produce any bicluster which signifies that this model do not fit with the given dataset. OPSM and ISA reported the same number of biclusters. Considering the quality of obtained biclusters, it was noted that the biclusters obtained using BiSim, FABIA, and kSpectral had no similarity w.r.t. CC in the context of Jaccard index. On the other hand, ITL, Plaid, BitBit, and ISA had very low similarity. BSGP and MSVD gave higher similarity while ROBA had the maximum similarity among all. According to sign variance metric, the biclusters obtained using CC, Plaid, ISA, and FABIA were less coherent while ROBA, BSGP, and BiMax gave strong coherent biclusters. LAS, BiSim, and MSVD were giving average coherent biclusters. While measuring the quality of biclusters using constant variance, it was inferred that BSGP, MSVD, BiMax formed better biclusters while ISA and Plaid gave higher values of constant variance indicating lower quality of biclusters. LAS, CC, BitBit, ITL, and FLOC gave an average type of biclusters.
5.2 Leukemia (ALL vs. AML) Dataset
Leukemia dataset comprises of two subtypes of leukemia cancer i.e. Acute Myeloid Leukemia (AML) and Acute Lymphoblastic Leukemia (ALL). It has genes and conditions. For ALL vs. AML dataset, also ROBA reported highest number of biclusters biclusters due to its ability to extract more than one type of biclusters in given dataset. As mentioned earlier various biclustering algorithms are able extract specific patterns from dataset. For ex. BSGP works better when dataset can be modelled using bipartite graph efficiently whereas kSpectral is well known to extract checkerboard patterns in data. In case of this dataset both patterns were not applicable therefore 0 biclusters were reported. On other hand BitBit and BiSim are known to search patterns in less time by traversing the binarized data matrix with tuned parameters. As shown in Table III BSGP, kSpectral, BitBit and BiSim failed to produce any bicluster. BiMax successfully extracts 100 inclusive maximal biclusters from this dataset. It is interesting to notice that ITL, BiMax and MSVD produced same number of biclusters i.e. 100 though their objective functions are different from each other. ITL tries to preserve mutual info whereas BiMax follows divide and conquer strategy and MSVD is inspired from linear algebra technique. CC formed only one bicluster which has all genes and conditions. LAS, OPSM and xMotif resulted 52, 37 and 89 bicluster respectively. FABIA and Plaid extracted only 5 and 4 biclusters due to presence of less conditions and few layers as per plaid model. Considering the Jaccard index similarity, xMotif and CC values were high. CC and xMotif had a negative score which indicates differential coexpression. According to sign variance, CC gave coherent biclusters as it had the lowest value while high value of FLOC and BiMax indicates less coherent cluster. Rest of the algorithms generated biclusters with average coherency. From constant variance values, it can be inferred that ISA gave very low quality biclusters.
5.3 Mammary Tissue Profile (GDS205) Dataset
GDS205 [44] comprises of genes and conditions. For this dataset again ROBA resulted in high number of biclusters i.e. 3925. This indicates there are overlapped gene and sample sets where genes are involved in several biological pathways. Rest of the biclustering algorithms, embedded in BIDEAL extracted approximately 12 biclusters only. BiMax successfully extracted 11 subsets of genes and conditions whereas BiSim only extracted 1 bicluster. FABIA extracted only 5 biclusters which signifies that GDS205 dataset is not influenced by heavytailed distribution. For this dataset use of FLOC algorithm over the CC is clearly shown. FLOC resulted in biclusters without being effected by random interference whereas as CC produced only bicluster. BSGP and OPSM both gave biclusters indicating presence of orderpreserving submatrices in GDS201. kSpectral and xMotif resulted in and , respectively. LAS and MSVD discovered , biclusters, respectively. Qubic identified checkerboard pattern present in data. For this dataset ITL, Plaid, and BitBit failed to provide any bicluster. Plaid did not find any shift biclusters in this dataset whereas ITL fails to find co entropy based subsets genes and conditions. Now considering the validity of these bicluster we found that in terms of sign variance, CC and QUBIC resulted in very low value i.e. more coherent biclusters but biclusters produced by LAS were not coherent hence it had high value of sign variance. According to the constant variance, CC and QUBIC produced best biclusters, but FLOC gave the high value of constant variance, which meant that the quality of the biclusters was not good. Jaccard indices were calculated w.r.t. CC like others. It interprets that the biclusters formed by BSGP and MSVD had the lowest similarity with the biclusters formed by CC. It can also be concluded that CC and QUBIC produced better biclusters for this dataset.
5.4 Ligand Screen in B Cells (GDS301) Dataset
GDS301 dataset comprises of genes and 11 conditions collected by culturing B Cells with Ligand to perform temporal analysis. As shown in Table III BiMax produced maximum number of biclusters i.e. . This signifies 100 biclusters were found with values of 1s by enumeration. ITL also discovered same number of biclusters by extracting mutual information between genes and conditions. BSGP, kspectral, and Plaid failed to produce any bicluster. Plaid discovers interesting pattern with multivariate data whereas kSpectral identifies biclusters only if genes are coregulated with expression levels. FABIA reported to extract 4 biclusters. CC, ISA, BitBit, BiSim, all reported one bicluster having all genes and conditions in that bicluster. This means algorithms failed to extract the patterns from dataset. Though MSVD formed one bicluster where all conditions were present but only genes were matched. In terms of Jaccard index, BitBit and BiSim had maximum similarity with CC, whereas ITL and BiMax had less similarity with CC. In terms of sign variance, xMotif and CC gave coherent biclusters but biclusters formed by FLOC were not coherent enough. Constant variance values were mostly similar i.e. FABIA produced maximum constant variance among all.
5.5 Biological Significance
The biological significance of biclustering algorithms refers to the identification of subset of genes clustered with similar subset of conditions to form a pattern or bicluster. The biclusters are useful for disease identification, biomarkers generation, genedrug association, etc. The reliability of these biclusters are justified using various evaluation measures. BIDEAL provides constant variance and sign variance as evaluation measures to check the coherency, significance, and reliability of biclusters obtained using various biclustering algorithms . In terms of coherency, for Yeast dataset, biclusters generated using FLOC, Bimax, LAS, and ITL algorithms had low sign variance and constant variance. In ALL vs. AML dataset, most of the algorithms failed to generate significantly coherent biclusters except CC and xMotif algorithms. In GDS205 dataset, CC and BiSim algorithms produced coherent biclusters whereas, in GDS301 dataset, CC, ITL, and ISA algorithms produced coherent biclusters. Another evaluation measure, i.e. SB score, is also embedded in BIDEAL. The SB score was quite low for Yeast dataset except for the biclusters generated using BSGP algorithm. It shows that the obtained biclusters had more coexpression level for two conditions among genes. In ALL vs. AML dataset, generated biclusters have differential coexpression among genes and conditions because the value of SB score was almost absent. GDS205 dataset reported the high value of SB score which signifies the more coexpression ranking among genes w.r.t. two sets of conditions. In each dataset, at least one algorithm had reported similar bicluster as CC algorithm, for example ITL in case of ALL vs. AML dataset, whereas BiSim in GDS205 dataset. As presented in Table III, it can be seen that for datasets, ROBA gave an exceptionally large number of biclusters which means overlapping biclusters were generated, FABIA and plaid resulted in less number of biclusters for all datasets, FLOC generated a constant number of biclusters i.e. . For GDS301, only CC, OPSM, ITL, xMotif, FLOC, BiMax, LAS, ISA, MSVD, and FABIA had some result and BiSim and BitBit were quite similar to CC. In case of Yeast dataset, kSpectral failed to produce any bicluster while ITL, Plaid, and BitBit gave no bicluster on GDS205 dataset. Most of the biclusters formed using xMotif, BiSim, QUBIC, BSGP, and CC are of type which indicates clusters with strong instance and attribute effect. MSVD, FLOC, ISA, and BiMax generated biclusters are of Ttype hence these biclusters are with strong instance effect.
5.6 Execution Time and Size of Dataset
The proposed toolbox integrates various biclustering algorithms on a single platform therefore to measure the execution time one needs to note the execution time of each algorithm. Since the complexity of the biclustering problem relies on the dataset and the objective function therefore its execution time can vary from few seconds to hours. For example on the Yeast dataset, CC, xMotif, and BiMax takes less than 5 seconds to compute biclusters; BSGP, ISA, kSpectral, and FLOC take around 1 minutes to compute biclusters; BitBit and QUBIC extracts biclusters in 30 minutes; and BiSim executes in 90 minutes. Moreover, considering the maximum file sizes can be handled, the proposed toolbox has been validated for the dataset with maximum size of 25 MB. The test has been performed on Yeast dataset of file size 198KB, ALL vs. AML dataset of file size 656KB, GS205 dataset of file size 120KB, and GDS301 dataset of file size 25 MB.
6 Conclusions
The proposed “BIDEAL” toolbox in this paper has been developed to generate, validate, and visualize the biclusters from any data on a single platform. It integrates famous biclustering algorithms, validation indices, and
visualization methods for comprehensive data interpretations. Additionally, it provides preprocessing module to remove outliers and NaN spots from the data which helps to rectify issues related to null values, discrete matrix, etc. The proposed toolbox has been tested and validated on four benchmark gene expression datasets i.e. Yeast, ALL vs. AML, GDS205, and GDS301. It was inferred that each algorithm of BIDEAL can generate distinct set of biclusters from the same data; therefore, the selection of appropriate technique is required. The diverse nature of BIDEAL with various validation indices and visualization methods has been proven effective for selection of best biclusters. Information retrieval from data mainly depends on the type of local patterns, whether it has overlapping and constant biclusters, or noisy data. We hope that the availability of BIDEAL will help the research community by widespread use of biclustering algorithms to identify coherent groups in data which is very useful in disease subtype identification. Furthermore, the toolbox can help to cater the data analysis needs, and it is being offered free to the community.
References
 [1] S. C. Madeira and A. L. Oliveira, “Biclustering algorithms for biological data analysis: a survey,” IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 1, no. 1, pp. 2445, Jan.March 2004.
 [2] V. Singh, N. K. Verma, and Y. Cui, “Type2 fuzzy PCA approach in extracting salient features for molecular cancer diagnostics and prognostics,” IEEE Trans. on NanoBioscience, vol. 18, no. 3, pp. 482489, July 2019.

[3]
R. K. Sevakula, V. Singh, N. K. Verma, C. Kumar, and Y. Cui,
“Transfer learning for molecular cancer classification using deep neural networks,”
IEEE/ACM Trans. Comput. Biol. Bioinf., 2018. (Early Access)  [4] B. Pontes, R. Giráldez, and J. S. AguilarRuiz, “Biclustering on expression data: A review,” Journal of Biomedical Informatics, vol. 57, pp. 163180, 2015.

[5]
J. MacQueen, “Some methods for classification and analysis of multivariate observations,”
In Proc. of 5th Berkeley symposium on mathematical statistics and probability
, vol. 1, no. 14, pp. 281297, 1967.  [6] S. C. Johnson, “Hierarchical clustering schemes,” Psychometrika, vol. 32, no. 3, pp. 241254, 1967.
 [7] N. K. Verma and A. Roy, “Self optimal clustering technique using optimized threshold function,” IEEE Syst. Journal, vol. 99, pp. 114, 2013.
 [8] N. K. Verma, A. Roy, and Y. Cui, “Improved mountain clustering algorithm for gene expression data analysis,” Journal of Data Mining and Knowledge Discovery, vol. 2, no. 1, pp. 3035, 2011.
 [9] J. C. Bezdek, R. Ehrlich, and W. Full, “FCM: The fuzzy cmeans clustering algorithm,” Journal Computers and Geosciences, vol. 10, no. 23, pp. 191203, 1984.
 [10] A. B. Geva and D.H. Kerem, “Forecasting generalized epileptic seizures from the EEG signal by wavelet analysis and dynamic unsupervised fuzzy clustering” IEEE Trans. on Biomedical Engg., vol. 45. no. 10, pp. 12051216, 1998.
 [11] N. K. Verma, S. Bajpai, A. Singh, A. Nagrare, S. Meena, and Y. Cui, “A comparison of biclustering algorithms,” in Int. Conf. on Systems in Medicine and Biology, pp. 9097, 2010.
 [12] S. Barkow et al., “BicAT: A biclustering analysis toolbox,” Bioinformatics, vol. 22, pp. 12821283, 2006.
 [13] R. Santamaría, R. Therónand, and L. Quintales, “BicOverlapper 2.0: Visual analysis for gene expression,” Bioinformatics, vol. 30, no. 12, pp. 17851786, 2014.
 [14] R. Shamir et al., “EXPANDER  An integrative program suite for microarray data analysis,” Bioinformatics, vol. 6, no. 1, pp. 232, 2005.
 [15] R. Henriques and S. C. Madeira, “BicNET: Flexible module discovery in largescale biological networks using biclustering,” Algorithms for Molecular Biology, vol. 11, no. 1, pp. 1, 2011.
 [16] K. O. Cheng et al., “BiVisu: Software tool for bicluster detection and visualization,” Bioinformatics, vol. 23, no. 17, pp. 23422344, 2007.
 [17] C. A. Gallo, J. S. Dussaut, J. A. Carballido, and I. Ponzoni, “BAT: A new biclustering analysis toolbox,” LNCS in Advances in Bioinfo. and Compt. Biology, pp. 6770, 2010.
 [18] K. Eren, “Application of biclustering algorithms to biological data,” Diss. The Ohio State University, 2012.
 [19] S. Kaiser and F. Leisch, “BiClust: A toolbox for biclustering analysis in R,” 2008.
 [20] J. Gupta, S. Singh, and N. K. Verma, “MTBA: MATLAB toolbox for biclustering analysis,” IEEE Workshop on Computational Intelligence: Theories, Applications and Future Directions, IIT Kanpur, India, pp.148152, 2013.
 [21] R. François, M. Stanislas, and N. Mohamed, “CoClust: A python package for coclustering”, in Journal of Statistical Software, vol. 88, no. 7, pp. 129, 2018.
 [22] H. Rui, F. Ferreira, and S. Madeira “BicPAMS: Software for biological data analysis with patternbased biclustering”, BMC Bioinformatics, vol. 18, no. 1, 2017.
 [23] Y. Cheng and G. Church, “Biclustering of expression data,” Conf. on Intelligent Systems for Molecular Biology, vol. 8, pp. 93103, 2000.

[24]
J. Yang, H. Wang, W. Wang, and P. S. Yu, “An improved biclustering method for analyzing gene expression profiles,”
Int. Journal on Artificial Intelligence Tools
, vol. 14, no. 5, pp. 771789, 2005.  [25] I. S. Dhillon, “Coclustering documents and words using bipartite spectral graph partitioning,” Int. Conf. on Knowl. discovery and data mining, pp. 269274, 2001.
 [26] A. BenDor et al., “Discovering local structure in gene expression data: the orderpreserving submatrix problem,” Int. Conf. on Computational biology, vol. 10, pp. 4957, 2000.
 [27] S. Bergmann, J. Ihmels, and N. Barkai, “Iterative signature algorithm for the analysis of largescale gene expression data,” Physical Review E, vol. 67. no. 3, pp. 031902, 2003.
 [28] Y. Kluger, R. Basri, J. T. Chang, and M. Gerstein, “Spectral biclustering of microarray data: coclustering genes and conditions,” Genome research, vol. 13, no. 4, pp. 703716, 2003.
 [29] I. S. Dhillon, S. Mallela, and D. S. Modha, “Informationtheoretic coclustering,” Int. Conf. on Knowl. discovery and data mining, pp. 8998, 2003.
 [30] T. M. Murali and S. Kasif, “Extracting conserved gene expression motifs from gene expression data,” in Proc. of Pacific Symposium Biocomputing, vol. 3, pp. 7788, 2003.
 [31] L. Lazzeroni and A. Owen, “Plaid models for gene expression data,” Statistica Sinica, vol. 12, pp. 6186, 2002.
 [32] A. Prelić et al., “A systematic comparison and evaluation of biclustering methods for gene expression data,” Bioinformatics, vol. 22, no. 9, pp. 11221129, 2006.

[33]
A. A. Shabalin, V. J. Weigman, C. M. Perou, and A. B. Nobel,
“Finding large average submatrices in high dimensional data,”
The Annals of Applied Statistics, pp. 9851012, 2009.  [34] S. Hochreiter et al., “FABIA: Factor analysis for bicluster information acquisition,” Bioninformatics, vol 26, no. 12, pp. 15201527, 2010.
 [35] D. S. RodriguezBaena, A. J. PerezPulido, and J. S. AguilarRuiz, “A biclustering algorithm for extracting bitpatterns from binary datasets,” Bioinformatics, vol. 27, no. 19, pp. 273845, 2011.

[36]
N. Noureen and M. A. Qadir, “BiSim: A simple and efficient biclustering algorithm,”
Int. Conf. on Soft Computing and Pattern Recognition
, pp. 16, 2009.  [37] D. Wang and Zheng, “MSVDMOEB algorithm applied to cancer gene expression data,” Int. Conf. on Awareness Science and Technology (iCAST), pp. 119124, 2015.
 [38] L. Guojun, Q. Ma, H. Tang, A. H. Paterson, and Y. Xu, “QUBIC: A qualitative biclustering algorithm for analyses of gene expression data,” Nucleic acids research, pp. gkp491, 2009.
 [39] A. B. Tchagang and A. H. Tewfik, “Robust biclustering algorithm (ROBA) for DNA microarray data analysis,” Proc. IEEE/SP 13th Workshop on Statistical Signal Processing, pp. 984989, 2005.
 [40] M. Filippone, F. Masulli, and S. Rovetta, “Stability and performances in biclustering algorithms,” Int. Meeting on Comput. Intelligence Methods for Bioinformatics and Biostatistics, pp. 91101, 2008.
 [41] B. K. H. SB and R. K. M. Karuturi “Differential coexpression framework to quantify goodness of biclusters and compare biclustering algorithms,” Algorithms for Molecular Biology, vol. 5, no. 1, pp. 23, 2010.
 [42] N. K. Verma, E. Dutta, and Y. Cui, “Hausdorff distance and global silhouette index as novel measures for estimating quality of biclusters,” Int. Conf. on Bioinformatics and Biomedicine, pp. 267272, 2015.
 [43] T. R. Golub, “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring”, Science, vol. 286, no. 5439, pp. 531537, 1999.
 [44] S. P. Suchyta et al., “Bovine mammary gene expression profiling using a cDNA microarray enhanced for mammaryspecific transcripts”, Physiol Genomics, vol. 16, no. 1, pp. 818, 2003. Available Online: https://www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS205.
 [45] Available Online: https://www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS301.
 [46] A. Tanay, R. Sharan, and R. Shamir, “Discovering statistically significant biclusters in gene expression data,” Bioinformatics, vol. 18, pp. 136144, 2002.
 [47] C. A. Gallo, J. A. Carballido, and I. Ponzoni, “Bihea: A hybrid evolutionary approach for microarray biclustering,” Symposium on Bioinformatics, Springer, pp. 3647, 2009.
 [48] L. Wilkinson and M. Friendly, “The history of the cluster heat map,” The American Statistician, vol 63, no. 2, pp. 179184, 2009.