1 Introduction
Among data analysis techniques, Formal Concept Analysis (FCA) is a useful knowledge representation framework for describing and summarizing data. As the crucial data structure of FCA, concept lattice is an effective tool for knowledge discovering, which can depict the generalization and specification between formal concepts in a hierarchical structure. Concept lattice has been widely used in many areas, such as data mining, machine learning, information retrieval and so on
[4, 9, 17, 16, 23, 24]. The main research contents of concept lattice include lattice construction [14, 1, 2, 8, 15, 21, 19, 20], rule extraction [11, 12, 13, 18] and lattice reduction [13, 10, 22].A challenging problem in computing these formal concepts is that a typical data set may have a great number of formal concepts. It is well known that the number of formal concepts can be increased exponentially in associated with the size of the input context and the problem of determining this number is #Pcomplete [7].
In FCbO algorithm, Outrata and Vychodil [14] introduced an idea in which a concept is closed before its descendants are computed, thus allowing the descendants to fully inherit the attributes of the parent. With the spirit of ‘bestofbreed’ research, this idea was integrated into the InClose2 algorithm [1].
Considering the formal context as a matrix, a row is all the attributes of an object and a column is all the objects of an attribute. Further, all the objects of a formal concept is called extent and all the attributes of a formal concept is called intent. Within InClose2, InClose3 or InClose4 algorithms [1, 2, 3], intents are stored in a linked list tree structure. Extents are stored in a linearised 2dimensional array. The context is stored as a horizontal bitarray for optimising for RAM and cache memory. This also allows multiple context cells to be processed by a single 32bit or 64bit operator.
Suppose that one row of context is , , , , it is stored as in InClose2, InClose3 or InClose4 algorithms, where the first bit means , the second bit means , the third bit means and so on. The main shortcoming of these algorithms is that the extent of a concept is not stored as a 32bitarray (or 64bitarray), thus they process the intersection of the extent of a concept and a column of context only one object at a time.
A crucial improvement in our algorithm is that both context and extent of a concept are stored as a vertical bitarray for optimising for RAM and cache memory, which can significantly reduce both the time complexity and space complexity. Suppose that one column of context or the extent of a concept is , , , , it is stored as in our algorithm. Thus multiple context cells are processed by a single 32bit or 64bit operator when finding the intersection of the extent of a concept and a column of context.
The second important improvement is the following. The core procedure in InClose2 algorithm is ComputeConceptsFrom((A,B),y), which uses a queue of local array [1, 2]. In most cases, the local queues are empty, thus the space complexity is not efficient. In our algorithm, the queue is optimised and used as one global queue, which would greatly reduce the space complexity of the core procedure.
This paper illustrates, after a brief description of formal concepts, how formal concepts are computed via InClose2 algorithm. Using a simple example, the basic recursive process of InClose2 algorithm is shown, linebyline. Using the same notation and style, we present a new variant called the InClose5. The key differences between the algorithms are then compared to highlight where efficiencies occur.
The paper is organized as follows. In Section 2, we review the necessary notions concerning formal concepts and their basic properties. In Section 3, we study how formal concepts are computed using InClose2 algorithm. In Section 4, we present the InClose5 algorithm and give experiment results with InClose2 algorithm, InClose4 algorithm and InClose5 algorithm. Finally the paper is concluded in Section 5.
2 Basic notions and properties
In this section, we will review some basic notions and properties of FC involved in this paper. The definitions of a formal context and its operators are given first as follows.
Definition 1. [5] Let be a formal context, where , , and is a binary relation between and . Here each is called an object, and each is called an attribute. If an object has an attribute , we write or .
Definition 2. [5] Let be a formal context. For any and , a pair of positive operators are defined by:
,
,
Based on the above operators, formal concepts and concept lattices are defined as follows.
Definition 3. [5] Let be a formal context. For any , , if and , then is called a formal concept, where is called the extent of the formal concept, and is called the intent of the formal concept. For any , one can define the partial order as follows:
The family of all formal concepts of is a complete lattice, and it is called a concept lattice and denoted by .
Let be a formal context. For any , , the following properties hold:
(1) , ;
(2) , ;
(3) , ;
(4) ;
(5) ;
(6) ;
Typically a table of or is used to represent a formal context, with s indicating binary relations between objects (rows) and attributes (columns). The following is a simple example of a formal context:
1  0  1  1  0  0 

2 
1  1  0  0  0 
3 
1  0  0  0  0 
4 
0  0  0  0  1 
5 
0  0  0  1  1 
6 
0  0  1  1  1 


















Formal concepts in a table of or can be visualised as closed rectangles of s, where the rows and columns in the rectangle are not necessarily contiguous. Suppose we define the cell of the th row and th column as . Thus in Table 1, , , and form the concept , and is a rectangle of height and width . Similarly , and form the concept , and is a rectangle of height and width . and form the concept , and is a rectangle of height and width , here and are not contiguous.
In fact, it is not easy to compute the formal concepts given a formal context. Next we will address this problem.
3 Computation of formal concepts
A formal concept can be obtained by applying the operator to a set of attributes to get its extent, and then applying the operator to the extent to get the intent.
If this procedure is applied to every possible subset of , then all the concepts in the context can be obtained. However, the number of formal concepts can be exponential in terms of the size of the input context and the problem of determining this number is #Pcomplete [7]. So an efficient algorithm is crucial and required to compute all the formal concepts in a formal context.
By taking the advantages of algorithm InClose and algorithm FCbO, InClose2 is very efficient [1, 2]. The InClose2 algorithm, given below, is invoked with an initial and an initial attribute , where there are columns in the formal context.
Line 1 – Iterate across the context, from starting attribute down to attribute 0 (the first column).
Line 2 – Skip attributes already in , as intents now inherit all of their parent’s attributes.
Line 3 – Form an extent , by intersecting the current extent with the next column of objects in the context.
Line 4 and Line 5 – If the extent formed, , equals the extent, , of the concept whose intent is currently being processed, then add the current attribute to the intent being processed, .
Line 7 – Otherwise, check whether is contained in any new concept in the queue.
Line 8 – If is not contained, place the new extent C and the location where it was found, , in a queue for later processing.
Lines 13 – The queue is processed by obtaining each new extent C and the associated location from the queue.
Line 14 – Each new partial intent, , inherits all the attributes from its completed parent intent, , along with the attribute, , where its extent was found.
Line 15 – Call ComputeConceptsFrom to compute child concepts from and to complete the intent .
As the extent of a concept is not stored as a 32bitarray (or 64bitarray), thus in Line 3 of ComputeConceptsFrom, the algorithm processes the intersection of the extent of a concept and a column of context only one object at a time, which increases the time complexity of InClose2 greatly. This is the main disadvantage of InClose2 algorithm.
For example, apply InClose2 algorithm to the formal context in Table 1, we have results in Table 2. In the first call ComputeConceptsFrom, , , and passed through IsCannonical() test. As , where , so failed IsCannonical() test.
In the second call ComputeConceptsFrom, passed through IsCannonical() test, we got concept as the child concept of . Similarly, we got concept as the child concept of , as the child concept of and as the child concept of .
1  0  1  1  0  0 

2 
1  1  0  0  0 
3 
1  0  0  0  0 
4 
0  0  0  1  0 
5 
0  0  0  1  1 
6 
0  0  1  1  1 
In the first call ComputeConceptsFrom, all , , , and passed through IsCannonical() test.
When call ComputeConceptsFrom with and , we got , where , thus .
Similarly, we got concept as the child concept of , as the child concept of and as the child concept of .












In Figure 1, one can see the call tree of ComputeConceptsFrom. From the graph theory, the number of vertices is equal to the number of edges plus one. Here one edge means a ComputeConceptsFrom call from the queue and one vertex means an implementation of ComputeConceptsFrom. One can see that during the first implementation of ComputeConceptsFrom, 5 function calls from the queue are launched. It is obvious that during the implementation of ComputeConceptsFrom, one function call from the queue is launched averagely.
Specifically, the local queue of ComputeConceptsFrom is implemented as the following. First int Bchildren[MAX_COLS] is used to store the location of the attribute that will spawn new concept. Second int Cnums[MAX_COLS] is used to store the concept number of the spawned concept, where MAX_COLS. One can see that the efficiency of the local queue is very low.
Figure 1. The call tree of ComputeConceptsFrom
In line 3 of ComputeConceptsFrom, if the extent formed, , is empty, then store the current attribute , which can be ignored in concepts of subsequent levels. This is the main improvement from InClose2 to InClose3. Further InClose4 is a 64 bit version, and it can build and output concept trees in JSON format, where JSON stands for Java Script object notation, is a lightweight data representation method.
4 InClose5 algorithm
Within InClose2, InClose3 or InClose4 algorithms [1, 2, 3], extents are stored in a linearised 2dimensional array. A concept of objects will occupy integers. Furthermore, in ComputeConceptsFrom, the core procedure of InClose2, intersecting the current extent with the next column of objects in the context is the most timeconsuming operation. It inspires us to store both context and extents of concepts as a vertical bitarray.
Technically, let rows in a context be divided into blocks, where is the largest number that is less than or equal to . We only store the rows with objects (or nonzero block value) by block number and block value.
For example, the column , , , , , , is divided into 2 blocks. The first block value is , where the first bit means , the second bit means , the third bit means and so on. Thus the column is stored as , namely block number is and block value is . We do not store the second block, as the second block value is .
In the case of InClose4 algorithm, suppose that are the objects of a concept, then they will stored as , namely integers indicate the locations of all objects (or the locations of all s ).
In the best case of InClose5, a concept of objects only occupy integers, and one bitwise logic and operation may process 32 objects when stored as 32bit integers. In the worst case of InClose5, the column is stored as , while InClose4 the same column is stored as .
With InClose2 or InClose4 algorithm to process mushroom data [6], extents of all concepts will occupy bytes of memory. In contrast, InClose5 algorithm only needs bytes of memory when stored as 32bit integer, and bytes of memory when stored as 64bit integer. From the results above, the space complexity of InClose5 is much better than that of InClose2 and InClose4.
The core procedure in InClose2 algorithm is ComputeConceptsFrom((A,B),y), which uses a queue of local array [1, 2]. Specifically, use int Bchildren[MAX_COLS] to store the location of the attribute that will spawn new concept, and int Cnums[MAX_COLS] to store the concept number of the spawned concept, where MAX_COLS. In fact, the number of times that the function is called is equal to the number of concepts. Thus in most cases the queue is empty. This inspires us to link all the local queue together as one global queue, and use the concept number as the index of the queue. Thus we only need to store the location of the attribute where the new extent was found.
The InClose5 algorithm is presented as the following, which is invoked with an initial , an initial attribute , where there are columns in the formal context, and an initial empty Bparent.
Line 1 – Bparent contains and the attribute that can be ignored in concepts of subsequent levels. The child concept inherits attributes from the parent.
Line 2 – Iterate across the context, from starting attribute down to attribute 0 (the first column).
Line 3 – Skip attributes already in Bchild.
Line 4 – Form an extent , by intersecting the current extent with the next column of objects in the context. It is implemented in C language as the following.
unsigned int* Ac = startA[c]; //pointer to start of current extent
unsigned int* aptr = startA[highc]; //pointer to start of next extent to be created
int sizeAc = startA[c+1]startA[c]; //calculate the size of current extent
/* iterate across objects in current extent to find them in current column */
for(int i = sizeAc/2; i >0; i–)
if(context0[*Ac][j] & *(Ac+1))
*aptr = *Ac; //add object block number to new extent (intersection)
aptr++;
*aptr = context0[*Ac][j] & *(Ac+1); //add object block value to new extent
aptr++;
Ac+=2; //move to next object block
Line 5 and Line 6 – If the extent formed, , is empty, then put in Bchild, which can be ignored in concepts of subsequent levels.
Line 8 and Line 9 – If the extent formed, , equals the extent, , of the concept whose intent is currently being processed, then add the current attribute to the intent being processed, and also put in Bchild.
Line 11 – Otherwise, check whether is contained in any new concept in the queue.
Line 12 – If is not contained, place the location in a global queue for later processing. It is implemented as the following.
Bchildren[highc1] = j; //note where (attribute column) it was found,
nodeParent[highc] = c; //note the parent concept number and
startA[++highc] = aptr; //note the start of the new extent in A.
Lines 18 – The queue is processed by obtaining each new extent and associated location from the queue.
Line 19 – Each new partial intent, , inherits all the attributes from its completed parent intent, , along with the attribute, , where its extent was found and attributes that can be ignored in concepts of subsequent levels.
Line 20 – Call ComputeConceptsFrom to compute child concepts from and to complete the intent .
Lines 18, Lines 19 and Lines 20 are implemented in C language as the following.
// here numchildrenStart is stored as highc1 at the beginning of
//ComputeConceptsFrom
for( = highc2; numchildrenStart ; –)
startB[+1] = bptr; //set the start of the intent in B tree
// note that is the number of new extent
ComputeConceptsFrom(, Bchildren[]1, Bchild);
As both context and extent of a concept are stored as a vertical bitarray, when form an extent in Line 4, at most 32 (64) context cells can be processed by a single 32bit ( 64bit) and operation. In the case of InClose3, the extent of a concept is not stored as a 32bitarray (or 64bitarray), thus InClose3 processes context only one cell at a time. In Line 12, we only place the location in a global queue for later processing, while InClose3 has many local empty queues. The time complexity and space complexity is greatly reduced, however the logic structure of InClose5 algorithm is almost the same as that of InClose3 algorithm, so please refer to [2] for the correctness of InClose5 algorithm.
Considering the formal context as a matrix, when transpose the matrix, the concepts of the new matrix should be symmetric to that of the original matrix. However, there are 8124 columns in transposed mushroom data, thus the depth of recursive calls of ComputeConceptsFrom is greatly increased and so does the complexity.
For InClose5 algorithm, with one global queue, it is capable to process transposed mushroom data but with much longer time. In contrast, as local queues use too much memory, InClose4 algorithm can not process transposed mushroom data.
Some experiments are done to compare the time complexity of InClose2 algorithm, InClose4 algorithm and InClose5 algorithm. The experiment results are given in Table 5. Here mushroom data and nursery data are from [6]. The experiments are carried out using a laptop computer with an Intel Core i52450M 2.50 GHz processor and 8GB of RAM.
Mushroom  Nursery  Transposed Mushroom  Transposed Nursery  

#concepts 
233,101  154,055  233,101  154,055 
InClose2 
0.424  0.123  
InClose4 
0.388  0.132  
InClose5 
0.195  0.073  102.536  105.531 
From Table 5, one can see that for Mushroom data of , InClose4 is faster than InClose2 and InClose5 is the fastest. For Nursery data of , InClose4 has no advantage over InClose2 as 30 is much less than 64. As local queues use too much memory, both InClose2 and InClose4 can not process transposed mushroom data.
5 Conclusions and future work
Within InClose2, InClose3 or InClose4 algorithms, intents are stored in a linked list tree structure and extents are stored in a linearised 2dimensional array. The data structure is very simple and effective. A crucial improvement in our algorithm is that both context and extents of concepts are stored as a vertical bitarray to optimise for RAM and cache memory, which also significantly reduces the time for processing extents of concepts.
Object oriented concept lattice is a more extensive concept lattice [25]. It is more difficult to construct object oriented concept lattices. In the future, we will apply the data structure and technique in these algorithms to object oriented concept lattices, attribute oriented concept lattices and so on.
References
 [1] S. Andrews, Inclose2, a high performance formal concept miner. S. Andrews, S. Polovina, R. Hill, B. Akhgar (Eds.), Conceptual Structures for Discovering Knowledge – Proceedings of the 19th International Conference on Conceptual Structures (ICCS), Springer (2011), pp. 5062
 [2] S. Andrews, A ‘BestofBreed’ approach for designing a fast algorithm for computing fixpoints of Galois Connections, Information Sciences, 295 (20) (2015) 633–649.
 [3] S. Andrews, InClose4 Program, 2017, https://sourceforge.net/projects/inclose/files/InClose/.
 [4] V.G. Blinova, D.A. Dobrynin, V.K. Finn, S.O. Kuznetsov, E.S. Pankratova, Toxicology analysis by means of the jsmmethod, Bioinformatics. 19(10) (2003) 1201–1207.
 [5] B. Ganter, R. Wille, Formal Concept Analysis: Mathematical Foundations, SpringerVerlag, New York, 1999.
 [6] A. Frank, A. Asuncion, UCI Machine Learning Repository, 2010, http://archive.ics.uci.edu/ml.
 [7] S.O. Kuznetsov, On computing the size of a lattice and related decision problems, Order, 18 (4) (2001) 313321
 [8] S.O. Kuznetsov, S.A. Obiedkov, Comparing performance of algorithms for generating concept lattices,J. Exp. Theor. Artif. Intell. 14(2–3) (2002) 189–216.
 [9] S.O. Kuznetsov, Machine learning and formal concept analysis, in: Concept Lattices, Proceedings of the Second International Conference on Formal Concept Analysis, ICFCA 2004, Sydney, Australia, February 2326, 2004, pp.287312.
 [10] S.O. Kuznetsov, S.A. Obiedkov, C. Roth, Reducing the representation complexity of latticebased taxonomies, in: Conceptual Structures: Knowledge Architectures for Smart Applications, Proceedings of the 15th International Conference on Conceptual Structures, ICCS 2007, Sheffield, UK, July 2227, 2007, pp.241–254.
 [11] J. Li, C. Mei, Y. Lv, Incomplete decision contexts: approximate concept construction, rule acquisition and knowledge reduction, Int. J. Approx. Reason. 54(1) (2013) 149–165.
 [12] J. Li, C. Mei, L. Wang, J. Wang, On inference rules in decision formal contexts, Int. J. Comput. Intell. Syst. 8(1) (2015) 175–186.
 [13] J. Li, C. Mei, J. Wang, X. Zhang, Rulepreserved object compression in formal decision contexts using concept lattices, Knowl.Based Syst. 71 (2014) 435–445.
 [14] J. Outrata, V. Vychodil, Fast algorithm for computing fixpoints of Galois connections induced by objectattribute relational data, Information Sciences, 185 (1) (2012) 114127
 [15] P. Osicka, Algorithms for computation of concept trilattice of triadic fuzzy context, in: Advances in Computational Intelligence Proceedings of the 14th International Conference on Information Processing and Management of Uncertainty in KnowledgeBased Systems, IPMU 2012, Catania, Italy, July 913, 2012, pp.221230 (Part III).
 [16] N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, Efficient mining of association rules using closed itemset lattices, Inf. Syst. 24(1) (1999) 25–46.
 [17] J. Poelmans, D.I. Ignatov, S. Viaene, G. Dedene, S.O. Kuznetsov, Text mining scientific papers: a survey on FCAbased information retrieval research, in:Advances in Data Mining. Applications and Theoretical Aspects Proceedings of the 12th Industrial Conference, ICDM 2012, Berlin, Germany, July 1320, 2012, pp.273287.
 [18] Z. Pei, D. Ruan, D. Meng, Z. Liu, Formal concept analysis based on the topology for attributes of a formal context, Information Sciences, 236 (2013) 66–82.
 [19] J. Qi, W. Liu, L. Wei, Computing the set of concepts through the composition and decomposition of formal contexts, in: International Conference on Machine Learning and Cybernetics, Proceedings, ICMLC 2012, Xian, Shaanxi, China, July 15–17, 2012, pp.1326–1332.
 [20] J. Qi, T. Qian, L. Wei, The connections between threeway and classical concept lattices, Knowl.Based Syst. 91 (2016) 143–151.
 [21] J. Qi, L. Wei, Z. Li, A partitional view of concept lattice, in: Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, Proceedings of the 10th International Conference, RSFDGrC 2005, Regina, Canada, August 31–September 3, 2005, pp.74–83 (Part I).
 [22] R. Ren, L. Wei, The attribute reductions of threeway concept lattices, Knowl.Based Syst. 99 (2016) 92–102.
 [23] M. Shao, H. Yang, W. Wu, Knowledge reduction in formal fuzzy contexts, Knowl.Based Syst. 73 (2015) 265–275.
 [24] Q. Wan, L. Wei, Approximate concepts acquisition based on formal contexts, Knowl.Based Syst. 75 (2015) 78–86.
 [25] Y. Y. Yao, Concept lattices in rough set theory, Processing Nafips 04 IEEE Meeting of the the Fuzzy Information, Canada: IEEE, September 27,2004: 796801.