1 Introduction
Semantic similarity measure serves as the foundation of knowledge discovery and management processes such as ontology matching, ontology alignment & mapping, ontology merging, etc [Shvaiko and Euzenat2013]. Ontological concept similarity can be based on different approaches: (i) string matching of concept labels (i.e. lexical similarity) [Stoilos et al.2005], (ii) external lexical resource/ontology based matching (i.e. lexicosemantic similarity) [Rada et al.1989a]
, (iii) graphbased matching using lexicons such as WordNet
[Stuckenschmidt2007] (i.e. structural similarity), (iv) property analysis (as in FCAbased similarity [Cimiano et al.2005]) or instance analysis (as in Jaccard similarity [Jaccard1998]) based matching over a large sample of concept instance occurrences (i.e. instancedriven similarity), (v) matching based on statistical analysis of attributevalue or distribution analysis within fixed contextwindows of concepts over large corpora (i.e. statistical similarity) [Li and Clifton1994], and (vi) modeltheoretic matching of formal concept descriptions (i.e. formal semantic similarity) [Alsubait et al.2014].It can be argued that, in comparison to other approaches, formal semantic similarity measure modeling has not received equal research attention. Nevertheless, existing literature is significant, and can be broadly classified into two approaches: (i) Propositional Logics based
[NienhuysCheng1998, Ramon and Bruynooghe1998], and (ii) Description Logics (DL) based [Alsubait et al.2014, Lehmann and Turhan2012, Stuckenschmidt2007, Fanizzi and d’Amato2006, Borgida et al.2005]. The former requires: (a) representation of ontologies (mostly in RDFS/OWL format) in First Order Predicate Logic, (b) a set of axioms (or domain knowledge, mostly as upper ontologies/thesaurus), and (c) a SAT solver that checks satisfiability (and hence, satisfiability) of disjointness of concept pairs. The latter approach, on the other hand, does not necessarily require any formal language transformation or satisfiability checker. In this paper we propose an algebraic similarity measure, called BitSim (), that can compute semantic similarity of pair of concepts defined in ^{1}^{1}1 : . The motivation behind is to formulate a formal semantic similarity measure that provides: (i) a platform for fast, scalable, and accurate semantic similarity computation of DL concepts, and (ii) a sound and complete correspondence with conventional semantic interpretation of DL. is algebraic, in the sense that it maps a given pair of concept codes (called bitcode), instead of concept DL definitions/axioms, to a positive real space. For this we define a novel algebraic interpretation function, called , that maps an definition to a unique string, called bitcode, () belonging to the language defined over a novel algebraic alphabet . We prove that has complete correspondence with . We also show that is highly adaptive to any kind of similarity measure that relies on set operation. As an example, we have shown how can be plugged into Jaccard similarity index. The contribution of the paper is as follows:
: A novel algebraic semantic interpretation function for .

Proof of mathematical correspondence of with semantic interpretation of .

: A novel semantic similarity measure based on

Comparative analysis of properties of with contemporary DL based similarity measures.
2 Related Work
DL based similarity measures, as described in the introduction, can be further subdivided into: (i) taxonomic analysis, (ii) structural analysis [Tongphu and Suntisrivaraporn2014, Ontañón and Plaza2012, Joslyn et al.2008, Hariri et al.2006], (iii) language approximation [Stuckenschmidt2007, Tserendorj et al.2008, Groot et al.2005, Brandt et al.2002], and (iii) modeltheoretic analysis [Distel et al.2014, Alsubait et al.2014, Lehmann and Turhan2012, Borgida et al.2005]. The most common approach for DL based similarity measure modeling adopts taxonomic analysis as proposed in [Rada et al.1989b, Resnik and others1999, Jiang and Conrath1997, Wu and Palmer1994, Lin1998]. These techniques can be further subdivided, as mentioned in introduction, into graphtraversal approaches [Rada et al.1989b] and InformationContent approaches [Resnik and others1999]. However, these methods can work on a generalized ontology and hence, are not sensitive to DL definitions.
In structural analysis based approaches, a similarity measure is designed to capture the semantic equivalence of description trees of definitions of DL concept pairs. One way of achieving this is to calculate the degree of homomorphism between such trees, as proposed in [Tongphu and Suntisrivaraporn2014]. A refinement graph based antiunification approach has been proposed in [Ontañón and Plaza2012] for computing instance similarity. Approximation based techniques aim at converting given DL expression to another lower expressive DL language on approximation. In [Stuckenschmidt2007] an upper and lower approximation interpretation for have been defined over a subvocabulary of . The subvocabulary can be formed by either removing concepts atoms in a given definition or by replacing them with structurally simpler concepts [Groot et al.2005]. Another technique, as proposed in [Noia et al.2004], is based on converting user query into a DL expression and try to classify the match to be either an exact match, or a potential match (i.e., match might happen if some concept atom and operators are added) or a partial match (where the user query and answer/description found in the knowledge base are in conflict).
One of the pioneer work on modeltheoretic interpretation based similarity approach can be found in [Borgida et al.2005]. The work shows the inherent difficulty in measuring similarity of DL concepts using conventional taxonomic analysis based techniques. It then uses an InformationContent based approach to evaluate the similarity of two concept definitions. A work has been proposed by [Lehmann and Turhan2012] for similarity computation of concepts defined in
. In this work, a Jaccard Index
[Jaccard1998] based approach has been followed that compares common parents of a concept pair using a fuzzy connector (i.e. a similarity score aggregation function). A similar Jaccard Index based approach has been recently proposed in [Alsubait et al.2014]. Another recent approach has been proposed in [Distel et al.2014]. The work emphasizes the necessity of triangle inequality property of formal semantic similarity measure. It defines two versions of a relaxation function for computing dissimilarity of concepts defined in . However, it can be proved that triangle inequality is not always valid and hence, is not a necessary condition.3 Preliminaries
3.1  A Description Logics Fragment
We hereby define the semantic interpretation of (an extension of which also interprets current OWL 2 specification)^{2}^{2}2The syntax of follows conventional DL as defined in [Baader2003]. Let be an interpretation function, and be the universal domain. is defined as:

(Attributive Language):

Atomic concept:

Role:

Atomic Negation: =

Top Concept:

Bottom Concept:

Conjunction:

Value Restriction:

Limited Existential Restriction:


(Full Existential Restriction):

(Concept Negation): =

(Role Hierarchy):

(Role Union):

(Role Intersection):
3.2 Formal Similarity Measure
In this section we define the algebraic properties of as given in [Lehmann and Turhan2012]. Let is a DL concept in a given terminology (TBox) .
Definition 1: A semantic similarity measure is a function defined as follows:
where
Properties of Similarity Measure: Arguably^{3}^{3}3* denotes that the property is not universally adopted as necessary condition. Also it cannot be proven to be valid in all types of algebraic spaces., should hold the following properties:
(1) 
(2) 
(3) 
(4) 
(5) 
(6) 
(7) 
(8)  
(9) 
(10) 
It should be noted that the aforementioned necessary properties may not be sufficient and hence, detailed theoretical analysis has to be done on sufficiency.
4 : Formal Language for Concept Coding
In this section we define the formal language on which the proposed algebraic interpretation function is defined. We first define bit (), the alphabet of , as follows:
Definition 2: A base alphabet () is an alphabet defined as: where

0 is the empty symbol. It is also called potential bit since it generates all other symbols (i.e. bits)

1 is base bit, called property bit, signifying the presence of a property at the string position that it holds.
Definition 3: An bit operator () is a set of operators on the base bits defined as: .
We now define a very important semantics for potential bit (i.e. 0) as follows:
(1) 
Definition 4: An derived alphabet () is an alphabet defined as: where
Based on the above definition the following observations can be made (using de Morgan’s law):
A further analysis shows that has a partial ordering (as shown in figure 1). It is interesting to note that .
Definition 5.1: An bitalphabet () is defined as
: .
It is to be noted that satisfies commutativity and double negation over .
Definition 5.2: A quantifieralphabet () is defined as
.
The algebraic space of is defined as below:
Definition 5.3: A rolealphabet () is defined as
.
The algebraic space of is defined as below:
Definition 6.1: A base bitcode () is defined as where is bit concatenation operator; ; ; .
Definition 6.2: A derived bitcode () is defined as = () () where is string concatenation operator.
It can be observed that the definition of is recursive. We leave the explanation and utility of the definition in section 5.3.
Definition 7: is defined as .
5 Encoding Concept
5.1 Motivation
The motivation behind BitSim ( ) is to develop a formal, efficient, and scalable matchmaking system that can be applied in DL based knowledge bases. Unlike other DL based similarity measures, was designed to satisfy all the necessary properties defined in section 3.2 with special emphasis on structural dependency and strict monotonicity.
At the same time, computation is over , rather than . This significantly improves the computational speed since essentially becomes a function over pairs (such bitcodes can be computed and stored offline in the knowledge base). Since, computation is performed on pairs of bits holding the same position in , therefore we can chunk bitcodes in constant sizes and perform similarity over concept pairs on parallel computational platforms. This gives massive scalability to . Efficient optimization can be performed by caching similarity results of bitcode chunks that are frequently visited.
We will also show that, at a bit level, has a partial ordering . This allows applicationoriented assignment of similarity scores to bit pairs at the lowest granularity. Also, is highly adaptive to all types of similarity measures that have set theoretic operations on DL concepts.
5.2 Encoding Atomic Concepts
Before we show that has complete correspondence with , we first provide the foundational axioms that helps us to encode atomic concepts in . For that we need to define the proposed algebraic interpretation function (also called bitinterpretation).
Definition 8: Bitinterpretation ( ) is a function as follows:
: ;
We hereby define , where is an arbitrary atomic concept in , using the following two axioms:
(1) 
(2) 
where, k is the kth position (in increasing order from right to left) in . In the second axiom holds the significant property bit.
We now show the method to encode inclusion axioms on atomic concepts using the following two axioms:
(3) 
(4) 
The above four axioms ensure that a simple boolean intersection between any pair of and generates the bitcode of the least common subsumer (lcs) atomic concept. An example encoding instance has been illustrated in figure 2. For atomic roles we can use the same principle of encoding with an alphabet that corresponds to .
5.3 Encoding Derived Concepts
For encoding derived concepts, we cannot attain completeness using only. This is because, for a bounded number of atomic concepts (say, n) we need a mechanism to encode distinct and disjoint concepts, in the worst case. However, with only 11 bits in , we can only encode distinct concepts. It is because of this reason that we need to use . The method is to have nested encoding for bit operations over certain special bit pairs. These operations do not have direct mapping to the algebraic lattice shown in figure 1. Instead they map to intermediate and discrete compound bits (). which can be represented in terms of . We define as follows:
Definition 9: A compound bit is defined as:
= , where the algebra of is defined as shown in figure 3.
For any derived concept , we can state the following:
(5) 
Based on axiom 5 we can state that:
Lemma 1.
For any binary operation between and , the length of both the operands, and , must be same; where k: kth position of in .
Lemma 2.
For any binary operation between and , the length of the resultant has growth = , where and n is length of operand .
Lemma 3.
If , then
Lemma 4.
If , then
Lemma 5.
If , then
Lemma 6.
If , then
We now postulate the following axioms for operations over derived concepts:
(6) 
(7) 
(8) 
(9) 
It is to be noted that for the above axioms can be both a simple and a compound bit. We now provide a proof for showing the mathematical correspondence between ( ) and ( ).
Lemma 7.
k: kth position in
Proof.
Proof can be derived from the lattice structure of (see figure 1.) ∎
Following the above lemma we can state that:
Theorem 1.
Theorem 2.
Proof.
, if ; else converse over . ∎
Theorem 3.
Proof.
From axiom 7 and 8, we can show that is unique. This is because is unique. In other words, the algebra has complete correspondence with . ∎
Theorem 4.
, is unique.
Proof.
Follows from axiom 4 and lemma 1  6. ∎
Theorem 5.
has complete correspondence with
Proof.
The proof follows from theorem 1  3. ∎
6 BitSim Similarity Measure
6.1 Outline
In this section we provide a generic definition for BitSim. We first define (i.e. similarity at a bit level) as follows:
Definition 10: : ; where .
It is to be noted that can be as well. One can see that has a total order (see figure 4). In order to compute similarity at a bitcode level, we define an aggregation function called . There are two parameters that should influence the value output of : (i) and (ii) codegenerativity (). We define codegenerativity as follows:
Definition 11: Codegenerativity () of any of a concept is the total number of distinct and disjoint concepts that are covered by the .
As an example, ; (i.e. ). We now define the similarity measure at a bitcode level (we call it as follows:
Definition 12: : { } .
We now postulate the following axioms:
(10) 
(11) 
(12) 
6.2 : Property Analysis
In this section we show that follows the necessary conditions: (i) reflexive, (ii) maximality, (iii) equivalence closure, (iv) equivalence invariance, (v) structural dependency, (vi) subsumption preservation, (vii) reverse subsumption preservation, and (viii) strict monotonicity. The first two properties trivially hold true. The following theorems show that the rest of the properties hold:
Theorem 6.
Equivalence closure and invariance holds true for .
Proof.
Follows from theorem 1 and theorem 4. ∎
Theorem 7.
Structural dependency holds true for .
Proof.
Under the condition of structural dependency, for two concepts and the that is generated for them will have a length, say l, that grows exponentially as the number of inner intersections in the definition of both and tends to . Therefore, except for the word length of the invariant concepts in the definition of and , all of and will be exact. Hence, () ∎
Theorem 8.
Strict monotonicity holds true for .
Proof.
When more than one concepts (say, ) subsume two (say, ) out of three arbitrary concepts (say, ), while only one (say, ) subsumes all three, then () (). This is because, since is property based measure, will inherit more common bits than . ∎
We now show how can be adapted to a thirdparty similarity measure as follows:
Definition 13:  ()
7 Discussion
As can be seen, can be adapted as an alternate paradigm for DL subsumption reasoning. Since can be mapped to a boolean space one can perform bit operations at high speed and that too on a distributed and parallel platform. At the same time, various caching techniques can be applied efficiently. In the future we will be exploring these research prospects and other possibilities such as probabilistic reasoning on , ABox reasoning, and reasoning over higher expressive DL.
8 Conclusion
In this paper we have proposed BitSim ( )  an algebraic similarity measure for concept definitions in . We show that satisfies all the necessary algebraic properties recommended for a formal similarity measure. Being based on , is highly sensitive to standard DL interpretation. Furthermore, is highly adaptive to any similarity measure that uses set theoretic operations.
References
 [Alsubait et al.2014] Tahani Alsubait, Bijan Parsia, and Uli Sattler. Measuring similarity in ontologies: A new family of measures. In Knowledge Engineering and Knowledge Management, pages 13–25. Springer, 2014.
 [Baader2003] Franz Baader. The description logic handbook: theory, implementation, and applications. Cambridge university press, 2003.
 [Borgida et al.2005] Alexander Borgida, Thomas J Walsh, and Haym Hirsh. Towards measuring similarity in description logics. Description Logics, 147, 2005.
 [Brandt et al.2002] Sebastian Brandt, Ralf Küsters, and AnniYasmin Turhan. Approximation and difference in description logics. In KR, pages 203–214, 2002.
 [Cimiano et al.2005] Philipp Cimiano, Andreas Hotho, and Steffen Staab. Learning concept hierarchies from text corpora using formal concept analysis. J. Artif. Intell. Res.(JAIR), 24:305–339, 2005.
 [Distel et al.2014] Felix Distel, Jamal Atif, and Isabelle Bloch. Concept dissimilarity with triangle inequality. KR’14, 2014.
 [Fanizzi and d’Amato2006] Nicola Fanizzi and Claudia d’Amato. A similarity measure for the description logic. Proceedings of CILC, pages 26–27, 2006.
 [Groot et al.2005] Perry Groot, Heiner Stuckenschmidt, and Holger Wache. Approximating description logic classification for semantic web reasoning. In The Semantic Web: Research and Applications, pages 318–332. Springer, 2005.
 [Hariri et al.2006] Babak Bagheri Hariri, Hassan Abolhassani, and Ali Khodaei. A new structural similarity measure for ontology alignment. In SWWS, pages 36–42, 2006.
 [Jaccard1998] P Jaccard. Etude comparative de la distribution florale dans une portion des alpes et des jura. In Bull Soc Vaudoise Sc Nat, volume 37, pages 547–579, 1998.
 [Jiang and Conrath1997] Jay J Jiang and David W Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint cmplg/9709008, 1997.
 [Joslyn et al.2008] Cliff Joslyn, Alex Donaldson, and Patrick R Paulson. Evaluating the structural quality of semantic hierarchy alignments. In International Semantic Web Conference (Posters & Demos), volume 401, 2008.

[Lehmann and
Turhan2012]
Karsten Lehmann and AnniYasmin Turhan.
A framework for semanticbased similarity measures for
concepts.
In
Logics in Artificial Intelligence
, pages 307–319. Springer, 2012. 
[Li and Clifton1994]
WenSyan Li and Chris Clifton.
Semantic integration in heterogeneous databases using neural networks.
In VLDB, volume 94, pages 12–15, 1994.  [Lin1998] Dekang Lin. An informationtheoretic definition of similarity. In ICML, volume 98, pages 296–304, 1998.

[NienhuysCheng1998]
ShanHwei NienhuysCheng.
Distances and limits on herbrand interpretations.
In
Inductive Logic Programming
, pages 250–260. Springer, 1998.  [Noia et al.2004] Tommaso Di Noia, Eugenio Di Sciascio, Francesco M Donini, and Marina Mongiello. A system for principled matchmaking in an electronic marketplace. International Journal of Electronic Commerce, 8(4):9–37, 2004.
 [Ontañón and Plaza2012] Santiago Ontañón and Enric Plaza. Similarity measures over refinement graphs. Machine learning, 87(1):57–92, 2012.
 [Rada et al.1989a] Roy Rada, Hafedh Mili, Ellen Bicknell, and Maria Blettner. Development and application of a metric on semantic nets. Systems, Man and Cybernetics, IEEE Transactions on, 19(1):17–30, 1989.
 [Rada et al.1989b] Roy Rada, Hafedh Mili, Ellen Bicknell, and Maria Blettner. Development and application of a metric on semantic nets. Systems, Man and Cybernetics, IEEE Transactions on, 19(1):17–30, 1989.
 [Ramon and Bruynooghe1998] Jan Ramon and Maurice Bruynooghe. A framework for defining distances between firstorder logic objects. In Inductive Logic Programming, pages 271–280. Springer, 1998.
 [Resnik and others1999] Philip Resnik et al. Semantic similarity in a taxonomy: An informationbased measure and its application to problems of ambiguity in natural language. J. Artif. Intell. Res.(JAIR), 11:95–130, 1999.
 [Shvaiko and Euzenat2013] Pavel Shvaiko and Jérôme Euzenat. Ontology matching: state of the art and future challenges. Knowledge and Data Engineering, IEEE Transactions on, 25(1):158–176, 2013.
 [Stoilos et al.2005] Giorgos Stoilos, Giorgos Stamou, and Stefanos Kollias. A string metric for ontology alignment. In The Semantic Web–ISWC 2005, pages 624–637. Springer, 2005.
 [Stuckenschmidt2007] Heiner Stuckenschmidt. Partial matchmaking using approximate subsumption. In Proceedings of the national conference on Aritificial Intelligence, page 1459, 2007.
 [Tongphu and Suntisrivaraporn2014] Suwan Tongphu and Boontawee Suntisrivaraporn. On desirable properties of the structural subsumptionbased similarity measure. 4th Joint International Semantic Technology Conference (JIST), 2014.
 [Tserendorj et al.2008] Tuvshintur Tserendorj, Sebastian Rudolph, Markus Krötzsch, and Pascal Hitzler. Approximate owlreasoning with screech. In Web Reasoning and Rule Systems, pages 165–180. Springer, 2008.
 [Wu and Palmer1994] Zhibiao Wu and Martha Palmer. Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics, pages 133–138. Association for Computational Linguistics, 1994.