1 Introduction
Hypergraphs were introduced in Berge and Minieka (1973). Hypergraphs are defined as a family of nonempty subsets  called hyperedges  of the set of vertices. Elements of a set are unique. Hence elements of a given hyperedge are also unique in a hypergraph.
Multisets extend sets by allowing duplication of elements. As mentioned in Singh et al. (2007), N.G. de Bruijn proposed to Knuth the terminology multiset in replacement of a variety of existing terms, such as bag or weighted set. Multisets are used in database modelling: in Albert (1991) relational algebra extension basements were introduced to manipulate bags  see also Klug (1982)  by studying bags algebraic properties. Queries for such bags have been largely studied in a series of articles  see references in Grumbach et al. (1996): bags are intensively used in database queries as duplicate search is a costly operation. In Hernich and Kolaitis (2017) information integration under bag semantics is studied as well as the tractability of some algorithmic problems over bag semantics: they showed that the GLAV (GlobalAndLocalAsView) mapping of two databases problem becomes untractable over such semantic. In Radoaca (2015a) and Radoaca (2015b), the author extensively study multisets and propose to represent them by two kind of Venn diagrams. Multisets are also used in Pcomputing in the form of labelled multiset, called membrane  see Păun (2006) for more details.
Taking advantage of this duplication allowance, we construct in this article an extension of hypergraphs called hyperbaggraphs (shortcut as hbgraphs). There are two main reasons to get such an extension. The first reason is that multisets are extensively used in databases as they allow presence of duplicates Lamperti et al. (2000)  removing duplicates (and thus obtaining sets and hypergraphs) being an expensive operation. The second reason is that natural hbgraphs  hbgraphs based on multisets with nonnegative integer multiplicity values  allow results on the hbgraph adjacency tensor: hypergraph being particular case of hbgraph, other hypergraph adjacency tensors than the ones proposed in Banerjee et al. (2017); Ouvrard et al. (2017) can be built by giving meaningful interpretation to the steps taken during its construction via hbgraph.
Section 2 gives the mathematical background, including main definitions on hypergraphs and multisets. Section 3 gives mathematical construction of the HyperBagGraphs (or hbgraphs). Section 4 gives algebraic description of hbgraphs and consequences for the adjacency tensor of hypergraphs. Section 5 gives results on the constructed tensors. Section 6 evaluates the constructed tensors and proceed to a final choice on the hypergraph adjacency tensor. Section 7 gives future work.
2 Mathematical background
2.1 Hypergraphs
As mentioned in Ouvrard et al. (2017), hypergraphs fit collaboration networks modelling  Newman (2001a, b) , coauthor networks  Grossman and Ion (1995), Taramasco et al. (2010) , chemical reactions  Temkin et al. (1996)  , genome  Chauve et al. (2013) , VLSI design  Karypis et al. (1999)  and other applications. More generally hypergraphs fit perfectly to keep entities grouping information. Hypergraphs succeed in capturing adic relationships. In Berge and Minieka (1973), Stell (2012) and Bretto (2013) hypergraphs are defined in different ways. In this article, the definition of Bretto (2013)  as it doesn’t impose the union of the hyperedges to cover the vertex set  is used:
Definition 2.1.
An (undirected) hypergraph on a finite set of vertices (or vertices) is defined as a family of hyperedges where each hyperedge is a nonempty subset of .
A weighted hypergraph is a triple: where is a hypergraph and a mapping where each hyperedge is associated to a real number .
The section of a hypergraph is the graph such that:
Let . A hypergraph is uniform if all its hyperedges have the same cardinality .
A directed hypergraph is a hypergraph where each hyperedge accepts a partition in two nonempty subsets, called the source  written  and the target  written  with
Definition 2.2.
Let be a hypergraph.
The degree of a vertex is the number of hyperedges it belongs to. For a vertex , it is written or . It holds:
In this article only undirected hypergraphs will be considered. Hyperedges link one or more vertices together. Broadly speaking, the role of the hyperedges in hypergraphs is playing the role of edges in graphs.
2.2 Multisets
2.2.1 Generalities
Basic on multisets are given in this section, based mainly on Singh et al. (2007).
Definition 2.3.
Let be a set of distinct objects. Let
Let be an application from to .
Then is called a multiset  or mset or bag  on .
is called the ground or the universe of the multiset , is called the multiplicity function of the multiset .
is called the support  or root or carrier  of .
The elements of the support of a mset are called its generators.
A multiset where is called a natural multiset.
We write the set of all multisets of universe .
Some extensions of multisets exist where the multiplicity function can have its range in  called hybrid set in Loeb (1992). Some other extensions exist like fuzzy multisets Syropoulos (2000).
Several notations of msets exist. One common notation which we will use, if is the ground of a mset is to write:
where .
An other notation is:
or even:
If is a natural multiset an other notation is:
which is similar to have an unordered list.
Remark 2.1.

Two msets can have same support and same support objects multiplicities but can differ by their universe.
Also to be equal two msets must have same universe, same support and same multiplicity function. 
The multiplicity function corresponds to a weight that is associated to objects of the universe.

Multiplicity in natural multisets can also be interpreted as a duplication of support elements. In this case, a mset can be viewed as a non ordered list with repetition. In a natural multiset the copies of a generator of the support in instances are called elements of the multiset.

Some definitions of multisets also consider which could lead to interesting applications. We don’t develop such case here.
Definition 2.4.
Let be a mset.
The mcardinality of written is defined as:
The cardinality of  written is defined as:
Remark 2.2.
In general multisets, mcardinality and cardinality are two separated notions as for instance: , and, have all same cardinality with different mcardinalities for C compared to A and B.
In natural multisets, mcardinality and cardinality are equal if and only if the multiplicity of each element in the support is 1, ie if the natural multiset is a set. It doesn’t generalize to general multisets  see A and B of the former example.
Definition 2.5.
Two msets and are said to be cognate if they have same support.
They are not necessarily equal: for instance, and are cognate but different.
Definition 2.6.
Let and be two msets on the same universe .
If is called the empty mset and written .
is said to be included in  written  if for all : . In this case, is called a submset of .
The union of and is the mset of universe and of multiplicity function such that for all :
The intersection of and is the mset of universe and of multiplicity function such that for all :
The sum of and is the mset of universe and of multiplicity function such that for all :
Proposition 2.1.
, and are commutative and associative laws on msets of same universe. They have the empty mset of same universe as identity law.
is distributive for and .
and are distributive one for the other.
and are idempotent.
Definition 2.7.
Let be a mset.
The power set of , written , is the multiset of all submsets of .
2.2.2 Copyset of a multiset
Let consider a multiset: where the range of the multiplicity function is a subset of . Equivalent definition  see Syropoulos (2000)  is to give a couple where is the set of all instances (including copies) of with an equivalency relation where:
Definition 2.8.
Two elements of such that: are said copies one of the other. The unique is called the original element. and are said copies of .
Also is isomorphic to and:
Definition 2.9.
The set is called a copyset of the multiset .
Remark 2.3.
A copyset for a given multiset is not unique. Sets of equivalency classes of two couples and of a given multiset are isomorphic.
2.2.3 Algebraic representation of a multiset
We suppose given a natural multiset of universe and multiplicity function . It yields:
Vector representation:
A multiset can be conveniently represented by a vector of length the cardinality of the universe and where the coefficients of the vector represent the multiplicity of the corresponding element.
Definition 2.10.
The vector representation of the multiset is the vector
This representation requires space and has null elements.
The sum of the elements of is
This representation will be useful later when considering family of multisets in order to build the incident matrix.
Hypermatrix representation:
An alternative representation is built by using a symmetric hypermatrix. This approach is needed to reach our goal of constructing an adjacency tensor for general hypergraphs.
Definition 2.11.
The unnormalized hypermatrix representation of the multiset is the symmetrix hypermatrix of order and dimension such that if . The other elements are null.
Hence the number of nonzero elements in is out of the elements of the representation.
The sum of the elements of is then:
To achieve a normalisation, we enforce the sum of the elements of the hypermatrix to be the mrank of the multiset it encodes. It yields:
Definition 2.12.
The normalized hypermatrix representation of the multiset is the symmetrix hypermatrix of order and dimension such that if . The other elements are null.
3 Hbgraphs
Hyperbaggraphs  hbgraphs for short  are introduced in this section. Hbgraphs extend hypergraphs by allowing hyperedges to be msets. The goal of this section is to revisit some of the definitions and results found in Bretto (2013) for hypergraphs and extend them to hbgraphs.
3.1 Generalities
3.1.1 First definitions
Definition 3.1.
Let be a nonempty finite set.
A hyperbaggraph  or hbgraph  is a family of msets with universe and support a subset of . The msets are called the hbedges and the elements of the vertices.
We write the family of hbedges and such a hbgraph.
We consider for the remainder of the article a hbgraph , with and the family of its hbedges.
Each hbedge is of universe and has a multiplicity function associated to it: where . When the context make it clear the notation is used for and for .
Definition 3.2.
A hbgraph is said with no repeated hbedges if:
Definition 3.3.
A hbgraph where each hbedge is a natural mset is called a natural hbgraph.
Remark 3.1.
For a general hbgraph each hbedge has to be seen as a weighted system of vertices, where the weights of each vertex are hbedge dependent.
In a natural hbgraph the multiplicity function can be viewed as a duplication of the vertices.
Definition 3.4.
The order of a hbgraph  written  is:
Its size is the cardinality of
Definition 3.5.
The empty hbgraph is the hbgraph with an empty set of vertices.
The trivial hbgraph is the hbgraph with a non empty set of vertices and an empty family of hbedges.
If : then the hbgraph is said with no isolated vertices. Otherwise, the elements of are called the isolated vertices. They correspond to elements of hyperedges which have zeromultiplicity for all hbedges.
Remark 3.2.
A hypergraph is a natural hbgraph where the vertices of the hbedges have multiplicity one for any vertex of their support and zero otherwise.
3.1.2 Support hypergraph
Definition 3.6.
The support hypergraph of a hbgraph is the hypergraph whose vertices are the ones of the hbgraph and whose hyperedges are the support of the hbedges in a onetoone way. We write it , where .
Remark 3.3.
Given a hypergraph, an infinite set of hbgraphs can be generated that all have this hypergraph as support. To each of these hbgraphs corresponds a hbedge family: to each support of these hbedges corresponds at least a hyperedge in the hypergraph and reciprocally to each hyperedge corresponds at least a hbedge in each hbgraph of the infinite set.
To have unicity, the considered hypergraph and hbgraphs should be respectively with no repeated hyperedge or with no repeated hbedge.
3.1.3 muniform hbgraphs
Definition 3.7.
The mrange of a hbgraph  written  is by definition:
The range of a hbgraph  written  is the range of its support hypergraph
The mcorange of a hbgraph  written  is by definition:
The corange of a hbgraph  written  is the range of its support hypergraph
Definition 3.8.
A hbgraph is said muniform if all its hbedges have same cardinality .
A hbgraph is said uniform if its support hypergraph is uniform.
Proposition 3.1.
A hbgraph is muniform if and only if:
Proof.
Immediate.
∎
3.1.4 HBstar and mdegree
Definition 3.9.
The HBstar of a vertex is the multiset  written  defined as:
Remark 3.4.
The support of the HBstar of a vertex of a hbgraph is exactly the star of this vertex in the support hypergraph .
Definition 3.10.
The mdegree of a vertex of a hbgraph  written  is defined as:
The maximal mdegree of a hbgraph is written .
The degree of a vertex of a hbgraph  written  corresponds to the degree of this vertex in the support hypergraph
The maximal degree of a hbgraph is written and corresponds to the maximal degree of the support hypergraph
Definition 3.11.
A hbgraph having all of its hbedges of same mdegree is said mregular or mregular.
A hbgraph is said regular if its support hypergraph is regular.
3.1.5 Dual of a hbgraph
Definition 3.12.
Considering a hbgraph , its dual is the hbgraph with a set of vertices which is in bijection with the set of hbedges of :
And the set of hbedges is in bijection  where  with the set of vertices of .
Switching from the hbgraph to its dual:
Vertices  

Edges  
Multiplicity  with  with 
muniform  mregular  
mregular  muniform 
3.2 Additional concepts for natural hbgraphs
3.2.1 Numbered copy hypergraph of a natural hbgraph
In natural hbgraphs the hbedge multiplicity functions have their range in the natural number set. The vertices in a hbedge with multiplicities strictly greater than 1 can be seen as copies of the original vertex.
Deepening this approach copies have to be understood as “numbered” copies. Let and be two hbedges. Let be a vertex of multiplicity in and in . will hold copies: the ones “numbered” from 1 to . The remaining copies will be held either in xor depending which set has the highest multiplicity of .
More generally, we define the numberedcopy set of a multiset:
Definition 3.13.
Let .
The numbered copyset of is the copyset where: is a shortcut to indicate the numbered copies of the original element : to and is designated as the copy number of the element .
Definition 3.14.
Let be a natural hbgraph.
Let be the vertices of the hbgraph. Let be the hbedges of the hbgraph and for , the multiplicity function of .
The maximum multiplicity function of is the function defined for all by:
Definition 3.15.
Let be a natural hbgraph where is the vertex set and is the hbedge family of the hbgraph.
Let be the maximum multiplicity function.
Let consider the numberedcopyset of the multiset :
Then each hbedge is associated to a copyset / equivalency relation which elements are in with copy number as small as possible for each vertex in .
Then where is a hypergraph called the numberedcopyhypergraph of .
Proposition 3.2.
A numberedcopyhypergraph is unique for a given hbgraph.
Proof.
It is immediate by the way the numberedcopyhypergraph is built from the hbgraph.
∎
Allowing the duplicates to be numbered prevent ambiguities; nonetheless it has to be seen as a conceptual approach as duplicates are entities that are not discernible.
3.2.2 Paths, distance and connected components
Defining a path in a hbgraph is not straightforward as vertices are duplicated in a hbgraph. The duplicate of a vertex strictly inside a path must be at the intersection of two consecutive hbedges.
Definition 3.16.
A strict mpath in a hbgraph from a vertex to a vertex is a vertex / hbedge alternation with hbedges to and vertices to such that , , and and that for all , .
A large mpath from a vertex to a vertex is a vertex / hbedge alternation with hbedges to and vertices to such that , , and and that for all , .
is called in both cases the length of the mpath from to .
Vertices from to are called interior vertices of the mpath.
and are called extremities of the mpath.
If the extremities are different copies of the same object, then the mpath is said to be an almost cycle.
If the extremities designate exactly the same copy of one object, the mpath is said to be a cycle.
Remark 3.5.

For a strict mpath, there are:
possibilities of choosing the interior vertices along a given mpath and:
possible strict mpaths in between the extremities.

For a large mpath, there are:
possibilities of choosing the interior vertices along a given mpath and:
possible large mpaths in between the extremities.

As large mpaths between two extremities by a given sequence of interior vertices and hbedges include strict mpaths, we often refer as mpaths for large mpaths.

If an mpath exists from to then an mpath also exists from to .
Definition 3.17.
An mpath in a hbgraph corresponds to a unique path in the hbgraph support hypergraph called the support path.
Proposition 3.3.
Every mpath traversing same hyperedges and having similar copy vertices as intermediate and extremity vertices share the same support path.
The notion of distance is similar to the one defined for hypergraphs.
Definition 3.18.
Let and be two vertices of a hbgraph. The distance from to is the minimal length of an mpath from to if such an mpath exists. If no mpath exist, and are said disconnected and .
Definition 3.19.
A hbgraph is said connected if its support hypergraph is connected, disconnected otherwise.
Definition 3.20.
A connected component of a hbgraph is a maximal set of vertices such that every pair of vertices of the component has an mpath in between them.
Remark 3.6.
A connected component of a hbgraph is a connected component of one of its copy hypergraph.
Definition 3.21.
The diameter of a hbgraph  written  is defined as:
3.2.3 Adjacency
Definition 3.22.
Let be a positive integer.
Let consider vertices not necessarily distinct belonging to .
Let write the mset consisting of these vertices with multiplicity function .
The vertices are said adjacent in if it exists such that .
Considering a hbgraph of mrange , the hbgraph can’t handle more than adjacency in it. This maximal adjacency is called the adjacency of .
Definition 3.23.
Let consider a hbedge in .
Vertices in the support of are said adjacent.
Vertices in the hbedge with nonzero multiplicity are said adjacent.
Remark 3.7.

adjacency doesn’t support redundancy of vertices.

adjacency allows the redundancy of vertices.

The only case of equality is where the hbedge has all its nodes of multiplicity 1 at the most.
Definition 3.24.
Two hbedges are said incident if their support intersection is not empty.
3.2.4 Sum of two hbgraphs
Let and be two hbgraphs.
The sum of two hbgraphs and is the hbgraph written defined as the hbgraph that has:

as vertex set and where the hbedges are obtained from the hbedges of and with same multiplicity for vertices of (respectively ) but such that for each hyperedge in (respectively ) the universe is extended to and the multiplicity function is extended such that (respectively )

as hbedge family, ie the family constituted of the elements of and of the elements of .
This sum is said direct if doesn’t contain any new pair of repeated hbedge than the ones already existing in and those already existing in . In this case the sum is written .
3.3 An example
Example 3.1.
Considering , with and with: , , , .
It holds:
2  0  0  0  2  2  
0  3  0  0  3  3  
0  1  1  0  2  1  
2  0  0  0  2  1  
1  0  2  0  3  2  
0  0  0  1  1  1  
0  0  0  0  0  0  
5  4  3  1 
Therefore the order of is and its size is .
is an isolated vertex.
and are incident as well as and . is not incident to any hbedge.
, and are adjacent as they hold in .
, and are adjacent as they hold in .
The dual of is the hbgraph: with:

with for

with:

has duplicated hbedges and one empty hbedge.
4 Algebraic representation of a hbgraph
4.1 Incidence matrix of a hbgraph
A mset is well defined by giving itself its universe, its support and its function of multiplicity. We have seen that a mset can be represented by a vector called the vector representation of the mset.
Hbedges of a given hbgraph have all the same universe.
Definition 4.1.
Let and be two positive integers.
Let be a nonempty hbgraph, with vertex set and .
The matrix is called the incidence matrix of the hbgraph .
This incidence matrix is intensively used in Ouvrard et al. (2018b) for diffusion by exchanges in hbgraphs.
4.2 adjacency tensor of a natural hbgraph
To build the adjacency tensor of a natural hbgraph without repeated hbedge  with vertex set and hbedge set  we use a similar approach that was used in Ouvrard et al. (2017) using the strong link between cubical symmetric tensors and homogeneous polynomials.
Definition 4.2.
An elementary hbgraph is a hbgraph that has only one non repeated hbedge in its hbedge family.
Claim 4.1.
Let be a hbgraph with no repeated hbedge.
Then:
where is the elementary hbgraph associated to the hbedge .
Proof.
Let and . As is with no repeated hbedge, doesn’t contain new pairs of repeated elements. Thus is a direct sum.
A straightforward iteration over elements of leads trivially to the result.
∎
We need first to define hypermatrices for the adjacency of an elementary hbgraph and of a muniform hbgraph.
4.2.1 Normalised adjacency tensor of an elementary hbgraph
We consider an elementary hbgraph where and is a multiset of universe and multiplicity function . The support of is by considering, without loss of generality: .
is the multiset: where .
The normalised hypermatrix representation of , written , describes uniquely the mset . Thus the elementary hbgraph is also uniquely described by as is the unique hbedge. is of rank and dimension .
Hence, the definition:
Definition 4.3.
Let be an elementary hbgraph with and the multiset of mrank , universe and multiplicity function
The normalised adjacency hypermatrix of an elementary hbgraph is the normalised representation of the multiset : it is the symmetric hypermatrix of rank and dimension where the only nonzero elements are:
where
In a elementary hbgraph the adjacency corresponds to adjacency. This hypermatrix encodes the adjacency of the elementary hbgraph; as the adjacency corresponds to adjacency in such a hbgraph is encodes also the adjacency of the elementary hbgraph.
4.2.2 hbgraph polynomial
Homogeneous polynomial associated to a hypermatrix:
With a similar approach than in Ouvrard et al. (2017) where full details are given, let write the canonical basis of .
is a basis of , where is the Segre outerproduct.
A tensor is associated to an hypermatrix by writting as:
Considering variables attached to the vertices and , the multilinear matrix product is a polynomial ^{1}^{1}1As a reminder::
of degree .
Elementary hbgraph polynomial:
Considering a hbgraph with and the multiset of mrank , universe and multiplicity function .
Using the normalised adjacency hypermatrix , which is symmetric, we can write the reduced version of its attached homogeneous polynomial :
Hbgraph polynomial:
Considering a hbgraph with norepeated hbedge, with and
This hbgraph can be summarized by a polynomial of degree :
where is a technical coefficient. is called the hbgraph polynomial. The choice of is made in order to retrieve the mdegree of the vertices from the adjacency tensor.
4.2.3 adjacency hypermatrix of a muniform natural hbgraph
We now extend to muniform hbgraph the adjacency hypermatrix obtained in the case of an elementary hbgraph.
In the case of a muniform natural hbgraph with no repeated hbedge, each hbedge has the same cardinality . Hence the adjacency of a muniform hbgraph corresponds to adjacency where is the mrank of the hbgraph. The adjacency tensor of the hbgraph has rank and dimension . The elements of the adjacency hypermatrix are:
with .
The associated hbgraph polynomial is homogeneous of degree
We obtain the definition of the adjacency tensor of a muniform hbgraph by summing the adjacency tensor attached to each hyperedge with a coefficient equals to 1 for each hyperedge.
Definition 4.4.
Let be a hbgraph. .
The adjacency hypermatrix of a muniform hbgraph is the hypermatrix defined by:
where is the adjacency hypermatrix of the elementary hbgraph associated to the hbedge .
The only nonzero elements of are the elements of indices obtained by permutation of the multiset and are all equals to .
Remark 4.1.
When a muniform hbgraph has 1 as vertex multiplicity for any vertices in each hbedge support of all hbedges, then this hbgraph is a uniform hypergraph: in this case, we retrieve the result of the degreenormalized tensor defined in Cooper and Dutle (2012).
Claim 4.2.
The degree of a vertex in a muniform hbgraph of adjacency hypermatrix is:
Proof.
has nonzero terms only for corresponding hbedges that have in it. For such a hbedge containing , it is described by . It means that the multiset corresponds exactly to the multiset For each such that there is possible permutation of the indices to and .
Also: .
∎
4.2.4 Elementary operations on hbgraphs
In Ouvrard et al. (2017), we describe two elementary operations that are used in the hypergraph uniformisation process. We describe here two similar operations and some additional operations for hbgraphs.
Operation 4.1.
Let be a hbgraph. Let be a constant weighted function on hbedges with constant value . The weighted hbgraph is called the canonical weighted hbgraph of The application is called the canonical weighting operation.
Operation 4.2.
Let be a canonical weighted hbgraph. Let . Let be a constant weighted function on hbedges with constant value . The weighted hbgraph is called the dilatated hbgraph of The application is called the cdilatation operation.
Operation 4.3.
Let be a weighted hbgraph. Let be a new vertex. The ycomplemented hbgraph of is the hbgraph where ,  with the map such that for all , and is the multiset , with  and, the weight function is is such that : The application is called the ycomplemented operation.
Operation 4.4.
Let be a weighted hbgraph. Let be a new vertex. Let The vertexincreased hbgraph of is the hbgraph where ,  with the map such that for all , and is the multiset , with  and, the weight function is is such that : The application is called the vertexincreasing operation.
Operation 4.5.
The merged hbgraph of a family of weighted hbgraphs with is the weighted hbgraph with vertex set , with hbedge family ^{2}^{2}2 is the family obtained with all elements of each family  with the map such that for all , and is the multiset , with  and, such that , The application is called the merging operation.
Operation 4.6.
Decomposing a hbgraph into a family of hbgraphs , where such that is called the decomposition operation .Remark 4.2.
The direct sum of two hbgraphs appears as a merging operation of the two hbgraphs.
Definition 4.5.
Let and be two hbgraphs.
Let .
is said preserving adjacency if vertices of that are adjacent in are either adjacent vertices in or the maximal subset of these vertices that are in </