Knowledge-base relation triples and textual mentions of the Freebase entity pairs. They are a subset of the FB15K dataset (Bordes et al. NIPS-2013), whcih was originally derived from Freebase. The textual mentions are obtained from 200M sentences from the ClueWeb12 corpus along with the FACC1 Freebase entity mention annotations.
The FB15K dataset suffered from major test leakage through inverse relations, where a large number of test triples could be obtained by inverting triples in the training set. In order to create a dataset without this characteristic, the FB15k-237 introduced – a subset of FB15k where inverse relations were removed.