Knowledge Graphs (KGs) are an essential component of neuro-symbolic AI. KG Embedding Models (KGEMs) are used to represent elements of a KG (its entities and relations) in a vector space, to enable efficient processing and reasoning over knowledge. Most KGEMs are evaluated against datasets derived from the Freebase KG: FB15k and FB15k-237. In this paper, we identify limitations in these datasets with respect to Compound Value Types (CVTs), which are nodes introduced in Freebase as a substitute for \uD835\uDC5B-ary relations. In FB15k and FB51k-237, CVTs have been removed, thereby eliminating valuable information. To evaluate whether KGEMs can learn semantically accurate representations of entities and relations in Freebase, we introduce here a new dataset named FB15k-CVT, which reintroduces the deleted CVT nodes. In a preliminary evaluation, we assess the limitations of baseline KGEMs (TransE, DistMult) in the presence of CVTs. The evaluation suggests that KGEMs based on tensor decomposition are more promising than translational models but, most of all, it calls for further experiments with KGEMs that can answer conjunctive queries or that preserve logical entailment.
Authors
- Bibliographic Reference
- Mouloud Iferroudjene, Victor Charpenay, Antoine Zimmermann. FB15k-CVT: A Challenging Dataset for Knowledge Graph Embedding Models. NeSy 2023, 17th International Workshop on Neural-Symbolic Learning and Reasoning, Jul 2023, Siena, Italy. pp.381-394. ⟨emse-04081543⟩
- Department
- Département Informatique et systèmes intelligents
- HAL Collection
- ['Ecole Nationale Supérieure des Mines de Saint-Etienne', 'Institut Mines Télécom', 'Université de Clermont', 'CNRS - Centre national de la recherche scientifique', 'FAYOL - Institut Henri Fayol', "Laboratoire d'Informatique, de Modélisation et d'optimisation des Systèmes", 'FAYOL / ISCOD : Informatique pour les Systèmes Coopératifs Ouverts et Décentralisés', 'composantes instituts telecom', 'Clermont Auvergne INP']
- HAL Identifier
- 4140335
- Institution
- ['Ecole Nationale Supérieure des Mines de St Etienne', 'Courbon Software', 'Institut national polytechnique Clermont Auvergne']
- Laboratory
- Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes
- Published in
- France
Table of Contents
- 1 Introduction 2
- 2 Compound Value Types in Freebase 3
- 2.1 Removing Compound Value Types 4
- 2.2 Reintroducing Compound Value Types 5
- 3 The FB15k-CVT dataset 5
- 3.1 Creation of the set of triples 6
- 3.2 Train/validation/test split 6
- 4 Experiments and preliminary results 7
- 4.1 From link prediction to path prediction 7
- 4.2 Experiments Settings 8
- 4.3 Experiments Results 9
- 5 Conclusion 10
- A Extended results of the experiment 14
- B Related Work 14