Description
Knowledge Graphs are used in an increasing number of applications. Although considerable human effort has been invested into making knowledge graphs available in multiple languages, most knowledge graphs are in English. Additionally, regional facts are often only available in the language of the corresponding region. This lack of multilingual knowledge availability clearly limits the porting of machine learning models to different languages. To alleviate this drawback, we previously proposed THOTH, which is an approach for translating and enriching knowledge graphs across languages. THOTH extracts bilingual alignments between a source and target knowledge graph and learns how to translate from one to the other by relying on two different recurrent neural network models along with
knowledge graph embeddings. We evaluated THOTH extrinsically by comparing the German DBpedia with the German translation of the English DBpedia on two tasks: fact-checking and entity linking. In addition, we ran a manual intrinsic evaluation of the translation. Our results showed that THOTH is a promising approach that achieves a translation accuracy of 88.56%. Moreover, its enrichment improves the quality of the German DBpedia significantly, as we report +18.4% accuracy for fact validation and +19% F1 for entity linking.
Goals:
In this GSoC, our idea is not to enrich officially the DBpedia KG rather investigate THOTH based on other Neural Network architectures along with distinct Knowledge Graphs Embeddings techniques for improving other downstream NLP tasks such as Machine Translation and Question Answering.
Impact:
The project may allow users to enrich artificially low-resource DBpedia KGs to be used in essential NLP tasks or/and augment Knowledge graph-based Machine Learning models.
Warm-up tasks:
- Read the papers:
Transformer :
Survey on Knowledge Graphs Embeddings:
http://ceur-ws.org/Vol-2377/paper_4.pdf
Mentors
Diego Moussallem
Keywords
Neural Networks, NLP, Semantic Web