🤖 A Neural QA Model for DBpedia: Compositionality - GSoC2020


Neural SPARQL Machine is a project that deals with building an end-to-end system to answer questions posed by user not versed with writing SPARQL queries.

Currently DBpedia hosts billions of such data points and corresponding relations in the RDF format. Accessing such data is difficult for a lay user, who does not know how to write a SPARQL query. This proposal tries to built upon a System: ( ​​https://github.com/AKSW/NSpM/tree/master ​) — which tries to make this humongous linked data available to a larger user base in their natural languages(now restricted to English) by improving, adding and amending upon the existing codebase.

Source Code and Documentation

The latest code-base is available here: https://github.com/dbpedia/neural-qa


To better understand the project please look into the following links:

  1. [GSoC 2018] Aman’s Blog: https://amanmehta-maniac.github.io/
  2. [GSoC 2019] Anand’s Blog: https://anandpanchbhai.com/A-Neural-QA-Model-for-DBpedia/

Reading Material:

  1. {SPARQL} as a Foreign Language: https://arxiv.org/abs/1708.07624
  2. Neural Machine Translation for Query Construction and Composition: https://arxiv.org/abs/1806.10478
  3. Introduction to Neural Network based Approaches for
    Question Answering over Knowledge Graphs: https://arxiv.org/pdf/1907.09361.pdf

Warm up tasks:

  1. Read through the blogs and the reading list to get a good understanding of the code. This will allow you to get a good idea about the project.
  2. Run the pipelines in the gsoc/anand folder of the repository mentioned above. For a certain ontology.


Now that you have a good understanding of the current state of the project, we suggest you to build proposals pondering on some of the following points, feel free to bring your own solutions to tackle the problems that the project faces.

  1. Structure of the questions.
    Basic Graph Pattern (BGP)
    1. subordinate clauses or genitive (which / that / of / ’s)
    2. con-/disjunctions (and / or / as well as)
    3. modifiers (which + mod / what + mod / demonyms)
    4. comparative (more than / -er than)
    5. superlative (most … / -est)
    6. numeric / quantitative (how many / long / tall)
  2. Tackling out of vocabulary words
  3. Using word embedding
  4. Integrating fast-text
  5. Updating to code-base to python3

Feel free to contact us for more information. We eagerly look forward to working with you and contributing towards making data accessible to all.

1 Like