Template Discovery for Neural Question Answering over DBpedia - GSoC2022

This project started in 2018 as ‘A Neural QA Model for DBpedia’ and is now looking to its 5th consecutive year at Google Summer of Code.

Introduction

Neural SPARQL Machines (NSpM) aim at building an end-to-end system to answer questions posed by user not versed with writing SPARQL queries.

Currently, billions of relationships on the Web are expressed in the RDF format. Accessing such data is difficult for a lay user, who does not know how to write a SPARQL query. This GSoC project consists of building upon the NSpM question answering system, which tries to make this humongous linked data accessible to a larger user base in their natural language (as of now restricted to English) by improving, adding and amending upon the existing codebase, which resides at the link below.

Documentation

Related work

The first 3 papers introduce and elaborate on Neural SPARQL Machines. Work number 3 was carried out by our GSoC 2019 student and published at KGSWC 2020. The 4th paper is an almost-complete survey of related approaches.

  1. SPARQL as a Foreign Language
  2. Neural Machine Translation for Query Construction and Composition
  3. Exploring Sequence-to-Sequence Models for SPARQL Pattern Composition
  4. Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs

GSoC Blogs

You may also check which problems past GSoC contributors worked on:

  1. [GSoC 2018] Aman’s Blog: https://amanmehta-maniac.github.io/
  2. [GSoC 2019] Anand’s Blog: A Neural QA Model for DBpedia | Making data accessible to everyone
  3. [GSoC 2020] Zheyuan’s Blog: https://baiblanc.github.io/
  4. [GSoC 2021] Siddhant’s Blog: Documenting my GSoC’21 journey at DBpedia | Neural-QA-Model-for-DBpedia

Warm-up tasks

  1. Read through the blogs and the reading list to get a good understanding of the code. This will allow you to get a good idea about the project.
  2. Run the pipelines in the ./gsoc/anand and ./gsoc/zheyuan folders of the base repository using examples of your choice.

Your proposal

Now that you have a good understanding of the current state of the project, we suggest you to build proposals pondering on some of the following points, feel free to bring your own solutions to tackle the problems that the project faces.

  1. How can we automatically build the right question from the property label only?
    • example a) from <s> dbo:birthPlace <o> infer where was <s> born?
    • example b) from <s> dbo:timeZone <o> infer what time zone is <s> in?
  2. How can we automatically build question-query templates that feature one or more of the following?
    • subordinate clauses or genitive: which / that / of / ’s
    • con-/disjunctions: and / or / as well as
    • modifiers: which + mod / what + mod / demonyms
    • comparative: more than / -er than
    • superlative: most … / -est
    • numeric / quantitative: how many / long / tall

Consider experimenting with advanced approaches such as:

Project size

The size of this project can be either medium or large. Please state in your proposal the number of total project hours you intend to dedicate to it (175 or 300).

Mentors

@tsoru, @panchbhai1969, @nausheenfatma

Feel free to contact us for more information. We eagerly look forward to working with you and contributing towards making data accessible to all.

2 Likes

Hello @tsoru @panchbhai1969 I am Saurav Joshi, pursuing bachelors in Computer Engineering from Mumbai, India. I am highly interested and inclined towards contributing to this project as it focuses on nlp, knowledge graphs, semantic web and these are the areas I love. Hoping to have a great summer working on this project.

Everyone who is interested in submitting a proposal for this project please follow the steps below:

  • Prepare a Google Docs draft on the lines of this example of an excellent proposal that was accepted a few years ago.
  • Share the draft proposal with my account (mommi84 at gmail dot com).
  • Address the comments that the other mentors and I will leave.
  • Submit the proposal to the official GSoC platform.

Important message to @sauravjoshi23 and anyone interested in the project.

We are 7 DAYS away from the contributor application deadline, and we have so far received 1 (ONE) applications for this project, hence there is still plenty of chance to get accepted in this year’s programme.

Please follow the steps below as soon as possible if you wish to get mentors’ feedback before your submission.

Hi, @tsoru I have already mailed you yesterday the link to the draft that I have created. So can you please check your mail whether you have received it or not?

Hi @sauravjoshi23. Yes, I did indeed receive your proposal, thank you. The other mentors and I will try and give you our feedback within the next few days.

2 Likes

Hi @souravjoshi,

Have added a few comments on your doc. Other mentors might add to it as well.

2 Likes

I have replied to them comments. Thank you @nausheenfatma