🤖 A Neural QA Model for DBpedia: Compositionality - GSoC2020

panchbhai1969 · February 4, 2020, 4:36pm

Introduction

Neural SPARQL Machine is a project that deals with building an end-to-end system to answer questions posed by user not versed with writing SPARQL queries.

Currently DBpedia hosts billions of such data points and corresponding relations in the RDF format. Accessing such data is difficult for a lay user, who does not know how to write a SPARQL query. This proposal tries to built upon a System: ( https://github.com/AKSW/NSpM/tree/master ) — which tries to make this humongous linked data available to a larger user base in their natural languages(now restricted to English) by improving, adding and amending upon the existing codebase.

Source Code and Documentation

The latest code-base is available here: GitHub - dbpedia/neural-qa: 📚 A Neural QA Model for DBpedia using Neural SPARQL Machines.

Blogs

To better understand the project please look into the following links:

[GSoC 2018] Aman’s Blog: https://amanmehta-maniac.github.io/
[GSoC 2019] Anand’s Blog: A Neural QA Model for DBpedia | Making data accessible to everyone

Reading Material:

{SPARQL} as a Foreign Language: [1708.07624] SPARQL as a Foreign Language
Neural Machine Translation for Query Construction and Composition: [1806.10478] Neural Machine Translation for Query Construction and Composition
Introduction to Neural Network based Approaches for
Question Answering over Knowledge Graphs: https://arxiv.org/pdf/1907.09361.pdf

Warm up tasks:

Read through the blogs and the reading list to get a good understanding of the code. This will allow you to get a good idea about the project.
Run the pipelines in the gsoc/anand folder of the repository mentioned above. For a certain ontology.

Ideas

Now that you have a good understanding of the current state of the project, we suggest you to build proposals pondering on some of the following points, feel free to bring your own solutions to tackle the problems that the project faces.

Structure of the questions.
Basic Graph Pattern (BGP)
1. subordinate clauses or genitive (which / that / of / ’s)
2. con-/disjunctions (and / or / as well as)
3. modifiers (which + mod / what + mod / demonyms)
4. comparative (more than / -er than)
5. superlative (most … / -est)
6. numeric / quantitative (how many / long / tall)
Tackling out of vocabulary words
Using word embedding
Integrating fast-text
Updating to code-base to python3

Feel free to contact us for more information. We eagerly look forward to working with you and contributing towards making data accessible to all.

diogenesis · February 27, 2020, 4:19pm

Hi! I’d be interested in working on this during GSoC. Can we submit more than one proposal to the same organization, by any chance?

emarx · February 27, 2020, 4:32pm

sure

diogenesis · February 27, 2020, 4:33pm

Right, thanks! For this project, should I contact the mentor and start working on a proposal now?

panchbhai1969 · February 28, 2020, 6:48am

Hi @diogenesis,

Before starting up with the proposal, I would suggest you to read the papers and complete the warm up tasks. Doing so will help you in writing a good proposal. Feel free to ask questions here.

baizydl · March 2, 2020, 1:11pm

Hi. I am interested in this project, but I have some issues running it because of the version of TensorFlow. I see that this project is developed with python 2, tensorflow1.12, but my PC is under system windows and the most recent version for python27 windows is tensorflow1.10. Do I need a Linux environment? Or do you have any suggestion?

baizydl · March 2, 2020, 4:15pm

It’s no longer a problem, I’ve created a VM Linux and it works well

panchbhai1969 · March 2, 2020, 6:11pm

Sounds Good, keep us posted as you progress.

baizydl · March 5, 2020, 3:11pm

Hello, I’m wondering if you have gone any deeper for this project after the last year of GSOC as mentioned in Anand’s blog?

panchbhai1969 · March 5, 2020, 6:06pm

Hi @baizydl,

We did have multiple discussions after the GSoC period ended, some of the discussed points have been added to the ideas section in the topic description mentioned above.

baizydl · March 6, 2020, 8:31am

Thanks, @panchbhai1969. I will try to work on my first draft of proposal. By the way, do we need to merge any Pull Request to be a candidate?

panchbhai1969 · March 6, 2020, 12:33pm

Sure, do share the draft proposal with us (Recommended platform: Google Docs, share with us privately).

As far as pull requests and merges are concerned, its not compulsory. But we do encourage you to interact with the code and create pull request for small issues, if you come across any.

nikhit · March 9, 2020, 5:40pm

Hi, is this project and another one named “Multilingual Neural QA” same ?

panchbhai1969 · March 9, 2020, 7:55pm

Hi @nikhit ,

If you are referring to this: DBpedia Neural Multilingual QA - GSoC2020.

On first glance they may seem similar but if you take a closer look, you will find that this project (briefly) focuses on the aspect of NSpM that deals with handling a wide range of compositional question currently limited to English Language (complex questions)(hint: Check out the Basic Graph Patterns and other ideas in the topic description above).

Whereas the project you are referring to focuses on the multilingual aspect of Neural QA. Thus, extending the NSpM framework to couple with the multilingualism challenge as stated in the corresponding page. You may find more information about DBNQA here: https://github.com/AKSW/DBNQA.

jasonsychau · March 25, 2020, 12:48pm

Hi, I submitted my proposal and an email. Have you read them?

panchbhai1969 · March 25, 2020, 6:47pm

Hi Jason,

Indeed, I have gone through your proposal. Please provide us comment access, so that I can answer the questions you have asked in the doc files as comments.

jasonsychau · March 25, 2020, 8:16pm

Ok, i changed settings.

maheshkulkarni · March 26, 2020, 9:01pm

Hello, My name is Mahesh Kulkarni.Currently I am in my final year of B.Tech degree from Vishwakarma Institute Of Technology, Pune , India. I have some prior experience with NLP , Deep Learning. I am finding interest in this project. I want to contribute to it. I have gone through warm up tasks.Any further helpful instructions so that I will get more clarifications about the project?
Thank you

panchbhai1969 · March 27, 2020, 1:50pm

Hi Mahesh,

Sounds, great! The description above contains all the information necessary to help you to get started. Draft a proposal with your ideas pertaining to this projects and share with us.

maheshkulkarni · March 27, 2020, 7:11pm

Thanks for quick reply, in your blog
Future aspects of this project:
Working on variable awareness :
can you elaborate this so i can get more idea?
also adding some SPARQL learning resources will be helpful for me.