DBpedia Neural Multilingual QA - GSoC2020

In this topic, the student will implement an approach that extends a monolingual original DBNQA dataset into a multilingual one.

Could please provide project details and use our usual project description structure. Thank you :slight_smile:

Warm-up tasks:

In this topic, the student will implement an approach that extends a monolingual original DBNQA dataset [1] into a multilingual one.

Extend the NSpM framework to couple with the multilingualism challenge.

QA systems enable lay users to access content, however many of these systems are English-centric. Promoting models to other languages is pivotal to democratize knowledge access as well as to engage communities of minority languages.

Warm-up tasks
(1) Fork the NSpM project (https://github.com/AKSW/NSPM);
(2) Train the Monument 300 and Monument 600 datasets https://github.com/AKSW/NSpM/tree/master/data;
(3) Perform the Tensorflow tutorial for machine translation available at https://www.tensorflow.org/tutorials/text/nmt_with_attention

Edgard Marx

#NSpM #DBpedia #Multilingual #QA #MT

1 Like

Hello, I am Lahiru Hinguruduwa, Final year CSE undergraduate at University of Moratuwa, Sri Lanka.
I have quite experience in developing machine language models and deep learning models. Also, I am interested in NLP. I completed the warm-up tasks. Have a question,

  • Do we have to consider/focus on some languages or try to implement a general model for languages?

  • Also since I have completed the warm-up tasks, can you offer some advice to move forward in this project?

Dear Lahiroushara,

I am glad you have completed the warm-up tasks.
Now ,you can let your imagination flows and focus on your GOC application.
As the warm-up tasks suggests, we are going to use the NSPM framework that is based on Tensor Flow.
There is no other requirements other than that.
You can use NSPM and DBNQA publications as source inspiration for your application.

best of luck

Thanks. I will do so.

Do we have a template for the proposal?

Yup, https://google.github.io/gsocguides/student/proposal-example-1



I am Aadesh, I’m a masters student working in the field of ML. I am very excited to see applications of deep learning in NLU and would love to work on this project.
I am going through the warm up tasks and will soon start working on proposal.


Hi Emarx,
I read the paper SPARQL as a Foreign Language(http://tsoru.aksw.org/neural-sparql-machines/soru-marx-semantics2017.html). I did not fully understand the method of creating the dataset. Can you elaborate?

Dear @lahiruoshara,

Unfortunately not, that’s part of your homework ;-).
Perhaps after GOC deadline.
Start by understanding the running example, it may help you.

Okay, I’ll try to figure it myself, thank you.

Hii I am Shrilakshmi from India.
My Idea is to build a neural network which correct the misspelled word to the
dictionary word for Indian language.So that corrected words can be used for furthur QA process.

How to deal with out of vocabulary words? #24 -this issue can be solved as words or text are corrected before transformation.

Hi @emarx ,
Since “Mapping-Based Infobox Extraction” exists for 27 languages in DBpedia can’t we annotate templates for these languages and create the datasets?


*To everybody, @lahiruoshara @lahiruoshara @shrilakshmi

We started to collect the proposal’s drafts to have an estimation of project slots.
Please submit ASAP your drafts.

How to submit it through PM??

@shrilakshmi Via GSoC website.

Sure and Thanks