DBpedia Neural Multilingual QA - GSoC2020

emarx · February 6, 2020, 3:12pm

In this topic, the student will implement an approach that extends a monolingual original DBNQA dataset into a multilingual one.

SandraPraetor · February 13, 2020, 10:46am

Could please provide project details and use our usual project description structure. Thank you

DESCRIPTION
Goal
Impact
Warm-up tasks:
Mentors
Keywords

emarx · February 17, 2020, 12:55pm

Description
In this topic, the student will implement an approach that extends a monolingual original DBNQA dataset [1] into a multilingual one.

Goal
Extend the NSpM framework to couple with the multilingualism challenge.

Impact
QA systems enable lay users to access content, however many of these systems are English-centric. Promoting models to other languages is pivotal to democratize knowledge access as well as to engage communities of minority languages.

Warm-up tasks
(1) Fork the NSpM project (https://github.com/AKSW/NSPM);
(2) Train the Monument 300 and Monument 600 datasets https://github.com/AKSW/NSpM/tree/master/data;
(3) Perform the Tensorflow tutorial for machine translation available at https://www.tensorflow.org/tutorials/text/nmt_with_attention

Mentors
Edgard Marx

Keywords
#NSpM #DBpedia #Multilingual #QA #MT

lahiruoshara · February 25, 2020, 8:11am

Hello, I am Lahiru Hinguruduwa, Final year CSE undergraduate at University of Moratuwa, Sri Lanka.
I have quite experience in developing machine language models and deep learning models. Also, I am interested in NLP. I completed the warm-up tasks. Have a question,

Do we have to consider/focus on some languages or try to implement a general model for languages?
Also since I have completed the warm-up tasks, can you offer some advice to move forward in this project?

emarx · February 25, 2020, 11:42am

Dear Lahiroushara,

I am glad you have completed the warm-up tasks.
Now ,you can let your imagination flows and focus on your GOC application.
As the warm-up tasks suggests, we are going to use the NSPM framework that is based on Tensor Flow.
There is no other requirements other than that.
You can use NSPM and DBNQA publications as source inspiration for your application.

best of luck

lahiruoshara · February 25, 2020, 1:06pm

Thanks. I will do so.

lahiruoshara · March 1, 2020, 5:19pm

Do we have a template for the proposal?

emarx · March 1, 2020, 6:17pm

Yup, https://google.github.io/gsocguides/student/proposal-example-1

lahiruoshara · March 2, 2020, 6:59pm

Thanks

aadesh · March 4, 2020, 11:35am

Hi,

I am Aadesh, I’m a masters student working in the field of ML. I am very excited to see applications of deep learning in NLU and would love to work on this project.
I am going through the warm up tasks and will soon start working on proposal.

Thanks!

lahiruoshara · March 8, 2020, 5:28pm

Hi Emarx,
I read the paper SPARQL as a Foreign Language(http://tsoru.aksw.org/neural-sparql-machines/soru-marx-semantics2017.html). I did not fully understand the method of creating the dataset. Can you elaborate?

emarx · March 9, 2020, 9:40am

Dear @lahiruoshara,

Unfortunately not, that’s part of your homework ;-).
Perhaps after GOC deadline.
Start by understanding the running example, it may help you.

lahiruoshara · March 10, 2020, 9:20am

Okay, I’ll try to figure it myself, thank you.

shrilakshmi · March 14, 2020, 2:39am

Hii I am Shrilakshmi from India.
My Idea is to build a neural network which correct the misspelled word to the
dictionary word for Indian language.So that corrected words can be used for furthur QA process.

How to deal with out of vocabulary words? #24 -this issue can be solved as words or text are corrected before transformation.

lahiruoshara · March 18, 2020, 6:41am

Hi @emarx ,
Since “Mapping-Based Infobox Extraction” exists for 27 languages in DBpedia can’t we annotate templates for these languages and create the datasets?

emarx · March 23, 2020, 2:33pm

sure.

emarx · March 23, 2020, 2:36pm

*To everybody, @lahiruoshara @lahiruoshara @shrilakshmi

We started to collect the proposal’s drafts to have an estimation of project slots.
Please submit ASAP your drafts.

shrilakshmi · March 23, 2020, 3:13pm

How to submit it through PM??

emarx · March 23, 2020, 5:41pm

@shrilakshmi Via GSoC website.

shrilakshmi · March 24, 2020, 12:33pm

Sure and Thanks