This project started in 2018 as ‘A Neural QA Model for DBpedia’ and is now looking to its 7th consecutive year at Google Summer of Code.
Introduction
Neural SPARQL Machines (NSpM) aim at building an end-to-end system to answer questions posed by user not versed with writing SPARQL queries.
Currently, billions of relationships on the Web are expressed in the RDF format. Accessing such data is difficult for a lay user, who does not know how to write a SPARQL query. This GSoC project consists of building upon the NSpM question answering system, which tries to make this humongous linked data accessible to a larger user base in their natural language (as of now restricted to English) by improving, adding and amending upon the existing codebase, which resides at the link below.
Documentation
Related work
The first 3 papers introduce and elaborate on Neural SPARQL Machines. Work number 3 was carried out by our GSoC 2019 student and published at KGSWC 2020. The 4th paper is an almost-complete survey of related approaches.
Read through the most recent blogs and the reading list to get a good understanding of the code. This will allow you to get a good idea about the project.
Run the pipeline in the ./gsoc/mehrzad folder of the base repository using examples of your choice.
Your proposal
Now that you have a good understanding of the current state of the project, we ask you to write your own proposal. Feel free to bring your own solutions to tackle the problem that the project currently faces, i.e. training a question-answering model using the dataset we have built over the years.
Although the original paper mentions a seq2seq model, the NSpM paradigm allow us to choose any model as our Learner to translate natural-language questions into SPARQL. You may even propose your own model or one from any other community (e.g., HuggingFace).
Project size
The size of this project can be either medium or large. Please state in your proposal the number of total project hours you intend to dedicate to it (175 or 300).
I am Vedant Udan a bachelor’s student at IIT Bhilai. During my bachelor’s, I have worked lots on LLMs and NLP related task, like finetuning the models , prompt engineering and many other NLP task.
Due to my previous experience, i find this project particularly interesting and want to try out how LLMs along with the awesome NLP tools will helps to solve this problem.
Hi @panchbhai1969@mehrzadshm I came across this project and its seems quite Interesting, I have a solid foundation in ML and I am also a Microsoft Certified Solutions Developer for Natural Langauge Processing. I would love to get further details and discuss my proposal for the same. Looking Forward to working with you. Also please provide some contact details.
Hey @panchbhai1969
This is Soham, currently pursuing my Masters in Artificial Intelligence from Univeristy of Amsterdam. I would love to work in this research project. I have professional experience as an Applied Scientist at Amazon and Fraud Analyst at OneCard and a Software developer at Oracle.
Prior to this I also did a Masters from Indian Statistical Institute with specialization in Data Science.
LinkedIn: Soham Chatterjee - Amsterdam, North Holland, Netherlands | Professional Profile | LinkedIn
How can I join the slack channel ? It says : “It looks like there isn’t an account on DBpedia tied to this email address.”
I would go over the project in more detail. Lets connect on SLack and discuss more in detail.
Hi @panchbhai1969 , @mehrzadshm
I am Alexander Osadolor, I just concluded a Data Science bootcamp at HyperionDev, and I am currently pursuing a master’s degree at Teesside University. I find this topic intriguing, and would gladly love to be a part of it.
Regards