Mapping Generation from Resource Descriptions - GSoC2020


DBpedia currently maintains mappings between Wikipedia infobox template properties to the DBpedia ontology, since several similar templates exist (in single as well as over multiple languages) to describe closely related types of infoboxes. The aim of the project is to enrich and possibly correct the existing mappings with a data-driven method to propose or generate mappings automatically by analyzing instance data from distinct language-specific datasets. This will be a follow-up of a previous GSoC project, which mainly mapped the classes to infobox templates.
A central goal is also to map Wikidata property identifiers.


Provide suggestions (eg by using statistical probabilities) for template parameters which properties from DBpedia ontology and from Wikidata should be mapped.


Increase the coverage for mapped languages and yet not mapped languages, which finally leads to better data quality.

Warm up tasks

Familiarize with and evaluate the results of the previous project code base (no fixed stipulation to re-use this).



mappings, knowledge base completion, data quality

1 Like

Hi Team,

First of all, I would sincerely like to apologize for very late response. I missed out on the DBPedia’s projects.

I am a Master’s student at Saarland University (Max Planck Institute for Informatics, MPI-Inf), have done relevant course work (Information Extraction) and have worked with the Database and Information Systems group at MPI-Inf. I thereby possess the required knowledge and skills to work on this project, which seems very much interesting to me. I see a lot of benefits coming out of this project to the community.

Since I don’t see a mention of a mentor for this project, may I ask the team to guide me further?

I would appreciate prompt response.

Thank you very much.