Extending Extraction Framework with Citations, Commons and Lexeme Extractors - GSoC2020

Description

DBpedia is a crowd-sourced community effort to extract structured content from the various Wikimedia projects which is publicly available for everyone on the Web. This project will improve the DBpedia extraction (https://github.com/dbpedia/extraction-framework) process which is continuously being developed by community with citations, commons and lexemes information.

Goals

Student will develop the required modules which will parse the information from the specific source. Developed modules will be used to extract wider range of knowledge from the Wikimedia which will be presented openly to the community usage with different interest and language edition.

Impact

Created triples for the specific type of knowledge will be published to the community usage.

Warm up tasks

Preliminary experience with Extraction Framework
#8
#9

Mentors

TBA

Keywords

Extraction framework, text parsing, RDF generation

Would be willing to join in as co-mentor