Below are some ideas and tasks for volunteers to help improve DBpedia.
E1: Easier Download GUI
We had a Widget that worked on the dataid in section 3 Datasets. The old one loaded the json file and rendered it. It would help a lot of people to identify the datasets they need. We can now get the data for this widget from the sparql endpoint of the Databus, see the query here
M1: Issue tracker fixing and migrating tests to the framework
Issues in the CI-Test category should be migrated to tests on the minidump. Others need to be tagged properly or reviewed and closed. The process is badly documented at the moment. So this task is hard now and we need to simplify it with better docu.
M2: restart DBpedia Spotlight project
What we need is this:
- Spotlight needs training data from Wikipedia. Wikipedia dumps are parsed and then a model is created:
- these models should be created at least every three months for all Wikipedias (we can provide servers) and published on the databus
- from the databus, we can modify the existing spotlight docker to autoload and deploy. This docker can then be deployed at all the chapters.
H1: General debugging of the DBpedia Extraction Framework (Scala/Java)
If you go to extraction-framework/dump and run
mvn test you can check all thrown exceptions and try to fix them.