@karankharecha also in reply to Recommendation System for Databus and also your chatbot proposal for GSoC
In principle, these are the right ideas. We were unable to progress on this before as important meta information about the data was not available. We finished the essentials though and now is a better time to resume work on interfaces.
Let me sum what there is currently:
- Monthly releases are stable and the Databus seems to be doing fine. The main idea here is that people publish any data, e.g. their own or the one they extracted from somewhere else such as DBpedia under their accounts. Good quality data gets picked up and fused into global.dbpedia.org
- While the basics cover files and users, we did not finish implementing additional statistics, so called Mods which are essentially third party analysis plugins for the databus. These would also have the role of semantic tagging and content indexing
- Mods are the systematic approach, however, we finished a preform of semantic indexing, which we do manually now and later automatically. This can be seen e.g. here where in the next version, we will also include the dataset reference, see another prototype deployment here with DNB, Musicbrainz, Geonames, etc. The basis for this is called PreFusion and data is here
- the prefusion is an aggregation of several datasets and it is partitioned by properties into files. So if you are talking about a dashboard, the user might not be interested in all or any Databus datasets, but he would probably configure which part of the prefusion she would like to receive in terms of:
– Subjects, i.e. all persons, companies and cities
– properties for the above selected
– maybe the export vocabulary, i.e. export dbo:birthdate as foaf:birthdate or wd:Pxxx - in addition, we would also encourage users to add more datasets to the prefusion, add/fix links or map additional vocabularies, But this can also be linked later. Getting the information out there is the most important goal now.
Besides the interactive dashboard we can also make a visualisation first.
Side note the previous version of prefusion that you are seeing is loaded here: https://github.com/dbpedia/gfs/tree/master/gfs-data-browser into a read only mongodb:
mongo_url: "mongodb://readonly:gfs@88.99.242.78:8989/prefusion",
the second prototype has the newer data.