Hello everyone,
TL;DR: How would you generate an up to date “list of smartphones” or other entities?
Motivation
DBpedia is awesome. However, what I feel is particularly missing, is a browsing functionality for the average Joe that offers a filterable, sortable, shareable table of instances of a given type. For example, the-site.com/smartphones
should print out a list of all smartphones there are – and it has to be fast, responsive, mobile-friendly. Remotely similar to ExConQuer or maybe the table view of SemLens. It is sad to see how almost all of the linked projects are either offline or unmaintained.
I am working on implementing such an application, but there is a lot to do. The end product shall be a free, independent, modern and OS&OD product search engine or shopping search engine - the German wiki even has an article on those (Produktsuchmaschine). The fact that we all still have to use proprietary product search engines like Amazon’s integrated one or Google Shopping to compare and filter product data when there is something like DBpedia is unbearable.
This project should fill the gap.
Data extraction problems
“Smartphones” is just an example entity. But it is a great example because while there is an ontology class MobilePhone
, there is not a single RDF subject linked to it.
There is a lot linked to the resource Smartphone
, however, through various properties. In some cases, the information about a resource being a smartphone is lost with a more recent dump: The 2016 DBp Huawei Honor 8 knows it is a dbp:type
dbr:Smartphone
while in the live version, this information is missing. 2016 dump contains about five times more links to dbr:Smartphone
in general.
Am I right to assume that to get a most possible-complete “list of smartphones”, I need to UNION
a lot of different properties (hypernum, rdf:type, dbp:type, form etc.) and multiple datasets (live + 2016)? Also, the dataset should always be up to date, so I probably need to integrate downloads.dbpedia.org/live/changesets somehow.
These issues seem to be present for all ontologies, for example dbr:Cat
(#122) vs dbo:Cat
(#0). I hope I am not misinterpreting here.
Also, am I right to assume that data contained inside lists like the ones inside list of lists of lists like this list of smartphones is not included in the data dumps?
Off topic vision
It would be great if the users of above described website could add values / products / even categories or edit invalid values in place, either feeding directly into WikiData or serving as an inbetween layer for it.
I will apply with this project to the Prototype Fund funding initiative upcoming month but develop it either way.
.
Thank you so much for taking the time. Please do not hold back with criticism or comments as well.
Philip