Local vs online dbpedia versions: different results are returned

Hi,

I’ve downloaded dbpedia locally (using the pre-release) as described here, but I assume that http://dbpedia.org uses the 2016-10 dataset.

Is that the reason where when I execute the same query in the online dbpedia and my local dbpedia I get different results? For example the following query retrieves

SELECT COUNT(DISTINCT ?item)
WHERE
{ 
    ?item a dbo:Company;
          dbo:location/dbo:country ?location.

    FILTER(REGEX(?location, "United_Kingdom"))

}
LIMIT 10000

1113 results in my local version, but 2997 in the online version and also, being a little more concrete this result is retrieved in the online dbpedia but not in my local version.

As I’ve also noticed, this is the repo, where all the data is stored. I’ve seen that dbpedia is updated monthly. So, I’d like to ask you if these are the files that are updated each month.

Thanks.

Hi @thanasissdr yes you are right about that. DBpedia Online contains more data as compared to DBpedia.org.

1 Like
  • Part of the missing companies could have been been caused by [1]. Since the pre-release is before the fix I hope that this will improve in the next (pre)-release coming soon.
  • I added the missing resource to our new CI tests [2] but the issue is known and fixed [3] on another branch but we don’t have the resources to merge that to the master (at the moment and since 2 years) and actually need support by the community to get this done
  • Q1: at the moment 2016-10 is loaded in the official endpoint
  • Q2: No this is only one folder out of many. We use databus groups and collections for releases now. The raw monthly releases are on [4] (the usage will potentially cause pain and errors :wink: and error-filtered releases groups show up on dbpedia databus account e.g. mapping-based extraction here https://databus.dbpedia.org/dbpedia/mappings/ (like the pre-release you are using) but due to its complexity the latter is not monthly yet but will be monthly soon. Once the new release workflow is running stable we will try to document this better and introduce useful collections everybody can use or fork and customize.

[1] https://github.com/dbpedia/extraction-framework/issues/595
[2] https://github.com/dbpedia/extraction-framework/commit/0c57c6dd252f6d81a5d6d6c8736b03d0b911d396
[3] https://github.com/dbpedia/extraction-framework/issues/582
[4] https://databus.dbpedia.org/marvin

1 Like