Is there a unified code for countries?

What would be the best strategy to locate countries on dbpedia? I see some countries have “dbp:iso3166code” or “dbp:countryCode”, but some else don’t. There is as well a parsed file on Databus that has both “iso3166-1-alpha-3”, and “iso3166-1-alpha-2”. Is there a goal to unify things on the core based on any of these properties? thanks.

1 Like

Hi @p.zangeneh, what do you mean by ‘locate countries’? In the Semantic Web, unique identifiers are URIs. So, if you want the list of sovereign countries, you’ll just need to retrieve their URIs.

select distinct ?x where {
  ?x a yago:WikicatCountries , dbo:Country
} order by ?x

Result on dbpedia.org/sparql, an endpoint running DBpedia 2016-10.

However, if you are specifically looking for the ISO-3166 codes, Wikidata might have what you’re looking for.

1 Like

Thanks, true, the problem is however that not all iso3166-1 countries are a dbo:Country, or a
yago:WikicatCountries.

I tried to match them the other day, and one way or the other there was some manual searching that I had to do. (I would add to this based on a quick look that wdt:P298 also does not match all the iso3166-1 countries, although I should look more into it, maybe I am missing something).

the file is here, where all iso3166-1 are matched to their MARC_country instances with skos:excatMatch, and to their dbpedia and wikidata instance with skos:closeMatch. Now if we think this is something useful I can try putting it on Databus as well.

Can you please give me a few examples of such countries?

Well of course, first, counting them there are 179 distinct instances dbo:Country and yago:WikicatCountries, where there are 241 instances of iso3166-1 countries. albeit, there are not all “countries”, some are “territories” that are sometimes dependent or independent, but regardless they have alpha3 codes and are recognized in iso3166-1:

Consider Norfolk Island, that is a dbo:country but not a yago:WikicatCountries, although it is a yago:WikicatCountriesInOceania.

The problem with just looking for dbo:country is that it as well brings instances of ancient countries (7108 distinct instances).

That is what I meant in the first post by “what is the best strategy”, to locate them. would it be for example:

(dbo:Country ⋂ yago:WikicatCountries) ⋃ (dbo:Country ⋂ yago:WikicatCountriesOceania)

Although there are some that have the same problem but not in Oceania, take for example Turks and Caicos.

Best thing is to have a one to one match of dbr entities to their iso3166-1-alpha3 codes with the current “dbp:iso3166code”, or some property that denotes “current iso3166 alpha3 code”.

Yeah, I agree that the definition of country is blurred. Personally, I’ve always ignored the territories and considered only the sovereign countries, which at the moment are 201 including disputed ones such as Taiwan and Kosovo. But if your target is specifically ISO-3166-1 codes, neither of such definitions will match that.

That said, I’m afraid there is no simple way to get the exact list from DBpedia 2016-10, as the extraction of the codes seems to have failed sometimes.

Your only option seems to perform a federated query across DBpedia and Wikidata, partly derived by the query in the link I posted above. While DBpedia provides links to Wikidata, the opposite doesn’t always happen.

Agree, my problem was a local one and is solved now by just manually matching :slight_smile: , but that would be a great thing to sort out I guess. Thanks for your comments.

1 Like

@p.zangeneh if you find a more or less official source, you could create a DBpedia extension. Ideally you would take this from a stable Linked Data source, e.g. Geonames would be suitable.

In the future, we will merge these properties in the new kind of merged Knowledge Graphs we are producing:
https://global.dbpedia.org/?s=http://dbpedia.org/resource/Norfolk_Island
Those are multi-source fusions.

1 Like

Thanks a lot!

Here is what I’m trying. For some reason, US and others aren’t included. Also on the ‘new’ endpoint it returns nothing, but returns a few on dbpedia.org/sparql

PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?resource ?codea ?coden ?code0 ?countryCode WHERE
{
    ?resource rdf:type dbo:Place .
    OPTIONAL { ?resource dbp:iso31661Alpha ?codea } .
    OPTIONAL { ?resource dbp:iso31661Num ?coden } .
    OPTIONAL { ?resource dbp:iso3166code ?code0 } .
    OPTIONAL { ?resource dbp:countryCode ?countryCode } 

    FILTER ( ?countryCode || ?codea || ?coden || ?code0 )
}

I was hoping we could get these codes from somewhere else eventually. Some linked data source.

@kurzum how would that work or what would it look like?

in principle, we can just add any dataset from the Databus to the store, by adding it to https://databus.dbpedia.org/dbpedia/collections/latest-core
we are preparing the switch. So adding <dbpedia> dbo:langCode "code" would work, but it is not sustainable. We are doing this for e.g. LHD

However, there is a better system

  1. identify a good source. Ideally, directly from the horses mouth, i.e. the source, but a good proxy would work. That said, I would assume that CLDR is such a good proxy as you are an active, sustainable project, that builds a curation wrapper around ISO language codes and also country codes. Also lexvo.org might work.
  2. next year, we will start FAIR Linked Data . Anybody who has Linked Data can work with the DBpedia Platform to improve, quality-control and link it better to the existing linked data cloud.
  3. from there we can decide: a). people can query additional data from external, but be sure that the connection DBpedia->external is well maintained or b) we copy (in the sense of cache) some data into http://dbpedia.org/sparql via the collection.

Nice post!

Consider Norfolk Island , which is a dbo: country but not a Yago: WikicatCountries, although it is a yago: WikicatCountriesInOceania.

1 Like

Hi again. I assumed I could somehow get:

Vuanatu dbo:wikiPageRedirects dbr:ISO_3166-1:VU

… and somehow use a regex to extract the VU, but did not quite work.

Here’s what i am trying (I don’t think this quite worked).

PREFIX  dbo:  <http://dbpedia.org/ontology/>
PREFIX  dbr:  <http://dbpedia.org/resource/>
PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select *
WHERE
{
    ?region  rdf:type             dbo:Place .
    ?region dbo:wikiPageWikiLink dbr:ISO_3166-1 .

    ?region dbo:wikiPageRedirects ?redir

    FILTER (
        regex(?redir, "^.*3166-1:[A-Z][A-Z]$")
    )
}

I even tried using the cctld !

One issue I ran into is that, for example, https://dbpedia.org/resource/Hirshabelle_State is a Country, has CCTLD of .so, and has ISO 3166-1 code of SO. But it’s just one of the states, and I want to reject that entry in favor of https://dbpedia.org/resource/Somalia - made me think I might need to use a ranking system of some kind.