Duplicate resource names with underscore (in Mappings Extraction)

DBpedia instance_types_en.ttl contains duplicate entries, which are distinguished only by _[number] suffix. This is present both in 2016_04 dump as well as in live DBpedia.
E.g.
http://dbpedia.org/resource/Ada_Lovelace
http://dbpedia.org/resource/Ada_Lovelace__1

Both resources are connected via dbo:personFunction, for other resources the connection is a different property.

The 2016_04 dump additionally contains
http://dbpedia.org/resource/Ada_Lovelace__2 … 12

What is the semantics of this possible duplicate entries?

Hi @kliegr.

Entities ending with __[number] are not duplicates but are conform to the knowledge graph modelling choices. Most of them are instances of dbo:CareerStation (see examples), which represent time periods, therefore provide statements that hold true only within a determined time span.

In your case, I agree that creating a secondary entity is not necessary. However, potentially other metadata can be added to the person function “Countess of Lovelace” instead of adding predicates directly to the entity dbr:Ada_Lovelace.

Thanks for the explanation!

Tomas

@tsoru thanks for the explanation as well. I added it to the artifact docu, so it will display at the artifact page on next monthly release: https://databus.dbpedia.org/dbpedia/mappings/instance-types

I wasn’t even aware these existed. I wouldn’t delete them, they seem usefull, but if there is a good reason for it, we could split them in a third content variant like _function_ besides _transitive_.

1 Like

Topic was moved to ‘Data Quality’ category.