DBpedia Archivo - Call to improve the web of ontologies

Dear all,
We are proud to announce DBpedia Archivo (https://archivo.dbpedia.org) an augmented ontology archive and interface to implement FAIRer ontologies. Each ontology is rated with 4 stars measuring basic FAIR features. We discovered 890 ontologies reaching on average 1.95 out of 4 stars. Many of them have no or unclear licenses and have issues w.r.t. retrieval and parsing.

Community action on individual ontologies

We would like to call on all ontology maintainers and consumers to help us increase the average star rating of the web of ontologies by fixing and improving its ontologies. You can easily check an ontology at https://archivo.dbpedia.org/info. If you are an ontology maintainer just release a patched version - archivo will automatically pick it up 8 hours later. If you are a user of an ontology and want your consumed data to become FAIRer, please inform the ontology maintainer about the issues found with Archivo.

The star rating is very basic and only requires fixing small things. However, the impact on technical and legal usability can be immense.

Community action on all ontologies (quality, FAIRness, conformity)

Archivo is extensible and allows contributions to give consumers a central place to encode their requirements. We envision fostering adherence to standards and strengthening incentives for publishers to build a better (FAIRer) web of ontologies.

  1. SHACL (https://www.w3.org/TR/shacl/, co-edited by DBpedia’s CTO D. Kontokostas) enables easy testing of ontologies. Archivo offers free SHACL continuous integration testing for ontologies. Anyone can implement their SHACL tests and add them to the SHACL library on Github. We believe that there are many synergies, i.e. SHACL tests for your ontology are helpful for others as well.
  2. We are looking for ontology experts to join DBpedia and discuss further validation (e.g. stars) to increase FAIRness and quality of ontologies. We are forming a steering committee and also a PC for the upcoming Vocarnival at SEMANTiCS 2021. Please message hellmann@informatik.uni-leipzig.de if you would like to join. We would like to extend the Archivo platform with relevant visualisations, tests, editing aides, mapping management tools and quality checks.

How does Archivo work?

Each week Archivo runs several discovery algorithms to scan for new ontologies. Once discovered Archivo checks them every 8 hours. When changes are detected, Archivo downloads and rates and archives the latest snapshot persistently on the DBpedia Databus.

Archivo’s mission

Archivo’s mission is to improve FAIRness (findability, accessibility, interoperability, and reusability) of all available ontologies on the Semantic Web. Archivo is not a guideline, it is fully automated, machine-readable and enforces interoperability with its star rating.

  • Ontology developers can implement against Archivo until they reach more stars. The stars and tests are designed to guarantee the interoperability and fitness of the ontology.

  • Ontology users can better find, access and re-use ontologies. Snapshots are persisted in case the original is not reachable anymore adding a layer of reliability to the decentral web of ontologies.

Let’s all join together to make the web of ontologies more reliable and stable,

Johannes Frey, Denis Streitmatter, Fabian Götz, Sebastian Hellmann and Natanael Arndt

Paper: https://svn.aksw.org/papers/2020/semantics_archivo/public.pdf

I grabbed all 850 a couple months ago and it opened up my horizons to what ontologies are out there. Previously I had >500, now I got double that number :slight_smile:

(Somehow it hasn’t occurred to me to download all from LOV https://lov.linkeddata.es/, which are 730).

I made a script to rename them to a bit more descriptive names. And I’ve dispatched 150 to topical folders (leaving 700 that maybe I know to some extent, but haven’t found the time to dispatch them).

I guess my point is: an important task in enabling comprehension of these ontologies is to work on tags/topics or some better way of describing them (eg topic maps?)

The script looks like this:

curl -s http://akswnc7.informatik.uni-leipzig.de/dstreitmatter/archivo/purl.org/ontology--cco--core/2020.06.10-204639/ontology--cco--core_type=parsed.ttl\
	-o cco.ttl
curl -s http://akswnc7.informatik.uni-leipzig.de/dstreitmatter/archivo/purl.org/ontology--cco--mappings/2020.06.10-204739/ontology--cco--mappings_type=parsed.ttl\
	-o ontology-cco-mappings.ttl

But then I looked inside and renamed them as follows, where the first part is the vann:preferredNamespacePrefix and the rest is a descriptive name to let me know whether I’m interested and to dispatch (file) it properly in the future:

  • cco-Cognitive Characteristics.ttl
  • ccom-Cognitive Characteristics-mapping-PRV.ttl