Inconsistency in annotations (DBpedia Spotlight)

nawabhussain · August 20, 2019, 2:26pm

I have been trying to annotate text using DBpedia spotlight. I see a couple of places where the entities are either misread or are missed out totally. Is there someone who could help me out with this issue? I created an issue (https://github.com/dbpedia-spotlight/dbpedia-spotlight-model/issues/42) on Github, but no one seems to reply to the issues anymore.

I would be really glad if someone could at least point me in the right direction.

jfrey · August 21, 2019, 8:24am

Hi @nawabhussain,

Why do German DBpedia IRIs not resolve?: The German chapter has server issues at the moment. AFAIK there is the chance that this will work again by end of September. You can just retrieve the English IRI or DBpedia global ID for it with the help of http://dev.dbpedia.org/Global_IRI_Resolution_Service and then lookup the data there.
Why are the results different depending on the input text?: Spotlight uses language models and is trained to be applied on real text (actually Wikipedia like text. However the results for single words or the collection of concepts can be also quite good.) So this not contradicting this is just normal behavior for probabilistic approaches. You can try to get better results by filtering on specific DBpedia classes (e.g. Companies, Person…) to improve the results.

nawabhussain · August 21, 2019, 9:34am

Thank you for replying to my question. I understand that probabilistic approaches can sometimes lead to unexpected results, but what i was wondering if there is any way to improve the results (Except specifying the DBpedia classes. It just results in some of the classes not being annotated.)?