How to run underlying model code?

rileyhun · April 7, 2022, 8:17pm

Greetings all,

We need to use the dbpedia spolight model to do some batch inferencing on ~5M documents. As such, the docker file and a REST API endpoint won’t be useful here because we don’t need a prediction service. We need the underlying model code (I believe it is written in scala) to execute the predictions and we will likely be using a distributed computing framework like Spark to process the 5M documents.

It is true that we could create a REST API and send the 5M documents asynchronously through the annotation endpoint, but that would take a long time and we would be dealing with latency and networking bandwidth issues.

Any help on how to get started on this? I see the underlying model code is in this github repo: https://github.com/dbpedia-spotlight/dbpedia-spotlight-model. But there are no instructions on how to run the model outside of setting up the docker service.

Thanks,

Riley

JulioNoe · April 8, 2022, 7:22pm

Hi @rileyhun

Thanks for the post. The project described sounds really interesting. And yes, the underlying model code is in the posted repo (https://github.com/dbpedia-spotlight/dbpedia-spotlight-model).

For example, to run the model with IntelliJ:

Clone the DBpedia-spotlight-model (git clone …)
Open the project on IntelliJ (File → Open)
Create a configuration to run DBpedia Spotlight (Run → Edit configurations)
3.1 Create a new Application configuration
3.2 Set the fields as the following image (changing the paths with your own):

The D:\git\model-test\tr is the path of the extracted model and http://0.0.0.0:80/rest is the URL of the service. I hope this information helps. Have a great day.

My best regards

Julio

rileyhun · April 12, 2022, 6:47pm

Hello @JulioNoe,

Thanks for your prompt response. The example you showed above is still running the model through a web server. I basically just want to extract the model code and be able to execute the annotation from the code-level, not as a REST API endpoint. So basically, is there a way to just extract the model code, and run the annotation method/function?