How to run underlying model code?

Greetings all,

We need to use the dbpedia spolight model to do some batch inferencing on ~5M documents. As such, the docker file and a REST API endpoint won’t be useful here because we don’t need a prediction service. We need the underlying model code (I believe it is written in scala) to execute the predictions and we will likely be using a distributed computing framework like Spark to process the 5M documents.

It is true that we could create a REST API and send the 5M documents asynchronously through the annotation endpoint, but that would take a long time and we would be dealing with latency and networking bandwidth issues.

Any help on how to get started on this? I see the underlying model code is in this github repo: But there are no instructions on how to run the model outside of setting up the docker service.



Hi @rileyhun

Thanks for the post. The project described sounds really interesting. And yes, the underlying model code is in the posted repo (

For example, to run the model with IntelliJ:

  1. Clone the DBpedia-spotlight-model (git clone …)
  2. Open the project on IntelliJ (File → Open)
  3. Create a configuration to run DBpedia Spotlight (Run → Edit configurations)
    3.1 Create a new Application configuration
    3.2 Set the fields as the following image (changing the paths with your own):

The D:\git\model-test\tr is the path of the extracted model and is the URL of the service. I hope this information helps. Have a great day.

My best regards


Hello @JulioNoe,

Thanks for your prompt response. The example you showed above is still running the model through a web server. I basically just want to extract the model code and be able to execute the annotation from the code-level, not as a REST API endpoint. So basically, is there a way to just extract the model code, and run the annotation method/function?