Bringing together LLMs and RDF Knowledge Graphs ā€” GSoC 2024

Goal:

Improve the capabilities of Large Language Models (LLMs like Gemini or ChatGPT) in interfacing RDF Data and RDF Knowledge Graphs. In order to do so as a major step, the aim is to build services that allow LLMs to lookup the concept in ontologies using DBpedia Archivo as flexible source for ontologies.

Tasks:

  • improve an existing term search API based on DBpedia Lookup and DBpedia Archivo, such that it can be integrated via langchain/chatgpt/claude plugins to search for ontologies, classes, properties
  • load the ontologies into a vector database such that LLMs can find relevant information in vector space
  • integrate popularity information as mean to rank candidates (LOD stats, void stats generate e.g. from SPARQL endpoint)
  • test and measure and compare the quality/performance (improvement) with the llm-kg-bench framework based by extending and adding evaluation test cases (converting a factsheet or csv into a KG, writing meaningful queries towards a SPARQL endpoint, mapping ontologies or datasets)

Project size

  • 175 hours or 350 hours

Mentors

Johannes Frey, Dr. Mahdi Hedayat Mahmoudi, Hannes Hartmann

Hello,

Iā€™m highly interested in the GSOC 2024 project focusing on Semantic Web and NLP for DBpedia. With recent research study in NLP and hands-on experience, Iā€™m eager to contribute for bringing together LLMs and RDF Knowledge Graphs. Excited about the opportunity to collaborate and make a meaningful impact.

Best Regards
Tsirindanis Chrysovalantis

1 Like