Containerized Installers for Data-centric Services using Databus Collections

janfo · January 25, 2024, 1:04pm

Project Description:

This GSoC project aims to develop containerized installers for data-centric services utilizing Databus collections. Databus collections provide a framework for managing and sharing datasets across distributed systems, offering versioning, replication, and access control features.

One exemplary application of this project is integrating Databus collections with the Virtuoso Open-Source triple store, a widely used RDF service. This integration enables seamless deployment and loading of RDF datasets into Virtuoso instances within containerized environments.

Additionally, the project entails both designing and documenting best practices for deploying other Databus-driven services, along with implementing more deployment-ready containers. These containers will encapsulate the necessary components for pulling data from Databus collections and installing them with associated services, ensuring ease of deployment and scalability.

Furthermore, the project may explore integration options with the Databus frontend or even metadata, enhancing discoverability and interoperability of the deployed services within the Databus ecosystem.

Key Objectives:

Integrate Databus collections with the Virtuoso Open-Source Triple Store as a first use case. This can be done by building upon the Virtuoso Quickstarter repository (GitHub - dbpedia/virtuoso-sparql-endpoint-quickstart: creates a docker image with Virtuoso preloaded with the latest DBpedia dataset)
Design and document best practices for deploying Databus-driven services.
Implement 4-5 deployment-ready containers for data-centric services utilizing Databus collections. Services could, for instance, be chosen from a list of Semantic Web applications and services here: GitHub - semantalytics/awesome-semantic-web: A curated list of various semantic web and linked data resources.
Explore integration possibilities with the Databus frontend or metadata systems for enhanced functionality and interoperability.

Expected Outcome:

A well-documented Databus-driven Virtuoso Quickstarter container that focuses on ease of deployment.
Documentation outlining best practices and guidelines for implementing, deploying and managing Databus-driven services.
4-5 Containerized installers for deploying data-centric services leveraging Databus collections.
Design proposal for integration of these services with the Databus frontend.
[Optional] integration with Databus frontend or even metadata for improved discoverability and usability.

Skills Required:

A good understanding of SPARQL, RDF and other Semantic Web technologies
Some proficiency in containerization technologies (e.g., Docker, Kubernetes).
Knowledge of the core concepts of the DBpedia Databus (see Overview - Databus Gitbook)
Good documentation and communication skills

Project Size:

Estimated anywhere between 90 to 180 hours, depending on expertise and number of tackled tasks.

kurzum · February 1, 2024, 8:33am

One main thing here, which is pretty cool, is RDF. So people could pick any RDF dataset from the bus or assemble their dataset and then deploy RDF applications based on the dataset via Docker.

ronitblenz · February 4, 2024, 6:37pm

Hey Janforberg,

This is Ronit Banerjee. I was a part of GSoC 2023 at DBpedia as a Mentee under Edgard Marx’s project which dealt with Java Spring Boot, Maven, Docker and Documentation.
Your project idea in #gsoc2024 caught my sight, I would love to volunteer as a mentor at your project as I have a good understanding of the tech, how this organisation works and also the culture of open source development.

I am open to assist you with this.

surjendu104 · February 22, 2024, 11:40am

hi @janfo
I am Surjendu Pal, an open source enthusiast, currently in final year of college. I have worked in java based technologies. I have made projects with Java, Spring Boot and Docker. This is my github (surjendu104 (Surjendu) · GitHub). This project suits best to me. So I want to contribute in this project. Thanks.

Sincerely
Surjendu Pal
surjendup104@gmail.com

janfo · February 28, 2024, 11:58am

@ronitblenz I would also join as a mentor on this but it would be great if you could co-mentor this project with me!

@surjendu104 sounds great, thank you for your application. I am currently unsure how exactly the projects are going to be assigned, but I’ll get back to you.

Sorry for the long silence in here!

ronitblenz · February 29, 2024, 7:41pm

Awesome! I am in.

ronitblenz · March 15, 2024, 4:35pm

Dear @contributors for #gsoc2024

If you need to get started with Semantic Web, you can check out my documentation which I prepared last year while I was a contributor.

Hope this Helps! All the Best!

chiragtyagi2003 · March 16, 2024, 8:35am

Hi @janfo and @ronitblenz , I reviewed the project doc and the project sounds interesting, I have some ideas to get started and would be grateful to know the further process.

Thanks
Chirag Tyagi
tyagichirag06@gmail.com