New Contributor — GSoC 2026 Preparation

Hello DBpedia Community :waving_hand:

My name is Charitha, and I’m a third year B.Tech student specializing in AI/ML.
I’m preparing for GSoC 2026 and recently made my first real contribution to DBpedia.

I submitted Pull Request #93 to the dbpedia/fact-extractor repository, addressing Issue #88 by removing hard-coded Italian stopwords and introducing a configurable --language option to improve multilingual support in the data labeling pipeline.

I’m excited to continue contributing, learning from the community, and exploring potential GSoC project ideas related to DBpedia’s extraction and knowledge graph ecosystem.

Looking forward to collaborating with you all!

Hi @Charithakottu and welcome!

Please monitor this page for any project ideas or add your own: Topics tagged gsoc2026-ideas

We will post any updates here or on Slack about DBpedia’s participation at the Google Summer of Code 2026.

Hi @tsoru,
Based on my recent PRs around multilingual TF-IDF and Python 3 fixes, I proposed a project idea focused on modernizing and multilingualizing DBpedia’s NLP pipeline.

Would this align with DBpedia’s GSoC 2026 direction, or are there specific components you recommend I focus on next?

I have replied under your project proposal. Please follow the recommendations.