My name is Charitha, and I’m a third year B.Tech student specializing in AI/ML.
I’m preparing for GSoC 2026 and recently made my first real contribution to DBpedia.
I submitted Pull Request #93 to the dbpedia/fact-extractor repository, addressing Issue #88 by removing hard-coded Italian stopwords and introducing a configurable --language option to improve multilingual support in the data labeling pipeline.
I’m excited to continue contributing, learning from the community, and exploring potential GSoC project ideas related to DBpedia’s extraction and knowledge graph ecosystem.
Hi @tsoru,
Based on my recent PRs around multilingual TF-IDF and Python 3 fixes, I proposed a project idea focused on modernizing and multilingualizing DBpedia’s NLP pipeline.
Would this align with DBpedia’s GSoC 2026 direction, or are there specific components you recommend I focus on next?