DBpedia Amharic Chapter — GSoC 2024

Towards Amharic DBpedia

Description

DBpedia is a collaborative initiative focused on extracting structured information from Wikipedia and presenting it as Linked Open Data. While semantic web resourceful languages like English and German have dedicated DBpedia chapters, there must be more representation of low-resourced languages like Amharic. Amharic - an African language - is the official language of Ethiopia, spoken by millions globally, and it is one such language that lacks its own DBpedia chapter. This project endeavors to create an Amharic DBpedia Chapter, aiming to be the first sub-Saharan African language to join the internationalization efforts of DBpedia. This project will pave the way for other African languages to be part of DBpedia. Therefore, the task is effectively extracting, processing, and integrating information from Amharic Wikipedia into DBpedia.

Goal

The primary goal of this project is to create an Amharic DBpedia chapter to be reached at am.dbpedia.org:

  • Create an Amharic DBpedia chapter in the DBpedia knowledge graph with data from Amharic Wikipedia.
  • Extend the DBpedia extraction framework to extract citations, disambiguation, personal data, topical concepts, anchor text, and shared resources from Amharic Wikipedia.
  • Create Amharic DBpedia mapping based on DBpedia ontology mapping guidelines.
  • Make the knowledge graph available to end users via a web page.
  • Create a SPARQL endpoint to make it queryable.
  • Create a document for processes, tools, and techniques used for sustainable development following FAIR principles.

Impact

  • Enabling users to access and utilize structured data in Amharic DBpedia more effectively.
  • Promote linguistic diversity and support research, education, and applications that rely on multilingual knowledge graphs.
  • NLP downstream tasks: Apply knowledge graphs from DBpedia to downstream NLP tasks such as machine translation and sentiment analysis.
  • Community Engagement: Encourage the community to contribute and collaborate in sustaining and expanding Amharic DBpedia.

Warm-up tasks

Please read the following papers:

  • Amharic Wikipedia
  • Arabic DBpedia
  • Korean DBpedia
  • German DBpedia

Skills Required

  • A good understanding of Java, Python
  • Optionally, good knowledge of SPARQL, RDF, and other Semantic Web technologies
  • Good documentation and communication skills

Project Size

350 hrs

Mentors

  • Hizkiel Alemayehu
  • Tilahun Tafa
  • Ricardo Usbeck

Keywords

Amharic DBpedia, Semantic Web, Extraction Framework,

1 Like

Hi @hizclick,

Are you looking for someone who is native to Amharic or anyone interested in this can contribute?

1 Like