Towards Amharic DBpedia
Description
DBpedia is a collaborative initiative focused on extracting structured information from Wikipedia and presenting it as Linked Open Data. While semantic web resourceful languages like English and German have dedicated DBpedia chapters, there must be more representation of low-resourced languages like Amharic. Amharic - an African language - is the official language of Ethiopia, spoken by millions globally, and it is one such language that lacks its own DBpedia chapter. This project endeavors to create an Amharic DBpedia Chapter, aiming to be the first sub-Saharan African language to join the internationalization efforts of DBpedia. This project will pave the way for other African languages to be part of DBpedia. Therefore, the task is effectively extracting, processing, and integrating information from Amharic Wikipedia into DBpedia.
Goal
The primary goal of this project is to create an Amharic DBpedia chapter to be reached at am.dbpedia.org:
- Create an Amharic DBpedia chapter in the DBpedia knowledge graph with data from Amharic Wikipedia.
- Extend the DBpedia extraction framework to extract citations, disambiguation, personal data, topical concepts, anchor text, and shared resources from Amharic Wikipedia.
- Create Amharic DBpedia mapping based on DBpedia ontology mapping guidelines.
- Make the knowledge graph available to end users via a web page.
- Create a SPARQL endpoint to make it queryable.
- Create a document for processes, tools, and techniques used for sustainable development following FAIR principles.
Impact
- Enabling users to access and utilize structured data in Amharic DBpedia more effectively.
- Promote linguistic diversity and support research, education, and applications that rely on multilingual knowledge graphs.
- NLP downstream tasks: Apply knowledge graphs from DBpedia to downstream NLP tasks such as machine translation and sentiment analysis.
- Community Engagement: Encourage the community to contribute and collaborate in sustaining and expanding Amharic DBpedia.
Warm-up tasks
Please read the following papers:
- Amharic Wikipedia
- Arabic DBpedia
- Korean DBpedia
- German DBpedia
Skills Required
- A good understanding of Java, Python
- Optionally, good knowledge of SPARQL, RDF, and other Semantic Web technologies
- Good documentation and communication skills
Project Size
350 hrs
Mentors
- Hizkiel Alemayehu
- Tilahun Tafa
- Ricardo Usbeck
Keywords
Amharic DBpedia, Semantic Web, Extraction Framework,