Description
Suppose that you are searching for the information about a concept/entity (like a basketball player, Albert Einstein, or whatever you need). This task could be performed easily with a Knowledge Graph, but in most cases, we obtain thousands of results, which surely do not help us (too much information means no information) [1]. In order to address this issue, we can think of leveraging some measures of Social Network Analysis [2] to evaluate the importance of nodes in a Knowledge Graph.
Example:
Consider a student querying for facts about A. Einstein. Assuming to ask for Einstein to DBpedia, we collect several results, from the “birthPlace”, “institution”, “knownFor”, to the “spouse” and so on and so forth. The student may feel overwhelmed by all this information, giving up his task. In order to show only the relevant facts about Einstein (or any concept in a KG), we can think to:
- Derive an ego network [3] (so the nodes connected to Einstein)
- Compute a centrality measure of all the nodes to get their importance (which for starter could be the number of connections)
- Evaluate further patterns in the ego network (like triads, clique, etc.)
After this computation, we can give to the student a subgraph where Einstein is its center and he is surrounded by his most important concepts (like the top-5 or top-10, in terms of the previous measure).
Goal
Leverage the Knowledge Graph of DBpedia to develop a graph-query tool that can help the end user to obtain relevant information w.r.t his request/input/query. If there is enough time, it could be also interesting to build a simple dashboard that shows the subgraphs obtained from the queries.
Impact
- Enable users to perform graph queries and obtain the most valuable elements connected to the user’s input
- Provide a visual (partial) graph representation of the Knowledge Graph of DBpedia
Warm-up tasks
- Check the DBpedia databus (https://databus.dbpedia.org/) and the latest core release (https://wiki.dbpedia.org/develop/datasets/latest-core-dataset-releases).
- Have a look at the Virtuoso SPARQL Endpoint Quickstart (https://github.com/dbpedia/virtuoso-sparql-endpoint-quickstart)
- Have a look at the NetworkX library (https://networkx.org/)
- Read the papers cited below
Mentors
@lucav48
Keywords
#KnowledgeGraph #SocialNetworkAnalysis #GraphExploration #GraphQuery
Further Resources
[1] Lissandrini, Matteo, et al. “Graph-query suggestions for knowledge graph exploration.” Proceedings of The Web Conference 2020.
[2] Landherr, Andrea, Bettina Friedl, and Julia Heidemann. “A critical review of centrality measures in social networks.” Business & Information Systems Engineering
[3] https://research.library.gsu.edu/c.php?g=916490&p=6612505