Live endpoint how to access wikipedia outlink pages

SELECT * WHERE {http://dbpedia.org/resource/Spain http://dbpedia.org/ontology/wikiPageWikiLink ?c .}

How can I access Wikipedia page link from the live endpoint

Hi and welcome to the forum!
You can use markdown to make your queries more readable :slightly_smiling_face:

Can you specify your problem? Do you not know the correct syntax, the querying process or are you looking for the correct property?

There are:
https://www.w3.org/ns/prov#wasDerivedFrom
http://xmlns.com/foaf/0.1/isPrimaryTopicOf
And also http://dbpedia.org/ontology/wikiPageID that you can then use with http://en.wikipedia.org/?curid=XXXXXX

SELECT * WHERE { 
  <http://dbpedia.org/resource/Spain> <http://xmlns.com/foaf/0.1/isPrimaryTopicOf> ?c . 
}

So this might do the trick if that’s what you are looking for.

Hi,

Thank you for your reply and welcome message.

I am not very well versed, and I apologise for not being clear.

I am using Python’s SPARQLWrapper to make a query to DBpedia endpoint. The information need is to access outlinks of a Wikipedia webpage, such as Wikipedia article of Spain has outlinks to Archbishop, Autonomous_communities_of_Spain and so on. I am able to retrieve these from https://dbpedia.org/sparql and unfortunately not able to do this from the live one.

If my query is incorrect, can you help me with the correct query, the above mention query works with Python’s SPARQLWrapper.

How does markup works, can you share link.

Thank you and kind regards.

The live endpoint is at http://live.dbpedia.org/sparql

But it does not return anything from the live endpoint. However, the older version works: https://dbpedia.org/sparql
I am actually looking to access the live version for COVID-19 related task and it makes sense to use most updated information.

@matifq
Looks fine to me: http://live.dbpedia.org/page/Template:Infobox_pandemic lists all relevant pages for pandemics.

select ?corona where 
{
?corona <http://dbpedia.org/property/wikiPageUsesTemplate> <http://dbpedia.org/resource/Template:Infobox_pandemic> .
FILTER (regex (str(?corona),".*coronavirus.*"))
} 

which is ~300
There was also a thread regarding this: Tracking Corona in Wikipedia with a new mapping

Also you could try these datasets:
https://databus.dbpedia.org/dbpedia/mappings/mappingbased-objects/2020.04.01
https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2020.04.01
https://databus.dbpedia.org/dbpedia/generic/infobox-properties/2020.03.01
(generic 2020.04.01 is still running and available in some days)

This is not what I am looking for, let me explain, for example, the page https://en.wikipedia.org/wiki/Coronavirus_disease_2019. has wiki outlinks https://en.wikipedia.org/wiki/Infectious_disease, https://en.wikipedia.org/wiki/Severe_acute_respiratory_syndrome_coronavirus_2, and so on. I need these links for the analysis

You can open the Wikipedia webpage and see inside these links inside the content, which I was able to recover from the older version of endpoint: https://dbpedia.org/sparql but cannot recover the live endpoint.

I still don’t understand what you need. http://live.dbpedia.org/page/Coronavirus_disease_2019 has

rdfs:seeAlso 	

    dbr:Contact_tracing
    dbr:Coronavirus_disease_2019
    dbr:Severe_acute_respiratory_syndrome_coronavirus_2
    dbr:COVID-19_related_shortages
    dbr:2019–20_coronavirus_pandemic
    dbr:Mental_health_during_the_2019–20_coronavirus_pandemic

You are describing how Wikipedia is. I can not help you.

The version of the endpoint has not changed since 2017

Let me clarify it with the code that I am using and also provide documented references from DBpedia.

Following contains SPARQL code written inside python which works for the endpoint (old index), but when I try this with the live endpoint, it returns empty. You can try the code it works (and if you change it to live endpoint it does not return anything).

from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setReturnFormat(JSON)
q = "SELECT * WHERE {<http://dbpedia.org/resource/Coronavirus> <http://dbpedia.org/ontology/wikiPageWikiLink> ?out .}"
sparql.setQuery(q)
a = sparql.query().convert()
print(a)

The above code pulls all Wikipedia outlinks (hyperlinks).

The link: https://wiki.dbpedia.org/online-access/DBpediaLive refers to them as Page links and also on https://wiki.dbpedia.org/develop/datasets/downloads-2016-10 it is described as the same. File preview also shows it i.e., <http://dbpedia.org/ontology/wikiPageWikiLink>

http://downloads.dbpedia.org/preview.php?file=2016-10_sl_core-i18n_sl_en_sl_page_links_en.tql.bz2

From what I could see from the current live extraction configuration the relevant extractor (PageLinksExtractor) is not enabled anymore.

<configuration>
<extractors>
  <extractor name ="org.dbpedia.extraction.mappings.AbstractExtractorWikipedia" status="ACTIVE" languages= "en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.ArticleCategoriesExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.ArticleTemplatesExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.CategoryLabelExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.ContributorExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.ExternalLinksExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.GeoExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.PageIdExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.LabelExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.PageIdExtractor" status="ACTIVE" languages="en"></extractor>
  <!--extractor name ="org.dbpedia.extraction.mappings.PersondataExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.PndExtractor" status="ACTIVE" languages="en"></extractor-->
  <!--extractor name ="org.dbpedia.extraction.mappings.RedirectExtractor" status="ACTIVE" languages="en"></extractor-->
  <extractor name ="org.dbpedia.extraction.mappings.RevisionIdExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.SkosCategoriesExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.ArticlePageExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.DisambiguationExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.MappingExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.MetaInformationExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.WikiPageLengthExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.WikiPageOutDegreeExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.HomepageExtractor" status="ACTIVE" languages="en"></extractor>
  <extractor name ="org.dbpedia.extraction.mappings.TemplateParameterExtractor" status="ACTIVE" languages="en"></extractor>
  <!--<extractor name ="org.dbpedia.extraction.mappings.WikiPageCharactersExtractor" status="ACTIVE" languages="en"></extractor>-->
</extractors>

@kurzum do you have any reason for why it had been disabled after redeploy of live, i can see that this was set to “KEEP” in an older config file.

This seems inconsistent when having in mind that ExternalLinksExtractor is enabled, so I assume this is an error (note rdfs:seeAlso is not the same as http://dbpedia.org/ontology/wikiPageWikiLink )

1 Like

Is there a way that this error could be fixed anytime soon.

Live is currently being fundamentally redesigned, and will be redeployed. Live 2.0 beta will come probably in 2-3 month. I will propose to include this extractor in the new version to evaluate whether these page links make any problems. AFAIK at the moment only @kurzum (who is out of office - so I dont expect there a decision soon) and @pvk can decide / control the current live deployment. Until Live 2.0 will be used by live.dbpedia.org/sparql it can take several month, but we plan to make it easier to setup your own live-fed sparql endpoint (mirror). You can watch the progress here https://github.com/dbpedia/live

1 Like

@matifq, @jfrey http://dbpedia.org/ontology/wikiPageWikiLink have been disabled in live and the main endpoint for quite a while now. There is also no reason to activate them. They take up a lot of space and they are mostly useful for analytic queries, which you should not do on the main endpoint as it takes many resources.

This is the dataset, where you can download them:
https://databus.dbpedia.org/dbpedia/generic/wikilinks/

1 Like

This would help in the efforts made in Ireland in the direction of COVID-19 (project: RCES: Rapid Cues Exploratory Search Using Taxonomies For COVID-19).

Much appreciated.