Hi all,
CI Tests are working
Run on Minidump
git clone https://github.com/dbpedia/extraction-framework.git
cd dump
mvn test
Configure tests
https://github.com/dbpedia/extraction-framework/blob/master/dump/src/test/resources/dbpedia-specific-ci-tests.ttl
currently is this here:
validator:dissallowed_chars
a v:IRI_Validator ;
rdfs:comment """Dissallowed in URIs, cf. https://www.ietf.org/rfc/rfc3987.txt: Systems accepting IRIs MAY also deal with the printable characters in US-ASCII that are not allowed in URIs, namely "<", ">", '"', space, "{", "}", "|", "\", "^", and "`", in step 2 above. If these characters are found but are not converted, then the conversion SHOULD fail. Please note that the number sign ("#"), the percent sign ("%"), and the square bracket characters ("[", "]") are not part of the above list and MUST NOT be converted. """ ;
v:doesNotContain "<" , ">", "\"" , " ", "{", "}", "|", "\\", "^" , "`" .
validator:dbpedia_resource_delims
a v:IRI_Validator ;
rdfs:comment """
1. gen-delims are not allowed, except ":" and "@" per rfc3987
"ipchar = iunreserved / pct-encoded / sub-delims / ":" / "@" "
2. sub-delims are allowed:
These are allowed in DBpedia Uris, so we check that they are not encoded
sub-delims = "%21", "%24", "%26", "%27", "%28", "%29", "%2A", "%2B", "%2C", "%3B", "%3D"
sub-delims = "!", "$", "&", "'", "(", ")", "*", "+", ",", ";", "="
reserved gen-delims from above """ ;
v:doesNotContain "?", "#", "[", "]" ;
v:doesNotContain "%21", "%24", "%26", "%27", "%28", "%29", "%2A", "%2B", "%2C", "%3B", "%3D" .
Extend the minidump
Add more Wikipedia articles to the minidump here:
Results
Cov_s: 1.0 ( 5 triggered of 5 total ), Success_rate_s: 1.0 ( 5 )
Cov_p: 1.0 ( 33 triggered of 33 total ), Success_rate_p: 1.0 ( 33 )
Cov_o: 0.9464286 ( 106 triggered of 112 total ), Success_rate_o: 1.0 ( 106 )
Cov: 0.98214287
This will end faulty URIs and datatypes