sparna-git / xls2rdf

Create RDF data from Excel spreadsheets - edit SKOS vocabularies, knowledge graph instances, SHACL constraints, OWL ontologies in Excel files. Available as HTTP service, upload form, command-line, or Java API.

Home Page:https://xls2rdf.sparna.fr

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to run Xls2Rdf from the command line?

aurelberra opened this issue · comments

Hello. For a project based on open source tools I have been using your xls2skos CLI tool (currently xls2skos-0.7.6-onejar.jar). I would like to switch to xls2rdf, but in the files downloaded from the releases as per the instructions on the wiki, I cannot find the expected xls2rdf-app-x.y.z-onejar.jar file. Have I misunderstood something? Thanks in advance for your help.

Hello

(au moins un qui suit !).
Yes, I just needed to package and deploy a new release. This is done now, with a new 2.0 release.
You should check out latest cool features such as URI lookup or changing subject column.
Maybe you can share a pointer to your project ? I am always curious.

Don't forget to close the issue if appropriate.

Thanks

Great! Merci beaucoup.

But early adopters are also facing bugs… The SkosPostProcessor raises an exception:

Exception in thread "main" java.lang.ClassCastException: class org.eclipse.rdf4j.model.impl.SimpleLiteral cannot be cast to class org.eclipse.rdf4j.model.Resource (org.eclipse.rdf4j.model.impl.SimpleLiteral and org.eclipse.rdf4j.model.Resource are in unnamed module of loader 'app')
	at fr.sparna.rdf.xls2rdf.SkosPostProcessor.lambda$afterSheet$1(SkosPostProcessor.java:36)
	at java.base/java.lang.Iterable.forEach(Iterable.java:75)
	at fr.sparna.rdf.xls2rdf.SkosPostProcessor.afterSheet(SkosPostProcessor.java:34)
	at fr.sparna.rdf.xls2rdf.Xls2RdfConverter.processSheet(Xls2RdfConverter.java:343)
	at fr.sparna.rdf.xls2rdf.Xls2RdfConverter.processWorkbook(Xls2RdfConverter.java:174)
	at fr.sparna.rdf.xls2rdf.Xls2RdfConverter.processInputStream(Xls2RdfConverter.java:145)
	at fr.sparna.rdf.xls2rdf.app.Convert.execute(Convert.java:108)
	at fr.sparna.rdf.xls2rdf.app.Main.run(Main.java:79)
	at fr.sparna.rdf.xls2rdf.app.Main.main(Main.java:86)

There was nothing out of the way in the file or in the conversion command: sudo java -jar xls2rdf-app-2.0-onejar.jar convert -i in.xlsx -o out_2.rdf -l fr. Can you see what happened?

You have a skos:broader that has a literal as a value, instead of URI.
Try adding the option --noPostProcessings to deactivate SKOS post-processings and have a look at the output SKOS file to look for that bad skos:broader property.

Will improve the behavior of this : #10

I found a term (out of 1000+) in which the prefix was missing (though it was there a few hours ago, I will refrain from accusing colleagues who have access to the shared spreadsheet), but unfortunately I still get the same error with the corrected data. I checked all the terms several times. I also tried to remove non-ASCII characters from the URI, to no avail.

I see that xls2skos still happily converts the same spreadsheet. Is there any rule that changed between xls2skos and xls2rdf, and might explain that the newer one chokes on my data?

  • Double check your skos;broader and skos:narrower columns
  • You are using an old version of xls2skos, lot of things have changed since in particular in this post-processing steps
  • Have you checked the result of using --noPostProcessings as suggested previously, and search in your data for literal values of skos:broader ?
  • i will package a 2.0.1 version that fixes #10 and prints a log message in your situation. Stay tuned.

Many thanks for the checklist and the update. I have analysed the content of the broader/narrower columns again and can only find URIs. In the results of version 2.0.1, I have Found a skos:broadeer with Literal value warnings (you may have spotted the typo for "broader" already) for all my broader terms. I have URIs like "savoirs:histoire", "savoirs:pratiques_savantes", "savoirs:savoir-faire", "savoirs:Internet". The prefix is apparently not the problem, as I tried to remove it.

Please give the complete warning message. "savoirs:histoire" is not an (HTTP) URI, it looks like a plain string with a prefix not correctly interpreted. Have you checked your prefix declaration in the header ?
Share your spreadsheet here if not confidential.

The first lines and columns of the spreadsheet look like follows:

A B C
ConceptScheme URI http://data.xxx.fr/thes/savoirs
PREFIX savoirs http://data.xxx.fr/thes/savoirs/

Though the data are not highly confidential and will be open as soon as possible, I'd rather not share them online at such an early stage of the project. I'm happy to share them with you privately another way, of course.

I forgot to add the whole warning:

10:03:20.547 2090 WARN f.s.rdf.xls2rdf.SkosPostProcessor - Found a skos:broadeer with Literal value : savoirs:histoire

Indeed, the prefix is not interpreted correctly, and "savoirs:histoire" remains a plain literal.
Please double check your prefix declaration as well as the skos:broader column title row.
Make sure you don't have extra whitespaces before or after "savoirs:", in the prefix declaration as well as in your values.
You can send me the file at "thomas dot francart [at] sparna dot fr".

Thank you for solving this problem so quickly!

Before closing the issue, I leave a comment here to say that in the header skos:broader should not be used with a language suffix: a tag like @en forces the parsing of the cells as literal values. In my case skos:broader@fr(separator=",") had to be corrected into skos:broader(separator=",").