Integrate work of Sebastian from the parallel branch
JohannesLichtenberger opened this issue · comments
We have to integrate Sebastian's work on parallelization (probably in some parts with newer Java language possibilities).
Good point. I think as Java has majored, we can even use Java's ForkJoinPool
and stuff like that :-)
Maybe first, we can copy the stuff, add tests and then rewrite some stuff to modern Java 👍
Yeah, that was my thinking. This is the PR that ports the primary functionality.
@AlvinKuruvilla I've changed a couple of minor things you've changed from Sebastian's code (namely yield()
and a finali threadid. yield()
is an abstract class and can/should be overriden in subclasses, I guess, that's why it makes a difference :-)
Oh, okay, that makes sense. I'll merge those changes and get started on porting the tests
I don't think there are any test cases to port over from the branch. I think we will need to write our own
Do we have a way of benchmarking the improvements (and potentially the cases where it might cause perf degradation)?
I was thinking we should have a suite of benchmarks for the codebase too. It would be cool, in my opinion, to do some performance profiling.
But to answer your question I don't believe there is
I'm not sure if he has committed everything, otherwise he's benchmarked some stuff. Have a look at his thesis :)
The real question is also if it's working in all cases, as it was in a branch and Sebastian hasn't merged it.
Is his thesis listed somewhere? I don’t see it as a repository on his profile or in the brackit repository
It's listed under publications in the README: Separating Key Concerns in Query Processing - Set Orientation, Physical Data Independence, and Parallelism
I read through the paper, and there are about ten more .xq files we could use as part of a benchmarking suite that is not already in the repository. However, I was trying to see what brackit outputs when running against one of the available .xq files, and I keep getting this error
Error: err:XPDY0002: Dynamic context variable fs:dot is not assigned a value
This is true for every .xq file I have put through:
java -jar target/brackit-0.1.11-SNAPSHOT-jar-with-dependencies.jar -q src/test/resources/join/forNestedFor.xq
Is there some sort of linter/checker tool like there is for JSON to verify xq syntax?
you have to use the file option, which is qf
. q
is used for query strings :-)
johannes@luna:~/data/mystuff/brackit$ java -jar target/brackit-0.1.11-SNAPSHOT-jar-with-dependencies.jar -qf src/test/resources/join/forNestedFor.xq
Query result:
2 3 5
Ahh that was stupid. Can we update the README to add -f flag to the examples which read from files because I think it omits the flag. Also does the xml file the xq file reads from need to be in the same directory because how it is currently setup the xml file is outside the directories containing the xq files but they don't have any relative path syntax
Yeah, I think in general the README is the most important part, which has to be improved in case we find out that it's not sufficient to get a quick overview...
It looks like we also have a few XQuery functions to implement as well from his new XQuery files (These aren't exhaustive but new issues will also probably also arise after we tackle these):
- rel:parse-schema
- fn:ends-with
- bit:partition
It looks like we also have a few XQuery functions to implement as well from his new XQuery files (These aren't exhaustive but new issues will also probably also arise after we tackle these):
- rel:parse-schema
- fn:ends-with
- bit:partition
Regarding rel:parse-schema
, from what I can tell of its usage context it parses an XML schema file. I think that would, therefore, refer to an XML Schema Definition (XSD) file. To that end, I think this should be enough to be able to parse it. If not I found other resources: