tdbloader2 requires GNU find and sort
danmichaelo opened this issue · comments
Dan Michael O. Heggø commented
tdbloader2
throws some errors because the BusyBox versions of find
and quit
are missing some options:
find
is missing the-quit
flag (from https://github.com/apache/jena/blob/master/apache-jena/bin/tdbloader2data#L236)sort
is missing abuffer-size
option
See full output below.
After I did apk update && apk add coreutils findutils
it ran without errors.
bash-4.3# /jena/bin/tdbloader2 --loc /fuseki/databases/ds /staging/bibbi-aut.ttl
13:05:13 INFO -- TDB Bulk Loader Start
find: unrecognized: -quit
BusyBox v1.24.2 (2017-01-18 14:13:46 GMT) multi-call binary.
Usage: find [-HL] [PATH]... [OPTIONS] [ACTIONS]
Search for files and perform actions on them.
First failed action stops processing of current file.
Defaults: PATH is current directory, action is '-print'
-L,-follow Follow symlinks
-H ...on command line only
-xdev Don't descend directories on other filesystems
-maxdepth N Descend at most N levels. -maxdepth 0 applies
actions to command line arguments only
-mindepth N Don't act on first N levels
-depth Act on directory *after* traversing it
Actions:
( ACTIONS ) Group actions for -o / -a
! ACT Invert ACT's success/failure
ACT1 [-a] ACT2 If ACT1 fails, stop, else do ACT2
ACT1 -o ACT2 If ACT1 succeeds, stop, else do ACT2
Note: -a has higher priority than -o
-name PATTERN Match file name (w/o directory name) to PATTERN
-iname PATTERN Case insensitive -name
-path PATTERN Match path to PATTERN
-ipath PATTERN Case insensitive -path
-regex PATTERN Match path to regex PATTERN
-type X File type is X (one of: f,d,l,b,c,...)
-perm MASK At least one mask bit (+MASK), all bits (-MASK),
or exactly MASK bits are set in file's mode
-mtime DAYS mtime is greater than (+N), less than (-N),
or exactly N days in the past
-mmin MINS mtime is greater than (+N), less than (-N),
or exactly N minutes in the past
-newer FILE mtime is more recent than FILE's
-inum N File has inode number N
-user NAME/ID File is owned by given user
-group NAME/ID File is owned by given group
-size N[bck] File size is N (c:bytes,k:kbytes,b:512 bytes(def.))
+/-N: file size is bigger/smaller than N
-links N Number of links is greater than (+N), less than (-N),
or exactly N
-prune If current file is directory, don't descend into it
If none of the following actions is specified, -print is assumed
-print Print file name
-print0 Print file name, NUL terminated
-exec CMD ARG ; Run CMD with all instances of {} replaced by
file name. Fails if CMD exits with nonzero
-exec CMD ARG + Run CMD with {} replaced by list of file names
-delete Delete current file/directory. Turns on -depth option
13:05:13 INFO Data Load Phase
13:05:13 INFO Got 1 data files to load
13:05:13 INFO Data file 1: /staging/bibbi-aut.ttl
INFO Load: /staging/bibbi-aut.ttl -- 2019/10/31 13:05:14 GMT
INFO Add: 50,000 Data (Batch: 40,584 / Avg: 40,584)
INFO Add: 100,000 Data (Batch: 81,967 / Avg: 54,288)
INFO Add: 150,000 Data (Batch: 133,689 / Avg: 67,689)
INFO Add: 200,000 Data (Batch: 76,452 / Avg: 69,686)
INFO Add: 250,000 Data (Batch: 110,132 / Avg: 75,210)
INFO Add: 300,000 Data (Batch: 140,449 / Avg: 81,521)
INFO Total: 334,010 tuples : 4.08 seconds : 81,925.43 tuples/sec [2019/10/31 13:05:18 GMT]
13:05:18 INFO Data Load Phase Completed
13:05:18 INFO Index Building Phase
13:05:18 INFO Creating Index SPO
13:05:18 INFO Sort SPO
sort: unrecognized option: buffer-size=50%
BusyBox v1.24.2 (2017-01-18 14:13:46 GMT) multi-call binary.
Usage: sort [-nrugMcszbdfimSTokt] [-o FILE] [-k start[.offset][opts][,end[.offset][opts]] [-t CHAR] [FILE]...
Sort lines of text
-b Ignore leading blanks
-c Check whether input is sorted
-d Dictionary order (blank or alphanumeric only)
-f Ignore case
-g General numerical sort
-i Ignore unprintable characters
-M Sort month
-n Sort numbers
-o Output to file
-t CHAR Field separator
-k N[,M] Sort by Nth field
-r Reverse sort order
-s Stable (don't sort ties alphabetically)
-u Suppress duplicate lines
-z Lines are terminated by NUL, not newline
-mST Ignored for GNU compatibility
13:05:18 ERROR Failed during data phase
Stian Soiland-Reyes commented
Thanks, hopefully fixed by #27 and the latest builds trickling in now.