Giters
kpu
/
preprocess
Corpus preprocessing
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
93
Watchers:
8
Issues:
21
Forks:
21
kpu/preprocess Issues
Trouble Extracting Monolingual Datasets from SeamlessAlign
Closed
6 months ago
Comments count
1
Error when reconstructing Seamless data
Closed
a year ago
Comments count
7
Fail downloading Seamless align data
Updated
a year ago
Comments count
1
Error when compiling with cmake:
Closed
a year ago
Comments count
2
-k from `cache` doesn't work when numbers are not sorted
Closed
2 years ago
Comments count
3
b64filter: base64-encode & output documents on the go
Closed
2 years ago
Comments count
1
Sentence splitter uses unbounded memory in -k mode
Updated
3 years ago
foldfilter still expects input even when the command is invalid
Closed
3 years ago
error: cannot convert ‘size_t* {aka long unsigned int*}’ to ‘int32_t* {aka int*}’ for argument ‘2’ to ‘UChar32 utf8_nextCharSafeBody_60(const uint8_t*, int32_t*, int32_t, UChar32, UBool)’
Closed
4 years ago
Comments count
5
Warning: Compatibility with CMake < 2.8.12 will be removed from a future version of CMake.
Closed
4 years ago
Compilation error if zlib is not installed
Closed
4 years ago
Comments count
1
foldfilter breaks translation from language without spaces to language with spaces
Updated
4 years ago
Comments count
4
Cache util::EndOfFileException
Closed
4 years ago
Comments count
1
'unicode/stringpiece.h' file not found when running
Closed
4 years ago
Comments count
4
Undefined reference to boost unit_test while using make
Updated
4 years ago
Comments count
1
truecaser not identical to perl script
Updated
4 years ago
Comments count
1
Corpus Tokenization
Closed
5 years ago
Comments count
3
Error reporting for `cache` program
Closed
5 years ago
Error Cmake
Closed
6 years ago
Comments count
2
Unknown CMake command "AddExes"
Closed
6 years ago
Comments count
1
compile with bjam
Closed
7 years ago
Comments count
1