ugmatcha suite consists of these sub projects:
See each projects for details.
java-ugmatcha-suite/trietree is available on GitHub Packages. (Japanese version)
-
Create a personal access token with
read:packages
permission at https://github.com/settings/tokens -
Put username and token to your ~/.m2/settings.xml file with
<server>
tag.<settings> <servers> <server> <id>github</id> <username>USERNAME</username> <password>YOUR_PERSONAL_ACCESS_TOKEN_WITH_READ</password> </server> </servers> </settings>
-
Add a repository to your
repositories
section in project's pom.xml file.<repository> <id>github</id> <url>https://maven.pkg.github.com/koron/java-ugmatcha-suite</url> </repository>
-
Add a
<dependency>
tag to your<dependencies>
tag.<dependency> <groupId>net.kaoriya.ugmatcha</groupId> <artifactId>wikidict</artifactId> <version>0.0.3</version> </dependency>
Please read public document also. (Japanese)
-
Create a personal access token with
read:packages
permission at https://github.com/settings/tokens -
Put username and token to your ~/.gradle/gradle.properties file.
gpr.user=YOUR_USERNAME gpr.key=YOUR_PERSONAL_ACCESS_TOKEN_WITH_READ:PACKAGES
-
Add a repository to your
repositories
section in build.gradle file.maven { url = uri("https://maven.pkg.github.com/koron/java-ugmatcha-suite") credentials { username = project.findProperty("gpr.user") ?: System.getenv("USERNAME") password = project.findProperty("gpr.key") ?: System.getenv("TOKEN") } }
-
Add an
implementation
to yourdependencies
section.implementation 'net.kaoriya.ugmatcha:wikidict:0.0.3'
Please read public document also. (Japanese).
tmp/ に wikiwords.stt と wikiwords.stw を置く。 両ファイルは https://github.com/koron/wpwordtool で作る。
tmp/ に in.txt を置く
$ ./gradlew wikidict:matchDemo -Pargs='../tmp/in.txt' > tmp/out.txt
Input data is consisted from Japanese Wikipedia's abstracts of all page. See https://github.com/koron/wpwordtool#abstract-sub-command for details.
$ ./gradlew wikidict-benchmark:benchmarkMatcher -Pargs=../tmp/abstract.txt
benchmark with file:../tmp/abstract.txt
control:
total: 0.504289 seconds
average_per_line: 435 nanoseconds
lineCount: 1157686
matcher:
total: 11.130144 seconds
average_per_line: 9614 nanoseconds
lineCount: 1157686