ikawaha / kagome

Self-contained Japanese Morphological Analyzer written in pure Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tokenizer.SysDicIPASimple() causes out of memory error

ikawaha opened this issue · comments

go 1.13.0, kagome 1.11.0, Debian 9.9 (Chromebook)

https://twitter.com/shibu_jp/status/1178466763995959296?s=20

package main

import (
"http://github.com/ikawaha/kagome/tokenizer"
"testing"
)

var doc = tokenizer.SysDicIPASimple()
var kagomeTokenizer = tokenizer.NewWithDic(doc)

func TestSample(t *testing.T) {
     t.Log("hello")
}

In my environment, the above code did not cause an error.
I'm looking for cases where similar errors occur.

I confirm that it is not causing an error with the below env.

  • go 1.13.0, kagome 1.11.0, Debian 9.9 (Docker over macOS)

I think the error was from another cause. And +1 to close this issue until any reproducible error arises.


  • Log
$ tree
.
├── Dockerfile
├── go.mod
├── go.sum
├── main.go
└── main_test.go

0 directories, 5 files

$ docker build -t test:local .
....

$ docker run --rm test:local
Run main
BOS(0, 0)DUMMY[-1]
私(0, 1)KNOWN[304999]
は(1, 2)KNOWN[57061]
太郎(2, 4)KNOWN[181027]
です(4, 6)KNOWN[47492]
。(6, 7)KNOWN[98]
EOS(7, 7)DUMMY[-1]
Run test
ok  	kagome/sample	1.415s
$ docker run --rm --entrypoint cat test:local /etc/debian_version
9.9

$ docker --version
Docker version 20.10.7, build f0df350

$ sw_vers
ProductName:	Mac OS X
ProductVersion:	10.15.7
BuildVersion:	19H1217

Dockerfile
# Available Images see: https://golang.org/dl/
ARG VER_GO='1.13'
ARG VER_OS='9.9'

FROM debian:${VER_OS}

ARG VER_GO
ENV \
    GO111MODULE=on \
    PATH="${PATH}:/usr/local/go/bin"

# Install Go
RUN \
    apt update && \
    apt install -y wget && \
    name_archive="go${VER_GO}.linux-amd64.tar.gz" && \
    wget "https://golang.org/dl/${name_archive}" && \
    rm -rf /usr/local/go && \
    tar -C /usr/local -xzf "./${name_archive}" && \
    go version && \
    rm -rf "./${name_archive}"

COPY . /workspace

WORKDIR /workspace

RUN \
    go mod download

ENTRYPOINT echo 'Run main' && go run . && echo 'Run test' && go test .

go.mod / go.sum
module kagome/sample

go 1.13

require github.com/ikawaha/kagome v1.11.0
github.com/ikawaha/kagome v1.11.0 h1:mJ3W/SSDaDnmx1W2PaJsdTpab/mCeRgp586jXuYoh3Y=
github.com/ikawaha/kagome v1.11.0/go.mod h1:eEV1yEy8Hm2eJRMz6nU1OlbrafRqXTECbsmm9aUMX2s=

main.go / main_test.go
package main

import (
	"fmt"

	"github.com/ikawaha/kagome/tokenizer"
)

var doc = tokenizer.SysDicIPASimple()
var kagomeTokenizer = tokenizer.NewWithDic(doc)

func main() {
	Sample()
}

func Sample() {
	text := "私は太郎です。"
	tokens := kagomeTokenizer.Tokenize(text)

	for _, token := range tokens {
		fmt.Printf("%v\n", token)
	}
}
package main

import (
	"testing"
)

func TestSample(t *testing.T) {
	t.Log("hello")
	Sample()
}