liuwei3230 / ac

golang Aho-Corasick for byte strings

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ac

GoDoc Build Status

Golang implementation of Aho-Corasick for rapid substring matching on either byte strings or ASCII strings.

This is based on the excellent library cloudflare/ahocorasick (BSD License). The fork/changes were needed for a specific application usages that are incomptabile with the original library. Some other minor optimizations around memory and setup were also done.

Examples

  • FindAllString
m := ac.MustCompileString([]string{"Superman", "uperman", "perman", "erman"})
matches := m.FindAllString("The Man Of Steel: Superman")
fmt.Println(matches)

Output:

[Superman uperman perman erman]
  • MatchString
m := ac.MustCompileString([]string{"Superman", "uperman", "perman", "erman"})
contains := m.MatchString("The Man Of Steel: Superman")
fmt.Println(contains)

Output:

true

ac/acascii for pure ASCII matching

The ac/acascii package assumes the dictionary is all ASCII characters (1-127) without NULL bytes. This results in during setup:

  • 50% less memory allocations
  • 50% less memory users
  • 50% less CPU time

as compared to the plain ac package.

IN PROGRESS

  • Support for ASCII case-insensitive matching.

About

golang Aho-Corasick for byte strings

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Go 99.4%Language:Makefile 0.6%