a-zb / nim-aho-corasick

Aho–Corasick string matching algorithm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nim-aho-corasick

Aho–Corasick string matching algorithm

On why, see: http://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_string_matching_algorithm

or read about its use by Cloudflare as explained @nginx-conf 2014, where John Graham Cumming discusses its use in Cloudflare WAF, together with Lua and Nginx.

http://www.scalescale.com/scaling-cloudflares-massive-waf/

On how:

Create the tree and initialize it

  var ac = AhoCorasick(rootValue: "")
  ac.initialize()

Create a dictionary of words and build it

  for w in @["a", "ab", "bc", "bca", "c", "caa"]:
    ac.add(w)
  ac.build()

Search for matches

  var matches: seq[string] = ac.match("abccab")
  if matches == @["a", "ab", "bc", "c", "c", "a", "ab"]:
    echo("success")

Simple search, example

  # Dictionary built with words:
    @["monkey", "was", "time", "lava"], 
  # Will return matches:
    @["time", "was", "monkey"], 
  # If searched with phrase
    "In the time of chimpanzees I was a monkey."

About

Aho–Corasick string matching algorithm

License:MIT License


Languages

Language:Nim 100.0%