danwdart / zepper

Page grouping tool

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Zepper

Greenkeeper badge

Naming

It's a page grouping tool. My stupid pun-based naming scheme is thus:

Page group -> Jimmy Page group -> Led Zeppelin -> Zepper

How to use it

Call it with a website name. It will scrape/index/spider the website and group similar-looking pages together.

If possible it will try to name the group.

How it works

Nothing terribly fancy, but it just deals with the string similarity between each page's tag list and decides if the page structure is similar enough to another one.

TODO

Feedback loop for machine learning.

About

Page grouping tool


Languages

Language:JavaScript 100.0%