koaning / tokenwiser

Bag of, not words, but tricks!

Home Page:https://koaning.github.io/tokenwiser/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tokenwiser

Bag of, not words, but tricks!

This project contains a couple of "tricks" on tokens. It's a collection of tricks for sparse data that might be trained on a stream of data too.

While exploring these tricks was super fun, I do feel like there are plenty of better alternatives than the ideas I explore here. In the end, TfIDF + LogReg can be "fine" for a bunch of tasks that don't require embeddings.

And for embeddings ... there's embetter.

So I archived this repo. Bit of a shame, because I really liked the name of this package.

About

Bag of, not words, but tricks!

https://koaning.github.io/tokenwiser/

License:Apache License 2.0


Languages

Language:Python 99.2%Language:Makefile 0.8%