vgherard / sbo

Utilities for training and evaluating text predictors based on Stupid Back-off N-gram models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rethinking structure of `kgram_freqs` and `sbo_preds` object

vgherard opened this issue · comments

  1. From the UI point of view, components such as n, L or lambda would probably appear more naturally as attributes than list elements.

  2. Maybe a simple list (rather than a matrix) would be a better fit for actual kgram_freqs and prediction tables. This could potentially help solving the first part of #10; also would allow to store RLE encoded word sequences and regular sequences in a single object.

Second point requires more thought, opening a separate issue (#19) now.