citadel-ai / langcheck

Simple, Pythonic building blocks to evaluate LLM applications.

Home Page:https://langcheck.readthedocs.io/en/latest/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Eliminate unexpected spaces introduced before periods by `TreebankWordDetokenizer().detokenize()`.

Alnusjaponica opened this issue · comments

Motivation

Resolve #43 (review).

Description

As outlined in nltk/nltk#3210, TreebankWordDetokenizer().detokenize() introduces an unnecessary period when periods are treated as independent tokens. This issue aims to resolve then before the NLTK fix.