diasks2 / pragmatic_segmenter

Pragmatic Segmenter is a rule-based sentence boundary detection gem that works out-of-the-box across many languages.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Quotation mark at the beginning of a sentence breaks segmentation

arp opened this issue · comments

Example:

 text = '"These should be two different sentences. Of course."'
 s = PragmaticSegmenter::Segmenter.new(text: text)
 s.segment
 
 # RETURNS:
 ["\"These should be two different sentences. Of course.\""]
 
 # SHOULD RETURN:
 ["\"These should be two different sentences.",  "Of course.\""]