hunspell / hyphen

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

segfault in hnj_hyphen_hyphword

dimztimz opened this issue · comments

Forwarding issue hunspell/hunspell#12. Closing there.

In hnj_hyphen_hyphword function hyphenated word length is defined as initial word length plus 5.

void hnj_hyphen_hyphword(const char * word, int l, const char * hyphens,
char * hyphword, char *** rep, int ** pos, int ** cut)
{
int hyphenslen = l + 5;

Then if compound hyphen is found we have:

if (*rep && *pos && *cut && (*rep)[i]) {
size_t offset = j - (*pos)[i] + 1;
strncpy(hyphword + offset, (*rep)[i], hyphenslen - offset - 1);
hyphword[hyphenslen-1] = '\0';
j += strlen((*rep)[i]) - (*pos)[i];
i += (*cut)[i] - (*pos)[i];
} else hyphword[++j] = '=';

It leads to problems in a situation when there are more then 5 hyphens in a word followed by compound hyphen.
In the best scenario hyphenated word is truncated. In worst there is a segmentation fault – strncpy size argument (hyphenslen - offset -1) gets negative value.

To reproduce:
echo "mamamamamamammammamamamaasszony" | ./example /usr/share/hyphen/hyph_hu_HU.dic /dev/stdin
Segmentation fault

or more practic example:
echo "application-to-be-used-with-not-so-many" | ./example /usr/share/hyphen/hyph_en_US.dic /dev/stdin
Segmentation fault