bramstein / hypher

A fast and small JavaScript hyphenation engine

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Words that include an umlaut are not being hyphenated

deboerk opened this issue · comments

Hi,

I have the following issue with your great hypher script: Words that contain an umlaut don't get hyphenated. Here is an example:

<html>
  <body>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <script type="text/javascript" src="jquery.js"></script>
    <script type="text/javascript" src="jquery.hypher.js"></script>
    <script type="text/javascript" src="de.js"></script>

    <style>
      #foo { width: 5px; }
    </style>
    <div id="foo">müsse musse</div>
    <script>
      $("#foo").hyphenate("de");
    </script>
  </body>
</html>

The word "musse" is hyphenated as expected ("mus-se"), whereas "müsse" is not hyphenated at all ("müsse"). I tried to add an exception (müs‧se), but that didn't help. All my files are encoded in UTF-8. Can you help me out? Thank you!

Regards
deboerk

Same issue:

 Hypher.languages.de.hyphenateText("sozioökonomisch").replace(/\u00AD/g, "|")
 "so|zioöko|no|misch"

 Hypher.languages.de.hyphenateText("Kostenschätzungen").replace(/\u00AD/g, "|")
 "Kos|tenschätzun|gen"

The hyphenations look better when removing the umlauts:

Hypher.languages.de.hyphenateText("soziookonomisch").replace(/\u00AD/g, "|")
 "so­|zio­|o­|ko­|no­|misch"

Hypher.languages.de.hyphenateText("Kostenschatzungen").replace(/\u00AD/g, "|")
"Kos|ten|schat|zun|gen"

In the original patterns file http://tug.org/svn/texhyphen/trunk/collaboration/repository/hyphenator/de_1996.js?view=markup there is a property specialChars : 'ßàáâäçèéêëíñóôöü' which is omitted in de.js and is not used in jquery.hypher.js. Maybe this is related.

Sorry for the late response. This is indeed due to the special characters. I forgot that JavaScript's RegEx implementation does not support unicode at all. I've created a new pull request (#15) that attempts to fix this issue. Let me know if that fixes the issue for you.

Yep, it looks better now. Thanks a lot for the fix!

Thanks for testing! This is now released as v0.2.1. Thanks again both!