nijel / enca

Extremely Naive Charset Analyser

Home Page:https://cihar.com/software/enca/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Additional languages that are already supported

pecinko opened this issue · comments

Ahoj

I have found out that ENCA actually already supports additional languages, although it's missing their language tags at the moment:

  • serbian (latin), bosnian, and montenegrin (latin) are detected properly if "croatian" lang tag is used
  • serbian (cyrilic), macedonian and montenegrin (cyrilic) are converted if "bulgarian" language tag is used for detection

It would be great if you would consider adding above languages as a separate "L" tags.

Diky moc, Peca

Patches are welcome, as I really have no knowledge of these languages.

I probably did not explain it well enough.
croatian = serbian (latin) = bosnian = montenegrin. Practicaly, we are talking about same language. Everything is in place already only thing missing is adding -L language flag:
enconv -L serbian srbske.srt -x utf-8

as this already works

pedja-mac:test pecinko$ enca -L croatian srbske.srt
MS-Windows code page 1250
pedja-mac:test pecinko$ enca -L croatian bosenske.srt
MS-Windows code page 1250

Zkratka, diakriticke znaky jsou v windows 1250 rozmistene stejne - rad poskytnu titulky k otestovani, pokud by byl zajem o implementaci. Ja konvertuji ceske a srbske titulky encou jiz vic nez rok bez jakehokoli problemu (diky moc!) akorat musim davat -L croatian pro srbskou latinku and -L bulgarian pro srbskou cyrilici.

Ok, added aliases to these.

Diky moc!

On 11 Feb 2014, at 14:59, Michal Čihař notifications@github.com wrote:

Ok, added aliases to these.


Reply to this email directly or view it on GitHub.