batterseapower / libcharsetdetect

A dependency-free C interface to the Mozilla Universal Character Set Detector

Home Page:http://mxr.mozilla.org/seamonkey/source/extensions/universalchardet/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Request to verify your library behaviour

termopro opened this issue · comments

I am using Python wrapper to your library which fails under some special conditions (Segfault error).

I have to tried to verify your module behavior under Ubuntu 12 by compiling module and 'example.c' with GCC exactly as you write in your Readme.md. This has failed, however, because application couldn't find some *.so file.

Anyway, i am asking you to verify whether your module can correctly detect encoding for html document i've included below. I have very strong assumptions that it will fail.

Please be so kind to detect encoding for the following document:
https://mega.co.nz/#!5sd0lBxA!RQ61_jJwWiw_mwSryAvBpG8US71e_O-TIWYEu_9LQro

This is a document saved from url 'http://www.balbro.com'

p.s. - here is the link to a bug i am referring to:
PyYoshi/cChardet#4

the problem was with the wrapper, not with libcharsetdetect: PyYoshi/cChardet#7