Joungkyun / python-chardet

Python Universal Character Encoding Detector C-binding module (Compatible and faster than py-chardet)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CHARDET extension v2

COPYRIGHT AND LICENCE

Copyright 2021. JoungKyun.Kim all rights reserved.

Version: MPL 1.1

The contents of this file are subject to the Mozilla Public License Version 1.1 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.mozilla.org/MPL/

Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License.

DESCRIPTION

This is python extension that is libchardet python frontend.

libchardet is based on Mozilla Universal Charset Detector library and, detects the character set used to encode data.

From 2.0.0, this module has compatible API with python-chardet. This means that, this module can be in place the python-chardet in PYPI without code changes.

This module is a c-binding, is much faster than the python-chardet.

Differences with traditional Chardet of PYPI

  • compatible with 2.x
  • different with 3.x
    • don't support language key in chardet.universaldetector.UniversalDetector.feed API
  • different with 4.x
    • chardet.detect_all api is provided for compatibility with the traditional chardet of PYPI. In fact, it is the same as the result of chardet.detect api.
    • don't support language key in chardet.universaldetector.UniversalDetector.feed API

INSTALLATION

1. Requirement

This module requires follow library:

2. Build

To install this module type the following:

  [root@host python-chardet]$ make build
  [root@host python-chardet]$ make install
  [root@host python-chardet]$ # or
  [root@host python-chardet]$ python setup.py build or make build
  [root@host python-chardet]$ python setup.py install or make install

for details, read INSTALL.md document.

USAGE

See also http://chardet.readthedocs.io/en/latest/usage.html or test*.py in this source

About

Python Universal Character Encoding Detector C-binding module (Compatible and faster than py-chardet)

License:Other


Languages

Language:C 53.3%Language:Python 44.4%Language:Makefile 2.4%