jojolebarjos / zimscan

ZIM file iterator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ZIM Scan

Minimal ZIM file reader, designed for article streaming.

Getting Started

Install using pip:

pip install zimscan

Or from Git repository, for latest version:

pip install -U git+https://github.com/jojolebarjos/zimscan.git

Iterate over a records, which are binary file-like objects:

from zimscan import Reader

path = "wikipedia_en_all_nopic_2019-10.zim"
with Reader(open(path, "rb"), skip_metadata=True) as reader:
    for record in reader:
        data = record.read()
        ...

Links

About

ZIM file iterator

License:MIT License


Languages

Language:Python 100.0%