drnk / dcl-asozd-parser

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dcl-asozd-parser

Open Office XML (docx) files parser

Requirements

You need Python 3.7 or later to run dcl-asozd-parser.

Used packages:

  • beautifulsoup4
  • lxml

Installation

  1. Clone a repository:

    git clone git@github.com:drnk/dcl-asozd-parser.git
  2. Create virtual environment and start it:

    cd dcl-asozd-parser
    python -m venv .venv
    
    # unix
    source .venv/bin/activate
    # windows
    .venv\Scripts\activate.bat
  3. Upgrade pip and download and install necessary libraries:

    python -m pip install -U pip
    pip install -r requirements.txt

Testing

pytest

Running

To parse all files end ups with итоговая карточка, run:

python parse.py "in" --source-mask=".*,\s*итоговая карточка.docx"

About


Languages

Language:Python 100.0%