oscar-project / download_oscar

Downloading all files of a language from the OSCAR (Open Super-large Crawled Aggregated coRpus)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Downloading language data from OSCAR automated

Downloads PyPI PyPI - Python Version PyPI - License

Lint & Build Package Deploy Python Package

Features

  • Adds dodc and dodg as command line tools
  • dodc: command line variant provided with arguments to download data
  • dodg: gui variant to download data, put arguments into input fields

Usage

dodc

To get help with the command line tool use dodc -h from a shell.

The command line tool needs to be supplied with multiple arguments:

dodg

The gui tool internally calls the command line tool dodc. Instead of providing arguments to the command line you can enter these into input fields directly and they will be passed downward to the command line tool.

Installation

Simple Installation

pip install download-oscar will install the requirements and the tool with one command.

Installing from source

Requirements

Building

  • install Python
  • git clone https://github.com/xamm/download_oscar.git
  • cd download_oscar
  • (optional) create a virtual enironment
  • pip install -r requirements.txt
  • pip install -e . will install the tool in development mode.

Release a new version

  • All pushed git commits and pull requests on the main branch trigger an automatic build and packaging for pypi
    • commits without a tag only trigger packaging for TestPyPi
    • commits with a tag will also push to PyPi
    • A new version number must be specified in setup.py in order for publishing to work
      • publishing is trigerred on creation of a tag on the main branch
      • e.g. git tag -a v0.0.1 -m 'Release 0.1' and git push origin v0.0.1`
      • easiest procedure:
        • work on your code
        • add & commit changes
        • push changes
        • create tag
        • push tag

This tool was originally started during a student project at the Database Systems Group Dresden.

About

Downloading all files of a language from the OSCAR (Open Super-large Crawled Aggregated coRpus)

License:MIT License


Languages

Language:Python 93.1%Language:Makefile 6.9%