uiur / akagi

Codenize your datasources.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

akagi

https://readthedocs.org/projects/akagi/badge/?version=latest

akagi

  • Free software: MIT license

Features

akagi supports iter and save interface for various data sources such as Amazon Redshift, Amazon S3 (more in future).

Installation

Install via pip:

pip install akagi

or from source:

$ git clone https://github.com/ayemos/akagi akagi
$ cd akagi
$ python setup.py install

Example

MySQLDataSource

with MySQLDataSource.for_query(
        'select * from (select user_id, path from logs.imp limit 10000)', # Your Query here
        ) as ds:
    ds.save('./akagi_test') # save results to local

    for d in ds:
        print(d) # iterate on result

RedshiftDataSource

with RedshiftDataSource.for_query(
        'log-redshift-unload.ap-northeast-1', # S3 Bucket for intermediate storage
        'select * from (select user_id, path from logs.imp limit 10000)', # Your Query here
        'logs', # schema
        'imp' # table (Those two are used to generate unique prefix for S3 object (e.g. logs/imp/20170312_081527)
        ) as ds:
    ds.save('./akagi_test') # save results to local

    for d in ds:
        print(d) # iterate on result

S3DataSource

with S3DataSource.for_prefix(
        'image-data.ap-northeast-1',
        'data/image_net/zebra',
        FileFormat.BINARY) as ds:
    ...

LocalDataSource

with LocalDataSource.for_path(
      './PATH/TO/YOUR/DATA/DIR',
      'csv') as ds:
    ds.save('./akagi_test') # save results to local

    for d in ds:
        print(d) # iterate on result

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

About

Codenize your datasources.

License:MIT License


Languages

Language:Python 91.9%Language:Makefile 8.1%