okhlybov / iop

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

IOP - the data processing pipeline construction framework for Ruby

Synopsis

IOP is intended for construction of the data processing pipelines in a manner of UNIX shell pipes.

Instead of the standard Ruby way of handling such I/O tasks in form of nested blocks the IOP offers a simpler flat chaining scheme.

Consider the example:

# One-liner example
(FileReader.new('input.dat') | GzipCompressor.new | DigestComputer.new(MD5.new) | FileWriter.new('output.dat.gz')).process!

The above snippet reads input file and compresses it into the Gzip-compatible output file simultaneously computing the MD5 hash of compressed data being written.

The next snippet presents the incremental pipeline construction capability - a feature not easily implementable with the standard Ruby I/O block nesting.

# Incremental pipeline construction example
pipe  = FileReader.new('input')
pipe |= GzipCompressor.new if need_compression?
pipe |= FileWriter.new('output')
pipe.process!

Here the Gzip compression is made optional and is thrown in depending on external condition.

Features

The following capabilities are currently implemented:

  • String splitting/merging
  • IO or local file reading/writing
  • FTP file reading/writing
  • Digest computing
  • Gzip/Zlib (de)compression
  • Zstd (de)compression
  • Symmetric cipher (de,en)cryption
  • Random data generation

Basic usage

  • IOP is split into a set of files which should be required separately depending on which components are needed.
require 'iop/file'
require 'iop/zlib'
require 'iop/digest'
require 'iop/string'
  • The IOP module can be included into current namespace to conserve some writing.
include IOP
  • A chain of processing objects is created either in-line or incrementally.
pipe  = StringSplitter.new('Greetings from IOP', 10)
pipe |= GzipCompressor.new | (d = DigestComputer.new(MD5.new))
pipe |= FileWriter.new('output.gz')

It is convenient to set local variables to the created instances which are expected to have some kind of valuable state.

  • The actual processing is initiated with the process! method.
pipe.process!

The IOP instances do normally perform self-cleanup operations, such as closing file handles, network connections etc., even during exception handling.

  • The variable-bound instances can be then examined.
puts d.digest.hexdigest

For further information refer to IOP documentation.

The end

Cheers,

Oleg A. Khlybov <fougas@mail.ru>

About


Languages

Language:Ruby 99.3%Language:Shell 0.7%