olmax99 / sftppush

A filesystem watcher that does uncompress and push files to s3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sftppush Event Pipeline

https://github.com/olmax99/sftppush/workflows/Go/badge.svg

The sftppush is a mini pipeline for file write-close event > decompress > s3 archive.

Initially, it was intended to replace i.e. low-compute serverless functions that would simply push files from the Sftp server into an S3 Bucket location. Instead of mounting an Sftp server’s file system directly onto S3 FUSE this solution seems to be more fit for production use cases.

Prerequisites

Design

./images/sftppush_solution_concept.png

./images/sftppush_concurrency_design.png

Most likely you want to run this project inside an Sftp server, which receives a constant stream of data files.

The sftppush project is intended to run in a Linux (Ubuntu/Debian) VM. It captures WRITE_CLOSE events for files on the file system based on a single or multiple source directories.

The watch --source flag can read a single directory as well as a configuration file containing multiple directories. In case of multiple directory targets there will be a separate go watch process spawned for each target directory, respectively.

Quick Launch

1. Install the sftppush tool locally

NOTE: ONLY TESTED ON UBUNTU AND DEBIAN (this project relies on the UNIX CLOSE_WRITE event)

Ubuntu/Debian

$ git clone https://github.com/olmax99/sftppush.git
$ cd sftppush

$ make build
$ ./bin/sftppush-0.1.0-linux_amd64 help

This will create a new binary in ./bin/sftppush-0.2.2-linux_amd64.

2. Create a configuration file

Recommended: Create config.yaml in project root and set flag --config or -c.

All source directories for fsnotify are determined by:
      <defaults.userpath> + <watch.source.name> + <watch.source.paths>

./config.yaml

defaults:
  userpath: # Set by default, can be overwritten here or with environment variable
  s3target: olmax-test-sftppush-126912
  awsprofile: ***
  awsregion: ***
  # log:
  #   level: info
  #   location: "syslog" || <abs/path/to/logfile>
  #   format: json
watch:
  source:
    - name: sftpuser1
      paths:
        - /path/to/source/directory1
        - /path/to/source/directory2
    # - name: sftpuser2
    #   paths:
    #     - path/to/source/directory1
    #   s3target: olmax-test-sftppush-126912

By default (without log:) Sftppush will try to use ~/.sftppush/sftppush.log.

  • If the directory does not exist, it will use Stderr
  • Optionally syslog can be used but requires rsyslog to be active.
  • Log level is at debug by default, which is producing overhead.

3. Run the event watcher on a single local directory

If a config files is created there is no need to set the --source flags. Flags will overwrite config file values.

Running it should be as simple as:

$ ./bin/sftppush-0.2.0-linux_amd64 --config config.yaml watch

# EXAMPLE 1: Run without config with two sources
$ SFTPPUSH_DEFAULTS_S3TARGET=*** SFTPPUSH_DEFAULTS_AWSPROFILE=*** \
  ./bin/sftppush-0.2.0-linux_amd64 watch \
  --source="name=sftpuser1,paths=/device1/data /device2/data" \
  --source="name=sftpuser2,paths=/device1/data /device2/data"

# EXAMPLE 2: Run with a custom User directory - needs trailing '/'
$ SFTPPUSH_DEFAULTS_USERPATH="/home/my_test_dir/" ./bin/sftppush-0.2.0-linux_amd64 -c config.yaml

Testing

Some tests require the OS file system. You can choose to run the tests inside a Docker container.

$ [DOCKER=1] make test

General Instructions

1. Missing WRITE_CLOSE event

  • WRITE_CLOSE events are not cross-platform, and currently only readily accessible on Linux file systems. This restriction made the fsnotify creators to keep the respective PR in pending state.
  • Note that this sftppush project is relying heavily on this feature.
  • One way of doing it is to fork the original fsnotify, and accepting the changes made on the respective PR.

Useful Commands

$ aws s3api --profile *** list-objects --bucket *** --query 'Contents[?contains(Key,``)].{Key: Key, Size: Size}' --output table | wc -l

# install golangci-lint
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.31.0

About

A filesystem watcher that does uncompress and push files to s3

License:Apache License 2.0


Languages

Language:Go 86.2%Language:Makefile 10.2%Language:Dockerfile 3.6%