hkuffel / tap-lookml

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tap-lookml

This is a Singer tap that produces JSON-formatted data following the Singer spec.

This tap:

  • Pulls LookML files from GitHub v3 API to extract LookML components using lkml parser.
  • Extracts the following resources:
  • Outputs the schema for each resource
  • Incrementally pulls data based on the input state (file last-modified in GitHub)

Streams

model_files

models

  • Primary key fields: git_owner, git_repository, path
  • Replication strategy: FULL_TABLE (ALL for each model_file)
  • Transformations: Decode, parse model_file content and convert to JSON

view_files

views

  • Primary key fields: git_owner, git_repository, path
  • Replication strategy: FULL_TABLE (ALL for each model_file)
  • Transformations: Decode, parse model_file content and convert to JSON

Authentication

Quick Start

  1. Install

    Clone this repository, and then install using setup.py. We recommend using a virtualenv:

    > virtualenv -p python3 venv
    > source venv/bin/activate
    > python setup.py install
    OR
    > cd .../tap-lookml
    > pip install .
  2. Dependent libraries The following dependent libraries were installed.

    > pip install singer-python
    > pip install singer-tools
    > pip install target-stitch
    > pip install target-json
    
  3. Create your tap's config.json file. This tap connects to GitHub with a GitHub OAuth2 Token. This may be a Personal Access Token or Create an authorization for an App. Each tap connects to a single Looker/LookML Git Repository (where your Looker LookML code is hosted for your Looker Project); provide the name of the git_repositories delimited by a comma (spaces are ignored) and the git_owner of those repositories (whcih can be a User or Organization).

    {
        "api_token": "YOUR_GITHUB_API_TOKEN",
        "git_owner": "YOUR_GITHUB_ORGANIZATION_OR_USER",
        "git_repositories": "LOOKER_GIT_REPO_1, LOOKER_GIT_REPO_2, ...",
        "start_date": "2019-01-01T00:00:00Z",
        "user_agent": "tap-lookml <api_user_email@your_company.com>"
    }

    Optionally, also create a state.json file. currently_syncing is an optional attribute used for identifying the last object to be synced in case the job is interrupted mid-stream. The next run would begin where the last job left off.

    {
        "currently_syncing": "users",
        "bookmarks": {
            "model_files": "2019-10-13T19:53:36.000000Z",
            "view_files": "2019-10-13T18:50:11.000000Z"
        }
    }
  4. Run the Tap in Discovery Mode This creates a catalog.json for selecting objects/fields to integrate:

    tap-lookml --config config.json --discover > catalog.json

    See the Singer docs on discovery mode here.

  5. Run the Tap in Sync Mode (with catalog) and write out to state file

    For Sync mode:

    > tap-lookml --config tap_config.json --catalog catalog.json > state.json
    > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json

    To load to json files to verify outputs:

    > tap-lookml --config tap_config.json --catalog catalog.json | target-json > state.json
    > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json

    To pseudo-load to Stitch Import API with dry run:

    > tap-lookml --config tap_config.json --catalog catalog.json | target-stitch --config target_config.json --dry-run > state.json
    > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
  6. Test the Tap

    While developing the lookml tap, the following utilities were run in accordance with Singer.io best practices: Pylint to improve code quality:

    > pylint tap_lookml -d missing-docstring -d logging-format-interpolation -d too-many-locals -d too-many-arguments

    Pylint test resulted in the following score:

    Your code has been rated at 9.68/10

    To check the tap and verify working:

    > tap-lookml --config tap_config.json --catalog catalog.json | singer-check-tap > state.json
    > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json

    Check tap resulted in the following:

    The output is valid.
    It contained 58 messages for 4 streams.
    
        4 schema messages
        48 record messages
        6 state messages
    
    Details by stream:
    +-------------+---------+---------+
    | stream      | records | schemas |
    +-------------+---------+---------+
    | model_files | 2       | 1       |
    | models      | 2       | 1       |
    | view_files  | 17      | 1       |
    | views       | 27      | 1       |
    +-------------+---------+---------+

Copyright © 2019 Stitch

About

License:GNU Affero General Public License v3.0


Languages

Language:Python 98.6%Language:Makefile 1.4%