CGI-FR / RIMO

Rimo contains a series of tools that helps to create a masking.yaml for PIMO.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

feat: rimo analyse

adrienaury opened this issue · comments

$  rimo analyse <foldername>
succesfully analysed 1 table(s) - saved rimo.yaml

$ ls
<foldername> rimo.yaml

<foldername>/table1.jsonl

{"address": "PSC 4713, Box 9649 APO AA 43433", "age": 29, "date": "2013-06-11", "phone": "001-958-985-3039"}
{"address": "095 Jennifer Turnpike Castrobury, NY 98111", "age": 35, "date": "2014-07-24", "phone": "(517)819-3454"}
{"address": "06210 David Court South Kimberly, IL 10236", "age": 61, "date": "2003-10-11", "phone": "001-866-271-0116"}
{"address": "2035 Simmons Islands Heatherchester, IN 46152", "age": 73, "date": "2005-05-10", "phone": "+1-407-997-8293x68130"}
{"address": "275 Stone Ridges Suite 885 East Aliciafurt, MH 15407", "age": 47, "date": "2022-04-23", "phone": "828-755-3826"}
{"address": "38432 Moreno Turnpike Garrettland, TN 72939", "age": 95, "date": "2001-08-23", "phone": "7795418893"}
{"address": "25545 Cole Court Newtonfurt, KY 13882", "age": 47, "date": "2004-07-04", "phone": "(330)616-7639x7810"}
{"address": "0301 Amy Grove Apt. 325 Janefort, MA 84102", "age": 65, "date": "2014-09-09", "phone": "260-587-0590"}
{"address": "536 Robinson Estates Austinside, NV 69535", "age": 45, "date": "2011-07-13", "phone": "001-845-854-2110"}
{"address": "9038 Frye Ramp South Cheryltown, CT 54262", "age": 80, "date": "2010-11-18", "phone": "001-533-758-7269"}
database: "<foldername>"
tables:
- name: "table1"
  columns:
  - name: "address"
    type: "string"
    concept: ""
    constraint: []
    confidential: null
    sample:
    - 2035 Simmons Islands Heatherchester, IN 46152
    - 095 Jennifer Turnpike Castrobury, NY 98111
    - 9038 Frye Ramp South Cheryltown, CT 54262
    - 536 Robinson Estates Austinside, NV 69535
    - 06210 David Court South Kimberly, IL 10236
    statistics:
      count: 10
      unique: 10
      length_histogram:
        min_length: 31
        max_length: 52
        25%_length: 41
        50%_length: 42
        75%_length: 42
      most_freq_len:
        42: 0.3
        41: 0.2
        31: 0.1
        45: 0.1
        52: 0.1
      least_frequent_len:
        31: 0.1
        45: 0.1
        52: 0.1
        43: 0.1
        37: 0.1
      least_frequent_values:
      - "PSC 4713, Box 9649 APO AA 43433"
      - "2035 Simmons Islands Heatherchester, IN 46152"
      - "275 Stone Ridges Suite 885 East Aliciafurt, MH 15407"
      - "38432 Moreno Turnpike Garrettland, TN 72939"
      - "25545 Cole Court Newtonfurt, KY 13882"
  - name: "age"
    type: "integer"
    concept: ""
    constraint: []
    confidential: null
    sample:
    - 45
    - 47
    - 61
    - 29
    - 47
    statistics:
      count: 10
      unique: 9
      mean: 57.7
      value_histogram:
        min: 29.0
        25%: 45.5
        50%: 54.0
        75%: 71.0
        max: 95.0
  - name: "date"
    type: "string"
    concept: ''
    constraint: []
    confidential: null
    sample:
    - "2022-04-23 00:00:00"
    - "2003-10-11 00:00:00"
    - "2014-07-24 00:00:00"
    - "2013-06-11 00:00:00"
    - "2010-11-18 00:00:00"
    statistics:
      count: 10
      unique: 10
      length_histogram:
        ...
  - name: phone
    type: "string"
    concept: ""
    constraint: []
    confidential: null
    sample:
    - "001-866-271-0116"
    - "7795418893"
    - "260-587-0590"
    - "001-845-854-2110"
    - "+1-407-997-8293x68130"
    statistics:
      count: 10
      unique: 10
      length_histogram:
        min_length: 10
        max_length: 21
        25%_length: 12
        50%_length: 16
        75%_length: 16
      most_freq_len:
        16: 0.4
        12: 0.2
        13: 0.1
        21: 0.1
        10: 0.1
      least_frequent_len:
        12: 0.2
        13: 0.1
        21: 0.1
        10: 0.1
        18: 0.1
      least_frequent_values:
      - "260-587-0590"
      - "(517)819-3454"
      - "+1-407-997-8293x68130"
      - "7795418893"
      - "(330)616-7639x7810"

Requirements

  • Write at least one venom test
  • Use Cobra library for the analyse command
  • Minimal main.go, no business logic
  • Business logic isolated in a single package under pkg/ folder
  • neon compile command passed without errors
  • neon lint command passed without errors
  • neon test command passed without errors
  • neon test-int command passed without errors
  • Stream read json -> analyse columns

Venom tests that need to pass : test_venom.txt

Venom test documentation : https://github.com/ovh/venom