upsuper / csv-transformer

Command line tool to rearrange CSV files in certain ways

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

csv-transformer

Command line tool to rearrange CSV files in certain ways.

This was developed to help transforming Simplified Chinese survey result of 2020 State of Rust survey to match the global version.

Usage

Firstly, with a CSV file, use the following command to extract all the columns:

csv-transformer extract original.csv > transform.yaml

This will transform a CSV file like

Question 1,Question 2,Question 3
Answer B,Answer X,
Answer A,Answer X,Random text
Answer A,Answer Y,

into

- "A: Question 1"
- "B: Question 2"
- "C: Question 3"

The strings formatted X: Header text is called a column reference, and the letters before the first colon is the column index.

You can then edit the YAML file to reflect the transformation you want. Please refer to the transformations section for available transformations. This is an example of transformation file we used for the survey: transform.yaml.

After you edit it, you can use the following command to generate the result:

csv-transformer transform original.csv transform.yaml > result.csv

Transformations

Each item in the YAML file represents a rule to generate one or more columns in the transformation result in its order. If it's kept untouched (just a column reference), the column would be preserved as is. Otherwise, it can be one of the following transformations.

Rename

A rename transformation basically just changes the header text.

Example:

- transform: rename
  header: "New Header"
  column: "A: Old Header"

transforms

Old Header
Value 1
Value 2

to

New Header
Value 1
Value 2

Timestamp

A timestamp transformation reformats the date and time value with the format of strftime.

Example:

- transform: timestamp
  column: "A: Timestamp"
  from: "%d-%b-%Y %H:%M:%S"
  to: "%d/%m/%Y %H:%M:%S"

transforms

Timestamp
26-Sep-2020 01:12:42
25-Sep-2020 23:23:52

to

Timestamp
26/09/2020 01:12:42
25/09/2020 23:23:52

Optionally, you can also provide a header field to rename the column at the same time.

Join

A join transformation concatenates values from multiple columns into a single column.

Example:

- transform: join
  header: "Question?"
  columns:
  - "A: Question? Rust"
  - "B: Question? C++"
  - "C: Question? Python"

transforms

Question? Rust Question? C++ Question? Python
Rust Python
C++
Rust C++ Python

to

Question?
Rust, Python
C++
Rust, C++, Python

Optionally, you can provide a sep field to change the default separator , to something else.

It's also possible to slightly format the values from columns before joining via replacing the column reference item with a object.

Example:

- transform: join
  header: "Conference anywhere?"
  columns:
  - column: "A: Conference in China?"
    format: "China - {}"
  - column: "B: Conference outside China?"
    format: "outside China - {}"

transforms

Conference in China? Conference outside China?
Yes No
Maybe
No

to

Conference anywhere?
China - Yes, outside China - No
China - Maybe
outside China - No

Transpose

A transpose transformation transposes values and their header across several columns.

Example:

- transform: transpose
  sources:
    "A: Question? 1st": 1st
    "B: Question? 2nd": 2nd
    "C: Question? 3rd": 3rd
  columns:
    "Question? Go": Go
    "Question? C++": C++
    "Question? Rust": Rust
    "Question? Python": Python

transforms

Question? 1st Question? 2nd Question? 3rd
Rust C++ Go
Python Go Rust

to

Question? Go Question? C++ Question? Rust Question? Python
3rd 2nd 1st
2nd 3rd 1st

If a value is present in multiple source columns, the first matching one would be picked.

An error would be raised if a non-empty value in the source columns can't be mapped to a target column.

Development

All transformations are in src/transform directory, and new transformations can be added there.

License

Copyright (C) 2020 Xidorn Quan

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

About

Command line tool to rearrange CSV files in certain ways


Languages

Language:Rust 100.0%