beta / csv

Golang package for reading and writing CSV-like documents.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CSV

godoc

A Golang package for reading and writing CSV-like documents.

Working in progress.

Why another CSV package?

Golang provides an encoding/csv package out of the box for reading and writing standard CSV files (as described in RFC 4180). However, not all CSV documents follow the specs. Although the built-in csv package provides some sort of customizability, it cannot cover all variations of these CSV formats.

Also, the built-in csv package does not implement a marshaler/unmarshaler (like in json) for generating/parsing CSV documents from/into struct instances.

This package aims to offer better customizability than the built-in csv package, as well as a set of easy-to-use marshaler and unmarshaler.

Installation

$ go get -u github.com/beta/csv

Getting started

TBD.

Customization

csv uses setting functions for customization. The default setting used by csv supports a flexible variation of the standard CSV format.

  • The default encoding is UTF-8.

  • , is used as the default separator.

  • Single quotes are allowed. Escaping works exactly the same as double quotes.

    "field 1",'field 2','field 3 ''escaped''','field "4"'

    will be parsed as

    ["field 1", "field 2", "field 3 'escaped'", "field \"4\""]

  • Empty fields are allowed.

    field 1,,field 3

    will be parsed as

    ["field 1", "", "field 3"]

  • An ending line break in the last record is allowed.

  • Leading and trailing spaces in fields will be ignored.

    field 1 , field 2

    will be parsed as

    ["field 1", "field 2"]

  • Empty lines are omitted.

  • The marshaler by default outputs the header row based on the csv tag of struct fields.

Below lists all the settings that can be used to customize the behavior of csv.

Common settings

Setting Description Default
Encoding(encoding.Encoding) Sets the character encoding used while reading and writing a document. unicode.UTF8
Separator(rune) Sets the separator used to separate fields while reading and writing a document. ,
Prefix(rune) Sets the prefix of every field while reading and writing a document.
Suffix(rune) Sets the suffix of every field while reading and writing a document.

Scanner settings

Setting Description Default
AllowSingleQuote(bool) Sets whether single quotes are allowed while scanning a document. true
AllowEmptyField(bool) Sets whether empty fields are allowed while scanning a document. true
AllowEndingLineBreakInLastRecord(bool) Sets whether the last record may have an ending line break while reading a document. true
OmitLeadingSpace(bool) Sets whether the leading spaces of fields should be omitted while scanning a document. true
OmitTrailingSpace(bool) Sets whether the trailing spaces of fields should be omitted while scanning a document. true
OmitEmptyLine(bool) Sets whether empty lines should be omitted while reading a document. true
Comment(rune) Sets the leading rune of comments used while scanning a document.
IgnoreBOM(bool) Sets whether the leading BOM (byte order mark) should be ignored while reading a document. If not, the BOM will be treated as normal content.
This should not be done by a csv package, but since Golang has no built-in support for BOM, a workaround is required.
true

Unmarshaler and marshaler settings

Setting Description Default
HeaderPrefix(rune) Sets the prefix rune of header names while unmarshaling and marshaling a document.
If a header prefix is set, the Prefix setting will be ignored while reading and writing the header row, but will still be used for fields.
HeaderSuffix(rune) Sets the suffix rune of header names while unmarshaling and marshaling a document.
If a header suffix is set, the Suffix setting will be ignored while reading and writing the header row, but will still be used for fields.
FieldPrefix(rune) Sets the prefix rune of fields while unmarshaling and marshaling a document.
If a field prefix is set, the Prefix setting will be ignored while reading and writing fields, but will still be used for the header.
FieldSuffix(rune) Sets the suffix rune of fields while unmarshaling and marshaling a document.
If a field suffix is set, the Suffix setting will be ignored while reading and writing fields, but will still be used for the header.

Unmarshaler settings

Setting Description Default
Validator(string, func(interface{}) bool) Adds a new validator function for validating a CSV value while unmarshaling a document.

Marshaler settings

Setting Description Default
WriteHeader(bool) Sets whether to output the header row while writing the document. true

All scanner settings can be used in an unmarshaler. Also, all generator settings can be used in an marshaler.

Beside the settings above, there's a special setting named RFC4180 which applies the requirements as described in RFC 4180, including

  • using , as the separator,
  • no prefix and suffix,
  • not allowing single quotes,
  • not allowing empty fields,
  • allowing an ending line break in the last record,
  • not omitting leading and trailing spaces,
  • not omitting empty lines, and
  • not allowing comments.

License

MIT

About

Golang package for reading and writing CSV-like documents.

License:MIT License


Languages

Language:Go 100.0%