jitwit / jsv

mini csv parser for J

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JSV

Parsing csvs to inverted tables using J’s mealy machines (;:).

Parsing

rcsv =: [: pcsv 1!:1@<@jpath
pcsv =: 3 : 0
  hd=. cln &.> (0;mm;ma) ;: (j=. y i. rsep){.y =. stripbom y
  hd,:((#fs)$i.#hd) ([:<(cln;.0)&y)/.,."1 fs=.(2;mm;ma);:y=.(1+j)}.y
)

Mealy machine

NB. how to deal with input that is unicode? seems to give domain error...
'qchr csep rsep'=: '"';',';LF NB. char classes
ma =: a. (e.&> i. 1:)"0 _ qchr;csep;rsep NB. alphabet -> char class
mm =: 4 4 2 $ , ". ;. _2 ] 0 : 0
2 1  0 2  0 3  1 1 NB. limbo
0 6  0 2  0 3  1 0 NB. field
3 0  2 0  2 0  2 0 NB. quoted field, quote escapes self
2 0  0 2  0 3  2 0 NB. escaped quote or end of quoted field
)

Removing quotes

unq =: ((#~ [: -. (2#qchr)&E.)@}.@}:) ^: ((2#qchr) -: 0 _1&{)
cln =: unq`(0&{.)@.(-:&(,csep))

Optionally specify field separator quote char

create =: 3 : 0
  if. #y do.
    assert. (1 1 -: #&>y) *. 2=#y
    'csep qchr' =: y
    NB. recalculate alphabet -> char class
    ma =: a. (e.&> i. 1:)"0 _ qchr;csep;rsep
  end.
)

Import a csv

Populate it’s locale with columns pointing to their data (wip)

import =: 3 : 0
 table =: pcsv y
 for_c. {. y do.
   d =. (1,c_index) {:: y
   ". (>c),' =: d'
 end.
)

Zdefs

rcsv_z_ =: rcsv_jsv_ NB. read from file
pcsv_z_ =: pcsv_jsv_ NB. read from bytes

Byte Order Mark

All together

coclass 'jsv'
<<bom>>
<<mealy>>
<<unquot>>
<<create>>
<<db>>
<<read>>
<<zdefs>>

About

mini csv parser for J


Languages

Language:J 100.0%