BrianHicks / elm-csv

Decode CSV in the most boring way possible.

Home Page:https://package.elm-lang.org/packages/BrianHicks/elm-csv/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Decode CSV as `Dict String (List String)` and/or `List (Dict String String)`

sebsheep opened this issue · comments

Use case: I'dl ike providing my users a way to "explore" a CSV with "headers". So from a CSV which looks like:

name,first name,age
sheep,seb,18
hicks,brian,21

I'd like having either data by column:

Dict.fromList  [("name", ["sheep", "hicks"]), ("first name", ["seb", "brian"]), ("age", ["18", "21"])]

or by row:

[ Dict.fromList  [("name", "sheep"), ("first name", "seb"), ("age", "18")]
, Dict.fromList  [("name", "hicks"), ("first name", "brian"), ("age", "21")]
]

This falls squarely in the "use Csv.Parser" bucket of things you can do with this library. The functions in there say to open an issue if you need them—I guess you already saw that! If this turns out to be a common need, we can add it, so thank you for saying something!

Yes, I saw this sentence saying to open an issue, that's why I did it :)

I would need this exact thing. We have an application which lets users type in arbitrary combinations of fields

field1, field2, ... fieldn
value1, value2, ... valuen
value1, value2, ... valuen

These fields ultimately become input to another library. There are a lot of fields in question (>4000), so users specify a subset of them. Here's the function I wrote that does that. Perhaps it could be dropped into the library with some modifications.

import Csv.Parser as P
import List.Extra exposing (transpose)
import Dict exposing (Dict)

getDict : String -> Result P.Problem (Dict String (List String))
getDict s =
    let
        -- split the first row of data as headers
        zipFirstRow : List (List a) -> Result P.Problem (List ( a, List a ))
        zipFirstRow lst =
            case lst of
                [] ->
                    -- TODO: add this case to P.Problem, or create a new error type
                    Err (P.NoHeaderRow)
                xs :: xss ->
                    Ok (List.map2 Tuple.pair xs (transpose xss))
    in
    s
    |> P.parse { fieldSeparator = ',' }
    |> Result.andThen zipFirstRow
    |> Result.map (Dict.fromList)

Personally, I actually parse values out into a custom type, which makes things more complicated as I have to wrap errors all over the place. It would be great if this library could handle it by passing a function! I believe it would need a (P.Problem -> error) parameter to handle this.

-- Define custom error type to hold all errors
type MyError
  = NoHeader
  | CsvParseError P.Problem
  | MyParserError (List (List MyTypeParseError))

type MyTypeParseError = ... -- Errors for turning a String into your type
type MyType = ... -- Whatever you need it to be 

parseIntoMyType : List (String, List String) -> Result MyError (List (String, List MyType))
-- define this how you need it to be
-- I found it useful to write
-- (String, List String) -> Result (List MyTypeParseError) MyType
-- and then collect the errors into the variant
-- MyParserError (List (List MyTypeParseError)) 

getDict : String -> Result MyError (Dict String (List MyType))
getDict s =
    let
        -- split the first row of data as headers
        zipFirstRow : List (List a) -> Result MyError (List ( a, List a ))
        zipFirstRow lst =
            case lst of
                [] ->
                    Err NoHeader
                xs :: xss ->
                    Ok (List.map2 Tuple.pair xs (transpose xss))
    in
    s
    |> P.parse { fieldSeparator = ',' }
    |> Result.mapError CsvParseError 
    |> Result.andThen zipFirstRow
    |> Result.andThen parseIntoMyType
    |> Result.map Dict.fromList