fslaborg / Deedle

Easy to use .NET library for data and time series manipulation and for scientific programming

Home Page:http://fslab.org/Deedle/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Drop Columns with all elements are missing values

hoangdungt2 opened this issue · comments

Hi,

In pandas dropna there is a how flag to specify how the dropna works (any or all).
In Deedle.Frame, there is only dropSparseCols which is same as pandas.dropna for any. I wonder if we can add the case of all (drop column if all values are missing). I look at the source code for dropSparseCols

  let dropSparseCols (frame:Frame<'R, 'C>) =
    let newColKeys, newData =
      [| for KeyValue(colKey, addr) in frame.ColumnIndex.Mappings do
            match frame.Data.GetValue(addr) with
            | OptionalValue.Present(vec) when vec.ObjectSequence |> Seq.forall (fun o -> o.HasValue) ->
                yield colKey, vec
            | _ -> () |] |> Array.unzip
    let colIndex = frame.IndexBuilder.Create(ReadOnlyCollection.ofArray newColKeys, None)
    Frame(frame.RowIndex, colIndex, frame.VectorBuilder.Create(newData), frame.IndexBuilder, frame.VectorBuilder )

I notice just need to change Seq.forall to Seq.exists, then we will achieve the how=all case.
Wonder if it can be added in the future release

Thanks for the suggestion. I've added dropEmptyRows and dropEmptyCols. Will include it in the next release.