Drop Columns with all elements are missing values
hoangdungt2 opened this issue · comments
josevut2 commented
Hi,
In pandas dropna
there is a how
flag to specify how the dropna
works (any
or all
).
In Deedle.Frame
, there is only dropSparseCols
which is same as pandas.dropna
for any
. I wonder if we can add the case of all
(drop column if all values are missing). I look at the source code for dropSparseCols
let dropSparseCols (frame:Frame<'R, 'C>) =
let newColKeys, newData =
[| for KeyValue(colKey, addr) in frame.ColumnIndex.Mappings do
match frame.Data.GetValue(addr) with
| OptionalValue.Present(vec) when vec.ObjectSequence |> Seq.forall (fun o -> o.HasValue) ->
yield colKey, vec
| _ -> () |] |> Array.unzip
let colIndex = frame.IndexBuilder.Create(ReadOnlyCollection.ofArray newColKeys, None)
Frame(frame.RowIndex, colIndex, frame.VectorBuilder.Create(newData), frame.IndexBuilder, frame.VectorBuilder )
I notice just need to change Seq.forall
to Seq.exists
, then we will achieve the how=all
case.
Wonder if it can be added in the future release
Zhenyong Zhu commented
Thanks for the suggestion. I've added dropEmptyRows
and dropEmptyCols
. Will include it in the next release.