joshday / OnlineStats.jl

⚡ Single-pass algorithms for statistics

Home Page:https://joshday.github.io/OnlineStats.jl/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multiple statistics with multiple variables

Moelf opened this issue · comments

I know fit!() works with an iterator, but what if I need to return multiple pairs of (value, weight) from the iterator because I want to make many weighted histograms in one pass of the data?

Can you clarify a bit? I don't think I follow.

taking this example from docs:

itr = (row.variety => parse(Float64, row.sepal_length) for row in rows)

o = GroupBy(String, Hist(4:0.25:8))

fit!(o, itr)

What if:

  1. each observation from itr has a weight (histogram filling weight)
  2. Each Histogram has different binning (say "Setosa" has 4:0.5:8, and "Virginica" has 6:0.25:8)

You may have to roll a few things on your own.

Also, I've been meaning to work on StatsBase-like weights for OnlineStats so maybe this will nudge me to do it.