Uncertainties v.s weights & Averaging over columns / coordinates.
doronbehar opened this issue · comments
Hello. You have no idea how much I enjoy using your package. It fits exactly my usage, and I can't believe that only at this stage of my project I came to know it!
I'm working with a large xarray.Dataset
with N>4 coordinates which I convert .to_dataframe()
in order to plot them with seaborn.lineplot
. It became confusing to me when I wanted seaborn to show my calculations' uncertainty. At first, I wasn't sure even how to save that uncertainty, until I realized that you don't call it "uncertainty", but rather the weights of the data variables for the estimation, and that they should simply be saved in a separate data variable.
If I need to perform estimation, it works pretty good I suppose. However, I found that terminology choice a bit peculiar, because weights are something only proportional to each other, whereas uncertainties also have a meaning when the data is not averaged. The below formulas are the formulas I'm familiar with regarding this. Note how
I also noticed, that if I give seaborn.lineplot
a dataset.to_dataframe()
with only 1 coordinate, then the weights
aren't taken into account at all. I understand that I can supply a custom function to the errorbar
argument. But I think it would have been much more consistent if instead of the weights
argument, an uncertainties
argument would have been used, and the uncertainties would have been used as error bars even if no estimation is required (because there is a single y
per x
).
At first, I wasn't sure even how to save that uncertainty, until I realized that you don't call it "uncertainty", but rather the weights of the data variables for the estimation, and that they should simply be saved in a separate data variable.
Hi, I think you're thinking about this slightly wrong — the weights
parameter exists so that you can compute weighted mean, not to provide a measure of uncertainty.
At first, I wasn't sure even how to save that uncertainty, until I realized that you don't call it "uncertainty", but rather the weights of the data variables for the estimation, and that they should simply be saved in a separate data variable.
Hi, I think you're thinking about this slightly wrong — the
weights
parameter exists so that you can compute weighted mean, not to provide a measure of uncertainty.
I understood that correctly in the first place, but the way I phrased the sentence indeed implied otherwise. What I meant to say was that the closest thing related to uncertainties in seaborn
is the weights
parameter.
I wonder what do you think about adding an uncertainties
parameter that would act as I suggested? Do you think it'd be beneficial? (Please reopen 🙏)
Sorry, there's been plenty of discussion of related topics before. I'm not open to adding this.
Sorry, there's been plenty of discussion of related topics before. I'm not open to adding this.
Could you link me to those discussions? I want to know what were the arguments for / against were.. These search results don't show discussions about the simplest case of a seaborn.lineplot
...