hammerdirt-analyst / qualite-deau

The record of water quality monitoring in the Baye de Montreux by Hackuarium. Original manuscript was published.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrong implementation of coefficients

rachelaronoff opened this issue · comments

The calculations for the older data have the incorrect coefficients.
For instance for the 2016 site by site weekly average graphic (under section 7), the current max for 4th week is over 3000 bioindicator colonies per 100ml, when it should be about 4x lower (probably the required coeff of 25 was not used, but 100 instead).
For 2016 in csv all say 100 for coef, indeed, incorrectly.
and for 2017 all say 4 for the coefficient - but that was the 4ml inoculated on the easygel plates - so again the coefficient needs to be corrected.
I can double check and correct the current big .csv of data of all years and send. (maybe there are some other errors)

Correcting the 'random variable' definitions (section 1) could also be helpful to avoid future problems. I hope to soon clone this repo to test a simple edit to start off, and then be able to put things together. Thank you.

Hello,

These are the current coeeficient values. Could you please give the correct interpretation in the form old-value:new-value?

100:
4:
1:
0.5:
2.:
0.1:

Thank you

Thank you.
Actually, the column 'coef' should show the amount to multiply by, in order to normalise the values to show what 100ml of water sample would give (the standard for water quality reporting. Again, I hope to get this in the definitions on the page about random variables very soon.) However, the big problem with the recent csv file is that it has the wrong coef values for much of the 2016 data, especially, while for the 2017 data it shows, instead, how many ml of water were actually put on the plates or cards, rather than the coefficient by which to normalise.
In fact, originally, all the data showed actual volumes of water sample used, so one could determine what coefficient was required to get to 100ml, i.e. if 1ml was inoculated the coef would be 100. If 4ml was inoculated then the coef would be 25. Right now, all the 2016 data seems to think only 1ml was ever used (so coef as 100).
Furthermore, as the amounts of water used changed in certain cases (depending on the media, or plates, i.e. easygel, ECC, Levine, and even depending on our hope not to overload the system for easy counts (in 2017 for the 10July samples, we decided to only inoculate 1ml for that very reason, although it might not have been the best choice...), one really had to look at each week and not make assumptions. (maybe it was a communication error on my part at one point that led to you thinking all the 2016 data was just 1ml, but in fact there was even a week in 2016 when only 0.4ml was used, requiring a coef of 250.)
So, in short, simple 'old/new 'translations' will not fix the problem, that is why I simply 'corrected' the last csv, sent to you already.
To reiterate, sorry, there is really no one size fits all, for the 'translations' as we did with colors and media, as even a typo slipped in, the '0.' in the current list of your last reply, which actually was probably a 2017 week where 0.5ml was used on control plates... (Levine or LB)
Can we try the last csv file I sent and we can see if the graphic outputs at least match the published data again, for 2016/17/20 also in terms of the y-axis shown?
Thanks again.

Hello,

We will have to do this one year at a time then.

Thanks

Here is the 2016 data. https://github.com/Hackuarium/montreux-water-quality/blob/main/data/2016_Data.csv The columns Px_qty_sample show the original volumes of water plated.

For coefficients, to normalise for 'per 100ml' values:
0.4 (old volume): 250 (new coefficient to get 100ml volume)
then, for it all
0.4:250
4:25
1:100
0.5:200
2:50
0.1:1000

of course, the data is probably not in the same order as the current csv.

and for the 2017 typo 0. should probably get link to 2017 list next...

https://github.com/Hackuarium/montreux-water-quality/blob/main/data/2017_Data.csv

the translations from volumes to coefficients would be same as above.

thx! (and again, the outputs in the .htmls are mainly incorrect right now, but the csv sent yesterday can also be tried...)

Hello,

Thank you for providing the requested information. You are only seeing a small portion of what is going on. While a file might solve YOUR problem it does not get rid of the issue with the data. Once we define a method to make the changes we will make them. This action will be part of the repository history. You can always go back.

This may seem a little procedural. However, there was alot of good work done in the previous years. We are simply documenting changes to the work, out of respect for what was done prior. This is important. It is part of the job.

2017_AVG_BigBlue_CFU_week
This should show the correct y axis for confirmation of the re-analyses... had hoped to have it all together to show at Voltaire's old château today (Fête de la Science), but we will get there. thanks for your help.

Hello,

This information was requested in #10. If we have this information on hand then we can compare our results.

Could you please consider your answer above with what was requested in #10? (and all your non answers)

Please take time to consider the requested information and your answers.

The updates have been with caa2b32

Thank you

typo fixed