dsfsi / covid19za

Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa

Home Page:https://dsfsi.github.io/covid19za-dash/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gis_nicd_scraper

lrossouw opened this issue · comments

Is your feature request related to a problem? Please describe.
The new way to share data is here:
https://sacoronavirus.co.za/live-counter/

Not via the NICD page so my scraper will need to be retired and redesigned.

And it's broken...

We are unfortunately dealing with the fallout of digital vibes. Would NICD be better @lrossouw?

It looked like they stopped doing them. But I see the 9th is available again. They skipped the 8th though.

I also see the NICD page layout has changed. Captured all data manually until 9 June. Going to wait for the dust to settle before I update my scripts, but of course this week appears to be key in terms of stats that is coming out!

I've created something new that would (hopefully!) be more stable. It collects from various sources so no longer posting the exact urls as the source. I'm flagging auto scraped results as source = "gis_nicd_scraper" as most of the data is scraped from public dashboard there, Don't have anything for vaccines yet.

I scan two or three different dashboards for the figures. I typically can get the cases and tests the evening they are released from one dashboard (but they do break the dash from time to time). I usually seem to pickup the deaths and recoveries the next day (and the cases from this if the other dashboard is broken). Haven't solved the vaccines yet.

But these are more stable than scarping the pages of the media releases that change all the time.

@lrossouw It seems the scraper has stopped working again? Before I do a manual update just want to check on it's status?

Tx did not notice with all the other COVID-19 news out. Will have a look.

Sometimes the Rt calculation is still running (someone else maintains that?) and it commits back and creates a conflict with my process. So my bot keeps updating my local repo but can't push until I manually resolve the conflict.

Not sure how to fix that.

Anyway it's resolved now.

It might be this:
f8bfa83#diff-0e6e5c3c2330a562992a4157e9afb54fdea1938025dd074fec10a03e4e655aed

Can we make it pull before the push here as my bot might have made changes while this bot was running. That way it seems less likely to get into conflicts. @vukosim do you maintain that code?

I will check late this evening. It runs after a change to the file.

My bot posts new case data, Rt bot runs and then updates new data comes in and my bot update again while Rt is running. It creates a technical merge conflict but Rt bot uses --force so overwrites. Perhaps do a pull just before the push to bring the latest changes in. So you don't effectively reverse my or other changes. Rt bot has also reversed other data I captured manually before. I.e. I capture vaccine data while it's running and then it kind of reverse it.

I did a manual update of Death and Recoveries today as they had not updated by mid-day.

#854

Sorry just noticed now. Will sort it out.

Fixed. @vukosim did you managed to update the bot?

Good morning. The provincial data for confirmed cases for 2021-07-05 is missing. Should I add it manually or would you prefer to have the scraper make another pass and add it instead?

Seems to be there (line 487):

https://github.com/dsfsi/covid19za/blob/master/data/covid19za_provincial_cumulative_timeline_confirmed.csv#L487

Louis, apologies if I missed something, will check my import again tonight and get back to you.

Seems to be there (line 487):

https://github.com/dsfsi/covid19za/blob/master/data/covid19za_provincial_cumulative_timeline_confirmed.csv#L487

My mistake, sorry. Data for 05 July 2021 is indeed available in the confirmed cases file, but is missing from cumulative deaths and cumulative recoveries.

Ah 5 July had issue I believe:https://www.nicd.ac.za/latest-confirmed-cases-of-covid-19-in-south-africa-05-july-2021/

They did not release provincial figures for deaths and recoveries on NICD site but I see now they are available here:
https://sacoronavirus.co.za/2021/07/05/update-on-covid-19-05th-july-2021/

Feel free to capture.

My data source stopped providing deaths/recoveries in machine readable form on 30 July or so. What sources are people using?

@lrossouw Does this problem apply to testing too? covid19za_timeline_testing.csv I don't know how I missed that this has not been updating since the end of July.

Thanks @shaze yeah that needs to be updated.

Also looping @krokkie seems we have a few more failures. So we might need again to sync between you and @lrossouw