CAVaccineInventory / vial

The Django application powering calltheshots.us

Home Page:https://vial.calltheshots.us

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update location providers based on cvs: / walgreens: / etc concordance identifiers

simonw opened this issue · comments

Part of #705.

Key idea here is to ensure that all of our locations with a cvs: concordance are attached to the CVS provider, ditto for walgreens: and various other common concordance authorities.

This will allow us to pick up the appointments_url from those providers, if it is set, when we serve locations to our new API and to the Mapbox export used on www.VaccinateTheStates.com

The https://vial.calltheshots.us/api/docs#post-apiupdatelocations /api/updateLocations API can be used for this - it accepts provider_name and provider_type which do this:

vial/vaccinate/api/views.py

Lines 157 to 161 in 9cb542e

if location_data.get("provider_type"):
kwargs["provider"] = Provider.objects.update_or_create(
name=location_data["provider_name"],
defaults={"provider_type": location_data["provider_type"]},
)[0]

So I can run a script in a Jupyter notebook that retrieves the locations that need to be updated using https://vial.calltheshots.us/api/searchLocations?authority=cvs - maybe even add an extra search parameter for ?provider_null=1 to help with that.

Being able to use ?format=ids to return just the IDs of matching results would help with this (and all sorts of other things too).

This means I can know get the ID of every single location that is tagged with the cvs: authority but does not currently have a provider using:

https://vial-staging.calltheshots.us/api/searchLocations?format=ids&provider_null=1&authority=cvs&all=1

Here's a map of all locations attached to the provider called CVS: https://vial-staging.calltheshots.us/api/searchLocations?provider=CVS&format=map&all=1 - 3832 locations

All locations with a cvs: concordance identifier: https://vial-staging.calltheshots.us/api/searchLocations?authority=cvs&format=map&all=1 - 2798 locations

All locations with a cvs: authority that are not attached to any providers: https://vial-staging.calltheshots.us/api/searchLocations?authority=cvs&format=map&all=1&provider_null=1 - 109 locations

Interestingly this suggests that, at least on staging, we have CVS locations that are NOT tagged with a cvs: concordance identifier.

https://vial-staging.calltheshots.us/api/searchLocations?provider=CVS&format=map&all=1&exclude.authority=cvs shows a map of every location attached to the CVS provider that does NOT have a cvs: concordance. 1322 locations.

Deploying this stuff to production to answer these questions there.

In production we only have 72 locations with provider CVS that don't have a cvs: concordance, all of which are in California: https://vial.calltheshots.us/api/searchLocations?provider=CVS&format=map&all=1&exclude.authority=cvs

Spotted something weird: https://vial.calltheshots.us/api/searchLocations?authority=cvs returns 17707 results. But that seems like a LOT of CVSs.

https://www.google.com/search?client=firefox-b-1-d&q=how+many+branches+of+CVS suggests there are 9,967 branches in the USA.

On a hunch, I pulled and de-duped the full list of IDs using this:

curl 'https://vial.calltheshots.us/api/searchLocations?authority=cvs&format=ids&all=1' -H 'Authorization: Bearer 30:...' \
  | jq | sort | uniq | wc -l

Which returned 10,883 - so there's a bug in the API where multiple results can be returned.

This SQL query shows locations with multiple cvs: concordances:

select location.public_id, count(*), array_agg(authority || ':' || identifier)
from concordance_location join concordance_identifier on concordance_identifier.id = concordanceidentifier_id
join location on location_id = location.id
where authority = 'cvs' group by location.public_id
having count(*) > 1
order by count(*) desc

https://vial.calltheshots.us/dashboard/?sql=select+location.public_id%2C+count%28%2A%29%2C+array_agg%28authority+%7C%7C+%27%3A%27+%7C%7C+identifier%29%0D%0Afrom+concordance_location+join+concordance_identifier+on+concordance_identifier.id+%3D+concordanceidentifier_id%0D%0Ajoin+location+on+location_id+%3D+location.id%0D%0Awhere+authority+%3D+%27cvs%27+group+by+location.public_id%0D%0Ahaving+count%28%2A%29+%3E+1%0D%0Aorder+by+count%28%2A%29+desc%3AYNvYK31mkC1Kq8Tzr1xIQsunYgPUAqZJ2Axb-4VmNYg

There are a LOT - grouping by their duplicate count gives:

count	n
2	4193
3	358
4	259
5	189
6	64
7	9
8	2

That fixed it:

curl 'https://vial.calltheshots.us/api/searchLocations?authority=cvs&format=ids&all=1' \
  -H 'Authorization: Bearer 30:...' \
  | jq | sort | uniq | wc -l
   10883

https://vial.calltheshots.us/api/searchLocations?authority=cvs&exclude.provider=CVS&size=0 returns 7112 results - I'm going to assign them all to the CVS provider using the API.

Running this now:

import httpx
from tqdm.notebook import tqdm

api_key = "30:..."
url = "https://vial.calltheshots.us/api/searchLocations?authority=cvs&exclude.provider=CVS&all=1&format=ids"
ids = httpx.get(url, headers={"Authorization": "Bearer {}".format(api_key)}).json()

def make_chunks(sequence, n):
    for i in range(0, len(sequence), n): 
        yield sequence[i:i + n]

chunks = list(make_chunks(ids, 100))

for chunk in tqdm(chunks):
    httpx.post("https://vial.calltheshots.us/api/updateLocations", timeout=20, json={
      "update": {
        id: {
          "provider_type": "Pharmacy",
          "provider_name": "CVS",
        } for id in chunk
      },
      "revision_comment": "Issue #707"
    }, headers={"Authorization": "Bearer {}".format(api_key)})

Hitting https://vial.calltheshots.us/api/searchLocations?authority=cvs&exclude.provider=CVS&size=0 while it is running shows the number reducing as it works.

OK, what other authorities should I do?

Here's the top list: https://vial.calltheshots.us/dashboard/?sql=select%20%22authority%22%2C%20count%28%2A%29%20as%20n%20from%20%28select%20id%2C%20authority%2C%20identifier%2C%20created_at%20from%20concordance_identifier%29%20as%20results%20group%20by%20%22authority%22%20order%20by%20n%20desc%3AydXh_yEhn8L7UhP0pZuK5zTvFVCUmDm6ilDzvbORH7I

authority	n
placekey	89932
vaccinefinder_org	55414
us_carbon_health	46036
vaccinespotter_org	38904
getmyvax_org	32587
us_giscorps_vaccine_providers	23034
cvs	13934
walgreens	9015
google_places	8592
il_sfsites	8031
al_arcgis	7189
health_mart	4960
walmart	4679
tx_arcgis	4304
vtrcks	4284
rite_aid	4079
kroger	2713
wa_state	2526
publix	2377
fl_state	2303
sc_arcgis	2257
nc_myspot_gov	2041
md_arcgis	1731
mo_arcgis	1701
safeway	1660
wi_arcgis_map	1206
az_arcgis	1173
in_arcgis	1140
arcgis	1121
vaccinate_nj	1088
pa_arcgis	923
vaxfinder_gov	779
mn_gov	733
nyc_arcgis	707
la_tableau	703
ca_los_angeles_gov	694
ok_vaccinate_gov	639
sams	573
hyvee	552
ky_govstatus	551
costco	538
prepmod	505
ia_state	501
little_clinic	448
winn_dixie	431
tn_vaccinate_gov	428
albertson	390
sav_on	328
heb	311
weis	300
osco	288
meijer	256
stop_and_shop	253
il_juvare	240
wa_prepmod	237
vons	234
ct_gov	233
ct_covidvaccinefinder_gov	233
harris_teeter	229
giant_eagle	213
hart	207
vaccinefinder	205
smiths	184
jewel_osco	180
ri_arcgis	163
immunizenevada_org	162
fred_meyer	160
hannaford	156
ga_dph	153
giant_food	152
shop_rite	148
king_soopers	148
ak_arcgis	147
ak_clinic_list	147
ma_immunizations	138
acme	137
ct	129
price_chopper	129
duane_reade	126
frys_food_and_drug	123
tom_thumb	118
giant	115
sf_gov	110
food_city	108
kaiser_permanente	108
wegmans	103
thrifty_white	95
wyo_appt_portal	94
thrifty	87
_tag_provider	84
ingles	84
co_colorado_gov	82
cub_pharmacy	80
ralphs	77
ct_state	64
covidvaccinefinder_gov	60
pick_n_save	59
brookshire	58
dillons	54
genoa_healthcare	54
shaws	46
ny_northwell_health	43
raleys	41
marianos	40
pavilions	37
market_street	36
united_supermarket	36
city_market	33
al_jefferson	33
food_lion	32
randalls	32
qfc	32
albertsons_market	29
pharmaca	27
haggen	26
harps	24
ilvaccine_org	24
az_pinal_ph_vaccinelocations_gov	23
ca_metrolink	23
us_physicians_immediate	23
homeland	21
hartig	21
market_32	20
kaiser_health_plan	20
harmons	18
medicap	18
harveys	17
metro_market	12
carrs	11
bakers	11
star_market	10
wv_dhhr	9
dc_district	9
pay_less	9
gerbes	6
kta_super_stores	6
big_y	6
az_ph_pinal_clinics_gov	6
tx_memorialhermann	5
fresco_y_mas	5
jayc	5
amigos	3
drugco	2
_issue	2
market_bistro	1
pak_n_save	1

I'm going to start with these ones:

  • - cvs
  • - walgreens
  • - health_mart
  • - walmart
  • - rite_aid
  • - kroger
  • - safeway
  • - sams
  • - hyvee
  • - costco

I'm going to update all of the California "Rite-Aid Pharmacy" records to use "Rite-Aid" as their provider instead, then I'll rename the "Rite-Aid Pharmacy" provider in VIAL to "OBSOLETE Rite-Aid DO NOT USE"

Done that, now https://vial.calltheshots.us/api/searchLocations?provider=Rite-Aid&format=map&all=1 has 2228 locations.