Update location providers based on cvs: / walgreens: / etc concordance identifiers
simonw opened this issue · comments
Part of #705.
Key idea here is to ensure that all of our locations with a cvs:
concordance are attached to the CVS provider, ditto for walgreens:
and various other common concordance authorities.
This will allow us to pick up the appointments_url
from those providers, if it is set, when we serve locations to our new API and to the Mapbox export used on www.VaccinateTheStates.com
The https://vial.calltheshots.us/api/docs#post-apiupdatelocations /api/updateLocations
API can be used for this - it accepts provider_name
and provider_type
which do this:
Lines 157 to 161 in 9cb542e
So I can run a script in a Jupyter notebook that retrieves the locations that need to be updated using https://vial.calltheshots.us/api/searchLocations?authority=cvs - maybe even add an extra search parameter for ?provider_null=1
to help with that.
Updated documentation: https://vial-staging.calltheshots.us/api/docs#get-apisearchlocations
Demo: https://vial-staging.calltheshots.us/api/searchLocations?size=0&provider_null=1
{
"results": [],
"total": 17917
}
https://vial-staging.calltheshots.us/api/searchLocations?size=0&provider_null=1&authority=cvs
{
"results": [],
"total": 109
}
Being able to use ?format=ids
to return just the IDs of matching results would help with this (and all sorts of other things too).
This means I can know get the ID of every single location that is tagged with the cvs:
authority but does not currently have a provider using:
Here's a map of all locations attached to the provider called CVS: https://vial-staging.calltheshots.us/api/searchLocations?provider=CVS&format=map&all=1 - 3832 locations
All locations with a cvs:
concordance identifier: https://vial-staging.calltheshots.us/api/searchLocations?authority=cvs&format=map&all=1 - 2798 locations
All locations with a cvs:
authority that are not attached to any providers: https://vial-staging.calltheshots.us/api/searchLocations?authority=cvs&format=map&all=1&provider_null=1 - 109 locations
Interestingly this suggests that, at least on staging, we have CVS locations that are NOT tagged with a cvs:
concordance identifier.
https://vial-staging.calltheshots.us/api/searchLocations?provider=CVS&format=map&all=1&exclude.authority=cvs shows a map of every location attached to the CVS provider that does NOT have a cvs:
concordance. 1322 locations.
Deploying this stuff to production to answer these questions there.
In production we only have 72 locations with provider CVS that don't have a cvs:
concordance, all of which are in California: https://vial.calltheshots.us/api/searchLocations?provider=CVS&format=map&all=1&exclude.authority=cvs
Spotted something weird: https://vial.calltheshots.us/api/searchLocations?authority=cvs returns 17707 results. But that seems like a LOT of CVSs.
https://www.google.com/search?client=firefox-b-1-d&q=how+many+branches+of+CVS suggests there are 9,967 branches in the USA.
On a hunch, I pulled and de-duped the full list of IDs using this:
curl 'https://vial.calltheshots.us/api/searchLocations?authority=cvs&format=ids&all=1' -H 'Authorization: Bearer 30:...' \
| jq | sort | uniq | wc -l
Which returned 10,883 - so there's a bug in the API where multiple results can be returned.
This SQL query shows locations with multiple cvs:
concordances:
select location.public_id, count(*), array_agg(authority || ':' || identifier)
from concordance_location join concordance_identifier on concordance_identifier.id = concordanceidentifier_id
join location on location_id = location.id
where authority = 'cvs' group by location.public_id
having count(*) > 1
order by count(*) desc
There are a LOT - grouping by their duplicate count gives:
count n
2 4193
3 358
4 259
5 189
6 64
7 9
8 2
That fixed it:
curl 'https://vial.calltheshots.us/api/searchLocations?authority=cvs&format=ids&all=1' \
-H 'Authorization: Bearer 30:...' \
| jq | sort | uniq | wc -l
10883
https://vial.calltheshots.us/api/searchLocations?authority=cvs&exclude.provider=CVS&size=0 returns 7112 results - I'm going to assign them all to the CVS provider using the API.
Running this now:
import httpx
from tqdm.notebook import tqdm
api_key = "30:..."
url = "https://vial.calltheshots.us/api/searchLocations?authority=cvs&exclude.provider=CVS&all=1&format=ids"
ids = httpx.get(url, headers={"Authorization": "Bearer {}".format(api_key)}).json()
def make_chunks(sequence, n):
for i in range(0, len(sequence), n):
yield sequence[i:i + n]
chunks = list(make_chunks(ids, 100))
for chunk in tqdm(chunks):
httpx.post("https://vial.calltheshots.us/api/updateLocations", timeout=20, json={
"update": {
id: {
"provider_type": "Pharmacy",
"provider_name": "CVS",
} for id in chunk
},
"revision_comment": "Issue #707"
}, headers={"Authorization": "Bearer {}".format(api_key)})
Hitting https://vial.calltheshots.us/api/searchLocations?authority=cvs&exclude.provider=CVS&size=0 while it is running shows the number reducing as it works.
OK, what other authorities should I do?
authority n
placekey 89932
vaccinefinder_org 55414
us_carbon_health 46036
vaccinespotter_org 38904
getmyvax_org 32587
us_giscorps_vaccine_providers 23034
cvs 13934
walgreens 9015
google_places 8592
il_sfsites 8031
al_arcgis 7189
health_mart 4960
walmart 4679
tx_arcgis 4304
vtrcks 4284
rite_aid 4079
kroger 2713
wa_state 2526
publix 2377
fl_state 2303
sc_arcgis 2257
nc_myspot_gov 2041
md_arcgis 1731
mo_arcgis 1701
safeway 1660
wi_arcgis_map 1206
az_arcgis 1173
in_arcgis 1140
arcgis 1121
vaccinate_nj 1088
pa_arcgis 923
vaxfinder_gov 779
mn_gov 733
nyc_arcgis 707
la_tableau 703
ca_los_angeles_gov 694
ok_vaccinate_gov 639
sams 573
hyvee 552
ky_govstatus 551
costco 538
prepmod 505
ia_state 501
little_clinic 448
winn_dixie 431
tn_vaccinate_gov 428
albertson 390
sav_on 328
heb 311
weis 300
osco 288
meijer 256
stop_and_shop 253
il_juvare 240
wa_prepmod 237
vons 234
ct_gov 233
ct_covidvaccinefinder_gov 233
harris_teeter 229
giant_eagle 213
hart 207
vaccinefinder 205
smiths 184
jewel_osco 180
ri_arcgis 163
immunizenevada_org 162
fred_meyer 160
hannaford 156
ga_dph 153
giant_food 152
shop_rite 148
king_soopers 148
ak_arcgis 147
ak_clinic_list 147
ma_immunizations 138
acme 137
ct 129
price_chopper 129
duane_reade 126
frys_food_and_drug 123
tom_thumb 118
giant 115
sf_gov 110
food_city 108
kaiser_permanente 108
wegmans 103
thrifty_white 95
wyo_appt_portal 94
thrifty 87
_tag_provider 84
ingles 84
co_colorado_gov 82
cub_pharmacy 80
ralphs 77
ct_state 64
covidvaccinefinder_gov 60
pick_n_save 59
brookshire 58
dillons 54
genoa_healthcare 54
shaws 46
ny_northwell_health 43
raleys 41
marianos 40
pavilions 37
market_street 36
united_supermarket 36
city_market 33
al_jefferson 33
food_lion 32
randalls 32
qfc 32
albertsons_market 29
pharmaca 27
haggen 26
harps 24
ilvaccine_org 24
az_pinal_ph_vaccinelocations_gov 23
ca_metrolink 23
us_physicians_immediate 23
homeland 21
hartig 21
market_32 20
kaiser_health_plan 20
harmons 18
medicap 18
harveys 17
metro_market 12
carrs 11
bakers 11
star_market 10
wv_dhhr 9
dc_district 9
pay_less 9
gerbes 6
kta_super_stores 6
big_y 6
az_ph_pinal_clinics_gov 6
tx_memorialhermann 5
fresco_y_mas 5
jayc 5
amigos 3
drugco 2
_issue 2
market_bistro 1
pak_n_save 1
I'm going to start with these ones:
- - cvs
- - walgreens
- - health_mart
- - walmart
- - rite_aid
- - kroger
- - safeway
- - sams
- - hyvee
- - costco
Health Mart is a chain of 5,000 independent member pharmacies: https://en.wikipedia.org/wiki/Health_Mart - https://vial.calltheshots.us/api/searchLocations?authority=health_mart&format=map&all=1 looks about right to me.
It's here in VIAL: https://vial.calltheshots.us/admin/core/provider/98/change/ - I'm going to add the URLs to it.
Rite-Aid is in VIAL twice: https://vial.calltheshots.us/admin/core/provider/119/change/?_changelist_filters=q%3Drite as "Rite-Aid" and https://vial.calltheshots.us/admin/core/provider/2/change/?_changelist_filters=q%3Drite as "Rite-Aid Pharmacy"
https://vial.calltheshots.us/api/searchLocations?provider=Rite-Aid&format=map&all=1 shows 1689
https://vial.calltheshots.us/api/searchLocations?provider=Rite-Aid+Pharmacy&format=map&all=1 shows 539, all in California
I'm going to update all of the California "Rite-Aid Pharmacy" records to use "Rite-Aid" as their provider instead, then I'll rename the "Rite-Aid Pharmacy" provider in VIAL to "OBSOLETE Rite-Aid DO NOT USE"
Done that, now https://vial.calltheshots.us/api/searchLocations?provider=Rite-Aid&format=map&all=1 has 2228 locations.
Hy-Vee was missing so I added it - it's a super-market chain. https://vial.calltheshots.us/admin/core/provider/127/change/
https://vial.calltheshots.us/api/searchLocations?authority=hyvee&exclude.provider=Hy-Vee&size=0