thisisparker / ftd

:skull: Scripts for FOIA The Dead, a morbid transparency project

Home Page:https://foiathedead.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Encourage suggestions for desired news sources

brainwane opened this issue · comments

If there are particular news sources you'd especially like the service to check, could you add that list of desired sources to a section of the README or a TODO comment in the obit-retrieving script? That'd help others contribute. I am thinking in particular about West Coast or foreign sources.

Thanks!

Hi @brainwane! That's a good question and I'm going to provide too long of an answer. Apologies :)

The reason it started with NYT is because they have an API and can offer obituaries separately. An unintended but ultimately very helpful addition is that they also use a very standardized headline format that allows me to extract out the name automatically. New sources with that combination of features would be highest prioritized for my personal implementation, but of course none are requirements.

(There's also a minor issue about removing duplicates, which would require a little bit of fuzzy-matching on names. NYT often includes titles of nobility or religious leadership in headlines which make it into my database, so I'd have to make sure that I only sent one request for e.g. King Bhumibol Aduljadej.)

I will probably have to rewrite the obit-fetching script from scratch now that (a) I know what I'm doing with Python much much much more than I did when I started this and (b) the FBI has stopped accepting emailed FOIA requests. So I'm likely scripting against their form, and so scraping on the input side is less intimidating.

Which is all to say, that yes, a list of desired sources would be good, especially after I've done some of that rewrite. It seems to me as a relative beginner that a good place to do that would be in separate Github issues (where discussion can happen and which people can reference in PRs and the like). I don't want to reinvent the wheel, though, so if there's a reason that's not a good idea I'm all ears.

Thanks for the reply, @thisisparker, and congrats on the rewrite and relaunch!

I suggest that we note, in the README, that you're open to GitHub issues for suggestions of new sites, and that it's way easier to add a site if their API makes it easy to grab decedents' names. And I've made two concrete suggestions in #31 and #32. If you really wanted to get fancy you could add a label for obit source suggestion issues!

commented

Hey @thisisparker! Wanted to get a sense for your appetite for continuing to maintain this. I created a fork at https://github.com/brandongalbraith/foiathedead because I intend to script it against https://en.wikipedia.org/wiki/Deaths_in_2023 and similar Wikipedia endpoints. Let me know!