bmschmidt / pubmed-explorer

Scrollership through 20m pubmed abstracts.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fix incorrectly positive search results.

bmschmidt opened this issue · comments

I'm getting a few hits for the string COVID before 2011. Investigate why.

image

Yes, that's strange. I checked through the Pubmed web interface, searching for "covid" in the titles for papers published before 2019-12-31

https://pubmed.ncbi.nlm.nih.gov/?term=%28covid%5BTitle%5D%29+AND+%28%28%221950%2F01%2F01%22%5BDate+-+Publication%5D+%3A+%222019%2F12%2F31%22%5BDate+-+Publication%5D%29%29&sort=

I get 51 hits, but all of them are actually displayed as from 2020+, so I am not sure why the seach returns them.

Yeah. This isn't shocking to me because I've switched to a whole new model here. If you've tried the search in the last 24 hours, you'll see it's instant now. That's because I moved off using a server backend at all and wrote some extremely low-level code iterating across Uint8 array representing the strings as UTF-8 to get around some challenges involving differences in string encoding between Arrow and Javascript. Will be fixable, but requires some work.

This does not seem to happen with the search engine, so can be closed?

Screenshot from 2023-04-19 13-49-41

Excellent.