Searching on a year "1938" doesn't return result that is expected
laurensorensen opened this issue · comments
When searching on 1938 in the search bar with All Fields selected, the result only comes up with two results, one from 1946 and one where the identifier is H-1938.
URL to the search:
https://virtualtribunals-stage.stanford.edu/nuremberg?search_field=all_fields&q=1938
Would expect the one and only record from 1938 to show up on this search:
URL to the 1938 record (the year exists within a range... not sure if that is relevant):
https://virtualtribunals-stage.stanford.edu/catalog/mt839rq8746aspace_ae47fbe3b5a20b92d0f0d5947ea70d67
I'm not sure I understand this. One has the resource identifier "H-5260_1938A_1" and the other has the resource identifier "H-1938". So both of those should show up. Are you suggesting that H-1594 should also show up since it has a date range that spans 1936-45?
If we wanted to satisfy this, we'd first have to recognize that the query "1938" was part of a date. Would we also support "1938-10" (Oct 1938) and "1938-10-23" ? What about other formats (e.g. "Oct 1938", "October 1938", "The 1930s", "193?")? I think this will be very challenging.
It is totally fine that H-1938 and the other with that text as a string shows up, it's just bad that the date 1938 doesn't return anything even though there is a date range that includes 1938. I thought dates were able to be indexed for searching? Otherwise, I wouldn't be (as user) searching "all fields"... since date is a field.
If it's a lot of work and out of scope for this workcycle, that's fine, I just think in the future we would like to be able to have users search a date in YYYY or YYYY-MM or YYYY-MM-DD and return a record with that date associated.
It looks like all the date parts are indexed: https://github.com/sul-dlss/vt-arclight/blob/main/lib/traject/vt_component_config.rb#L44-L73 so maybe we just need to configure it for search https://github.com/sul-dlss/vt-arclight/blob/main/solr/conf/solrconfig.xml#L89-L97