Missing some top hits at higher maximum drift rates

Question

Missing some top hits at higher maximum drift rates

texadactyl opened this issue 2 years ago · comments

Richard Elkins commented 2 years ago

Describe the bug

Some top hits get deselected in function find_doppler.py tophitsearch as the maximum drift rate increases.

Relevant BL files (.fil, .h5)

Voyager 1 HDF5.
http://blpd0.ssl.berkeley.edu/Voyager_data/

To Reproduce

Steps to reproduce the behavior:

turboSETI -M 54
---> 3 top hits SNR
turboSETI -M 55
---> only 2 top hits
turboSETI -M 56
---> only 1 top hit

Expected behavior

All 3 cases should produce the same 3 top hits.

Richard Elkins · Answer 1 · Wed Jan 19 2022 05:51:41 GMT+0800 (China Standard Time)

It seems that as the maximum permitted drift rate (Hz/s) increases (specified by user), the minimum SNR increases in the find_doppler.py tophitsearch function, regardless of what the user/operator specified originally for the minimum SNR.

E.g. Voyager 1 standard test HDF5 file from http://blpd0.ssl.berkeley.edu/Voyager_data/
Running turboSETI successively:

min_snr=25, max_drift=4 to 54
=============================
find_doppler.0  INFO     Top hit found! SNR 30.612333, Drift Rate -0.392226, index 739933
find_doppler.0  INFO     Top hit found! SNR 245.709610, Drift Rate -0.373093, index 747929
find_doppler.0  INFO     Top hit found! SNR 31.220858, Drift Rate -0.392226, index 756037

min_snr=25, max_drift=55
========================
find_doppler.0  INFO     Top hit found! SNR 245.709610, Drift Rate -0.373093, index 747929
find_doppler.0  INFO     Top hit found! SNR 31.220858, Drift Rate -0.392226, index 756037

min_snr=25, max_drift=56, 100, 200
==================================
find_doppler.0  INFO     Top hit found! SNR 245.709610, Drift Rate -0.373093, index 747929

The pity is that turbo_seti threw out the sideband genuine hits on Voyager 1 and kept the noise spike in the middle.

It looks to me that as the max drift rate increases, so does the minimum SNR in the tophitsearch function, regardless of what the user/operator specified originally.

@telegraphic
Any insight? DId I miss something? Defer this to hyperseti? (-:

Richard Elkins · Answer 2 · Mon Jan 24 2022 04:39:47 GMT+0800 (China Standard Time)

Copying from Seti BL Slack .....

@lacker :

When max drift gets larger, that lbound-ubound interval gets larger. and to get into the output, you have to be the best hit in that lbound-ubound interval.
If you think about it, if your goal is "don't let a single pixel cause multiple outputs", then this makes total sense. higher drift rates mean a single pixel can show up in a larger range of frequencies. so if you want to avoid duplicates, with higher drift rates, you have to be stricter about throwing out hits near each other.
I think a different approach entirely is going to work better when the drift is more than one frequency bin size per one time bin size.

@telegraphic : @lacker has indeed identified the issue, and it is indeed a flaw with turboseti. At very high drift rates turboseti will also lose S/N by not frequency scrunching (averaging across channels). Combined I'd say these are the two biggest issues with turboseti.

Missing some top hits at higher maximum drift rates

---> 3 top hits SNR

---> only 2 top hits

---> only 1 top hit