Doppler search dies when using GPU (string format issue)
palumbom opened this issue · comments
Describe the bug
It looks like an unsupported format string causes FindDoppler.search() to die when using the GPU implementation. The full error message reads:
Traceback (most recent call last):
File "tSETI_pipeline.py", line 483, in <module>
main()
File "tSETI_pipeline.py", line 405, in main
datfile = run_turbo_seti(h5_file,
File "tSETI_pipeline.py", line 288, in run_turbo_seti
fdop.search()
File "/gpfs/group/jtw13/default/gpu_anaconda/lib/python3.8/site-packages/turbo_seti/find_doppler/find_doppler.py", line 204, in search
search_coarse_channel(dl, self, dataloader=sched, filewriter=filewriter, logwriter=logwriter)
File "/gpfs/group/jtw13/default/gpu_anaconda/lib/python3.8/site-packages/turbo_seti/find_doppler/find_doppler.py", line 485, in search_coarse_channel
filewriter = tophitsearch(fd, tree_findoppler_original, max_val, tsteps, data_obj.header, tdwidth,
File "/gpfs/group/jtw13/default/gpu_anaconda/lib/python3.8/site-packages/turbo_seti/find_doppler/find_doppler.py", line 671, in tophitsearch
info_str = "Top hit found! SNR {:f}, Drift Rate {:f}, index {}" \
TypeError: unsupported format string passed to cupy.core.core.ndarray.__format__
I haven't done super deep digging yet, but it's possible that this is the result of the cupy version I'm using (see below). Unfortunately, it's not easy to simply update cupy and see if this fixes the issue, since I would need to have IT install a newer CUDA toolkit for me. Happy to provide additional details, etc. if anything else is needed.
Relevant BL files (.fil, .h5)
- Specific file: spliced_blc00010203040506o7o0111213141516o7o021222324252627_guppi_59163_09490_ON_X3_0030.rawspec.0000.fil
- Original .fil file at: http://blpd3.ssl.berkeley.edu/AGBT20B_999_34/
Steps to reproduce the behavior:
- Used
blimpy.fil2h5.make_h5_file
to convert .fil -> .h5 - Ran
FindDoppler
on .h5 withgpu_backend=true
- Ran
fdop.search()
- See error
Setup
- Python version: 3.8.5
- turbo_seti version 2.0.13
- blimpy version 2.0.2
- cupy version 8.5.0 with CUDA toolkit version 10.2
@luigifcruz
Can you please investigate this? I still have no access to NVIDIA hardware.
@texadactyl Sure, I'll take a look.
@luigifcruz
blc00010203040506o7o0111213141516o7o021222324252627_guppi_59163_09490_ON_X3_0030.rawspec.0000.fil
is YUGE - 81 GB. Do you have access to the data centre servers with GPU hardware & software?
Did some digging. Based on cupy/cupy#2281 it looks like changing
info_str = "Top hit found! SNR {:f}, Drift Rate {:f}, index {}".format(maxsnr[i], drate, i)
at line 674 of find_doppler.py to
info_str = "Top hit found! SNR {:f}, Drift Rate {:f}, index {}".format(maxsnr[i], drate.item(), i)
may fix the issue. Testing it now.
I made the same change at home and it is kosher for non-gpu cases.
Actually running it on that filterbank file now (with GPU enabled) to see how things go, but it may take a while. As you said the file is YUGE :)
You can test with smaller files in the future:
http://blpd0.ssl.berkeley.edu/Voyager_data/ (smallest)
http://blpd0.ssl.berkeley.edu/parkes_testing/
http://seti.berkeley.edu/opendata
Good to know, thanks for these! Also, FindDoppler.search() ran smoothly for both the CPU and GPU versions on my end. I'm happy to submit a pull request, or I can leave it to you to change if you would prefer.