ska-sa / katdal

Data access library for the MeerKAT radio telescope

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simple download does not work

gigjozsa opened this issue · comments

The following script (tested on several public data sets and on different computers and with different pythons) causes an error:

#! /usr/bin/env python
import katdal

file = '1629930087_sdp_l0.full.rdb'
d = katdal.open(file)
a = d.vis[0,0,0]

This is Python 3.6.9 on an Ubuntu box
Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic
Katdal has been installed using PyPi:

pip install katdal

Results in the error mirrored below. Not sure if this is an error on the server side or on katdal's side or even on my side (although I don't think so). Please help!

WARNING:katdal.dataset:Extending flux density model frequency range of 'J0408-6545' from 1410-8400 MHz to 855-8400 MHz
Traceback (most recent call last):
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 638, in get_chunk
    headers=headers, stream=True)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 594, in complete_request
    with self.request(method, url, chunk_name, **kwargs) as response:
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 543, in request
    raise S3ObjectNotFound(msg)
katdal.chunkstore_s3.S3ObjectNotFound: Chunk '1629930087-sdp-l0/correlator_data/00000_00000_00000': Store responded with HTTP error 404 (Not Found) to request: GET http://archive-gw-1.kat.ac.za/1629930087-sdp-l0/correlator_data/00000_00000_00000.npy
Details of server response: <?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><BucketName>1629930087-sdp-l0</BucketName><RequestId>tx00000000000000df9d20a-006319b9e6-da08a2b-default</RequestId><HostId>da08a2b-default-default</HostId></Error>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./rfitest.py", line 6, in <module>
    a = d.vis[0,0,0]
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 558, in __getitem__
    return self.get([self], keep)[0]
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 591, in get
    da.store(kept, out, lock=False)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/array/core.py", line 1041, in store
    result.compute(**kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/base.py", line 283, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/base.py", line 565, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/threaded.py", line 84, in get
    **kwargs
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/local.py", line 487, in get_async
    raise_exception(exc, tb)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/local.py", line 317, in reraise
    raise exc
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore.py", line 145, in __getitem__
    return self.getter(self.array_name, slices, self.dtype, **self.kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore.py", line 325, in get_chunk_or_placeholder
    return self.get_chunk(array_name, slices, dtype)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 641, in get_chunk
    self._verify_bucket(url, err)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 625, in _verify_bucket
    raise StoreUnavailable(msg) from chunk_error
katdal.chunkstore.StoreUnavailable: S3 bucket http://archive-gw-1.kat.ac.za/1629930087-sdp-l0 is empty - your data is not currently accessible

Hi Josh,

The key bit of info is the very last line:

katdal.chunkstore.StoreUnavailable: S3 bucket http://archive-gw-1.kat.ac.za/1629930087-sdp-l0 is empty - your data is not currently accessible

This dataset is more than a year old. We only keep datasets on disk for 200 days, where katdal can find them. After that, the datasets are shipped off to tape, and you have to request the archive folks to put specific ones back on disk for you. This restaging process may take up to 30 days.

That is why your data is "not currently accessible".

If the data has already been restaged and it is still not accessible, please contact the archive folks to sort it out. It might be a bug on the archive side.