IDR / ome-ngff-samples

A catalog of all public OME-NGFF representative samples derived from IDR data

Home Page:https://idr.github.io/ome-ngff-samples

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to download files

jwindhager opened this issue · comments

Hi

I'd like to access https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0052A/5514375.zarr, but only get a NoSuchKey error:

image

How can I access this data for local viewing/processing?

Thanks

Hi @jwindhager,

I think the confusion comes from the representation of Zarr hierarchical data in a file system vs object storage.
The top-level container for a Zarr dataset, 5514375.zarr in this case, will be a folder on a local file system. There is no concept of hierarchies (folders or directories) in object storage, so there is no key associated to the /zarr/v0.4/idr0052A/5514375.zarr/ path on the idr bucket but all the objects associated to this dataset are under this prefix.

To be able to access or download the data locally, you might want to use tool/APIs that deal with object storage. For instance, using awscli, it is possible to list the keys under the path using s3 ls

$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 ls s3://idr/zarr/v0.4/idr0052A/5514375.zarr/
                           PRE 0/
                           PRE 1/
                           PRE 2/
                           PRE labels/
2022-06-21 09:59:47       3874 .zattrs
2022-06-21 09:59:47         24 .zgroup

or to download the data locally using aws cp --recursive

$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 cp --recursive s3://idr/zarr/v0.4/idr0052A/5514375.zarr/ .
download: s3://idr/zarr/v0.4/idr0052A/5514375.zarr/.zgroup to ./.zgroup               
download: s3://idr/zarr/v0.4/idr0052A/5514375.zarr/0/0/0/1/0/0 to 0/0/0/1/0/0         
download: s3://idr/zarr/v0.4/idr0052A/5514375.zarr/.zattrs to ./.zattrs               
download: s3://idr/zarr/v0.4/idr0052A/5514375.zarr/0/.zarray to 0/.zarray       
...

Since these are OME-NGFF datasets, an alternative would be to use the ome_zarr [info|download] utilities that come with the ome-zarr Python library

Hi @sbesson, thank you so much for the extensive answer and explanations. Indeed, the missing hierarchy concept in S3 threw me off. I managed to download the data now and look forward to playing around with OME-NGFF!