IQSS / dataverse.harvard.edu

Custom code for dataverse.harvard.edu and an issue tracker for the IQSS Dataverse team's operational work, for better tracking on https://github.com/orgs/IQSS/projects/34

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Start uploading production data to tape storage at NESE

landreev opened this issue · comments

We have the first batch of data to upload there: https://fly.cs.umb.edu/omama/
We will use it to resolve and iron out any technical kinks in the workflow, and to develop service policies for future use and for offering it to clients.

2024/03/21

I put a size 10 on it for now. I am actively working/focusing on this, so probably needs to be in progress.

Got the new (prod.) NESE end point set up and the new service account and the app client given the right permissions on their end. Everything is seemingly configured identically to how the DEMO tape was set up. Tried the first upload in prod. - it bombed. Trying to figure out what's going on and what is different from that demo configuration that is known to work.
Reached out to Jim and Victoria, in case they can help me diagnosing it. But at least I have something to work with now. Will try to resolve it asap.

Got the globus app to work.
Can now upload data to the production tape. Everything is working like a charm when I'm doing it from my own Dataverse instance. With our actual prod. instance, something is failing in the very last step, when Dataverse just needs to register the file in its own database.
Working on figuring this part out. We are very close. 🥲

OK, we are in business of uploading serious prod. data.
These are the first bundles of 2D images from the Omama collection:

Screen Shot 2024-04-02 at 7 59 24 PM

One remaining task (other than uploading the second OMAMA dataset-worth of data which is pending on local delivery) is to get the globus app to work properly in prod. for redirecting _down_load requests for NESE-stored objects.

The OMAMA dataset mentioned above: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/KXJCIU&version=DRAFT
Thinking of closing the issue since the tasks outlined above have been completed. Will be opening new issues for next steps and phases of the effort. The most important one being developing a doc. guide for future customers of this big data-to-tape storage service.