ACTCollaboration / sync-nersc-scinet

Scripts to establish a nightly sync between NERSC and Scinet

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make localScript.sh that is called by cron

msyriac opened this issue · comments

Something weird is happening. syncScript.sh works correctly when run from Cori, but when run from my local computer like in localScript.sh , the ssh authentication doesn't go through, presumably because the key forwarding isn't working correctly. I've tried to abstract out what's happening here:
http://unix.stackexchange.com/questions/338491/rsync-ssh-agent-forwarding-through-3-remote-systems-doesnt-work-with-single-com

Any idea what's going on @mhasself @amaurea ?

In that example, local is my machine, remote1 is Cori, remote2 is the Scinet login node, and remote3 is datamover1.

Are nerscUserName and nerscPath files on the local computer? What is the result if you do put an echo in front of the whole ssh command? Are you running localcScript.sh directly or through cron when it doesn't work?

Both nerscUserName and nerscPath are on the local computer. I've tried typing out the ssh command explicitly on the command line instead of using localScript.sh and the same error happens. I am running it locally, not yet using cron.

You can reproduce this exactly by running the example in the stackexchange question I posted, but replacing remote1 with cori.nersc.gov, remote2 with login.scinet.utoronto.ca and remote3 with datamover1.

I can't reproduce the problem, but I'm not using ssh agent forwarding to forward access to my key, and I think that's where your problem lies, based on your error message. Without ssh agent I can successfully run a script on cori that will rsync a file to scinet.

@amaurea This has now been fixed but I don't understand how.
db37a58

I noticed that not sending the rsync process to the background fixed it. So I sent it to the background and added a wait command. This fixed it even though only one process is currently submitted (syncList.txt has only one item), so I am puzzled.

Ahh that's it. I know you're right because when I run localscript.sh, the ssh connection stays open.

What I have learnt from this should have been obvious from the start, the SSH agent doesn't move anything to remote connections, but needs to be running on the local computer and communicating with remote.

This means that the local computer has to maintain a connection for the duration of the rsync. So it's probably not a good idea to set this up on my laptop. And setting it up on Feynman would mean moving my private keys there, which I don't want to do.

Luckily, I have a home server set up and I'm sure it will be happy to knock on Cori to do these syncs every night or week or so. If I stop running it, anyone else who has an always-up system that they administer (and so would be happy to put their keys on) can set this repo up.

I wrote a Python daemon which is now running on a Raspberry Pi at home. I've set it to do nightly syncs at 3am EST. Closing.