seqeralabs / nf-tower

Nextflow Tower system

Home Page:https://tower.nf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Request: Support Remote Agent

apeltzer opened this issue · comments

Hi!

a potentially interesting feature would also be the possibility to have a "local executor", e.g. a plain machine with required tools / software installed to run the workflow via tower on. In some cases, isolated workstations with e.g. proprietary tools are a thing that is not too easy to change / modify.

It would however be great to use these from Tower and not having to setup a "single machine SLURM" for example.

SSH credentials are already in Tower, dependency handling could be done the same way as with other tools (e.g. -profile docker).

The executor could simply be a "Remote SSH Executor" that runs a job on the machine of choice. Setup of that could also be done in Tower. Could also be that this is more a Core-Nextflow thing, e.g. a "Nextflow SSH Remote Executor" would be the best way forward 👍🏻

The need for a single node executor is a recurrent request. Not sure we are allowing this via SSH, however we are discussing a similar ability via the new agent tool we are developing.

Thank you Paolo - would be great to see something like this available 👍🏻

Is there some news on the agent tool you were mentioning Paolo? I have some testcases here at hand and could / would give this a go 💯

Tower Agent is up and running, available here: https://github.com/seqeralabs/tower-agent

@apeltzer notice that, even if Tower Agent can be used in a future as a connection gateway, right now at Tower side you can only use it to submit jobs to an HPC scheduler (slurm, lsf...).

As a temporary workaround if you want to use in a workstation you can fake some Slurm commands to use it:

sinfo

#!/bin/bash
echo "slurm 16.05.3"

scancel

#!/bin/bash
kill $1
echo "done"

squeue

#!/bin/bash
ps -Af | grep ".launcher.sh" | grep -v grep | awk '{print($2" R")}'

sbatch

#!/bin/bash
bash $1 &>/dev/null &
echo "Submitted batch job $!"

I cannot really set up the SLURM execution environment this way unfortunately :-( Always getting connection issues, though the key is definitely there

We could try to investigate the problem with the agent, if you are some error message or log please report it here.

So I need to have a Tower Version that can work with the Tower Agent, the "fake Slurm" part above on the machine running the jobs and then should be able to run things, correct?

Might also mean that I need IT to update Tower first to get this running.

So I need to have a Tower Version that can work with the Tower Agent, the "fake Slurm" part above on the machine running the jobs and then should be able to run things, correct?

Yes. Meanwhile maybe you can test the setup using https://tower.nf.