Giters
facebookincubator
/
submitit
Python 3.8+ toolbox for submitting jobs to Slurm
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
1087
Watchers:
21
Issues:
111
Forks:
110
facebookincubator/submitit Issues
Can I use torchrun with submitit?
Updated
15 days ago
Turn off Signal Handling
Updated
20 days ago
Comments count
2
Using `RsyncSnapshot` with a editable package install
Closed
24 days ago
Comments count
2
be tolerating about sacct error?
Closed
a year ago
Comments count
2
Submitit jobs die with no error on cluster with SLURM 19.05
Updated
2 months ago
Unexpected behavior of memory specification between `AutoExecutor` and `SlurmExecutor`
Updated
2 months ago
Too many sacct requests for batched tasks
Updated
3 months ago
Failed to launch: Invalid wckey specification
Updated
3 months ago
When 'submitit' meet 'mpirun', there will be a very strange BUG.
Updated
4 months ago
Improving performance with NVidia GPU affinity?
Updated
4 months ago
Consider supporting slurm rest api
Updated
4 months ago
Comments count
2
No user code logging output is shown in logs
Closed
a year ago
Comments count
2
`tasks_per_node=1` does not keep the number of tasks to 1 for the `LocalExecutor`
Updated
5 months ago
Comments count
4
Documentation of `executor.update_parameters` arguments
Updated
5 months ago
Comments count
1
SLURM Job keeps running after Successful Job Completon (Hydra Submitit Plugin)
Updated
5 months ago
Comments count
2
Enabling sbatch file re-use.
Updated
5 months ago
Comments count
2
Submitit map_array is using 2 GPUs but only requested 1
Closed
5 months ago
Comments count
1
Does setting `folder` in `AutoExecutor` interfere with sattach?
Updated
6 months ago
Comments count
1
timeout_min=0 results in pending jobs when a Slurm partition timelimit is set
Closed
6 months ago
Support Slurm Heterogeneous Job
Updated
6 months ago
Comments count
2
Submit Over SSH?
Updated
7 months ago
Comments count
4
Conda version out of date
Updated
10 months ago
Comments count
1
AttributeError , AutoExecutor attribute not recognised by submitit
Closed
10 months ago
Comments count
1
array_parallelism on local machine
Closed
a year ago
Comments count
1
Submitit with SLURM sub-scheduling
Updated
a year ago
Comments count
2
duplicate tasks when using `SlurmExecutor.map_array`
Closed
a year ago
Comments count
3
Requeueing on timeouts when launching jobs with CommandFunction
Updated
a year ago
SLURM Jobs keep running after successful job completion.
Closed
a year ago
Add custom options to sbatch command in SLURM
Updated
a year ago
Comments count
4
Submitit with sbatch
Updated
a year ago
Comments count
6
Unwanted behavior after a slurm job time limit
Closed
a year ago
Comments count
1
Printing in Signal Handlers May Be Unsafe
Updated
a year ago
Comments count
1
How to specify GPUs when executing locally?
Closed
a year ago
Comments count
5
Submitit puts all tasks on a single GPU
Closed
a year ago
Comments count
3
Should we submit job on login node?
Updated
a year ago
Comments count
1
InfoWatch might get previous jobid info after slurm restart
Updated
a year ago
Comments count
1
UnicodeDecodeError fails the job
Updated
a year ago
Comments count
1
array_parallelism for LocalExecutor
Closed
a year ago
Comments count
2
submitit.core.utils.FailedJobError: sbatch: error: Parameter --gres=gpu:1 no longer acceptable, please switch to --gpus=1
Closed
a year ago
Comments count
1
Can submitit manage chain dependencies?
Closed
a year ago
Comments count
1
Compute Canada
Closed
a year ago
Comments count
1
NodeList Declaration
Closed
a year ago
Comments count
2
Remove `#SBATCH --nodes=1`
Closed
a year ago
Comments count
3
How to load the original code point when preempted and rescheduled if the code is changed before rescheduling?
Closed
a year ago
Comments count
1
Switching from USR1 Breaks Pytorch Lightning
Updated
2 years ago
Comments count
4
Recover jobs after kernel dies
Closed
2 years ago
Comments count
1
submit job array to multiple partitions
Updated
2 years ago
Latest versions' tags not on Github
Closed
2 years ago
Comments count
2
Task does not wait for GPU memory resources
Closed
2 years ago
Comments count
1
filedescriptor out of range in select()
Closed
2 years ago
Comments count
1
Previous
Next