Update htcondor instructions#459
Conversation
tristan-f-r
left a comment
There was a problem hiding this comment.
[I'll have to restore my HTCondor access to follow this.]
agitter
left a comment
There was a problem hiding this comment.
I'm testing the Snakemake long execution mode. The first time my jobs went on hold because I put my spras-v0.6.0.sif file in the htcondor/ directory instead of the root directory. That should have been obvious based on the comment in the .yaml file.
On the second attempt my jobs went on hold with
Transfer output files failure at execution point slot1_24@e2591.chtc.wisc.edu while sending files to access point ap2001. Details: 1 total failures: first failure: reading from file /var/lib/condor/execute/slot1/dir_3699332/scratch/output: (errno 2) No such file or directory
| log = logs/spras_$(Cluster)_$(Process).log | ||
| output = logs/spras_$(Cluster)_$(Process).out | ||
| error = logs/spras_$(Cluster)_$(Process).err | ||
| log = htcondor/logs/spras_$(Cluster)_$(Process).log |
There was a problem hiding this comment.
Do we want on per cluster or one per cluster_process pair?
There was a problem hiding this comment.
I think one per cluster/process is the right way to go. In theory, one could still queue N>1.
|
I converted this to a draft because these docs will depend on the explicit sif transfer PR, and I haven't yet tested everything here in that paradigm. |
|
Also, apologies for the poor git etiquette in the last commit that rolled too many things into one diff (including running an |
178a93e to
fe7fbbc
Compare
…gging I was tired of hacking around wanting verbose logging in the HTCondor Snakemake executor, so I added some plumbing to pass Snakemake's '--verbose' flag through 'snakemake_long.py' to snakemake itself. Additionally, I added '--env-manager' so I could run things with my preferred mamba env instead of conda (which is too slow to rebuild).
The executor has matured quite a bit since these instructions were first drafted, and it's my hope that these changes remove a lot of the headache for running jobs. Now, you can edit config files in `config/` and use the `input/` directory directly. Workflows should be submitted directly from the repository root.
Co-authored-by: Tristan F.-R. <pub.tristanf@gmail.com>
fe7fbbc to
ceea753
Compare
This largely reformats the directory structure needed to run SPRAS workflows with HTCondor. In particular, it moves a lot of the helper code/submit files out of
docker-wrappers/SPRAS/into a top-levelhtcondor/directory. I can do this now that the HTCondor executor has matured significantly, and can handle all the paths as they're configured in this diff.To run a test SPRAS workflow, try following along with the instructions in
docs/htcondor.rst. If anything is confusing, or you get hung up on any of the steps, let's discuss what I can do to make things more clear.