the following short code examples (place it in a .sh file) can help identifying all power/PBS/torque jobs that have been running for at least (or up to) X hours and: (1) print them to screen; (2) delete them from queue. for simplicity we assume here that jobs do not exceed 10 hours. this is based on qstat output, which reports CPU time (wall time would have nice here sometimes). I’ll be happy to hear if you know of a better solution. maybe qselect can be utilized here, but I have yet to figure out how. EDIT: I’m adding a quote from the PBS guide on how to select jobs based on time parameters.
qdel_long.sh:
#!/bin/bash # delete jobs that are already running for $1 hours to ($1 + 1) hours qstat -u $USER | grep 'R 0'$1 qstat -u $USER | grep 'R 0'$1 | awk -F '.' '{print $1}' | xargs qdel
examples:
delete all jobs running for less than 10 hours: ./qdel_long.sh
delete all jobs running for less than 1 hour: ./qdel_long.sh 0
delete all jobs running for 1 to 2 hours: ./qdel_long.sh 1
BTW, you can easily combine it with a for-loop to delete up to $1:
#!/bin/bash # delete jobs that are up to ($1 + 1) hours long for i in `seq 0 $1` do qstat -u $USER | grep 'R 0'$i qstat -u $USER | grep 'R 0'$i | awk -F '.' '{print $1}' | xargs qdel done
or above $1 (up to 10 hours):
#!/bin/bash # delete jobs that are more than $1 hours long for i in `seq $1 9` do qstat -u $USER | grep 'R 0'$i qstat -u $USER | grep 'R 0'$i | awk -F '.' '{print $1}' | xargs qdel done
From the PBS guide:
qselect -te.gt.09251200 -te.lt.09251500
qselect -x -s “MF” -ts.gt.09251200 -ts.lt.09251500
qselect -x -tc.gt.09251200 -tc.lt.09251500
qselect -x -tq09251430