Job Requirements
Jobs must belong to an Accounting Group, as documented in the Usage document.
The following are limitations that the jobs must meet or face eviction or non-starting:
- Must ask for how much memory you need via request_memory (default is 1.5G) (with a +20% grace window)
- Must run for fewer than 3 days
Features:
- We support multicore: you can ask for however many CPUs you require with request_cpus
Job Eviction Policy
Regular jobs are guaranteed 3 days of runtime before being preempted
Jobs are evicted unconditionally if they exceed 30% over what RAM they ask for or run over 3 days
Jobs that have been evicted for memory-usage will be eligable to run again unless a periodic_hold or periodic_remove statement is added -- we suggest the following:
periodic_hold = (NumJobStarts >= 1 && JobStatus == 1)
Which holds jobs that have been put back into the Idle state (1) after starting at least once.
Queue Cleanup Policy
There are some system-wide expressions that keep the condor-queues clean from old jobs.
- Jobs that have ran at least once, been evicted for whatever reason, and are Idle in the queue for over 7 days will be placed on Hold
- Frequently jobs that use too much memory or run too long won't be able to start again and this keeps the old jobs from polluting the queue
- Jobs that are on Hold for over 3 weeks get removed
- Jobs that use periodic_hold, or are held due to policy (1) above will be cleaned up automatically.
For example, if a job uses too much RAM and gets evicted so it can't run again, it will be put on hold after 1 week and removed after 3 additional weeks -- giving the user about 1 month of opportunity to clean up their jobs.