Volcano Job

Create file test_q.yaml for a volcano queue:

apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: test
spec:
  weight: 1
  reclaimable: false
  capability:
    cpu: 8
    memory: 64Gi

Apply to the cluster:

$ kubectl apply -f test_q.yaml
queue.scheduling.volcano.sh/test created

Volcano job use PriorityClass to specify priority, so create file high_pc.yaml for it:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 10000
preemptionPolicy: PreemptLowerPriority
globalDefault: false
description: "Priority 10000"

Apply to the cluster:

$ kubectl apply -f high_pc.yaml
priorityclass.scheduling.k8s.io/high-priority created

Now create file sleep_vj.yaml:

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: sleep
spec:
  minAvailable: 3
  schedulerName: volcano
  priorityClassName: high-priority
  queue: test
  policies:
    - event: PodEvicted
      action: RestartJob
  tasks:
    - replicas: 3
      name: sleep-task
      policies:
        - event: TaskCompleted
          action: CompleteJob
      template:
        spec:
          restartPolicy: Never
          containers:
            - image: busybox:1.37.0-glibc
              imagePullPolicy: IfNotPresent
              name: sleep-busybox
              command: ["sh", "-c", "trap exit INT TERM; sleep 1m & wait"]
              resources:
                requests:
                  cpu: "1"
                  memory: 100Mi
                limits:
                  cpu: "1"
                  memory: 100Mi

Apply to the cluster:

$ kubectl apply -f sleep_vj.yaml
job.batch.volcano.sh/sleep created

Note

The job will be in PENDING state if the underlying pods were not running succcessfuly, which may not caused by lack of resources.

Watch events:

$ kubectl get vj sleep -owide -w
NAME    STATUS    MINAVAILABLE   RUNNINGS   AGE   QUEUE
sleep   Pending   3                         0s    test
sleep   Pending   3                         1s    test
sleep   Pending   3              1          2s    test
sleep   Pending   3              2          3s    test
sleep   Running   3              3          3s    test
sleep   Running   3              2          64s   test
sleep   Running   3              1          64s   test
sleep   Completing   3                         64s   test
sleep   Completed    3                         64s   test
sleep   Completed    3                         64s   test

If we list the resources when the job was running:

$ kubectl get vj,pg,po -owide
NAME                         STATUS    MINAVAILABLE   RUNNINGS   AGE   QUEUE
job.batch.volcano.sh/sleep   Running   3              3          13s   test

NAME                                                                        STATUS    MINMEMBER   RUNNINGS   AGE   QUEUE
podgroup.scheduling.volcano.sh/sleep-d6693086-b400-43c7-a0d9-baf606ff3479   Running   3           3          13s   test

NAME                     READY   STATUS    RESTARTS   AGE   IP               NODE     NOMINATED NODE   READINESS GATES
pod/sleep-sleep-task-0   1/1     Running   0          13s   192.168.5.203    las2     <none>           <none>
pod/sleep-sleep-task-1   1/1     Running   0          13s   192.168.135.17   las1     <none>           <none>
pod/sleep-sleep-task-2   1/1     Running   0          13s   192.168.135.16   las1     <none>           <none>

Delete the job:

$ kubectl delete vj sleep
job.batch.volcano.sh "sleep" deleted

Note

Delete the job will also delete all task pods.