Post-Deploy Jobs Against Live Services
When an onPostDeploy job needs to connect to services the same asset just deployed — a database migration runner, a data seeder, an integration smoke test — timing matters. Codiac fires onPostDeploy as soon as deployment manifests are applied, not after the new service instances are up and accepting traffic.
The solution is to build the wait into the job itself using a startup gate: a short-lived container that runs before your main job container and exits only when the target service is ready. Your job never starts until the gate passes, and Codiac's deployment pipeline moves on without blocking.
How it works
Add an initContainers entry to your job spec. Each init container must exit 0 before the next one runs, and before the main container starts. If an init container fails, it is retried automatically according to the job's restartPolicy. No changes to your Codiac configuration are needed — the gate lives entirely in the job manifest.
Option 1: TCP port check
Best for any service that accepts TCP connections: databases, message queues, APIs.
spec:
template:
spec:
initContainers:
- name: wait-for-service
image: busybox
command:
- sh
- -c
- until nc -z my-service 5432; do echo "waiting..."; sleep 2; done
containers:
- name: migration-runner
image: my-migration-image
Replace my-service with the service name and 5432 with the port. The loop retries every two seconds until the TCP port responds, then exits 0 and lets the main container start.
Limitation: Confirms the port is open, not that the application is ready to serve requests.
Option 2: HTTP health check
Best for services with a /health or /ready endpoint.
spec:
template:
spec:
initContainers:
- name: wait-for-api
image: busybox
command:
- sh
- -c
- until wget -qO- http://my-service:8080/health; do echo "waiting..."; sleep 2; done
containers:
- name: migration-runner
image: my-migration-image
wget exits non-zero on a connection failure or a non-2xx response, so the loop retries until the application confirms it is ready at the HTTP layer.
Limitation: Requires the service to expose a health endpoint.
Option 3: Full rollout check
Best when you need to confirm every instance of the service is healthy — not just the first one that answered.
spec:
template:
spec:
serviceAccountName: post-deploy-runner
initContainers:
- name: wait-for-rollout
image: bitnami/kubectl
command:
- kubectl
- wait
- --for=condition=available
- deployment/my-service
- --timeout=300s
containers:
- name: migration-runner
image: my-migration-image
The init container waits up to 5 minutes for the named service deployment to report all instances healthy before exiting.
Setup required: The job's service account (post-deploy-runner in the example) needs get and watch permissions on deployments in the same cabinet. Create a Role granting those permissions and bind it to the service account.
Limitation: More setup overhead than the other options; requires a service account with read access to deployment state.
Option 4: Retry on failure
Best when the service starts within seconds and an initial failure is acceptable.
Instead of a startup gate, configure the job to retry on failure:
spec:
backoffLimit: 5
template:
spec:
restartPolicy: OnFailure
containers:
- name: migration-runner
image: my-migration-image
If the job fails because the service is not yet ready, it retries with exponential back-off up to backoffLimit times.
Limitation: The job will fail at least once before succeeding. If you have alerting on job failures, that alert will fire on every deploy. Prefer one of the startup gate approaches if clean deploy logs matter to you.
Which to use
| Scenario | Approach |
|---|---|
| Any TCP service (database, broker, API) | TCP check |
| Service exposes a health endpoint | HTTP health check |
| Need all instances confirmed healthy before proceeding | Full rollout check |
| Service starts fast; initial failure is acceptable | Retry |
For most migration runners and seed loaders, Option 2 (HTTP health check) is the right default — straightforward to configure and gives a meaningful application-layer signal.