Skip to main content

Cluster Workflows Guide

This guide describes the common workflows for creating, restoring, and managing clusters using the Codiac CLI: cluster define, cluster restore, and cabinet cluster attach. If you are brand new to Codiac clusters, start with the Cluster Lifecycle Guide first.


The Big Picture: Blueprint vs. Infrastructure

Every cluster in Codiac has two distinct layers:

LayerWhat it isCreated by
ClusterDefThe blueprint — cloud provider, region, node spec, version, etc.cluster define, cluster provision, cluster bootstrap, cluster capture
Physical instanceThe actual cloud infrastructure (AKS, EKS, etc.)cluster restore, cluster provision, cluster bootstrap

These two layers are intentionally separate. Codiac stores your cluster configuration as a durable record independent of whether the physical cluster exists. This means:

  • You can define a cluster today and provision it later.
  • You can destroy the infrastructure and re-create it from the same blueprint.
  • You can review, audit, and version-control your cluster definitions before any cloud resources are created.

cod cluster define — Write the Blueprint

cluster define creates or updates a ClusterDef record without touching any cloud infrastructure. It is the pure metadata step.

# Interactive — walks you through every parameter
cod cluster define

# Update an existing definition (pre-populated with current values)
cod cluster define my-cluster

# Fully scripted
cod cluster define my-cluster \
--provider azure \
--providerSubscription <sub-id> \
--resourceGroup my-rg \
--location eastus \
--nodeSpec Standard_D4s_v3 \
--nodeQty 3 \
--k8sVersion 1.29.0 \
--silent

What it does:

  • Creates the ClusterDef record if it does not exist.
  • Updates the record if it already exists but has no live instance.
  • Rejects the call if the cluster already has a running instance — destroy it first, then redefine.

When to use it:

  • You want to register a cluster definition before you are ready to provision the hardware.
  • You want to update cluster metadata (e.g., node spec, k8s version) ahead of a rebuild.
  • You want to script out your cluster definitions for review or source control before creating any cloud resources.

cod cluster restore — Realize the Blueprint

cluster restore takes an existing ClusterDef and brings the cluster to a fully operational state. By default it runs all four phases in order, but each phase is individually skippable.

# Full restore — provision + agent + infrx + cabinets
cod cluster restore my-cluster --silent

# Skip provisioning (cluster already has a live instance)
cod cluster restore my-cluster --no-provision --silent

# Provision only — no agent, no infrx, no cabinets
cod cluster restore my-cluster --no-agent --no-infrx --no-cabinets --silent

# Restore cabinets for one enterprise only
cod cluster restore my-cluster --enterprise acme --silent

The four phases, in order:

PhaseFlag to skipWhat it does
Provision--no-provisionCreates the physical cluster in the cloud
Agent--no-agentInstalls the Codiac in-cluster agent
Infrx--no-infrxInstalls the default cluster stack (ingress, cert-manager, etc.)
Cabinets--no-cabinetsRestores all SDLC cabinets attached to this cluster

Each phase depends on the ones before it. Skipping agent also skips infrx and cabinets. Skipping infrx also skips cabinets.

When to use it:

  • Bringing a freshly defined cluster all the way to production-ready in one command.
  • Rebuilding a destroyed cluster (destroy → restore loop, described below).
  • Re-running bootstrap on an already-provisioned cluster (--no-provision).
  • Restoring a subset of phases after a partial failure.

cod cabinet cluster attach — Wire Cabinets to a Cluster

A cabinet is an environment slot that holds deployed workloads. Before cluster restore can restore a cabinet onto a cluster, Codiac needs to know which cluster that cabinet belongs to. cabinet cluster attach records that association in metadata — no infrastructure is created.

# Interactive
cod cabinet cluster attach

# Scripted
cod cabinet cluster attach \
--enterprise acme \
--environment prod \
--cabinet api-gateway \
--cluster prod-cluster-1 \
--silent

When to use it:

  • After creating a new cabinet or a new cluster, before running cluster restore.
  • When moving a cabinet from one cluster to another (e.g., during a cluster migration).

Once attached, cluster restore will automatically include that cabinet in its cabinet restore phase.


Common Workflows

New Cluster, Start to Finish

Define the blueprint, wire up your cabinets, then restore everything in one go.

# 1. Define the cluster (no cloud resources created yet)
cod cluster define prod-cluster \
--provider azure --providerSubscription <sub-id> \
--resourceGroup prod-rg --location eastus \
--nodeSpec Standard_D4s_v3 --nodeQty 3 \
--k8sVersion 1.29.0 --silent

# 2. Declare which cabinets should run on this cluster
cod cabinet cluster attach \
--enterprise acme --environment prod \
--cabinet api-gateway --cluster prod-cluster --silent

cod cabinet cluster attach \
--enterprise acme --environment prod \
--cabinet worker-service --cluster prod-cluster --silent

# 3. Provision and bring everything up
cod cluster restore prod-cluster --silent

Rebuild a Destroyed Cluster

Codiac keeps the ClusterDef intact after cluster destroy, so you can re-create the exact same cluster at any time.

# Tear down the cloud infrastructure (definition is preserved)
cod cluster destroy prod-cluster --provider azure --silent

# Re-provision from the stored definition, re-install agent, infrx, and cabinets
cod cluster restore prod-cluster --silent

This is also useful for troubleshooting: if a cluster gets into a bad state, destroy and restore gives you a clean slate backed by the exact same configuration record.


Re-Run Bootstrap on a Live Cluster

If you need to reinstall the agent or infrx components on a cluster that is already provisioned, skip the provision phase.

# Re-install agent, infrx, and cabinets — leave the cloud infrastructure alone
cod cluster restore prod-cluster --no-provision --silent

# Re-install infrx and cabinets only (agent already healthy)
cod cluster restore prod-cluster --no-provision --no-agent --silent

Update Cluster Metadata Before a Rebuild

When you want to change cluster parameters (e.g., upgrade k8s version or change node spec), update the definition first, then destroy and restore.

# Destroy infrastructure first (required — define rejects updates while a live instance exists)
cod cluster destroy prod-cluster --provider azure --silent
cod cluster define prod-cluster --k8sVersion 1.30.0 --silent

# Bring the cluster back up with the new spec
cod cluster restore prod-cluster --silent

How the Composite Commands Map to These Steps

The higher-level cluster commands are shortcuts that combine these building blocks:

CommandEquivalent steps
cod cluster provisioncluster define → restore infrastructure only
cod cluster bootstrapcluster define → full restore (provision + agent + infrx + cabinets)

Understanding this helps you choose the right command for your situation. If you need control over individual steps — or want to run only a subset of them — use cluster define and cluster restore directly.


Reference

GoalCommand
Create or update a cluster definition (no infrastructure)cod cluster define
Restore a cluster from its stored definitioncod cluster restore
Wire a cabinet to a clustercod cabinet cluster attach
Provision + agent + infrx + cabinets (all-in-one)cod cluster bootstrap
Provision cloud infrastructure onlycod cluster provision
Destroy cloud infrastructure (keep definition)cod cluster destroy
Permanently remove cluster definitioncod cluster forget