Kubernetes deployment.yml Explained

Deep, field-by-field explanation of a Spring Boot UI+API Deployment and a ClusterIP Service, including selectors, ports, probes, and resource requests/limits.

Deployment Service AKS Probes Resources

deployment.yml

2 documents: Deployment + Service
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-ui-api
  labels:
    app: spring-ui-api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: spring-ui-api
  template:
    metadata:
      labels:
        app: spring-ui-api
    spec:
      containers:
        - name: app
          image: REPLACEME_ACR_LOGIN/spring-ui-api:1.0
          imagePullPolicy: IfNotPresent
          ports:
            - name: http
              containerPort: 9087

          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 9087
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 2
            failureThreshold: 6

          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 9087
            initialDelaySeconds: 15
            periodSeconds: 10
            timeoutSeconds: 2
            failureThreshold: 3

          startupProbe:
            httpGet:
              path: /actuator/health
              port: 9087
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 2
            failureThreshold: 30

          resources:
            requests:
              cpu: "100m"
              memory: "256Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
---
apiVersion: v1
kind: Service
metadata:
  name: spring-ui-api-svc
  labels:
    app: spring-ui-api
spec:
  selector:
    app: spring-ui-api
  ports:
    - name: http
      port: 80
      targetPort: 9087
  type: ClusterIP

What Kubernetes creates

  • Deployment
    Ensures the desired number of Pods exist and supports rolling updates.
  • ReplicaSet
    Created and managed by the Deployment to maintain the replica count.
  • Pods
    The running container instances (2 replicas → 2 Pods, under normal conditions).
  • Service (ClusterIP)
    A stable virtual IP + DNS name that load-balances traffic to matching Pods.

Traffic flow (inside cluster)

Service receives traffic on port 80:
spring-ui-api-svc:80 Pod IP container:9087
Because the Service is ClusterIP, it is reachable only from inside the cluster (or via an Ingress / Gateway that you add later).

Health model

startupProbe
“Is the app started yet?” Only once it succeeds do liveness/readiness matter.
readinessProbe
“Is the app ready to receive traffic?” Controls Service endpoints membership.
livenessProbe
“Is the app alive?” Failure triggers container restart.

Deployment deep dive

  • apiVersion: apps/v1 selects the stable API group for workloads like Deployments.
  • kind: Deployment declares a higher-level controller for rolling updates and replicas.

name: spring-ui-api
The Deployment’s resource name (what you reference with kubectl get deploy spring-ui-api).
labels: app: spring-ui-api
A key/value tag used for selection, grouping, and policy. You reuse the same label across: Deployment → Pod template → Service selector.

Requests two Pod replicas. The Deployment controller continuously reconciles reality to match this desired state.

What you get
2 Pods → basic redundancy. If one Pod is terminated/restarted, the other can still serve traffic (if ready).
Scaling later
You can increase replicas manually or use an HPA (Horizontal Pod Autoscaler) based on CPU/memory/custom metrics.
Replicas alone do not guarantee “multi-node” placement. To enforce spreading across nodes/zones, you’d add affinity/anti-affinity or topology spread constraints (not present in this file).

This is one of the most important concepts in Kubernetes: the Deployment’s selector decides which Pods it owns and manages.

Your file sets both to:
app: spring-ui-api
selector.matchLabels
“Find Pods with these labels; those are my replicas.”
template.metadata.labels
“When I create Pods, stamp them with these labels.”
Common pitfall
If selector labels and template labels don’t match, the Deployment will not “see” its own Pods, causing stuck rollouts or unexpected behavior.

image
REPLACEME_ACR_LOGIN/spring-ui-api:1.0 is a placeholder for your ACR login server, for example: myregistry.azurecr.io/spring-ui-api:1.0.

The tag :1.0 is the immutable version you’re deploying. Updating the tag changes what AKS pulls.
imagePullPolicy: IfNotPresent
Kubernetes pulls the image only if it’s not already present on the node. This reduces pulls during restarts but may surprise you if you reuse tags.

Best practice: avoid reusing the same tag for different builds (or use Always for dev).
ports
containerPort: 9087 documents the port the process listens on inside the container. This is the same port your Dockerfile exposes and your Spring Boot app runs on.
containerPort: 9087 Service targetPort: 9087 Service port: 80

Probes explained (startup, readiness, liveness)

How to read probe timing
Roughly: Kubernetes calls the endpoint every periodSeconds. Each call must respond within timeoutSeconds. After failureThreshold consecutive failures, Kubernetes takes action (ready=false, restart, or “still starting”).
startupProbe
Protects slow-starting apps from being killed by liveness checks too early.

Endpoint: GET /actuator/health on port 9087
  • initialDelaySeconds: 5 → wait 5 seconds before first check
  • periodSeconds: 5 → check every 5 seconds
  • timeoutSeconds: 2 → each call must complete within 2 seconds
  • failureThreshold: 30 → allow 30 failures before declaring startup failed
What that means in time
After the initial delay, Kubernetes allows up to ~30 × 5s = 150 seconds of failed startup checks before it gives up (plus the 5-second initial delay).
readinessProbe
Controls whether a Pod is considered a valid backend for the Service.

Endpoint: GET /actuator/health/readiness on port 9087
  • initialDelaySeconds: 5 → begin checks shortly after container starts
  • periodSeconds: 5 → check frequently (fast cutover)
  • timeoutSeconds: 2 → keep probe calls tight
  • failureThreshold: 6 → after 6 failures → Pod becomes NotReady
Operational impact
When readiness fails, Kubernetes stops routing new Service traffic to that Pod (but does not necessarily restart it).
livenessProbe
Detects “stuck” apps and triggers restarts to recover automatically.

Endpoint: GET /actuator/health/liveness on port 9087
  • initialDelaySeconds: 15 → allow more time before first liveness check
  • periodSeconds: 10 → check every 10 seconds
  • timeoutSeconds: 2 → each call must complete within 2 seconds
  • failureThreshold: 3 → after 3 failures → container restart
What that means in time
After initial delay, ~3 × 10s = 30 seconds of continuous failures (plus timeouts) can cause a restart.
Actuator note
These endpoints assume Spring Boot’s health groups for readiness/liveness are enabled and exposed. If only /actuator/health is exposed, readiness/liveness URLs will 404 and probes will fail.

Resources explained (requests vs limits)

requests

The minimum resources Kubernetes uses for scheduling decisions. A node must have at least these available for the Pod to be placed.

  • cpu: 100m → 0.1 CPU core requested
  • memory: 256Mi → 256 MiB requested

limits

The upper bounds enforced at runtime (especially for memory). Exceeding memory limit typically leads to OOM kill. CPU limits throttle.

  • cpu: 500m → up to 0.5 CPU core
  • memory: 512Mi → up to 512 MiB
Practical sizing insight
For Spring Boot, memory sizing should be validated under load. If the JVM heap + metaspace + native memory exceed 512Mi, the Pod may be OOM-killed. If this happens, increase the limit or set JVM flags (not shown here) to cap heap appropriately.

Service explained (ClusterIP)

Selector
The Service selects Pods using labels. In your file: selector.app = spring-ui-api.
If Pods don’t have this label, the Service will have zero endpoints (no backends).
Ports mapping
  • port: 80 → Service listens on 80 (cluster-internal)
  • targetPort: 9087 → forwards to container’s 9087
Using port 80 is convenient for callers inside the cluster, while the container can keep its native 9087.
Type: ClusterIP
ClusterIP exposes the Service only inside the cluster. To make it reachable from the internet, you’d add an Ingress (NGINX) or Azure Application Gateway Ingress Controller / Gateway API resources (not included in this YAML).