Kubernetes deployment.yml Explained

Deep, field-by-field explanation of a Spring Boot UI+API Deployment and a ClusterIP Service, including selectors, ports, probes, and resource requests/limits.

Deployment Service AKS Probes Resources

deployment.yml

2 documents: Deployment + Service

apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-ui-api
  labels:
    app: spring-ui-api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: spring-ui-api
  template:
    metadata:
      labels:
        app: spring-ui-api
    spec:
      containers:
        - name: app
          image: REPLACEME_ACR_LOGIN/spring-ui-api:1.0
          imagePullPolicy: IfNotPresent
          ports:
            - name: http
              containerPort: 9087

          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 9087
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 2
            failureThreshold: 6

          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 9087
            initialDelaySeconds: 15
            periodSeconds: 10
            timeoutSeconds: 2
            failureThreshold: 3

          startupProbe:
            httpGet:
              path: /actuator/health
              port: 9087
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 2
            failureThreshold: 30

          resources:
            requests:
              cpu: "100m"
              memory: "256Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
---
apiVersion: v1
kind: Service
metadata:
  name: spring-ui-api-svc
  labels:
    app: spring-ui-api
spec:
  selector:
    app: spring-ui-api
  ports:
    - name: http
      port: 80
      targetPort: 9087
  type: ClusterIP

What Kubernetes creates

Deployment
Ensures the desired number of Pods exist and supports rolling updates.
ReplicaSet
Created and managed by the Deployment to maintain the replica count.
Pods
The running container instances (2 replicas → 2 Pods, under normal conditions).
Service (ClusterIP)
A stable virtual IP + DNS name that load-balances traffic to matching Pods.

Traffic flow (inside cluster)

Service receives traffic on port 80:

spring-ui-api-svc:80 Pod IP container:9087

Because the Service is ClusterIP, it is reachable only from inside the cluster (or via an Ingress / Gateway that you add later).

Health model

startupProbe

“Is the app started yet?” Only once it succeeds do liveness/readiness matter.

readinessProbe

“Is the app ready to receive traffic?” Controls Service endpoints membership.

livenessProbe

“Is the app alive?” Failure triggers container restart.

Deployment deep dive

apiVersion: apps/v1 selects the stable API group for workloads like Deployments.
kind: Deployment declares a higher-level controller for rolling updates and replicas.

name: spring-ui-api

The Deployment’s resource name (what you reference with kubectl get deploy spring-ui-api).

labels: app: spring-ui-api

A key/value tag used for selection, grouping, and policy. You reuse the same label across: Deployment → Pod template → Service selector.

Requests two Pod replicas. The Deployment controller continuously reconciles reality to match this desired state.

What you get

2 Pods → basic redundancy. If one Pod is terminated/restarted, the other can still serve traffic (if ready).

Scaling later

You can increase replicas manually or use an HPA (Horizontal Pod Autoscaler) based on CPU/memory/custom metrics.

Replicas alone do not guarantee “multi-node” placement. To enforce spreading across nodes/zones, you’d add affinity/anti-affinity or topology spread constraints (not present in this file).

This is one of the most important concepts in Kubernetes: the Deployment’s selector decides which Pods it owns and manages.

Your file sets both to:

app: spring-ui-api

selector.matchLabels

“Find Pods with these labels; those are my replicas.”

template.metadata.labels

“When I create Pods, stamp them with these labels.”

Common pitfall

If selector labels and template labels don’t match, the Deployment will not “see” its own Pods, causing stuck rollouts or unexpected behavior.

image

REPLACEME_ACR_LOGIN/spring-ui-api:1.0 is a placeholder for your ACR login server, for example: myregistry.azurecr.io/spring-ui-api:1.0.

The tag :1.0 is the immutable version you’re deploying. Updating the tag changes what AKS pulls.

imagePullPolicy: IfNotPresent

Kubernetes pulls the image only if it’s not already present on the node. This reduces pulls during restarts but may surprise you if you reuse tags.

Best practice: avoid reusing the same tag for different builds (or use Always for dev).

ports

containerPort: 9087 documents the port the process listens on inside the container. This is the same port your Dockerfile exposes and your Spring Boot app runs on.

containerPort: 9087 → Service targetPort: 9087 → Service port: 80

Probes explained (startup, readiness, liveness)

How to read probe timing

Roughly: Kubernetes calls the endpoint every periodSeconds. Each call must respond within timeoutSeconds. After failureThreshold consecutive failures, Kubernetes takes action (ready=false, restart, or “still starting”).

startupProbe

Protects slow-starting apps from being killed by liveness checks too early.

Endpoint: GET /actuator/health on port 9087

initialDelaySeconds: 5 → wait 5 seconds before first check
periodSeconds: 5 → check every 5 seconds
timeoutSeconds: 2 → each call must complete within 2 seconds
failureThreshold: 30 → allow 30 failures before declaring startup failed

What that means in time

After the initial delay, Kubernetes allows up to ~30 × 5s = 150 seconds of failed startup checks before it gives up (plus the 5-second initial delay).

readinessProbe

Controls whether a Pod is considered a valid backend for the Service.

Endpoint: GET /actuator/health/readiness on port 9087

initialDelaySeconds: 5 → begin checks shortly after container starts
periodSeconds: 5 → check frequently (fast cutover)
timeoutSeconds: 2 → keep probe calls tight
failureThreshold: 6 → after 6 failures → Pod becomes NotReady

Operational impact

When readiness fails, Kubernetes stops routing new Service traffic to that Pod (but does not necessarily restart it).

livenessProbe

Detects “stuck” apps and triggers restarts to recover automatically.

Endpoint: GET /actuator/health/liveness on port 9087

initialDelaySeconds: 15 → allow more time before first liveness check
periodSeconds: 10 → check every 10 seconds
timeoutSeconds: 2 → each call must complete within 2 seconds
failureThreshold: 3 → after 3 failures → container restart

What that means in time

After initial delay, ~3 × 10s = 30 seconds of continuous failures (plus timeouts) can cause a restart.

Actuator note

These endpoints assume Spring Boot’s health groups for readiness/liveness are enabled and exposed. If only /actuator/health is exposed, readiness/liveness URLs will 404 and probes will fail.

Resources explained (requests vs limits)

requests

The minimum resources Kubernetes uses for scheduling decisions. A node must have at least these available for the Pod to be placed.

cpu: 100m → 0.1 CPU core requested
memory: 256Mi → 256 MiB requested

limits

The upper bounds enforced at runtime (especially for memory). Exceeding memory limit typically leads to OOM kill. CPU limits throttle.

cpu: 500m → up to 0.5 CPU core
memory: 512Mi → up to 512 MiB

Practical sizing insight

For Spring Boot, memory sizing should be validated under load. If the JVM heap + metaspace + native memory exceed 512Mi, the Pod may be OOM-killed. If this happens, increase the limit or set JVM flags (not shown here) to cap heap appropriately.

Service explained (ClusterIP)

Selector

The Service selects Pods using labels. In your file: selector.app = spring-ui-api.

If Pods don’t have this label, the Service will have zero endpoints (no backends).

Ports mapping

port: 80 → Service listens on 80 (cluster-internal)
targetPort: 9087 → forwards to container’s 9087

Using port 80 is convenient for callers inside the cluster, while the container can keep its native 9087.

Type: ClusterIP

ClusterIP exposes the Service only inside the cluster. To make it reachable from the internet, you’d add an Ingress (NGINX) or Azure Application Gateway Ingress Controller / Gateway API resources (not included in this YAML).

Kubernetes deployment.yml Explained

deployment.yml

What Kubernetes creates

Traffic flow (inside cluster)

Health model

Deployment deep dive

apiVersion + kind apps/v1 • Deployment

metadata: name + labels Identity

spec.replicas: 2 Availability

selector.matchLabels & template.labels Critical linkage

container: image, pull policy, ports Runtime

Probes explained (startup, readiness, liveness)

Resources explained (requests vs limits)

requests

limits

Service explained (ClusterIP)