Understanding Ephemeral Storage in Kubernetes

Photo by Umberto on Unsplash

Understanding Ephemeral Storage in Kubernetes

Ephemeral storage in Kubernetes refers to the storage associated with a pod that exists only for the lifetime of that pod. It is typically used for temporary data that does not need to persist beyond the pod's lifecycle. Ephemeral storage is usually backed by the node’s local storage and is not designed for long-term data retention. It gets wiped clean once the pod is deleted or restarted.

Usage of Ephemeral Storage

Ephemeral storage in Kubernetes can be used in various scenarios, such as:

  1. Scratch space for applications:

    • Applications that require temporary space for intermediate data processing.
  2. Caching:

    • Temporary storage for cached data that can be regenerated if lost.
  3. Logs:

    • Storing log data that can be offloaded to persistent storage or log aggregation systems.
  4. Temporary storage for build artifacts:

    • During CI/CD processes, ephemeral storage can be used to store build artifacts that are needed only temporarily.

Types of Ephemeral Storage in Kubernetes

  1. EmptyDir:

    • A volume that is created when a pod is assigned to a node and exists as long as the pod is running. Empty at Pod startup, with storage coming locally from the kubelet base directory (usually the root disk /var/lib/kubelet) or RAM
  2. ConfigMap and Secret:

    • Used for storing configuration data and sensitive information that can be consumed by pods.
  3. DownwardAPI:

    • Allows pods to consume metadata about themselves or their environment.
  4. CSI ephemeral volumes:

    • Similar to the previous volume kinds, but provided by special CSI drivers which specifically support this feature.
  5. Generic ephemeral volumes:

    • Can be provided by all storage drivers that also support persistent volumes

emptyDir, configMap,downwardAPI, secret are provided as local ephemeral storage. They are managed by kubelet on each node.

Default Ephemeral Storage Type

When a Pod is started in Kubernetes, the default type of ephemeral storage used is the node’s local storage associated with the pod's lifecycle. Specifically, the primary forms of ephemeral storage that are used by default without any explicit configuration are:

Container Writable Layer

Each container in a pod has a writable layer provided by the container runtime (e.g., Docker, containerd). This writable layer is ephemeral and used by default for any file operations within the container that are not mapped to other volumes.

Automatic Ephemeral Storage Features

  1. Container Writable Layer:

    • Each container has its own writable layer where it can write files. This writable layer is ephemeral and lasts only for the duration of the container. Once the container is terminated or restarted, the data in this writable layer is lost.
  2. EmptyDir (if specified):

    • If an emptyDir volume is explicitly specified in the pod's configuration, it provides a dedicated ephemeral storage that is shared among all containers in the pod. This storage is tied to the pod's lifecycle and is created when the pod is scheduled and deleted when the pod is terminated.
💡
To get the list of all pods or deployments that are using emptyDir, use the following command: https://gist.github.com/Brain2life/173dbad51c112a2ff1b3f3add9312fb1

Here's an example of pod definition with emptyDir volume specified:

    apiVersion: v1
    kind: Pod
    metadata:
      name: example-pod
    spec:
      containers:
      - name: example-container
        image: busybox
        command: ["sh", "-c", "echo Hello Kubernetes! > /data/hello.txt && sleep 3600"]
        volumeMounts:
        - mountPath: /data
          name: temp-storage
      volumes:
      - name: temp-storage
        emptyDir: {}

Example

If you start a pod without specifying any volumes, the writable layer of each container acts as the ephemeral storage by default. Here is an example of a simple pod configuration without any explicit ephemeral storage:

apiVersion: v1
kind: Pod
metadata:
  name: example-pod
spec:
  containers:
  - name: example-container
    image: busybox
    command: ["sh", "-c", "echo Hello Kubernetes! > /tmp/hello.txt && sleep 3600"]

In this example:

  • The container uses its writable layer to write the file /tmp/hello.txt.

  • This storage is ephemeral and will be lost if the container is terminated or restarted.

To see the writable layer of a container in Kubernetes, you typically need to access the container's filesystem. The writable layer is where all changes made to the filesystem of the container (such as creating or modifying files) are stored. Here's how you can inspect the writable layer:

Exec into the container:

kubectl exec -it example-pod -c example-container -- /bin/sh

Inspect filesystem. You should see hello.txt file:

cd /tmp
ls -l

By default, when a pod is started in Kubernetes, the container’s writable layer is used as the ephemeral storage. If you need shared ephemeral storage among containers in the same pod, you can explicitly define an emptyDir volume. The writable layer is tied to the container lifecycle, while emptyDir (if used) is tied to the pod lifecycle.

Best Practices

Resource Requests and Limits

Set requests and limits for ephemeral storage to ensure that no single pod can consume all the node's storage. This helps in managing resource allocation and preventing resource exhaustion.

In the following example, the Pod has two containers. Each container has a request of 2GiB of local ephemeral storage. Each container has a limit of 4GiB of local ephemeral storage. Therefore, the Pod has a request of 4GiB of local ephemeral storage, and a limit of 8GiB of local ephemeral storage. 500Mi of that limit could be consumed by the emptyDir volume.

apiVersion: v1
kind: Pod
metadata:
  name: frontend
spec:
  containers:
  - name: app
    image: images.my-company.example/app:v4
    resources:
      requests:
        ephemeral-storage: "2Gi"
      limits:
        ephemeral-storage: "4Gi"
    volumeMounts:
    - name: ephemeral
      mountPath: "/tmp"
  - name: log-aggregator
    image: images.my-company.example/log-aggregator:v6
    resources:
      requests:
        ephemeral-storage: "2Gi"
      limits:
        ephemeral-storage: "4Gi"
    volumeMounts:
    - name: ephemeral
      mountPath: "/tmp"
  volumes:
    - name: ephemeral
      emptyDir:
        sizeLimit: 500Mi

The sizeLimit field under emptyDir specifies the maximum size for the entire emptyDir volume. An emptyDir volume is a temporary directory that initially starts empty and is deleted when the Pod is removed.

In given example, the sizeLimit of 500Mi restricts the total size of the ephemeral volume that is mounted to /tmp in both containers.

Each container (app and log-aggregator) is allowed to request 2Gi and use up to 4Gi of ephemeral storage individually.

However, the shared ephemeral volume (emptyDir) has a sizeLimit of 500Mi, meaning the combined storage usage of both containers in /tmp cannot exceed 500Mi.

Log Management

Use log aggregation solutions (e.g., ELK Stack, Fluentd) to collect and store logs from pods to avoid losing critical log data when pods are deleted.

Monitoring and Alerts:

Implement monitoring for ephemeral storage usage. Tools like Prometheus and Grafana can be used to set up alerts for storage thresholds.

If the kubelet is managing local ephemeral storage as a resource, then the kubelet measures storage use in:

  • emptyDir volumes, except tmpfs emptyDir volumes

  • directories holding node-level logs

  • writeable container layers

If a Pod is using more ephemeral storage than you allow it to, the kubelet sets an eviction signal that triggers Pod eviction.

For container-level isolation, if a container's writable layer and log usage exceeds its storage limit, the kubelet marks the Pod for eviction.

For pod-level isolation the kubelet works out an overall Pod storage limit by summing the limits for the containers in that Pod. In this case, if the sum of the local ephemeral storage usage from all containers and also the Pod's emptyDir volumes exceeds the overall Pod storage limit, then the kubelet also marks the Pod for eviction.

By default ephemeral container data is located at:

  • /var/lib/kubelet

  • /var/lib/containers

on the Kubernetes Node.

To show the ephemeral storage usage on the node use:

df -h /var/lib

To show the ephemeral storage usage inside the container:

du -h .
du -h [directory]

Data Backup

For critical temporary data, implement mechanisms to periodically back up data to persistent storage solutions.

Clean Up

Ensure that ephemeral storage is properly cleaned up when pods are terminated to avoid orphaned data and wasted storage space. Even terminated and failed pods take up the space. Therefore, it is important to clean up the cluster from hanging pods.

You can use the following Bash script in your CI/CD pipelines or clean your cluster manually by starting the script from your local machine: https://github.com/Brain2life/bash-cookbook/tree/k8s-cleanup-pods

Type of Issues with Ephemeral Storage

  1. Data Loss:

    • Since ephemeral storage is tied to the pod lifecycle, any data stored there will be lost if the pod is deleted or crashes.
  2. Resource Contention:

    • Without proper resource limits, pods might consume more storage than expected, leading to node resource contention and potential disruptions.
  3. Node Disk Pressure:

    • High usage of ephemeral storage can cause disk pressure on nodes, triggering eviction of pods or other resource management actions by the kubelet.
  4. Limited Capacity:

    • Nodes have finite storage capacity, and excessive usage of ephemeral storage can exhaust available space, affecting the overall cluster performance.
  5. No Persistence:

    • Ephemeral storage is not suitable for storing data that needs to be preserved across pod restarts or crashes. Applications requiring persistent storage should use Persistent Volumes (PVs) and Persistent Volume Claims (PVCs).

Understanding and effectively managing ephemeral storage in Kubernetes is crucial for ensuring the stability and performance of applications running in the cluster.