Kubernetes 101: Introduction to Volumes

Kubernetes volumes are an essential component of containers deployed within pods, addressing a fundamental need for persistent data storage and sharing data between containers. Containers are ephemeral by nature, meaning they can be created, destroyed, and recreated on the fly, which is great for application scalability and flexibility but poses challenges for persistent data storage. Data stored within a container's filesystem can be lost when the container no longer exists, making it difficult to manage stateful applications that require persistent storage.

Another problem occurs when multiple containers are running in a Pod and need to share files. It can be challenging to setup and access a shared filesystem across all of the containers.

This is where Kubernetes volumes come in. A volume is a directory containing data that can be accessed by containers within a pod. It provides a way to persist data, share data between containers within a pod, and even use external storage resources.

Why We Need Kubernetes Volumes

  • Data persistence: Volumes provide a way to persist data beyond the lifecycle of individual containers, ensuring that important data is not lost when containers are restarted or replaced.

  • Data sharing: Volumes can be mounted by multiple containers within the same pod, allowing those containers to share data. This is useful for applications that involve multiple containers that need to work on the same set of data.

  • Storage abstraction: Kubernetes volumes abstract the details of how storage is provided and how it is consumed, allowing users to focus on the application rather than the underlying storage infrastructure. This abstraction also makes applications more portable across different environments.

  • Using external storage: Volumes offer seamless integration with various external storage options like cloud storage, network file systems, and physical disks.

Types of Kubernetes Volumes

Kubernetes supports several types of volumes, each designed for different use cases and underlying storage backends. Here are some of the most commonly used types:

Ephemeral Volumes:

💡
An ephemeral volume in Kubernetes is a type of storage that is tightly coupled to the lifecycle of a Pod. It is created automatically when a Pod is assigned to a node and is destroyed when the Pod is removed from that node. Ephemeral volumes are designed for temporary data that does not need to persist beyond the life of the Pod, such as caching data and working files for the application running within the Pod. For more information, see Ephemeral Volumes
  1. emptyDir: A simple empty directory used for storing transient data that persists as long as the pod is running on a node. It is ideal for temporary data that a container needs to process but is not essential to keep long-term.

    💡
    Note: A container crashing does not remove a Pod from a node. The data in an emptyDir volume is safe across container crashes.
  2. configMap and secret: Special types of volumes that provide a way to inject configuration data, secrets, and non-confidential data into pods, allowing you to keep container images generic and externally configure them at runtime.

  3. downwardAPI: makes downward API data available to applications. Within the volume, you can find the exposed data as read-only files in plain text format.

    💡
    Downward API - is a mechanism to expose Pod and container field values to code running in a container.
  4. CSI ephemeral volumes: Container Storage Interface (CSI) volumes enable storage providers to expose storage systems to Kubernetes without having to add plugins specifically for Kubernetes, promoting a standard for exposing storage systems to containerized workloads.

  5. generic ephemeral volumes: Generic ephemeral volumes are similar to emptyDir volumes in the sense that they provide a per-pod directory for scratch data that is usually empty after provisioning. But they may also have additional features:

    • Storage can be local or network-attached.

    • Volumes can have a fixed size that Pods are not able to exceed.

    • Volumes may have some initial data, depending on the driver and parameters.

    • Typical operations on volumes are supported assuming that the driver supports them, including snapshotting, cloning, resizing, and storage capacity tracking.

Persistent Volumes (PVs):

  1. persistentVolumeClaim (PVC): Allows users to request specific size and access modes (e.g., read/write once or read/write many) for persistent storage.

    💡
    Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany, ReadWriteMany, or ReadWriteOncePod, see AccessModes).

    PVCs use PersistentVolume (PV) resources that represent physical storage in the cluster, abstracting the details of how the storage is provided.

  2. nfs: Mounts an NFS (Network File System) share into the pod, allowing multiple pods to read and write to the same files at the same time. Unlike emptyDir, which is erased when a Pod is removed, the contents of an nfs volume are preserved and the volume is merely unmounted. This means that an NFS volume can be pre-populated with data, and that data can be shared between pods.

  3. cloud provider-specific storage: Integrations with cloud providers' storage solutions, such as AWS Elastic Block Store (EBS), Google Persistent Disk, or Azure Disk Storage, provide seamless and scalable storage options that are managed by the cloud provider.

    Kubernetes 1.29 removed some cloud provider's volume types. Please refer to the following documentation: Types of Volumes
  4. gitRepo (deprecated): A volume initialized with the content of a Git repository. As of Kubernetes 1.21, this type is deprecated and other mechanisms are preferred for initializing containers with Git repository content.

    💡
    To provision a container with a git repo, mount an EmptyDir into an InitContainer that clones the repo using git, then mount the EmptyDir into the Pod's container.

Each volume type has its own set of parameters and configurations, making it flexible to suit various storage needs and scenarios in a Kubernetes environment. Selecting the right type of volume depends on the specific requirements of the application, such as the need for persistence, sharing across containers, or integration with external storage systems.