
Longhorn Distributed File System For Kubernetes In Action
Sep 19
4 min read
0
12
Real Life Problem
After installing a Kubernetes Cluster in your own private cloud, you've came to realize that your application can not store data. That is because by default Kubernetes does not come with a storage solution. At most, you can configure a local storage, but that doesn't work in production. That is because data stored in local storage are not available and persisted across pods running in different nodes. You might be tempted to mount NFS but it has limitation, it would work for application that does not require heavy data processing like Wordpress.
A viable solution is called "Distributed File System". In layman's term, a storage technology that is a distributed across nodes forming a cluster. This produces a storage solution that is fault tolerant and is shared across nodes in Kubernetes. A technology like Longhorn.
What is Longhorn?
Cloud native distributed block storage for Kubernetes. This provides persistent storage for Stateful Applications. In simple term through Longhorn one can create a disks that is spread across many nodes such that the there is a data locality across pods.
I would assume that you had already installed Longhorn so you can follow my demo. If not, you may visit this article Install Longhorn on Kubernetes.
Real-Life Use Case of Longhorn Distributed File System
Consider a scenario where your company needs you to deploy a database in Kubernetes. The setup should include one pod specifically for writing and two additional pods for reading(Statefulset Application). A key requirement is that the database must be optimized for writing, and reading. Deploying Database in Kubernetes via Statefulset application running multiple pods replicas across nodes with distributed filesystem would achieve such requirement.
Solution and Demonstration of Longhorn Distributed File System in Action?
To keep things straightforward, we're using a log file to simulate the database operations. One pod handles writing, while two pods are dedicated to reading. A Longhorn volume will be set up and linked to a persistent volume and persistent volume claim. This claim will be mounted to all three pods within a specific directory, which is where the log is stored.
As the log is written in the log file, the other two pods in different nodes will conduct read simultaneous thus the scenario above. On a monitoring dashboard the performance of the pods and the read-write throughput of the disks. Refer to the image below for our configuration.
Image 1: Pods Using Persistent Volume Claim mounted in Longhorn Volume

Create a Longhorn Volume and Persistent Volume Claim
Step 1: Create Volume
First, we create a volume in Longhorn called test-volume-01 using the Longhorn Dashboard. Ensure the access mode is configured to ReadWriteMany, allowing multiple pods to read from and write to the storage at the same time. It's important to note that we have set the replica count to 3, indicating that the volume is spread across 3 Kubernetes nodes.


Step 2: Attach the Volume to a Persistent Volume and Persistent
After creating the volume, have it attached to a Persistent Volume and Persistent Volume Claim. We will use this later to mount our storage to a pod running in a deployment.

Mount the volume into Deployment through PVC
We are then to create a Deployment with 3 pod replicas, we will mount the PVC created earlier in Longhorn. This pod will run the write and read log script. See Deployment script below. It is important to know that the PVC used in the Deployment was created earlier in Longhorn Dashboard. The Deployment and PVC must be under the same namespace.
Create deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: frontend
name: frontend
namespace: development
spec:
replicas: 3
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- image: httpd
name: httpd
ports:
- containerPort: 80
resources: {}
volumeMounts:
- mountPath: /data
name: test-volume-02-pvc
volumes:
- name: test-volume-02-pvc
persistentVolumeClaim:
claimName: test-volume-02-pvcProvision the Deployment
kubectl apply -f deployment.yamlCheck the Pods created by the Deployment
kubectl get pods
Deploy Writer and Reader to the Pods
Deploy a Writer Script to Pod
Login to the first pod
kubectl exec frontend-84845856dd-66t8h -it -- bashAdd a vi editor
apt-get update && apt-get install -y vimCreate the writer script in first pod
cd /datavi writer#!/bin/bash
LOG_FILE="/data/logfile.txt"
MESSAGES=("DATABASE WRITE SUCCESSFUL" "DATABASE DELETE SUCCESSFUL" "DATABASE UPDATE SUCCESSFUL" "DATABASE READ SUCCESSFUL")
while true; do
RANDOM_MSG=${MESSAGES[$RANDOM % ${#MESSAGES[@]}]}
echo "$(date +'%Y-%m-%d %H:%M:%S') $RANDOM_MSG" >> "$LOG_FILE"
echo "Write successful"
sleep 1 # Adjust for desired log generation frequency
doneRun the writer script
sudo chmox +x write./writerDeploy a Reader Script to Pod
Login to the other two remaining pods and read the log using tail command.
Login to the second pod
kubectl exec frontend-84845856dd-86kd8 -it -- bashRead on the Log
$ cd /data
$ tail -f logfile.txtLogin to the third pod
kubectl exec frontend-84845856dd-r7ws5 -it -- bashRead on the Log
$ cd /data
$ tail -f logfile.txtActual Demonstration and Simulation
In this video we logged in to three(3) different pods in different nodes in the cluster. The first pod above simulates writes in database, the second and third pods mimics read in database. Without distributed files system like Longhorn, data locality and persisntency is not possible for pods running in different nodes. Furthermore, since the persistent volume claim runs in distributed filesystem, a replica of the data are kept across three nodes.
Conclusion
Longhorn is a perfect solution to Kubernetes persistent storage when you want the following:
A distributed storage system that runs well with your cloud native application (without relying on external providers).
A storage solution that is tightly coupled with Kubernetes.
Storage that is highly available and durable.
A storage system without specialized hardware and not external to the cluster.
A storage system that is easy to install and manage.
Reference(s):