Kubernetes Observability: EFK Stack Deployment Guide

Kishor Chukka

5 min readNov 4, 2023

Repository

GitHub - kishorechk/efk-demo

Contribute to kishorechk/efk-demo development by creating an account on GitHub.

github.com

Tech Stack

Python
ElasticSearch
Fluentd
Kibana

Overview

In today’s cloud-native world, monitoring services is crucial. Kubernetes (k8s) has become the go-to orchestration platform, and with it comes the need for robust logging solutions. Traditional logging solutions often fall short when applied to the dynamic nature of containers and microservices. This is where the EFK stack comes into play.

What is EFK?

EFK stands for Elasticsearch, Fluentd, and Kibana:

Elasticsearch: A real-time distributed search and analytics engine. It’s where your logs are stored and can be queried.
Fluentd: An open-source data collector, which unifies data collection and consumption for better use and understanding by humans and machines. In our context, it’s responsible for collecting logs from Kubernetes nodes and forwarding them to Elasticsearch.
Kibana: A visualization layer that works on top of Elasticsearch, providing a UI to visualize and query the data.

ELK vs. EFK: Embracing Cloud-Native Logging

Traditional logging solutions, like centralized logging servers or logging agents that write to a file on disk, were designed for static infrastructure. They assume that servers are long-lived and that logs can be written to disk or sent to a centralized server without much transformation.

However, in a Kubernetes environment:

Containers are ephemeral: They can be killed and started dynamically, which means logs can be lost if not handled correctly.
High volume and velocity: With potentially thousands of containers running, the volume of logs can be overwhelming.
Diverse log formats: Different microservices might log in different formats, requiring normalization.

The EFK stack, being cloud-native, addresses these challenges by providing a scalable, flexible, and unified logging solution that’s designed for the dynamic nature of containerized applications.

With this foundation, let’s dive into setting up the EFK stack on a local Kubernetes cluster and monitor logs from two sample microservices.

Prerequisites:

docker installed
kind (Kubernetes in Docker) installed
kubectl installed
helm v3 installed

The complete code for this project, including the Makefile and Helm charts, is available on GitHub.

Setting Up the EFK Stack

Step 1: Deploying Elasticsearch

Elasticsearch is the heart of the EFK stack, responsible for storing and indexing log data.

To deploy Elasticsearch, we first add the Elastic Helm chart repository:

helm repo add elastic https://helm.elastic.co

Then, we install Elasticsearch using Helm:

 helm install elasticsearch elastic/elasticsearch \
  --set replicas=1 \
  --set resources.requests.memory="512Mi" \
  --set resources.requests.cpu="500m" \
  --set persistence.enabled=false \
  --set service.type=NodePort

This command sets up a single-node Elasticsearch cluster with specific resource requests and makes it accessible via a NodePort service.

After running the command, monitor the Elasticsearch pods until they are running:

kubectl get pods --namespace=default -l app=elasticsearch-master -w

Once Elasticsearch is running, you can forward the service port to your local machine:

kubectl port-forward service/elasticsearch-master 9200:9200

Now, you can verify that logs are being received in Elasticsearch by checking the indices.

curl -X GET "localhost:9200/_cat/indices?v"

Step 2: Deploying Fluentd

Fluentd collects logs from each Kubernetes node and forwards them to Elasticsearch.

We create a ConfigMap for Fluentd with our custom configuration:

#custom-fluentd.conf
<source>
  @type tail
  path /var/log/containers/*.log
  pos_file /var/log/fluentd-containers.log.pos
  tag kubernetes.*
 <parse>
    @type regexp
    pattern ^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z)\s+(?<level>\w+)\s+(?<message>.*)$
  </parse>
</source>

<match **>
  @type stdout
</match>

kubectl create configmap custom-fluentd-config --from-file=custom-fluentd.conf

Then, we deploy Fluentd using Helm:

helm install fluentd bitnami/fluentd \
  --set elasticsearch.host=elasticsearch-master \
  --set resources.requests.memory="200Mi" \
  --set resources.requests.cpu="100m" \
  --set replicas=1 \
  --set configMap=custom-fluentd-config

This command deploys Fluentd and configures it to connect to the Elasticsearch service we deployed earlier.

Check the status of Fluentd to ensure it’s running properly:

kubectl get all -l "app.kubernetes.io/name=fluentd,app.kubernetes.io/instance=fluentd"

If a Fluentd pod is in CrashLoopBackOff, delete the pod to allow Kubernetes to recreate it:

kubectl delete pod <pod-name>

Step 3: Deploying Kibana

Kibana provides a web UI to visualize logs stored in Elasticsearch.

We deploy Kibana using Helm:

helm install kibana-new elastic/kibana \
  --set replicas=1 \
  --set resources.requests.memory="500Mi" \
  --set resources.requests.cpu="500m" \
  --set service.type=NodePort

This command deploys Kibana and sets it up with the necessary resources, making it accessible via a NodePort service.

Monitor the Kibana pods until they are running:

kubectl get pods --namespace=default -l release=kibana -w

Once Kibana is running, you can forward the Kibana service port to your local machine:

kubectl port-forward svc/kibana-kibana 5601:5601

Now, you can access Kibana by navigating to https://localhost:5601/ in your web browser.

Note: You can get the login credentials for Elasticsearch by running the following commands:

kubectl get secrets --namespace=default elasticsearch-master-credentials -ojsonpath='{.data.username}' | base64 -d

kubectl get secrets --namespace=default elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d

Deploying Your Services

The complete code for this project, including the Makefile and Helm charts, is available on GitHub.

Deploying Service A

For deploying our Flask-based Service A, we use a Helm chart:

helm upgrade --install servicea-release ./efk-helm-chart -f ./efk-helm-chart/valuesServiceA.yaml

This command deploys Service A using the Helm chart located in ./efk-helm-chart with the values specified in valuesServiceA.yaml.

After deployment, ensure the service is running:

kubectl get pods -l app=service-a

Deploying Service B

Similarly, we deploy Service B:

helm upgrade --install serviceb-release ./efk-helm-chart -f ./efk-helm-chart/valuesServiceB.yaml

Check the status of Service B:

kubectl get pods -l app=service-b

Accessing the Services

To access Service A, we forward the port from our local machine to the service running in the cluster:

kubectl port-forward svc/service-a 8080:80

Now, you can access Service A by visiting https://localhost:8080 in your browser.

Verifying Log Collection

After setting up the EFK stack, it’s crucial to verify that logs from serviceA and serviceB are being collected and stored in Elasticsearch, and that they can be visualized in Kibana.

You can verify the services logs are picked up by checking the Fluentd pod logs using the below command.

kubectl logs <fluentd pod name> | grep 'Service A was called!'

You can verify that logs are being received in Elasticsearch by checking the indices.

curl -X GET "localhost:9200/_cat/indices?v"

You can visualise the data using Kibana by following their official guide

Conclusion:

The EFK stack provides a powerful solution for monitoring logs from services running on Kubernetes. With this setup, you can ensure that you’re always aware of what’s happening within your applications, allowing for quicker debugging and better overall system health.