Hi Folks!!
In this article, we are going to discuss how we can set up logging using the EFK stack (Elastic-Search+FluentD+Kibana) for Litmuschaos experiments.
Firstly we will be doing a standard EFK setup on our k8s cluster and then we will see what filters we can use for fetching logs of a particular experiment run.
If you already have the EFK setup in your cluster which can fetch logs for all pods (or at least logs of pods in the Agent namespace), then you can skip to the logs filtering part through Kibana UI.
let us look at the components in this stack -
Elasticsearch is a real-time, distributed, and scalable search engine which allows for full-text and structured search, as well as for analytics. We will be using ElasticSearch for storing, filtering, and indexing our logs.
Kibana is a powerful data visualization frontend and dashboard for Elasticsearch. Kibana allows us to explore our Elasticsearch log data through a web interface, and build dashboards and queries to quickly answer questions and gain insight into your Kubernetes applications.
FluentD is a popular open-source data collector, that can be used to tail container log files, filter and transform the log data, and deliver it to the Elasticsearch cluster, where it can be indexed and stored.

Okay Okay, no more theory, without any further waiting let’s jump into it!!
EFK Stack configuration steps
The required manifests for fluentd and updated values.yml according to 7.17.1 version of ElasticSearch and Kibana are available here for reference.
For making sure while we are trying to implement this setup, we are using the same version as this blog, we can go ahead and clone the provided repository.
git clone git@github.com:Jonsy13/EFK-Setup.git
cd EFK-Setup
Creating separate namespaces for EFK setup
It’s generally a good practice to have the application deployed in separate namespaces that have different concerns concerning functionalities and behavior.
In our case, we are creating different namespaces for EFK to separate the setup from Litmus components so that we can reduce the risk of disaster to our EFK setup. We will be installing Kibana and ElasticSearch in elastic-kibana namespace and Fluentd in fluentd namespace.
kubectl create ns elastic-kibana 
kubectl create ns fluentd
Installing Elasticsearch cluster for data/logs for indexing and storage
We will be using the official helm chart for installing Elasticsearch. Firstly, we will have to add Elastic helm repo using the below command
helm repo add elastic https://helm.elastic.co

Next, we can install the ElasticSearch using above added helm repository as below
cd ElasticSearch
helm install — values values.yaml elasticsearch elastic/elasticsearch -n elastic-kibana

For verifying the successful installation, we can check pods of our elastic-search cluster -
kubectl get pods -n elastic-kibana — selector=”app=elasticsearch-master”
Just after the installation, you might find the output of the above command below -

It can take some time for all pods to come in a ready state, as there is a Readiness probe configured for all pods of elastic-search cluster for configuration of the cluster.
Once the Readiness probe finishes and gets satisfied, all the pods will come in a Ready state.

Installing Kibana for checking/filtering logs using Web Interface
The above-provided official elastic helm repository, we added earlier also provides a chart for Kibana. So, we will use the official chart to install Kibana using the below command
cd Kibana
helm install — values values.yaml kibana elastic/kibana -n elastic-kibana

For verifying the successful installation, we can go ahead and check if all kibana resources are created successfully.

Installing FluentD for fetching logs, indexing, and storing
We will be installing FluentD as a Daemonset so that we can get logs for all pods scheduled on different nodes from the /var/log/ directory mounted on all the nodes.
cd FluentD
kubectl apply -f ./

There are 3 files present in FluentD directory.
manifest.yml — Manifest for deploying fluentd as daemonset, fetching logs from nodes, and exporting them to elasticsearch.
rbac.yml— Manifest for applying clusterRole, ClusterRoleBinding, and Service Account for fluentD.
config-map.yml — This configmap contains a configuration file for fluentD as its data which is mounted in all pods of the fluentD daemonset.
If we look at the data, we can see that we are adding an index_name to our logs which are getting filtered through this configuration file as fluentd-k8s. This index_name will be used for making our logs unique for filtering sources in Kibana UI.
  index_name fluentd-k8s
Complete configmap is available here
Accessing Logs on Kibana Web Interface
Now, with the current setup, fluentD will be fetching logs from all nodes and sending them to ElasticSearch. Now, for visualizing the stored logs from ElasticSearch, we will be using Kibana UI. So, Let’s open Kibana UI and check what are we getting there!!

When we open Kibana for the first time, we will be presented with a Welcome screen as below.

Since we already have our Elastic-Search setup, we don’t have to go for adding any new integrations. We can click on Explore on my own button for moving forward to the Homepage.

As we want to explore the logs that we are getting from ElasticSearch, we will have to create an index pattern to identify the logs which we want to explore. (You may want to only explore only some parts of logs according to the use case).
Let’s click on Discover Card and create an index pattern. After clicking on the Discover card, you will get a screen as below.

It can take some time for ElasticSearch to get filled up the logs. Once there are some logs available in ElasticSearch, we can proceed further and click on create index pattern button.

In the above screen, we can add a pattern for identifying our logs. I have added the pattern as fluentd-k8s* as we had the same tag added in the fluentd configuration for filtration. On the right side, if your added index pattern matches a source coming from ElasticSearch then it will also show us Your index pattern matches 1 source, which means that Kibana can match the pattern with logs and filter them out.
After adding the Name, we can click on create index pattern button for saving it. After saving the pattern, it will show you what are the different fields, that we can get from the log details in the table.

Now, once we have added our index pattern, we can click on Discover button in the sidebar for seeing the logs we are getting from ElasticSearch based on our index pattern.

Now, as we can see that we are getting logs for all the available pods. But we want to filter logs for fetching logs for specific pods, for that we will be using label filters.
Agent Plane Components pod logs
We have 5 components which are part of agent plane, lets look at the required filters for all of them one by one.
- chaos-operator —
For chaos-operator, we can use below label as filter
kubernetes.labels.app.kubernetes.io/component=operator

- chaos-exporter —
For chaos-exporter, we can use below label as filter
kubernetes.labels.app=chaos-exporter

- subscriber —
For subscriber, we can use below label as filter
kubernetes.labels.app=subscriber

- event-tracker —
For event-tracker, we can use below label as filter
kubernetes.labels.app=event-tracker

- workflow-controller —
For workflow-controller, we can use below label as filter
kubernetes.labels.app=workflow-controller

Experiment pods logs —
So, for experiment run, All the experiment related pods for that particular run are labeled with a unique chaosUID i.e. kubernetes.labels.chaosUID.
So, how will you get the value of this label?
Well, once an experiment is completed, We can get chaosUID of that experiment run from the chaos results tab as can be seen below image.

So, using this label kubernetes.labels.chaosUID , we will be able to get logs for that experiment run. But it will be giving combined logs of experiment, runner and helper with this one label only as all chaos pods related to one experiment run are labelled with this label. but we want to distinguish between experiment, runner and helper pods logs also, right?
We will have to use one more labels which will be unique to particular Experiment pod, Runner pod & Helper pod. Let’s see how to achieve this.
For Experiment pod logs, we can use the below labels —
    1. kubernetes.labels.chaosUID: <chaosUID>
    2. kubernetes.labels.name: <experiment_name>
For Runner pod logs, we can use below labels -
    1. kubernetes.labels.chaosUID: <chaosUID>
    2. kubernetes.labels.app_kubernetes_io/component: ”chaos-runner”
For Helper pod logs, we can use the below labels -
    1. kubernetes.labels.chaosUID: <chaosUID>
    2. kubernetes.labels.name: <experiment-name>-helper-*
Let’s take an example of a pod-network-loss experiment,
In my case, the chaosUID is 839d4a33–8e3a-4df5-b05a-2d136160c90d
Now, for filtering logs of only the experiment pod, we will use these 2 labels -
1. kubernetes.labels.chaosUID: 839d4a33–8e3a-4df5-b05a-2d136160c90d
2. kubernetes.labels.name: pod-network-loss

For filtering logs of only helper pod, we will use these 2 labels -
1. kubernetes.labels.chaosUID: 839d4a33–8e3a-4df5-b05a-2d136160c90d
2. kubernetes.labels.name: pod-network-loss-helper-*

For filtering logs of only the runner pod, we will use these 2 labels -
1. kubernetes.labels.chaosUID: 839d4a33–8e3a-4df5-b05a-2d136160c90d
2. kubernetes.labels.app_kubernetes_io/component: ”chaos-runner”

So, as we can see now we can index our logs with a tag, store them in ElasticSearch and view & filter them on Kibana.
Well, this was my first try with EFK and that’s all I got to know about it’s working. Will be writing more as I get to work more on this. Thanks for checking this blog out!!
Conclusion
Feel free to check out our ongoing project — Chaos Center and do let us know if you have any suggestions or feedback regarding the same. You can always submit a PR if you find any required changes.
Make sure to reach out to us if you have any feedback or queries. Hope you found the blog informative!
If chaos engineering is something that excites you or if you want to know more about cloud-native chaos engineering, don’t forget to check out our LitmusChaos website, ChaosHub, and the LitmusChaos repo. Do leave a star if you find it insightful. 😊
I would love to invite you to our community to stay connected with us and get your Chaos Engineering doubts cleared. To join our slack please follow the following steps!
Step 1: Join the Kubernetes slack using the following link: https://slack.k8s.io/
Step 2: Join the #litmus channel on the Kubernetes slack or use this link after joining the Kubernetes slack: https://slack.litmuschaos.io/
Cheers!


