Ship AKS logs using a Fluentd DaemonSet

Overview
Default configuration
Custom configuration
Multiline logs

Fluentd is an open source data collector and a great option because of its flexibility. This implementation uses a Fluentd DaemonSet to collect Kubernetes logs and send them to Logz.io. The Kubernetes DaemonSet ensures that some or all nodes run a copy of a pod.

The image used in this integration comes pre-configured for Fluentd to gather all logs from the Kubernetes node environment and append the proper metadata to the logs. If you prefer to customize your Fluentd configuration, you can edit it before it’s deployed.

The latest version pulls the image from logzio/logzio-fluentd. Previous versions pulled the image from logzio/logzio-k8s.

Fluentd will fetch all existing logs, as it is not able to ignore older logs.

For troubleshooting this solution, see our user guide.

Sending logs from nodes with taints

If you want to ship logs from any of the nodes that have a taint, make sure that the taint key values are listed in your in your daemonset/deployment configuration as follows:

tolerations:
- key: 
  operator: 
  value: 
  effect: 

To determine if a node uses taints as well as to display the taint keys, run:

kubectl get nodes -o json | jq ".items[]|{name:.metadata.name, taints:.spec.taints}"

You need to use Helm client with version v3.9.0 or above.

K8S version compatibility

Your Kubernetes version may affect your options, as follows:

K8S 1.19.3+ - If you’re running on K8S 1.19.3+ or later, be sure to use the DaemonSet that supports a containerd at runtime. It can be downloaded and customized fromlogzio-daemonset-containerd.yaml.
K8S 1.16 or earlier - If you’re running K8S 1.16 or earlier, you may need to manually change the API version in your DaemonSet to apiVersion: rbac.authorization.k8s.io/v1beta1.

The API versions of ClusterRole and ClusterRoleBinding are found in logzio-daemonset-rbac.yaml and logzio-daemonset-containerd.yaml.

If you are running K8S 1.17 or later, the DaemonSet is set to use apiVersion: rbac.authorization.k8s.io/v1 by default. No change is needed.

Multiline logs

Fluentd’s basic configuration may cause longer, multiline logs to split into multiple logs - 1 log per line. You can use the Fluentd multiline parser plugin to control this behavior.

See the next tab for details about using the Fluentd multiline parser plugin.

Deploy logzio-k8s with default configuration

For most environments, we recommend using the default configuration. However, you can deploy a custom configuration if your environment needs it.

Deploy Fluentd as a DaemonSet on Kubernetes

Create a monitoring namespace

Your DaemonSet will be deployed under the namespace monitoring.

kubectl create namespace monitoring

Store your Logz.io credentials

Save your Logz.io shipping credentials as a Kubernetes secret.

kubectl create secret generic logzio-logs-secret \
  --from-literal=logzio-log-shipping-token='<<LOG-SHIPPING-TOKEN>>' \
  --from-literal=logzio-log-listener='https://<<LISTENER-HOST>>:8071' \
  -n monitoring

Replace the placeholders to match your specifics. (They are indicated by the double angle brackets << >>):

Replace <<LOG-SHIPPING-TOKEN>> with the token of the account you want to ship to.
Replace <<LISTENER-HOST>> with the host for your region. For example, listener.logz.io if your account is hosted on AWS US East, or listener-nl.logz.io if hosted on Azure West Europe. The required port depends whether HTTP or HTTPS is used: HTTP = 8070, HTTPS = 8071.

Deploy the DaemonSet

For an RBAC cluster:

kubectl apply -f https://raw.githubusercontent.com/logzio/logzio-k8s/master/logzio-daemonset-rbac.yaml -f https://raw.githubusercontent.com/logzio/logzio-k8s/master/configmap.yaml

For a non-RBAC cluster:

kubectl apply -f https://raw.githubusercontent.com/logzio/logzio-k8s/master/logzio-daemonset.yaml -f https://raw.githubusercontent.com/logzio/logzio-k8s/master/configmap.yaml

For container runtime Containerd:

kubectl apply -f https://raw.githubusercontent.com/logzio/logzio-k8s/master/logzio-daemonset-containerd.yaml -f https://raw.githubusercontent.com/logzio/logzio-k8s/master/configmap.yaml

Check Logz.io for your logs

Give your logs some time to get from your system to ours, and then open Open Search Dashboards.

If you still don’t see your logs, see Kubernetes log shipping troubleshooting.

Deploy logzio-k8s with custom configuration

You can customize the configuration of your Fluentd container by editing either your DaemonSet or your Configmap.

Create a monitoring namespace

Your DaemonSet will be deployed under the namespace monitoring.

kubectl create namespace monitoring

Store your Logz.io credentials

Save your Logz.io shipping credentials as a Kubernetes secret.

kubectl create secret generic logzio-logs-secret \
  --from-literal=logzio-log-shipping-token='<<LOG-SHIPPING-TOKEN>>' \
  --from-literal=logzio-log-listener='https://<<LISTENER-HOST>>:8071' \
  -n monitoring

Replace the placeholders to match your specifics. (They are indicated by the double angle brackets << >>):

Replace <<LOG-SHIPPING-TOKEN>> with the token of the account you want to ship to.
Replace <<LISTENER-HOST>> with the host for your region. For example, listener.logz.io if your account is hosted on AWS US East, or listener-nl.logz.io if hosted on Azure West Europe. The required port depends whether HTTP or HTTPS is used: HTTP = 8070, HTTPS = 8071.

Configure Fluentd

There are 3 DaemonSet options: RBAC DaemonSet, non-RBAC DaemonSet, Containerd. Download the relevant DaemonSet and open it in your text editor to edit it.

If you wish to make advanced changes in your Fluentd configuration, you can download and edit the configmap yaml file.

Environment variables

The following environment variables can be edited directly from the DaemonSet without editing the Configmap.

Parameter	Description	Default
output_include_time	To append a timestamp to your logs when they’re processed, `true`. Otherwise, `false`.	`true`
buffer_type	Specifies which plugin to use as the backend	`file`
buffer_path	Path of the buffer	`/var/log/Fluentd-buffers/stackdriver.buffer`
buffer_queue_full_action	Controls the behavior when the queue becomes full	`block`
buffer_chunk_limit	Maximum size of a chunk allowed.	`2M`
buffer_queue_limit	Maximum length of the output queue.	`6`
flush_interval	Interval, in seconds, to wait before invoking the next buffer flush.	`5s`
max_retry_wait	Maximum interval, in seconds, to wait between retries.	`30s`
num_threads	Number of threads to flush the buffer.	`2`
INCLUDE_NAMESPACE	Sends logs from all namespaces by default. To send logs from specific k8s namespaces, specify them in the following format, space delimited: `kubernetes.var.log.containers._<<NAMESPACE-TO-INCLUDE>>_ kubernetes.var.log.containers._<<ANOTHER-NAMESPACE>>_`.	`""`
KUBERNETES_VERIFY_SSL	Enable to validate SSL certificates.	`true`
FLUENT_FILTER_KUBERNETES_URL	URL to the API server. This parameter isn’t part of the default Daemonset. You can set it to retrieve additional Kubernetes metadata for logs from the Kubernetes API server.	`null`
AUDIT_LOG_FORMAT	The format of your audit logs. If your audit logs are in json format, set to `audit-json`.	`audit`
CRI	The CRI of the cluster. In `logzio-daemonset` & `logzio-daemonset-rbac` is set to `docker`, and in `logzio-daemonset-containerd` is set to `containerd`. The configmap uses this var to determin which includes it needs to make for the fluent.conf file, when configuration needs to be adjusted by the CRI.

Good to know

If FLUENT_FILTER_KUBERNETES_URL is not specified, the environment variables KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT will be used, as long as both of them are present. Typically, they are present when running Fluentd in a pod.
Note that FLUENT_FILTER_KUBERNETES_URL does not appear in the default environment variable list in the DaemonSet. If you wish to use this variable, you’ll have to add it manually to the DaemonSet’s environment variables.

Deploy the DaemonSet

For the RBAC DaemonSet:

kubectl apply -f /path/to/logzio-daemonset-rbac.yaml -f /path/to/configmap.yaml

For the non-RBAC DaemonSet:

kubectl apply -f /path/to/logzio-daemonset.yaml -f /path/to/configmap.yaml

For container runtime Containerd:

kubectl apply -f /path/to/logzio-daemonset-containerd.yaml -f /path/to/configmap.yaml

Check Logz.io for your logs

Give your logs some time to get from your system to ours, and then open Open Search Dashboards.

If you still don’t see your logs, see Kubernetes log shipping troubleshooting.

Disabling systemd input

To suppress Fluentd system messages, set the FLUENTD_SYSTEMD_CONF environment variable to disable in your Kubernetes environment.

Exclude logs from certain namespaces

If you wish to exclude logs from certain namespaces, add the following to your Fluentd configuration:

<match kubernetes.var.log.containers.**_NAMESPACE_**>
  @type null
</match>

Replace NAMESPACE with the name of the namespace you need to exclude logs from.

If you need to specify multiple namespaces, add another kubernetes.var.log.containers.**_NAMESPACE_** line to the above function as follows:

<match kubernetes.var.log.containers.**_NAMESPACE1_** kubernetes.var.log.containers.**_NAMESPACE2_**>
  @type null
</match>

Configuring Fluentd to concatenate multiline logs using a plugin

Fluentd splits multiline logs by default. If your original logs span multiple lines, you may find that they arrive in your Logz.io account split into several partial logs.

The Logz.io Docker image comes with a pre-built Fluentd filter plug-in that can be used to concatenate multiline logs. The plug-in is named fluent-plugin-concat and you can view the full list of configuration options in the GitHub project.

Example

The following is an example of a multiline log sent from a deployment on a k8s cluster:

2021-02-08 09:37:51,031 - errorLogger - ERROR - Traceback (most recent call last):
File "./code.py", line 25, in my_func
1/0
ZeroDivisionError: division by zero

Fluentd’s default configuration will split the above log into 4 logs, 1 for each line of the original log. In other words, each line break (\n) causes a split.

To avoid this, you can use the fluent-plugin-concat and customize the configuration to meet your needs. The additional configuration is added to the values.yml file.

For the above example, we could use the following regex expressions to demarcate the start and end of our example log:

<filter **>
  @type concat
  key message # The key for part of multiline log
  multiline_start_regexp /^[0-9]{4}-[0-9]{2}-[0-9]{2}/ # This regex expression identifies line starts.
</filter>