Giulia Di Pietro
Feb 01, 2022
In this series of blog posts on observing the NGINX controller, we have already looked at collecting metrics with Prometheus and turning logs into metrics with LogQL. Today, we will look at building a logs stream pipeline exposing metrics with Fluentd.
In short, we'll first quickly go over which metrics we extracted in part 2 of this series with LogQL. And then, we will look at which plugins we need to use in our log stream pipeline, specifically the Prometheus plugin. In the end, we will jump into the tutorial.
Extracting metrics with LogQL
As we have learned in the previous blog post and video about observing NGINX with Loki, logs are a data source that provides deeper insights into our ingress controller.
The advantage is that we could easily adjust the login format produced by our ingress controller by adding extra metadata related to our context:
-
1
The name of the ingress
-
2
The namespace of the ingress resource
-
3
The target service
-
4
...
In the case of NGINX, you can adjust the logging format by configuring the NGINX config file (in Kubernetes, it’s the config map).
NGINX offers several variables exposing specific data :
-
1
$bytes_sent to report number of bytes sent by the client
-
2
$request_time to report the request process time in seconds
-
3
$status to report the http status code
And specific variables for Kubernetes
-
1
$resource_name
-
2
$resource_type
-
3
$resource_namespace
-
4
$service
Adding an extended logging format will expose the metadata required to observe our ingress controller properly. By adding this metadata, we could utilize LogQL to parse the new metadata and create metrics with dimensions that help us split statistics based on the service, ingress, etc. That was precisely what we were looking for.
Let’s look at the metrics we were able to report in our previous episode.
We built three LogQL to extract three KPIs. First, one that reports the number of HTTP requests per ingress.
count by (service) ( rate(
{app="nginx-nginx-ingress",pod="nginx-nginx-ingress-7b95f794c4-kv5rb"}
|="ingress" !="stderr"
| pattern `<remote_addr> [<time_local>] <method> <request> <_> <status> <body_bytes_sent> <request_time> <_> <upstream_response_time> <_> <status_ingress> <resource_name> <_> <namespace_patter> <service>`
[1m]))
Then we have the following to report the amount of bytes per ingress:
sum by (service) ( sum_over_time( {app="ngninx-NGINX-ingress",pod="ngninx-NGINX-ingress-7b95f794c4-kv5rb"}
|="ingress" !="stderr"
| pattern `<remote_addr> [<time_local>] <method> <request> <_> <status> <body_bytes_sent> <request_time> <_> <upstream_response_time> <_> <status_ingress> <resource_name> <_> <namespace_patter> <service>`
| unwrap body_bytes_sent[1m])
)
And this to report the p99 response time of the request per service:
quantile_over_time(0.99,
{app="ngninx-NGINX-ingress",pod="ngninx-NGINX-ingress-7b95f794c4-kv5rb"}
|="ingress" !="stderr"
| pattern `<remote_addr> [<time_local>] <method> <request> <_> <status> <body_bytes_sent> <request_time> <_> <upstream_response_time> <_> <status_ingress> <resource_name> <_> <namespace_patter> <service>`
| status<400
| unwrap request_time[1m]) by (service)
We can simply report the number of log streams to report the number of requests, the bytes sent to report the bytes exchanged, and lastly, the request_time for the response time.
We can also go further by counting the status code as well.
What are the various plugins required for our log pipeline
As described in our episode dedicated to Fluentd, there are many plugins available to:
-
1
Collect (input plugin)
-
2
Parse
-
3
Filter
-
4
Forward (output plugin)
our logs.
In our case, we need to collect logs produced by our NGINX ingress controller.
We utilize the input plugin called “tail,” which will read logs from log files (container logs). Since we're only interested in logs produced by our ingress controller, we can filter by specific container logs files by using the regexp character in the path of the log file.
path /var/log/containers/*NGINX*.log
Then we need to parse our log stream to extract the available metadata
Fluentd has several parsers available:
-
1
Regexp
-
2
Apache2
-
3
NGINX
-
4
Syslog
-
5
Csv
-
6
Json
The NGINX parser supports the default logging format of NGINX. Therefore, we would need to use “expression” to specify the structure of our log:
expression /^(?<logtime>\S+)\s+(?<logtype>\S+)\s+(?<type>\w+)\s+(?<ip>\S+)\s+\[(?<time_local>[^\]]*)\]\s+(?<method>\S+)\s+(?<request>\S+)\s+(?<httpversion>\S*)\s+(?<status>\S*)\s+(?<bytes_sent>\S*)\s+(?<responsetime>\S*)\s+(?<proxy>\S*)\s+(?<upstream_responsetime>\S*)\s+(?<ressourcename>\S*)\s+(?<upstream_status>\S*)\s+(?<ingress_name>\S*)\s+(?<ressource_type>\S*)\s+(?<ressource_namesapce>\S*)\s+(?<service>\w*)/
The <name> will allow us to extract and store the metadata into a new log stream key.
As the objective is to expose metrics, we probably need to specify the type of each key using types:
types ip:string,time_local:string,method:string,request:string,httpversion:string,status:integer,bytes_sent:integer,responsetime:float,request_time:float,proxy:string,upstream_responsetime:float,ressourcename:string,ressource_type:string,ressource_namesapce:string,service:string
Once the metadata is parsed and stored in new log stream keys, we can either format our new stream to send it to any observability backend or expose the metrics collected in a Prometheus format directly by using the Prometheus plugin.
The purpose of the Prometheus Fluentd plugin is to expose a Prometheus exporter on the Fluentd agent. The source object will specify how to expose the Prometheus data :
-
1
Using bind
-
2
Port
-
3
And metric_path
Here’s an example
<source>
@type prometheus
bind 0.0.0.0
port 9914
metrics_path /metrics
</source>
The Prometheus plugin also has prometheus_monitor capability to automatically generate metrics related to your log stream pipeline. Fluentd will automatically expose :
Metrics for output
-
1
fluentd_output_status_retry_count
-
2
fluentd_output_status_num_errors
-
3
fluentd_output_status_emit_count
-
4
Fluentd_output_status_retry_wait :current retry_wait computed from last retry time and next retry time
-
5
fluentd_output_status_emit_records
-
6
fluentd_output_status_write_count
-
7
fluentd_output_status_rollback_count
-
8
fluentd_output_status_flush_time_count in milliseconds from fluentd v1.6.0
-
9
fluentd_output_status_slow_flush_count from fluentd v1.6.0
Metrics for buffer
-
1
fluentd_output_status_buffer_total_bytes
-
2
fluentd_output_status_buffer_stage_length from fluentd v1.6.0
-
3
fluentd_output_status_buffer_stage_byte_size from fluentd v1.6.0
-
4
fluentd_output_status_buffer_queue_length
-
5
fluentd_output_status_buffer_queue_byte_size from fluentd v1.6.0
-
6
fluentd_output_status_buffer_newest_timekey from fluentd v1.6.0
-
7
Fluentd_output_status_buffer_oldest_timekey from fluentd v1.6.0
-
8
fluentd_output_status_buffer_available_space_ratio from fluentd v1.6.0
This plugin automatically exposes metrics related to output or buffer plugin.
If you want to report metrics related to data collected by the tail plugin, you can also use the prometheus_tail_monitor. This plugin will report :
-
1
Fluentd_tail_file_position :current bytes which plugin reads from the file
-
2
Fluentd_tail_file_inode :inode of the file
All those plugins are relevant if you're interested in reporting observability related to your log stream pipeline.
In our case, we're looking to expose data from our log stream.
For that, we will need to use the Prometheus plugin that could instrument metrics from our records.
@type prometheus
<metric>
name message_foo_counter
type counter
desc The total number of foo in message.
key foo
</metric>
</filter>
The <metric> will describe the metric:
-
1
Name: name of the metric
-
2
Type: counter, gauge, histogram or summary (the 4 Prometheus data types)
-
3
Desc: to add a description
-
4
Key: to precise the key that you want to expose as a metric
Then you can specify your labels:
<labels>
label_name value
</labels>
For example
<labels>
Host ${hostname}
</labels>
The hostname is a key from our log stream.
The label object can either be positioned in the metric object or close to the Prometheus plugin. In the first case, it will describe the specific label of this metric:
<metric>
<labels>
</labels>
</metric>
In the second case, it will describe the common labels of all the metrics that you’re trying to expose.
Type @ prometheus
<labels>
</labels>
-
1
A Kubernetes cluster
-
2
The NGINX ingress controller deployed
-
3
Prometheus deployed
In this tutorial, we will:
-
1
Customize the logging format
-
2
Create a Docker container with the Prometheus plugin installed
-
3
Deploy Fluentd as a demonset by using our customized Docker image
-
4
Deploy your log stream pipeline with a Config map
-
5
Build a Prometheus dashboard
-
6
We will expose those new metrics in Dynatrace by simply adding the Dynatrace annotation on the Fluentd daemonset.
Here are the all-important links to the step-by-step tutorials:
-
1
YouTube Video: How to produce Prometheus metrics out of logs using Fluentd
-
2
GitHub page: How to observe an NGINX ingress controller
Topics
Go Deeper
Go Deeper