Kubernetes

What is the Keptn Lifecycle Toolkit?

Keptn is an event-driven framework that utilizes CloudEvents to communicate with different tools for specific events, for example, testing.

Giulia Di Pietro

Feb 20, 2023

12 minute read

This episode and blog post of Is It Observable is dedicated to a CNCF project you may already know about: Keptn. Mainly, I'll talk about the Keptn Lifecycle Toolkit and introduce the following topics:

1

The challenges we face when deploying applications in Kubernetes.
2

What the Keptn Lifecycle Toolkit is.
3

Which CRDs it adds to your cluster.
4

How to create pre and post-checks.
5

What observability data Keptn provides out of the box.

As usual, I’ll wrap up the episode with a tutorial that shows you how to get started with Keptn.

# Deploying applications in Kubernetes

Do you remember how we used to manage deployments a few years back?

First, we used to handle only one or two releases per year while working in the same Git or SVN repository. Code would be built and deployed to ship it into the infrastructure, and we would run tests to tune the environment and adjust its size there.

Nowadays, we're more Agile. Every team is in charge of a component or a microservice, so everyone has their repository with their own validation set. Deployments are separate from component to component, but we need to run validation when integrating them and figure out the right autoscaling policy and observability.

When deploying an application in Kubernetes, you must create one deployment file per microservice. This deployment file includes the following:

1

One or several containers.
2

A health probe that Kubernetes will use to route the traffic to the right pod.
3

A readiness probe to let Kubernetes determine if your pod is running.
4

The number of required replicas.
5

A Kubernetes service for the network layer.

In the end, Kubernetes will automatically check if the required objects used by your workload are already deployed. Once the container is created and started, Kubernetes will interact with the readiness probe and the health probe to update your service's state and network route.

Since Kubernetes has no notion of application or dependency, if you deploy a pod that requires a specific database and this one hasn’t been deployed, it will crash. After repeated crashes, you'll get the error: crashloopbackoff.

The correct deployment sequence is crucial to avoid this from happening. You can also run tests within your CI/CD process to validate the deployment with tools like:

1

Jenkins
2

GitHub Actions
3

Gitlab
4

AzureDevOps
5

Spinnaker
6

ArgoCD
7

And more.

Most CI/CD solutions allow you to build tasks that interact with one of your tools for deployment, security scans, tests, and more. Integrations may already exist within the tooling, but sometimes you need to build your own.

Consequently, once you have created a reliable pipeline, you usually copy pieces of your pipeline to reuse them in other project pipelines. After some time, you realize that this copy-pasted piece is in most of our pipeline.

It becomes an issue when you change tooling because you need to update all your pipelines for your organization.

And thus, Keptn was created.

Keptn is an event-driven framework that utilizes CloudEvents to communicate with different tools for specific events, for example, testing. If you need to change a tool, you can just subscribe to the new one to Keptn. You can also use Keptn for its quality gates feature, which evaluates the success of a task based on SRE methodology. And it provides production use cases to manage your remediation or canary releases.

Several months ago, Keptn released an LTS version, v1.0. Currently, the official release is 1.1.0.

The Keptn team tries to make onboarding easier and faster for cloud-native applications. So they decided to create the Keptn Lifecycle Toolkit, designed for applications running in a Kubernetes cluster. Let’s have a look at this tool in more detail.

# What is the Keptn Lifecycle Toolkit?

The Keptn Lifecycle Toolkit (KLT) is an operator designed to help teams manage their deployments in Kubernetes by adding the notion of “applications.”

Keptn defines an application by attaching workloads to it. This helps you:

1

Add pre-evaluation before deploying any workload or application in our cluster.
2

Find out when an application (not workload) is ready and running.
3

Check the application health in a declarative way.
4

Standardize pre/post-deployment tasks.
5

Provide out-of-the-box observability related to your deployment cycle.

The beauty of Keptn is that it does not require any specific configuration. All you need to do is define your application, add particular annotations to the workload, and label the namespace where the Keptn Lifecycle Toolkit will observe your deployments and applications.

Keptn will only check the application deployed in namespaces having a specific annotation:

            keptn.sh/lifecycle-toolkit: "enabled"

Then you simply need to add the right annotation to your workload:

            keptn.sh/app: myAwesomeAppName
keptn.sh/workload: myAwesomeWorkload
keptn.sh/version: myAwesomeWorkloadVersion

Or you can also use the Kubernetes-recommended labels to annotate your workload:

            app.Kubernetes.io/part-of: myAwesomeAppName
app.Kubernetes.io/name: myAwesomeWorkload
app.Kubernetes.io/version: myAwesomeWorkloadVersion

Keptn will first look for the Keptn annotations/labels; if it does not find them, it will look at the Kubernetes labels.

As you can see, Keptn also keeps track of the version of your workload. You can define, for example, that application 1.0 is made of workload A, version 0.20, and workload B with version 0.40, etc. That is why the version number needs to be annotated.

If you don’t define any version number in your annotations, Keptn will use the version number of your container only if you have only one container in your workload.

KLT has a great feature that allows you to create pre-deployment and post-deployment checks for our workload or application. To do this, you need to add the following annotations to your workload:

            keptn.sh/pre-deployment-tasks: verify-infrastructure-problems
keptn.sh/post-deployment-tasks: slack-notification,performance-test

Where verify-infrastructure-problems and slack-notification and performance-test are the name of the new CRDs introduced by KLT, the KeptnTaskDefinition, in the case of pre or post-deployment tasks for your application, you'll need to define them in the definition of your Keptn Application.

If the pre-deployment check is unsuccessful, the workload will be stuck in a pending state.

Another great feature is creating pre and post-deployment evaluations on your workload and the application.

For example, let’s say you have ten services within your application, and they require two CPU cores from your cluster. You could imagine creating a pre-deployment evaluation that retrieves metrics from a metric provider like Prometheus or Dynatrace. In the case of Prometheus, you would need to build the right PromQL to retrieve the current cores available in your cluster and add the right objective.

For example

            - name: available cores
 query: "sum(node:node_num_cpu:sum ) - sum(kube_pod_container_resource_requests{resource='cpu'})"
 evaluationTarget: ">2"

Keptn will check the required number of cores before deployment. If the conditions aren’t met, your workload will be stuck in a pending state.

You can also use this notion of evaluation in our workload by adding the correct annotation:

            keptn.sh/pre-deployment-evaluations: my-evaluation-definition
keptn.sh/post-deployment-evaluations: my-eval-definition

Keptn includes:

1

An operator composed of a scheduler and a control manager (a scheduler in charge of scheduling your workload in your cluster and add the various checks, and the control manager that interacts with the admission controller to ensure the usage of Keptn scheduler is involved in the deployment process of your workload).
2

A configmap of each component.
3

A service exposing Prometheus metrics out of your deployment.
4

And several CRDS.

When a deployment is triggered, the operator receives a request from the admission controller. At that moment, the operator creates or updates the Keptn workload object and the deployment pod to ensure the Keptn scheduler will manage the deployment sequence.

The operator will then create a keptnworkload instance related to this deployment. It also makes the Keptn task based on the information related to our pre and post-deployment checks. All tasks run through a Kubernetes job.

After running the checks, Keptn will schedule the evaluation tasks you have defined. Those evaluations retrieve the metrics from the keptnevaluationprovider that you have configured. So, in the end, we deploy our workload that is attached to a Keptn app.

If the app has a pre-deployment check or evaluation, it would be triggered before any other deployment. If the pre-checks and evaluations are successful, the workload is scheduled. Entering the creating state requires having a node available for your workload and successful pre-deployment checks. Then the workload moves to the creating state and running state.

Once the workload is in a running state, Keptn triggers our post-deployment checks and evaluations. If your checks fail, then your workload ends.

If the workload of the application runs, then Keptn triggers the post-deployment checks and evaluation defined at the application level.

# Keptn CRDs

Keptn introduced several CRDS, but we will only configure a few to manage our deployment sequence, specifically:

1

KeptnApp
2

KeptnTaskDefinition
3

KeptnEvaluationDefinition
4

KeptnEvaluationProvider
5

KeptnMetric

The other CRDs: KeptnWorkload, KeptnWorkloadInstance, KeptnTask, and KeptnVersion, are automatically created by the operator.

But you should know that:

1

The Keptn workload is created from the annotations added in the deployments. It also maps the current active workload instance.
2

The Keptn workload instance keeps track of the status of the various checks defined in the workload. Therefore, you'll have to check on the WorkloadInstance to understand the status of your various checks.
3

The KeptnTask is created when your validation tasks are running. The Keptn task will create a Kubernetes job to run our check. The Keptn task will keep track of the status of the current job.
4

The KeptnEvaluation is created to collect the metrics from our metric provider. Therefore you'll also need to look at the keptnEvaluation to understand the status of your various evaluations.

So let's start with the main object, the keptnApp

# KeptnApp

The KeptnApp maps the various workloads that make up our application. The CRD takes the name of the application (application name). For example

            apiVersion: lifecycle.keptn.sh/v1alpha2
kind: KeptnApp
metadata:
name: otel-demo-applicaton
namespace: otel-demo
spec:
version: "1.2.1"
workloads:
- name: accountingservice
version: 1.2.1
- name: adservice
version: 1.2.1
- name: cartservice
version: 1.2.1
- name: checkoutservice
version: 1.2.1
- name: currencyservice
version: 1.2.1
- name: shippingservice
version: 1.2.1
- name: emailservice
version: 1.2.1
- name: redis
version: 1.2.1
- name: recommendationservice
version: 1.2.1

You add the application name in your annotations, for example, “otel-demo.”

            app.Kubernetes.io/part-of: otel-demo-applicaton
app.Kubernetes.io/name: checkoutservice
app.Kubernetes.io/version: "1.2.1"

In the CRD, you add the pre-deployment task, evaluation, and post-deployment task with the following:

            preDeploymentEvaluations:
 - app-pre-deploy-eval-1
PreDeploymentTasks : 
app-check-network
PostdeploymentEvaluations : 
response-time-check
PostdeploymentTask:
load-test

In this example, App-check-network will be the first task in your deployment process, and then it will run the evaluation: app-pre-deployment-eval

And once all the workload is deployed, it will run the task “load-test” and end with the evaluation: response-time-check.

App-check-network and load-test are the names of KeptntaskDefinition. Response-time-check and app-pre-deploy-eval are the names of KeptnEvaluationDefintion.

KeptnAPP also has an optional property: revision. If you don’t define it, the default value is 1.

The advantage of the revision field is to unblock a pending deployment due to a failing task related to an issue in our task definition. Then, you can increment the version number and reapply it. It will then restart the current task in error.

# Keptn Task Definition

KeptnTaskDefinition is a CRD used to define tasks that the Keptn Lifecycle Toolkit can run as part of a pre and post-deployment check of a deployment.

Currently, tasks support the Deno script. So you can code your check in the task definition like this:

            apiVersion: lifecycle.keptn.sh/v1alpha2
kind: KeptnTaskDefinition
metadata:
name: deployment-hello
spec:
function:
inline:
code: |
console.log("Deployment Task has been executed");

Or refer to a script stored in an HTTP endpoint:

            apiVersion: lifecycle.keptn.sh/v1alpha2
kind: KeptnTaskDefinition
metadata:
 name: kafka-check
spec:
 function:
 httpRef:
 url: https://raw.githubusercontent....
 parameters:
 map:
 host: example-kafka

In the future, we should be able to have other runtimes, especially running a container image directly.

A task definition can be configured by referencing the code object, an HTTP script, or another KeptnTaskDefinition.

# Keptn Evaluation Definition

A KeptnEvaluationDefinition is a CRD used to define evaluation tasks that the Keptn Lifecycle Toolkit can run as part of a workload or application's pre and post-evaluation phases.

A Keptn evaluation definition looks like the following:

            ApiVersion: lifecycle.keptn.sh/v1alpha2
kind: KeptnEvaluationDefinition
metadata:
 name: pre-deploy-eval-ressources
spec:
 source: prometheus-provider
 objectives:
 - name: available cores
 query: "sum(node:node_num_cpu:sum ) - sum(kube_pod_container_resource_requests{resource='cpu'})"
 evaluationTarget: ">=2"

We will need to define a source that will link to an existing keptnEvaluationProvider, and then the various objectives. This example query is the PromQL that Keptn will send to the KeptnEvaluationProvider named Prometheus-provider.

To keep track of the results of your evaluation, you should interact with KepntEvaluation, by using

            kubectl get KeptnEvaluation -n yournamespace

# Keptn Evaluation Provider

A KeptnEvaluationProvider is a CRD used by your evaluation provider, which will provide data for the pre and post-evaluation phases of a workload or application.

A Keptn evaluation provider looks like the following:

            apiVersion: lifecycle.keptn.sh/v1alpha2
kind: KeptnEvaluationProvider
metadata:
 name: prometheus-provider
 namespace: otel-demo
spec:
 targetServer: "http://prometheus-kube-prometh..."

You can also add a secret name if it is required to log in to your Prometheus instance.

# Keptn Metric

A KeptnMetric is a CRD used to define a metric in Keptn. You specify the query, and Keptn retrieves this metric in a regular base

KeptnMetric is designed to help you reuse data across multiple keptnEvaluationDefinition.

So we first define our metric

            apiVersion: metrics.keptn.sh/v1alpha1
kind: KeptnMetric
metadata:
name: keptnmetric-sample
namespace: otel-demo
spec:
provider:
name: " prometheus-provider"
query: ""sum(node:node_num_cpu:sum ) - sum(kube_pod_container_resource_requests{resource='cpu'})"
fetchIntervalSeconds: 5

Then if you want to use this Keptn metric in your Keptn evaluation definition, you'll need to specify keptn-metric as the source. The name of the metric will be the reference of our Keptn metric

            ApiVersion: lifecycle.keptn.sh/v1alpha2
kind: KeptnEvaluationDefinition
metadata:
 name: pre-deploy-eval-resources
spec:
 source: keptn-metric
 objectives:
 - name: keptnmetric-sample
 evaluationTarget: ">=2"

# Creating custom Keptn tasks

Creating a check or task requires building a Deno script. You can code your script directly in your KeptnTaskDefintion, but you'll soon realize that you often have the same type of checks for many deployments. Therefore it makes more sense to build a Deno script that will be stored in GitHub or a custom HTTP endpoint.

When the keptnTask is triggered, Keptn creates the keptnTask and a Kubernetes job to execute your script.

To make it reusable, your script will probably have input parameters, which can be plain text or sensitive.

Plain text parameters will be stored in a JSON object called DATA. Therefore you'll need first to collect your DATA object and then access the various properties for your script.

For example:

            let text = Deno.env.get("DATA");
let data;
data = JSON.parse(text);
try {
const a = await Deno.resolveDns(data.host, "A");
}
catch (error){
console.error("Could not resolve hostname")
Deno.exit(1)
}

The script needs a host to run a DNS resolution in this example. So we need to get the “data.host” property to run this task

To use this script in your keptnTaskdefintion, you'll configure it as follows:

            apiVersion: lifecycle.keptn.sh/v1alpha2
kind: KeptnTaskDefinition
metadata:
name: payment-check
spec:
function:
httpRef:
url: https://raw.githubusercontent....
parameters:
map:
host: example-paymentservice

The Keptn Lifecycle Toolkit passes the values defined inside the map field as a JSON object (DATA). For the moment, Keptn doesn’t support multi-level maps.

If you need to use sensitive parameters, like tokens or passwords, you can use Kubernetes secrets. To do this, you must pass the function using the secureParameters field.

For example

            apiVersion: lifecycle.keptn.sh/v1alpha2
kind: KeptnTaskDefinition
metadata:
name: slack-notification-dev
spec:
function:
functionRef:
name: slack-notification
parameters:
map:
textMessage: "This is my configuration"
secureParameters:
secret: slack-token

In this example, the Slack token is stored in a secret called "slack-token."

In your script, you'll need to retrieve an object named SECURE_DATA (a JSON object with all the properties defined in our secret).

For example

            let text = Deno.env.get("SECURE_DATA");
if (text != undefined) {
data = JSON.parse(text);
}
console.log(body)
let resp = await fetch("https://hooks.slack.com/servic..." + data.slack_hook, {
method: "POST",
body,
});
console.log(resp)

Keptn also adds a context variable to your scripts that contains information related to the application and the workload: context. It provides all details about where your task has been triggered.

# Extracting observability data with Keptn

The Keptn Lifecycle Toolkit was built to manage the deployments of your application but also provide the right visibility on your deployments process by supplying observability data.

All the metrics produced by Keptn will be exposed on the Keptn controller manager on the default metric port 2222.

Here are the types of metrics produced by Keptn:

1

Application-related metrics
2

Workload-related metrics
3

Task-related metrics
4

Evaluation-related metrics

Keptn provides Grafana dashboards by default, giving you an overview of Keptn, of the Keptn application, and the Keptn workload.

A dashboard can be deployed to Grafana using the sidecar feature provided by the Prometheus operator. You need to create a config with your dashboard in a JSON file with the label “grafana_dashboard:1”.

The application dashboard shows you the traces generated. Still, it will require a Jaeger data source, so you'll need to add Jaeger to your cluster and adjust your OpenTelemetry collector pipeline. Keptn natively generates traces of your deployment on top of the Prometheus metrics.

The scheduler and the control manager require you to define the right OTEL_COLLECTOR_URL

In the next coming release, the Keptn Lifecycle Toolkit will have a keptnConfig CRD allowing you to define your OpenTelemetry collector URL to send the produced spans.

In the end, you can easily build your DORA metrics with the traces and metrics.

# Tutorial

This tutorial will use the Keptn Lifecycle Toolkit with the OpenTelemetry Demo application.

We will create a first release of the application with a couple of pre and post-deployment checks.

Then we will create a second release of the application, adding a post-deployment check on the application by adding some sort of load test and pre-evaluation and post-evaluation.

It would be the perfect moment to learn to use KeptnTaskDefintion and KEptnEvaluationDefintion.

We will also look at the metrics and traces produced by Keptn.

For this tutorial, we will require the following:

1

A Kubernetes cluster
2

The Nginx ingress controller to expose Grafana
3

The Prometheus operator
4

The cert-manager
5

The OpenTelemetry operator
6

The Keptn Lifecycle Toolkit demo
7

The OpenTelemetry demo customized for the Keptn Lifecycle Demo by adding an annotation to the deployment files.