How to Monitor App Uptime with Kubernetes and Prometheus

How to Monitor App Uptime with Kubernetes and Prometheus

Disclosure: scottyfullstack.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to amazon.com.

Scotty Parlor

July 4, 2020

Read Time 17 min

                

I wrote an article back in April titled SFS DevOps 2: Kubernetes, Docker, Selenium, Jenkins, Python, and Prometheus. This is a long article with a basic project setup. Although I definitely recommend combing through it to help your understanding and portfolio, I decided to write a smaller spin off tutorial to serve as an intro to monitoring with Prometheus.



If you haven't heard it by now, monitoring is critical for any software development position, especially DevOps. Employers LOVE someone that can catch issues before their clients.

I hope this intro serves as a brief and helpful walk through to get Prometheus up and running on your local machine.

What do we need?


An Exporter


We are going to need an application that has an exporter. An exporter is what we tie into our application to present application metrics so that Prometheus can scoop them up and interpret them. In our case, we will have a standard falcon http app with a falcon exporter library included.

A Kube Cluster


Since we are practicing and really don't need anything special, we are going to use minikube on our local machine. We will be applying permissions yaml files for cluster roles, cluster role bindings, and  a service account. Don't worry if that sounds confusing...this is going to be a copy/paste job almost directly from kubernetes docs.

ConfigMaps with Prometheus/Alertmanager Configs


We will take the Prometheus configs and tie them to configmaps in our cluster. Although you don't see why now, this will be beneficial for rapid deployments when we don't want to rebuild an image every time.

AND THAT'S IT.
Now, before we begin here is what my working directory looks like at the end. If you'd like to show a repo off to employers, try this setup (folders in bold):

├── configs
│   ├── alertmanager.yml
│   ├── prometheus.yml
│   └── rules.yml
├── deployment.yml
└── permissions
    └── permissions.yml

The Falcon App with Exporter:


There are really two ways for you to go about doing this section.

A.) The easy way. You can pull scottyfullstack/hello from my docker. Its a falcon app that when hit the route /metrics will show the response from the falcon exporter.

B.) Make your own app below and create your own docker image.

  1. First you will need to make sure you pip install (for docker put these in a requirements.txt file:

falcon==2.0.0
gunicorn==20.0.4
falcon-prometheus==0.1.0

  1. Create hello.py
import falcon
from falcon_prometheus import PrometheusMiddleware

class HelloResource(object):
def on_get(self, req, resp):
resp.status = falcon.HTTP_200
resp.body = ("Hello, World!")

class Page2Resource(object):
def on_get(self, req, resp):
resp.status = falcon.HTTP_200
resp.body = ("This is the second page!")

prometheus = PrometheusMiddleware()
app = falcon.API(middleware=prometheus)

hello = HelloResource()
page2 = Page2Resource()

app.add_route('/', hello)
app.add_route('/page2', page2)
app.add_route('/metrics', prometheus)

And thats all there is to the falcon app. Notice the Prometheus Middleware used with the /metrics route.



The Minikube Cluster & Permissions:

If you are new to kubernetes and have not downloaded minikube or kubectl yet, navigate here and follow the quick installation instructions.

Fire up minikube with

minikube start
This will take a few minutes. Once your cluster starts, check it out with

kubectl get nodes

Cool, your kube cluster is up and running on your laptop. Let's now deploy the falcon app there and expose it (then use minikube to open it in browser), as well as deploying our permissions.

kubectl create deployment hello --image=sparlor/scottyfullstack:hello
kubectl expose deployment hello --type=NodePort --port=8000

check the status of the pod with:

kubectl get po -w

once its up run the following to open it in your browser:

minikube service hello

Then I will add my permissions file that includes three yamls in one (clusterRole, clusterRoleBinding, serviceAccount). For more information, check out the official kubernetes docs.
#permissions.yml

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus-kube
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
- nonResourceURLs:
- /metrics
verbs:
- get

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-kube
subjects:





- kind: ServiceAccount
name: prometheus-sa
namespace: default

---

apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-sa
namespace: default

Lastly for this section, let's deploy our basic Prometheus deployment with their official image.

kubectl create deployment prometheus --image=prom/prometheus
kubectl expose deployment prometheus --type=NodePort --port=9090
minikube service prometheus

Awesome, you should be able to see the basic Prometheus Dashboard now.

In the menu, navigate to Status > Targets to see your Prometheus instance monitoring itself.

Let's wrap this up and start reporting.



Prometheus/Alertmanager Configs and Enhancing the Deployment Yaml


Now that we have Prometheus standing we need to create a configmap for Prometheus that our deployment will pick up each time. If you haven't seen the prometheus docs, check them out. There is a prometheus.yml file that comes standard that tells prometheus what to monitor, how to alert, and at what interval. The basic template can be found here.

This is what our prometheus.yml (inside my configs folder) will look like:

Note: In the 'hello'  targets section at the bottom, you will need to include the output of (WITHOUT http://):

minikube service hello --url

!!! IMPORTANT !!! Make sure you keep the yaml tabbing/spacing...Otherwise your deployment will fail (VSCode gave me issues with this, if all else fails use vim).

#prometheus.yml

# my global config
global:
scrape_interval: 1s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- /etc/rules/rules.yml

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'

# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.

static_configs:
- targets: ['localhost:9090']

- job_name: 'Hello-app'

static_configs:
- targets: ['Hello service URL']

Next lets create an alertmanager config in our configs folder. I've configured my configuration for my slack in the #alerts channel. To set this up check out this link and add Incoming Webhooks to slack. Once you have your webhook url add it to slack_api_url below :

#alertmanager.yml

#global config
global:
resolve_timeout: 30s
slack_api_url: ''
route:
receiver: 'slack-notifications'

receivers:
- name: 'slack-notifications'
slack_configs:
- channel: '#alerts'
send_resolved: true

Finally, create the rules.yml file for our rules volume:

#rules.yml

groups:
- name: Rules
rules:
- alert: InstanceDown
# Condition for alerting
expr: up == 0
for: 1m
# Annotation - additional informational labels to store more information
annotations:
title: 'Instance {{ $labels.instance }} down'
description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute.'
# Labels - additional labels to be attached to the alert
labels:
severity: 'critical'

Now create the configmaps from those files:

kubectl create configmap prometheus --from-file=prometheus.yml
kubectl create configmap alertmanager --from-file=alertmanager.yml
kubectl create configmap rules --from-file=rules.yml

While we are at it, lets create the rules config too (found in the prom docs)

Now, lets grab the prometheus deployment yaml and add to it. This is a great way to SHOW what you know, by the way.

kubectl get deployment prometheus -o yaml > deployment.yml

Compare this to your file and add what's needed.

There is a lot going on here, so note the Volumes, the new alertmanager image, and the Service account added.

apiVersion: v1
items:
- apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
creationTimestamp: "2020-03-12T23:25:18Z"
generation: 1
labels:
app: prometheus
name: prometheus
namespace: default
resourceVersion: "61653"
selfLink: /apis/apps/v1/namespaces/default/deployments/prometheus
uid: 63d11ba2-4caf-45c5-953f-d404812aec9a
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: prometheus
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: prometheus
spec:
serviceAccountName: prometheus-sa
containers:
- image: prom/prometheus
imagePullPolicy: IfNotPresent
name: prometheus
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- name: prometheus
mountPath: /etc/prometheus/
- name: rules
mountPath: /etc/rules/
- image: prom/alertmanager
imagePullPolicy: IfNotPresent
name: alertmanager
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- name: alertmanager
mountPath: /etc/alertmanager/
volumes:
- name: rules
configMap:
defaultMode: 420
name: rules
- name: prometheus
configMap:
defaultMode: 420
name: prometheus
- name: alertmanager
configMap:
defaultMode: 420
name: alertmanager
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2020-03-12T23:25:20Z"
lastUpdateTime: "2020-03-12T23:25:20Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2020-03-12T23:25:18Z"
lastUpdateTime: "2020-03-12T23:25:20Z"
message: ReplicaSet "prometheus-79d4cb85d5" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 1
readyReplicas: 1
replicas: 1
updatedReplicas: 1
kind: List
metadata:
resourceVersion: ""
selfLink: ""

When done run:

kubectl delete deployment prometheus
kubectl create -f deployment.yml

You should now have all of your services, pods, and configs in your minikube cluster. Let's try it out by bringing down our hello app.

kubectl scale deployment hello --replicas=0

after the service is down for 1 minute, you should receive a critical alert for hello deployment in your slack channel. Also, navigate to

minikube service prometheus

and check the targets dropdown item. You should see the failure in your prometheus admin panel.

And that's how to get started with Prometheus. As you can imagine, there is so much more to do from here! Check back for more!

See you next time.