Kubernetes Basics

Pods

There are four top level properties to a Kubernetes config yaml:

apiVersion: Each apiVersion supports different Kubernetes objects. For an exhaustive list, see Kubernetes' docs, but don’t let it bog you down. See our /cheatsheets/ file.
kind: The type of Kubernetes object.
metadata: Data that describes the K8s objects.
spec: The desired state of the K8s object. Kubernetes will work to make sure that the current state of the objects matches the desired state; in other words, that the spec matches the status of the object.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 1. Version of K8s API to use.
apiVersion: v1

# 2. Kind of object being created.
kind: Pod

# 3. Describes K8s the object; "name" and "labels" are a few of many valid keys.
metadata:
   name: myapp-pod
   labels:
      app: myapp
      type: front-end
      nonsensical-bologna: boop 
      # You can put any key/value pairs you want in labels.
      # While "app" and "type" seem like they may be important labels, they are
      # something that we will use to organize our objects. 
      
# 4. The desired state of the object which K8s will work to maintain.
spec:
   containers:
      - name: nginx-container
        image: nginx

The kind: Pod specifies that we are creating a Pod, which is a collection of container(s) that share resources, have a single IP, and can share volumes. Pods run on worker nodes, and are the smallest unit that Kubernetes can understand. You will rarely manage pods yourself, and instead, will rely on one of Kubernetes’ control structures to manage them. Pods run on worker nodes.

Labels, Selectors and Annotations

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: v1
kind: Pod
metadata:
   name: simple-webapp

   # 1
   labels:
      app: App1
      function: front-end

   # 2
   annotations:
      buildVersion: 1.2.3
      forAssistance:
         call: "555-555-5555"
         email: "[email protected]"
         allowsTexts: True
      ownedBy: 
         name: "Alex Yu"
         dept: "Psychotronics"
spec:
   containers:
      - name: nginx
        image: nginx

labels are used to help filter K8s objects in conjunction with a selector. Labels are visible to Kubernetes.
1 2 3
kubectl get pods --selector app=App1 # or kubectl get pods --selector function=front-end
annotations, on the other hand, are used to communicate information with other engineers. In our annotations section, we list a person to contact in case an engineer needs assistance, and who owns the application. This information is not visible to Kubernetes.

Replica Sets

A Replica Set will make sure that a certain number of objects exist which match a particular query. A ReplicaSet has 3 very important fields used to define it’s functionality.

template: the Pod to be created. Does the data in template look familiar? It should! It is copied directly from the above pod definition. NOTE: We don’t specify an apiVersion or kind in template. This is because we always assume we are using apiVersion: v1, and kind: Pod.
selector: provides us with a tool to query for pods that we care about.
replicas: specifies the number of pods we want to exist at a given time.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Remember, every K8s object has apiVersion, kind, metadata, and a spec.
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: myapp-repplicaset
  labels:
    app: myapp
    type: front-end
    guess_what: chicken_butt
spec:

  # 1. Provides a template of a pod we wish to create when needed.
  template:
    metadata:
      name: myapp-pod
      labels:
        app: myapp
        type: front-end
        boop: blarp
    spec:
      containers:
        - name: nginx-container
          image: nginx

  # 2. This ReplicaSet will watch for all pods that have metadata.type: front-end
  selector: 
    matchLabels:
      type: front-end

  # 3. Watch for 3 objects matching the above selector. 
  replicas: 3

We can control ReplicaSets from the command line using the following commands.

We can create a ReplicaSet using our config with:
1
kubectl create -f replicaset-def.yaml
We can list existing ReplicaSets using:
1
kubectl get replicaset

We can describe a ReplicaSet using:

`1`	`kubectl describe replicaset myapp-replicaset`

We can update a ReplicaSet using:

`1`	`kubectl replace -f replicaset-def.yaml`

We can delete a ReplicaSet using:
1
kubectl delete replicaset myapp-replicaset
BEWARE! Deleting a ReplicaSet also deletes all underlying pods.

Let’s say we end up needing more replicas than we currently have. We can scale our ReplicaSet in one of 3 ways.

This is the preferred way if you want to update the yaml file. Simple change the number of replicas in the config from 3 to 6… or however many you want.
1
kubectl replace -f replicaset-def.yaml
This is the preferred way if you don’t want to change the yaml file, but want to make changes to the deployed object.
1 2
kubectl scale --replicas=6 -f replicaset-def.yaml ``./`
This is the preferred way if you don’t want to specify the file. Note that this requires the object type, followed by the object name.
1
kubectl scale --replicas=6 replicaset myapp-replicaset

Pro Tip: You can use replicaset, rs, and replicasets interchangeably in the command line.

NOTE: You may see ReplicationController being used, which is a K8s object that is currently being phased out. The only difference is, a ReplicaSet has a selector which gives the ability to manage pods that have been created before the ReplicaSet was.

Deployments

ReplicaSets solve the problem of having enough Pods running to provide high availability of our application; but it doesn’t make it particularly easy to upgrade over time. This is where Deployments come in. Deployments will make sure that upgrades happen gradually (also known as a rolling upgrade, aka rollout). Upgrading Pods gradually helps to ensure that users do not notice any downtime. There is another update strategy known as Recreate, which you can find more information about here, but it’s rare that you will use this in production. Deployments also make it easy to rollback to a previous version of software if need be. The config for a Deployment is nearly identical to a ReplicaSet’s config, except we update the kind to Deployment.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deployment
  labels:
    app: myapp
    type: front-end
    super: mario
spec:
  template:
    metadata:
      name: myapp-pod
      labels:
        app: myapp
        type: front-end
        boop: blarp
    spec:
      containers:
        - name: nginx-container
          image: nginx
  selector: 
    matchLabels:
      type: front-end
  replicas: 3

We have the following commands at our disposal to manipulate the Deployment:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# Create a Deployment from a yaml file.
kubectl create -f deployment-def.yaml

# Get a list of deployments.
kubectl get deployments

# Apply changes made to the original Deployment yaml file.
kubectl apply -f deployment-def.yaml

# Change the image used by an existing Deployment.
kubectl set image deployment/mydeployment-name nginx=nginx:1.9.1

# Get status of a rollout for an existing Deployment.
kubectl rollout status deployment/mydeployment-name

# View the rollout history for an existing Deployment
kubectl rollout history deployment/mydeployment-name

# Undo the latest rollout applied to a Deployment.
kubectl rollout undo deployment/mydeployment-name

The Deployment automatically creates a ReplicaSet:

1
2
3
kubectl get replicaset
#> NAME                       DESIRED    CURRENT    READY    AGE
#> myapp-deployment-abc123    3          3          3        2m

The ReplicaSet created by the Deployment will subsequently create Pods:

1
2
3
4
5
kubectl get pods
#> NAME                             READY STATUS   RESTARTS  AGE
#> myapp-deployment-abc123-xyz001   1/1   Running  0         2m
#> myapp-deployment-abc123-xyz002   1/1   Running  0         2m
#> myapp-deployment-abc123-xyz003   1/1   Running  0         2m

Now that we seem to be working with a bunch of different K8s objects, we can utilize the following to make our lives a bit easier.
1
kubectl get all
You can format the output from kubectl using one of the following:
- -o json: Output a JSON formatted API object.
- -o name: Print only the resource name and nothing else.
- -o wide: Output in the plain-text format with any additional information.
- -o yaml: Output a YAML formatted API object.`

Pro Tip: You can use the flag --dry-run=client when running a kubectl command and it will inform you whether the syntax of the command and config are correct, as well as if the resource is able to be created.

Namespaces

Namespaces are a way to organize K8s objects while enforcing policy, increasing security, and preventing accidental manipulation of vital objects. Kubernetes creates 3 namespaces when the cluster is created:

Default - this is where all Kubernetes objects are placed by default
kube-system - where K8s places all Pods and Services used by K8s to run, such as those required by the networking solution, the DNS service, etc.
kube-public - where K8s objects that should be available to all users are placed.

You can create your own Namespaces, for example, a namespace for prod and another for dev. This way, while working in the development environment, you don’t accidentally interact with a K8s object that belongs to the production namespace.

We can access K8s objects outside of the current namespace by referencing its fully qualified domain name as configured by Kubernetes. For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Access a service from within the namespace.
mysql.connect("db-service")                       

# Access a service from outside of namespace using that resource's full name.
mysql.connect("db-service.dev.svc.cluster.local")
# Object named db-service,
#      from the namespace dev,
#        a K8s object of type svc 
#             with the domain name cluster.local


# Playing with mnemonics to remember this for the CKAD exam:
# name.space.type.domain
# Nina Simone is a Taurus from Detroit

You can run any kubectl command against a namespace of your chosing by specifying the namespace:

1
kubectl create -f pod-def.yaml --namespace=dev

We can also declare the namespace that a Kubernetes object should be placed in via a metadata tag:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
apiVersion: v1
kind: Pod

metadata:
   name: myapp-pod
   namespace: dev
   labels:
      app: myapp
      type: front-end
      asdf: abc123
spec:
   containers:
      -name: nginx-container
       image: nginx

We can define a custom Namespace in one of two ways:

Via yaml

1
2
3
4
apiVersion: v1
kind: Namespace
metadata:
   name: dev

followed by the command to kubectl:

kubectl create -f namespace-dev.yaml

```
kubectl create namespace dev
```

Pro Tip: You can permanently change your default namespace with the following kubectl config set-context $kubectl config current-context) --namespace=dev

Pro Tip: We can list all Pods, Deployments, ReplicaSets, etc, across all Namespaces via kubectl get pods --all-namespaces

Jobs

You may require a batch process to run on your K8s cluster, such as performing a calculation. For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
apiVersion: v1
kind: Pod
metadata: 
   name: math-pod
spec:
   containers:
   -  name: math-add
      image: ubuntu
      command: ['expr', '3', '+', '2']
   restartPolicy: OnFailure

Take note of restartPolicy in the spec section. This is a policy that tells Kubernetes what to do with the Pod when it dies. There are three possible values for restartPolicy:

Always: the default for Pod objects, meaning that a Pod will restart forever (in theory);
OnFailure: causes the Pod to restart when it fails;
Never: if the Pod doesn’t run the first time, don’t try again.

This is a great option for one-off batches, but it fails to take advantage of the scalability of Kubernetes. This is where Jobs come into play:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: batch/v1
kind: Job
metadata:
   name: math-add-job
spec:

   # 1. The template of the Pod to create.
   template:
      spec:
         containers:
         -  name: math-add
            image: ubuntu
            command: ['expr', '3', '+', '2']
         restartPolicy: OnFailure

   # 2. The number of Pod Completions (no failures or terminations) required.
   completions: 3

   # 3. The number of Pods to be running at once.
   parallelism: 3

Does spec.template look familiar? That’s because it’s a identical to the spec of the Pod we created above.
Remember the different restartPolicy values you might encounter? Always, OnFailure, and Never? Well, using completions, we can tell Kubernetes to make sure a certain number of Jobs run to completion.
Using the parallelism field, we tell Kubernetes how many Pods to have running at any given time.

We can use the following commands to manipulate Jobs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 1. Create the Job.
kubectl create -f job-def.yaml

# 2. View a list of running Jobs.
kubectl get jobs

# 3. View the output of the running job (if you printed it to stdout)
kubectl logs math-add-job-<your-pod-number>

# 4. Delete a running Job, and all the Pods it created.
kubectl delete job math-add-job

CronJobs

Now that you know how to run Jobs, let’s learn schedule them to run periodically using cron:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
apiVersion: batch/v1beta1
kind: CronJob
metadata:
   name: reporting-cron-job
spec:
   # 1. Cron schedule string
   schedule: "*/1 * * * *"

   # 2. Describe the job
   jobTemplate:
      spec:
         template:
            spec:
               containers:
               -  name: reporting-tool
                  image: reporting-tool
               restartPolicy: OnFailure
         completions: 3
         parallelism: 3

schedule takes the cron schedule string which you can build using the guide below, coupled with this tool. This particular CronJob runs every minute.

# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday;
# │ │ │ │ │                                   7 is also Sunday on some systems)
# │ │ │ │ │
# │ │ │ │ │
# * * * * * <command to execute>

We describe the Job we are going to run, which is very reminiscent of the spec section of the Job yaml we wrote above.

Multi-Container Pod Design Patterns

So far, we have been working with a singleton Pods which encapsulate a single container. Now, we will be working with Multi-Container Pods, which is something you are more likely to encounter in the wild.

There are three common Multi-Container Pod patterns we will be discussing:

The Sidecar Pattern: deploying a container alongside the application to handle some minor task. For example, we may deploy a log-agent alongside a web server to collect logs and forward them to a central log server.
What if we have a central logging server, and a bunch of different applications logging to a central logging server using different logging formats? This is where the next pattern comes in…
The Adapter Pattern: deploy an additional container to allow services to cooperate that otherwise wouldn’t be able to. This allows you to simply deploy a different sidecar container to adapt to new situations as they arise (e.g. change of backend).
For example, you may have three applications with logging agents that generate logs in completely different formats. An adapter container is deployed alongside those logging-agents to normalize the log data before sending it off to the central logging server.
The Ambassador Pattern: an “ambassador container” is deployed, which essentially acts like a proxy that allows other containers to connect to a port on localhost while the ambassador container proxies the connection to the appropriate server. For example, say you have different logging servers for development, testing, and production deployments. Using an ambassador, we can send all of our logging data to a given port over localhost, which wouldn’t require changing the source code of the application.

Pods#

Labels, Selectors and Annotations#

Replica Sets#

Deployments#

Namespaces#

Jobs#

CronJobs#

Multi-Container Pod Design Patterns#