Sharing Our Passion for Technology
& Continuous Learning
A Hands-On Tour of Kubernetes: Part 3 - Communication and Services
Pod Communication
Our "applications" haven't been too exciting so far. We've created some nginx pods and sent a few HTTP requests, but these pods aren't talking to each other. Kubernetes complements a microservice architecture, but even if you follow a monolithic application design approach, we can anticipate there will be at least some communication across pods within our cluster.
To better understand how pods are able to communicate with each other, let's start by creating a new namespace for ourselves.
$ kubectl create namespace telephone
namespace/telephone created
Next, we need a way to issue arbitrary HTTP requests from inside the cluster. We'll create a "helper" pod which we'll use to send our HTTP requests.
$ kubectl run caller --image=alpine:3.19 --namespace=telephone --command -- sleep infinite
pod/caller created
Note that we're using Alpine for our container image. We're also supplying the --command
option, we haven't seen that before. Without this option, the container will run using the ENTRYPOINT
specified by the image. For our nginx pods, ENTRYPOINT
provides the desired behavior (that is, run nginx), but the ENTRYPOINT
for Alpine runs a shell. Since there is no standard input connected to the shell, the process will exit immediately. By using --command
, we can specify a new entrypoint, which we set to a command that will run forever.
We'll see momentarily why this is useful. First, let's verify the pod is running.
$ kubectl get pods --namespace=telephone
NAME READY STATUS RESTARTS AGE
caller 1/1 Running 0 30s
Looks good. Next, we will use kubectl exec
to send HTTP requests from within the container in this pod. This command is similar to docker exec
-- we specify a new process to run in the container, and the output will be shown in the terminal. Note that just like docker exec
, we can only run commands that are available within the container image.
Here is how we can make a request to the Source Allies home page.
$ kubectl exec pod/caller --namespace=telephone -- wget -q -S https://www.sourceallies.com -O /dev/null
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 23195
Connection: close
x-amz-id-2: JbK9j2rVTyi6hcupIfeOkojTTifXPz0SGHdk88cnXkqZ6cr/DC0xInAW4iwD3esv866NLlsnrO0=
x-amz-request-id: AY4NMYGY3KKVRSTT
Date: Thu, 25 Jan 2024 19:40:53 GMT
Last-Modified: Wed, 24 Jan 2024 13:51:07 GMT
ETag: "7e835e07e20658bc5febfd483401fcae"
x-amz-server-side-encryption: AES256
Accept-Ranges: bytes
Server: AmazonS3
X-Cache: Miss from cloudfront
Via: 1.1 ee0949c654b72e5ceb330e8b3e825e32.cloudfront.net (CloudFront)
X-Amz-Cf-Pop: ORD53-C2
X-Amz-Cf-Id: Bcs22oqFKYhHLStEuye7JiSzXGYmpXFrOJbMaZRRx6SJLF0sx3QGFg==
Let's break down this command a bit:
kubectl exec pod/caller --namespace=telephone
- We're specifying which pod we want to use to run our command
--
- This separates our
kubectl exec
options from the command to run in the container.
- This separates our
wget -q -S https://www.sourceallies.com -O /dev/null
- This is the command to run in the container.
And here is the meaning of the options provided to our wget
command:
-q
silences progress meters and other extraneous output-S
displays the response headers-O /dev/null
sends the body of the response to/dev/null
(effectively discards the response body)
We can use the --stdin
(-i
) and --tty
(-t
) options of kubectl exec
to run interactive programs from within a container. For example, we can run and connect to a shell running inside the container.
$ kubectl exec pod/caller --stdin --tty --namespace=telephone -- sh
/ # cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.19.0
PRETTY_NAME="Alpine Linux v3.19"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"
/ # uname -a
Linux caller 6.1.64-0-virt #1-Alpine SMP Wed, 29 Nov 2023 18:56:40 +0000 aarch64 Linux
/ # whoami
root
/ # exit
Being able to run commands interactively from within your application container is extremely handy for debugging.
The previous request was to an external resource, but how do we reach things inside the cluster? To see that in action, we need to create another pod.
$ kubectl run receiver --image=nginx:1.24 --namespace=telephone
pod/receiver created
As always, let's verify the new pod is running.
$ kubectl get pods --namespace=telephone
NAME READY STATUS RESTARTS AGE
caller 1/1 Running 0 74s
receiver 1/1 Running 0 8s
In Kubernetes, every pod receives its own IP address. We can ask kubectl get
to show pod IP addresses by specifying the output format with --output
(-o
). In our case, we'll use the wide
output format.
$ kubectl get pods --output=wide --namespace=telephone
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
caller 1/1 Running 0 90s 10.42.0.120 lima-rancher-desktop <none> <none>
receiver 1/1 Running 0 24s 10.42.0.121 lima-rancher-desktop <none> <none>
In addition to the IP addresses, the wide
output format also shows us which node each pod is running on. Assuming you're running Rancher Desktop or Docker Desktop as shown in the introductory blog post, you'll see the same node for all your pods since we're running a single node cluster.
Let's try using the IP address of the receiver
pod as the target for our wget
command. Note that your IP addresses will likely be different, so update this command with the IP address that you see.
$ kubectl exec pod/caller --namespace=telephone -- wget -q -S 10.42.0.121 -O /dev/null
HTTP/1.1 200 OK
Server: nginx/1.24.0
Date: Thu, 25 Jan 2024 19:57:56 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Tue, 11 Apr 2023 01:45:34 GMT
Connection: close
ETag: "6434bbbe-267"
Accept-Ranges: bytes
Woah, it worked! The Server
response header indicates that it was nginx that sent the response, but let's check our receiver
logs to be sure. We'll use --tail
in our kubectl logs
command to grab the last five lines of output.
$ kubectl logs pod/receiver --tail=5 --namespace=telephone
2024/01/25 19:57:25 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/01/25 19:57:25 [notice] 1#1: start worker processes
2024/01/25 19:57:25 [notice] 1#1: start worker process 29
2024/01/25 19:57:25 [notice] 1#1: start worker process 30
10.42.0.120 - - [25/Jan/2024:19:57:56 +0000] "GET / HTTP/1.1" 200 615 "-" "Wget" "-"
Sure enough, the last line shows that nginx received a request from 10.42.0.120
, which is the IP address of our caller
pod. (Again, your pod IP addresses will likely be different).
Before you start hard coding pod IP address into your application, let's see what happens if we delete and recreate our receiver
pod.
$ kubectl delete pod/receiver --namespace=telephone
pod "receiver" deleted
$ kubectl run receiver --image=nginx:1.24 --namespace=telephone
pod/receiver created
Alright, now let's list our pod IP addresses again.
$ kubectl get pods --output=wide --namespace=telephone
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
caller 1/1 Running 0 4m41s 10.42.0.120 lima-rancher-desktop <none> <none>
receiver 1/1 Running 0 41s 10.42.0.122 lima-rancher-desktop <none> <none>
Before, the IP address for receiver
was 10.42.0.121
, but now it is 10.42.0.122
. This brings us to a key aspect of the Kubernetes networking model: pod IP addresses are ephemeral.
- When a pod is created, an IP address will be selected from a pool of unused IP addresses.
- A pod will retain it's IP address as long as it's running.
- When a pod is deleted, it's IP address is put back into the pool of unused pod IP addresses.
So, hard coding pod IP addresses in your application is a pathway to madness. You have no guarantees on which IP addresses will be assigned to your pods. But if that's the case, what hope do we have for building applications that rely on other pods if we don't know their IP addresses?
In the next section, we'll start looking at the DNS service provided by the cluster. This DNS service is what allows us to tame these ephemeral IPs.
Before moving on, let's clean up the pods and namespace we've created.
$ kubectl delete namespace/telephone
namespace "telephone" deleted
kubectl
supports several output options. We used the wide
format earlier in this post to view pod IP addresses, but this format also includes other information such as pod age and number of container restarts. If we only wanted the pod names and IPs, we can use custom-columns
to only show these columns.
$ kubectl get pods --output=custom-columns=NAME:.metadata.name,IP:.status.podIP --namespace=telephone
NAME IP
caller 10.42.0.120
receiver 10.42.0.121
Using custom-columns
requires knowledge of the underlying API resource format, but it can be handy for generating automated reports.
If you want to perform additional transformations or filtering on the output of kubectl get
(e.g. as part of a script), you may want to use the json
or yaml
output formats, which return the underlying API resource as JSON or YAML, respectively.
Services
As we saw at the end of the previous blog post, pod IP addresses are ephemeral. To avoid the toil of updating IP addresses in our applications as pods are created and destroyed, Kubernetes relies on a faithful protocol that helps power the Internet: DNS.
When we run a pod, Kubernetes adjusts the container DNS resolution configuration file (/etc/resolv.conf
) to include the DNS server running inside the cluster. This DNS server automatically creates an A/AAAA record for every pod running in the cluster. The domain name uses the following format:
pod-ip-address.my-namespace.pod.cluster-domain.example
Sadly, as you can see, the pod IP address is part of the domain. Despite the existence of the DNS record, we'd still need to know the pod IP address if we want to reach it from another application. Drat!
Fortunately, Kubernetes provides a separate resource to facilitate service discovery: the aptly-named service. Here is how a service works:
- When we create service, we include a label selector in the spec.
- The service will look for pods in the same namespace as itself. Any pod whose labels match the label selector will be considered part of the service.
- A service has its own IP address. Whenever a request is sent to the service IP address, the request will be routed to one of the pods in the service.
Essentially, a service functions as a cluster-internal load balancer for pods. Like pods, the cluster DNS server creates an A/AAAA record for every service. Here is the domain format:
my-svc.my-namespace.svc.cluster-domain.example
No IP address in this name! In most cases, we can shorten the domain to the following:
my-svc.my-namespace.svc
Despite the fact that service IP addresses are ephemeral, the domain name of a service is static. If we know the name and namespace of a service, we can connect to the corresponding application without worrying about the underlying IP addresses.
Let's put together an example scenario so that we can see this behavior in action. To start, we'll create a namespace for ourselves:
$ kubectl create namespace lake
namespace/lake created
Next, let's look at an example service manifest:
apiVersion: v1
kind: Service
metadata:
name: fish
namespace: lake
spec:
selector:
role: fish
ports:
- name: http
port: 80
targetPort: 8080
This manifest specifies that any pods with the label role: fish
in the lake
namespace will be considered part of the fish
service. The ports
section specifies that requests received by the service on port 80 (port
) will be forwarded to port 8080 on the pod (targetPort
). Services only handle traffic on the specified ports, so there must be at least one entry in the ports
list.
Let's create this service using the manifest directly. As a reminder, here is how to create a resource with a manifest:
- Save the manifest to a file.
- Run
kubectl apply -f <filename>
.
An example with bash
:
$ cat <<EOF >service.yaml
apiVersion: v1
kind: Service
metadata:
name: fish
namespace: lake
spec:
selector:
role: fish
ports:
- name: http
port: 80
targetPort: 8080
EOF
$ kubectl apply -f service.yaml
service/fish created
Let's verify the service exists:
$ kubectl get services --namespace=lake
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
fish ClusterIP 10.43.108.184 <none> 80/TCP 30s
So far so good! We haven't created any pods in this namespace yet (let alone pods with a matching label), so there are zero pods included in this service. We can list the pod endpoints of the service to verify:
$ kubectl get endpoints --namespace=lake
NAME ENDPOINTS AGE
fish <none> 60s
As expected, the endpoints list is empty. Let's start adding pods to our namespace, using the --labels
(-l
) option to specify a label on the pods. We'll set the value to match the service label selector.
$ kubectl run fish-1 --image=jmalloc/echo-server:0.3.6 --labels=role=fish --namespace=lake
pod/fish-1 created
$ kubectl run fish-2 --image=jmalloc/echo-server:0.3.6 --labels=role=fish --namespace=lake
pod/fish-2 created
$ kubectl run fish-3 --image=jmalloc/echo-server:0.3.6 --labels=role=fish --namespace=lake
pod/fish-3 created
Let's list our pods along with their IP addresses:
$ kubectl get pods --namespace=lake -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
fish-1 1/1 Running 0 36s 10.42.0.195 lima-rancher-desktop <none> <none>
fish-2 1/1 Running 0 31s 10.42.0.196 lima-rancher-desktop <none> <none>
fish-3 1/1 Running 0 26s 10.42.0.197 lima-rancher-desktop <none> <none>
And now, we'll list the service endpoints again:
$ kubectl get endpoints --namespace=lake
NAME ENDPOINTS AGE
fish 10.42.0.195:8080,10.42.0.196:8080,10.42.0.197:8080 11m
Our service has picked up our pods! Note that the IP addresses listed match the pod IP addresses. Let's create another pod that we can use to send HTTP requests inside the cluster.
$ kubectl run angler --image=alpine:3.19 --labels=role=angler --namespace=lake --command -- sleep infinite
pod/angler created
The label we specified does not match the label selector of the service, so this pod is not included in the service. Listing the service endpoints should show the same values as before:
$ kubectl get endpoints fish --namespace=lake
NAME ENDPOINTS AGE
fish 10.42.0.195:8080,10.42.0.196:8080,10.42.0.197:8080 11m
It's time to make our first request:
$ kubectl exec pod/angler --namespace=lake -- wget -qO- fish.lake.svc
Request served by fish-3
HTTP/1.1 GET /
Host: fish.lake.svc
User-Agent: Wget
Connection: close
The fish-*
pods are running an application that returns the details of the request along with the hostname of the pod. The hostname of a pod matches the name of the pod, and in this example, it was the fish-3
pod that received the request. Because the service does load balancing, you may see a different pod selected. In fact, if we keep sending requests, we'll likely see different pods chosen:
$ kubectl exec pod/angler --namespace=lake -- wget -qO- fish.lake.svc
Request served by fish-2
HTTP/1.1 GET /
Host: fish.lake.svc
User-Agent: Wget
Connection: close
This time, it was fish-2
that received the request. Services are a critical resource in Kubernetes since services facilitate horizontal scaling of workloads. Pods can be added and removed and the service will adjust the endpoints accordingly. For example, let's delete two of our pods then inspect the endpoints.
$ kubectl delete pod fish-2 fish-3 --namespace=lake
pod "fish-2" deleted
pod "fish-3" deleted
$ kubectl get endpoints --namespace=lake
NAME ENDPOINTS AGE
fish 10.42.0.195:8080 27m
There is just the one endpoint. If we pretend that the fish-*
pods represent replicas of our application, we can start to see how we can scale our application in/out depending on load.
Manually creating and deleting the pod replicas is a bit tedious though. In the next blog post, we'll look at another resource that will make it easier to manage pod replicas.
Let's clean up before moving on:
$ kubectl delete namespace/lake
In this post, we looked at how pods communicate with each other. We saw that pod IP addresses are ephemeral, but we can use services to provide a stable domain name for our pods. In the next post, we'll look at how to use Deployments to manage pod replicas and how to deploy our own applications.