Extracting useful information from your Kubernetes cluster with custom-columns and jq
How to build custom queries for your Kubernetes cluster objects and how to create your own query collection.
> Image by: max_duz at: unsplash.com/photos/qAjJk-un3BI
It's common when working with Kubernetes that we perform several queries to our cluster objects such as nodes, deployments, builds, pods and we don't always get the set of information we need, exposed by default via kubeclt get
, having to resort in these cases to search the entire object bringing information beyond what is desired.
Using kubectl
's custom-columns
output option and the jq
tool we can create queries that deliver specifically what we want. In this article, we'll explore the two and learn how to create your own query collection.
Problem
Let's consider two common scenarios where we need to fetch information from a kubernetes cluster:
- Recover cluster nodes health information such as memory, cpu, disk pressures.
- Retrieve information from environment variables (
env
) and resource limits (resource
andlimits
) fromdeployments
in the cluster.
To retrieve information from cluster objects, in general, we can use the command:
kubectl get <OBJECT>
To consult the nodes we can run the command:
~ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-0-135-204.xyz.compute.internal Ready master 11d v1.21.1+6438632 10.0.135.204 <none> RHEL CoreOS 48.84.202110270303-0 4.18.0-305.19.1.el8_4.x86_64 cri-o://1.21.3-8.rhaos4.8.git7415a53.el8
ip-10-0-142-176.xyz.compute.internal Ready worker 11d v1.21.1+6438632 10.0.142.176 <none> RHEL CoreOS 48.84.202110270303-0 4.18.0-305.19.1.el8_4.x86_64 cri-o://1.21.3-8.rhaos4.8.git7415a53.el8
ip-10-0-160-187.xyz.compute.internal Ready master 11d v1.21.1+6438632 10.0.160.187 <none> RHEL CoreOS 48.84.202110270303-0 4.18.0-305.19.1.el8_4.x86_64 cri-o://1.21.3-8.rhaos4.8.git7415a53.el8
ip-10-0-176-188.xyz.compute.internal Ready worker 11d v1.21.1+6438632 10.0.176.188 <none> RHEL CoreOS 48.84.202110270303-0 4.18.0-305.19.1.el8_4.x86_64 cri-o://1.21.3-8.rhaos4.8.git7415a53.el8
ip-10-0-214-226.xyz.compute.internal Ready master 11d v1.21.1+6438632 10.0.214.226 <none> RHEL CoreOS 48.84.202110270303-0 4.18.0-305.19.1.el8_4.x86_64 cri-o://1.21.3-8.rhaos4.8.git7415a53.el8
ip-10-0-219-74.xyz.compute.internal Ready worker 11d v1.21.1+6438632 10.0.219.74 <none> RHEL CoreOS 48.84.202110270303-0 4.18.0-305.19.1.el8_4.x86_64 cri-o://1.21.3-8.rhaos4.8.git7415a53.el8
To query deployments we can use the command, like for instance:
# You can also use -o wide to retrieve more information
~ kubectl get deployments --all-namespaces
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
openshift-apiserver-operator openshift-apiserver-operator 1/1 1 1 11d
openshift-apiserver apiserver 3/3 3 3 11d
openshift-cluster-storage-operator cluster-storage-operator 1/1 1 1 11d
openshift-cluster-storage-operator csi-snapshot-controller 2/2 2 2 11d
openshift-cluster-version cluster-version-operator 1/1 1 1 11d
openshift-console-operator console-operator 1/1 1 1 11d
openshift-console console 2/2 2 2 11d
Both commands, despite bringing a lot of information, do not contain the information we are looking for. To retrieve the information we need, we can retrieve the complete objects, in yaml
or json
format, using the command: kubectl get deployments --all-namespaces -o json
The command output is as follows:
{
"apiVersion": "v1",
"items": [
{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"name": "openshift-apiserver-operator",
"namespace": "openshift-apiserver-operator"
},
"spec": {
"template": {
"spec": {
"containers": [
{
"env": [
{
"name": "IMAGE",
"value": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f532d4e20932e1e6664b1b7003691d44a511bb626bc339fd883a624f020ff399"
},
{
"name": "OPERATOR_IMAGE",
"value": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a24bdc7bae31584af5a7e0cb0629dda9bb2b1d613a40e92e227e0d13cb326ef4"
},
{
"name": "OPERATOR_IMAGE_VERSION",
"value": "4.8.19"
},
{
"name": "OPERAND_IMAGE_VERSION",
"value": "4.8.19"
},
{
"name": "KUBE_APISERVER_OPERATOR_IMAGE",
"value": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0e56e34f980552a7ce3d55429a9a265307dc89da11c29f6366b34369cc2a9ba0"
}
],
"resources": {
"requests": {
"cpu": "10m",
"memory": "50Mi"
}
}
}
]
}
}
}
},
// other informations...
],
"kind": "List",
"metadata": {
"resourceVersion": "",
"selfLink": ""
}
}
Using custom-columns to query nodes
Let's explore the custom-columns
output option of the kubectl get
command to retrieve just the information we need. The custom-columns
option allows us to define which data will be extracted by mapping the column heading and the desired field.
Using the node json as a base to build our query
{
"apiVersion": "v1",
"kind": "Node",
"metadata": {
"name": "ip-10-0-219-74.xyz.compute.internal"
},
"status": {
"addresses": [
{
"address": "10.0.219.74",
"type": "InternalIP"
},
{
"address": "ip-10-0-219-74.xyz.compute.internal",
"type": "Hostname"
},
{
"address": "ip-10-0-219-74.xyz.compute.internal",
"type": "InternalDNS"
}
],
"conditions": [
{
"message": "kubelet has sufficient memory available",
"reason": "KubeletHasSufficientMemory",
"status": "False",
"type": "MemoryPressure"
},
{
"message": "kubelet has no disk pressure",
"reason": "KubeletHasNoDiskPressure",
"status": "False",
"type": "DiskPressure"
},
{
"message": "kubelet has sufficient PID available",
"reason": "KubeletHasSufficientPID",
"status": "False",
"type": "PIDPressure"
},
{
"message": "kubelet is posting ready status",
"reason": "KubeletReady",
"status": "True",
"type": "Ready"
}
],
"nodeInfo": {
"architecture": "amd64",
"bootID": "327671fc-3d6f-4bc4-ab5f-fa012687e839",
"containerRuntimeVersion": "cri-o://1.21.3-8.rhaos4.8.git7415a53.el8",
"kernelVersion": "4.18.0-305.19.1.el8_4.x86_64",
"kubeProxyVersion": "v1.21.1+6438632",
"kubeletVersion": "v1.21.1+6438632",
"machineID": "ec2e23b2f3d554c78f67dc2e30ba230a",
"operatingSystem": "linux",
"osImage": "Red Hat Enterprise Linux CoreOS 48.84.202110270303-0 (Ootpa)",
"systemUUID": "ec2e23b2-f3d5-54c7-8f67-dc2e30ba230a"
}
}
}
A simple query using custom-columns
to return the names of the cluster nodes:
~ kubectl get nodes -o custom-columns="Name:.metadata.name"
Name
ip-10-0-135-204.xyz.compute.internal
ip-10-0-142-176.xyz.compute.internal
ip-10-0-160-187.xyz.compute.internal
ip-10-0-176-188.xyz.compute.internal
ip-10-0-214-226.xyz.compute.internal
ip-10-0-219-74.xyz.compute.internal
To query values from a group, such as nodes addresses (InternalIP, Hostname, InternalDNS) we can use the notation .status.addresses[*].address
~ kubectl get nodes -o custom-columns="Name:.metadata.name,Addresses:.status.addresses[*].address"
Name Addresses
ip-10-0-135-204.xyz.compute.internal 10.0.135.204,ip-10-0-135-204.xyz.compute.internal,ip-10-0-135-204.xyz.compute.internal
ip-10-0-142-176.xyz.compute.internal 10.0.142.176,ip-10-0-142-176.xyz.compute.internal,ip-10-0-142-176.xyz.compute.internal
ip-10-0-160-187.xyz.compute.internal 10.0.160.187,ip-10-0-160-187.xyz.compute.internal,ip-10-0-160-187.xyz.compute.internal
ip-10-0-176-188.xyz.compute.internal 10.0.176.188,ip-10-0-176-188.xyz.compute.internal,ip-10-0-176-188.xyz.compute.internal
ip-10-0-214-226.xyz.compute.internal 10.0.214.226,ip-10-0-214-226.xyz.compute.internal,ip-10-0-214-226.xyz.compute.internal
ip-10-0-219-74.xyz.compute.internal 10.0.219.74,ip-10-0-219-74.xyz.compute.internal,ip-10-0-219-74.xyz.compute.internal
If we want specific values from a group, we can use the desired index for that, then, to set up our nodes health query:
~ kubectl get nodes -o custom-columns="Name:.metadata.name,InternalIP:.status.addresses[0].address,Kernel:.status.nodeInfo.kernelVersion,MemoryPressure:.status.conditions[0].status,DiskPressure:.status.conditions[1].status,PIDPressure:.status.conditions[2].status,Ready:.status.conditions[3].status"
Name Kernel InternalIP MemoryPressure DiskPressure PIDPressure Ready
ip-10-0-135-204.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.135.204 False False False True
ip-10-0-142-176.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.142.176 False False False True
ip-10-0-160-187.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.160.187 False False False True
ip-10-0-176-188.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.176.188 False False False True
ip-10-0-214-226.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.214.226 False False False True
ip-10-0-219-74.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.219.74 False False False True
Creating a query collection
With our custom query ready, we can save the field mapping for easy reuse. The file follows a specific format of headers and values:
HEADER1 HEADER2 HEADER3
.field.value1 .field.value2 .field.value3
For our query, the file, which we'll call cluster-nodes-health.txt
, would be:
Name Kernel InternalIP MemoryPressure DiskPressure PIDPressure Ready
.metadata.name .status.nodeInfo.kernelVersion .status.addresses[0].address .status.conditions[0].status .status.conditions[1].status .status.conditions[2].status .status.conditions [3].status
And we can perform the query using the custom-columns-file
option:
~ kubectl get nodes -o custom-columns-file=cluster-nodes-health.txt
Name Kernel InternalIP MemoryPressure DiskPressure PIDPressure Ready
ip-10-0-135-204.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.135.204 False False False True
ip-10-0-142-176.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.142.176 False False False True
ip-10-0-160-187.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.160.187 False False False True
ip-10-0-176-188.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.176.188 False False False True
ip-10-0-214-226.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.214.226 False False False True
ip-10-0-219-74.xyz.compute.internal 4.18.0-305.19.1.el8_4.x86_64 10.0.219.74 False False False True
Using jq
to query env from deployments
To query the envs
we will explore the jq
utility, with it we will fetch objects like json and filter them to show only the information we want.
About jq
jq
is a lightweight and flexible JSON command line processor.
As the jq page itself describes:
> "jq is like sed for JSON data - you can use it to split, filter, map, and transform structured data as easily as sed, awk, grep."
It can be found at: stedolan.github.io/jq
Structuring the jq
command
Let's show a basic query with jq
. That interacts with .items[]
deployments and extracts just their name from .metadata.name
.
~ kubectl get deployments --all-namespaces -o json | jq -r '.items[] | .metadata.name '
openshift-apiserver-operator
apiserver
authentication operator
# other projects...
Let's evolve our query to build a json with the information for name
, namespace
and env
:
~ kubectl get deployments --all-namespaces -o json | jq -r '.items[] | { namespace: .metadata.namespace, name: .metadata.name, env: .spec.template.spec.containers[].env}'
{
"namespace": "openshift-apiserver-operator",
"name": "openshift-apiserver-operator",
"env": [
{
"name": "IMAGE",
"value": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f532d4e20932e1e6664b1b7003691d44a511bb626bc339fd883a624f020ff399"
},
{
"name": "OPERATOR_IMAGE",
"value": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a24bdc7bae31584af5a7e0cb0629dda9bb2b1d613a40e92e227e0d13cb326ef4"
},
{
"name": "OPERATOR_IMAGE_VERSION",
"value": "4.8.19"
},
{
"name": "OPERAND_IMAGE_VERSION",
"value": "4.8.19"
},
{
"name": "KUBE_APISERVER_OPERATOR_IMAGE",
"value": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0e56e34f980552a7ce3d55429a9a265307dc89da11c29f6366b34369cc2a9ba0"
}
]
}
{
"namespace": "openshift-apiserver",
"name": "apiserver",
"env": [
{
"name": "POD_NAME",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.name"
}
}
},
{
"name": "POD_NAMESPACE",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.namespace"
}
}
}
]
}
{
"namespace": "openshift-apiserver",
"name": "apiserver",
"env": [
{
"name": "POD_NAME",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.name"
}
}
},
{
"name": "POD_NAMESPACE",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.namespace"
}
}
}
]
}
// fields hidden for reading....
To get our json in a valid format, let's wrap the results in an array []
and use jq
's map
function.
~ kubectl get deployments --all-namespaces -o json | jq -r '.items | [ map(.) | .[] | { namespace: .metadata.namespace, name: .metadata.name, env: .spec.template.spec.containers[].env }]'
// small output for reading purpose...
[
{
"namespace": "openshift-operator-lifecycle-manager",
"name": "catalog-operator",
"env": [
{
"name": "RELEASE_VERSION",
"value": "4.8.19"
}
]
},
{
"namespace": "openshift-operator-lifecycle-manager",
"name": "olm-operator",
"env": [
{
"name": "RELEASE_VERSION",
"value": "4.8.19"
},
{
"name": "OPERATOR_NAMESPACE",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.namespace"
}
}
},
{
"name": "OPERATOR_NAME",
"value": "olm-operator"
}
]
},
{
"namespace": "openshift-operator-lifecycle-manager",
"name": "packageserver",
"env": [
{
"name": "OPERATOR_CONDITION_NAME",
"value": "packageserver"
}
]
}
]
Query with jq
file
Just like with custom-columns
, with jq
we have the option of passing a file containing our filter instead of inline data. So, let's create a file called jq-deployments-envs.txt
with the contents:
.items | [ map(.) | .[] | { namespace: .metadata.namespace, name: .metadata.name, env: .spec.template.spec.containers[].env }]
And our query can be executed with the command:
~ kubectl deployments --all-namespaces -o json | jq -f jq-deployments-envs.txt
Conclusion
With kubectl
's native option, custom-columns
, and the jq
utility it is possible to extract custom information from a Kubernetes cluster. Furthermore, with the option to use files to assemble queries we can create several useful views for the cluster and store them in source control for sharing with other team members or for the community.
References
kubernetes.io/docs/reference/kubectl/overvi..
kubernetes.io/pt-br/docs/reference/kubectl/..
stedolan.github.io/jq/tutorial
kubernetes.io/docs/tasks/access-application..
kubernetes.io/docs/reference/kubectl/jsonpath
gist.github.com/so0k/42313dbb3b547a0f51a547..
starkandwayne.com/blog/silly-kubectl-trick-..
michalwojcik.com.pl/2021/07/04/yaml-jsonpat..
laury.dev/snippets/combine-kubectl-jsonpath..