Kubernetes metrics-server Installation

Metrics Server is a cluster-wide aggregator of resource usage data. It collects metrics like CPU or memory consumption for containers or nodes, from the Summary API, exposed by Kubelet on each node.

Installation:-

To deploy the Metrics Server

Latest Metrics Server release can be installed by running:

root@master:~#kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

After installation you may see the below error:-

root@master:~# kubectl top pods
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)

To fix this try to find the logs

root@master:~# kubectl get all
NAME READY STATUS RESTARTS AGE
pod/calico-kube-controllers-744cfdf676-w6l4v 1/1 Running 2 4d1h
pod/calico-node-7db46 1/1 Running 3 4d1h
pod/calico-node-8fckx 1/1 Running 3 4d1h
pod/calico-node-r6shp 0/1 Running 2 4d1h
pod/coredns-74ff55c5b-2bps6 1/1 Running 3 4d2h
pod/coredns-74ff55c5b-8v4n4 1/1 Running 3 4d2h
pod/etcd-master 1/1 Running 4 4d2h
pod/kube-apiserver-master 1/1 Running 4 4d2h
pod/kube-controller-manager-master 1/1 Running 4 4d2h
pod/kube-proxy-8cqc8 1/1 Running 4 4d2h
pod/kube-proxy-8grm9 1/1 Running 4 4d2h
pod/kube-proxy-z89hf 1/1 Running 3 4d2h
pod/kube-scheduler-master 1/1 Running 4 4d2h
pod/metrics-server-5d5c49f488-vs2qs 0/1 Running 1 38s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.96.0.10 53/UDP,53/TCP,9153/TCP 4d2h
service/metrics-server ClusterIP 10.97.59.212 443/TCP 38s

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/calico-node 3 3 2 3 2 kubernetes.io/os=linux 4d1h
daemonset.apps/kube-proxy 3 3 3 3 3 kubernetes.io/os=linux 4d2h

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/calico-kube-controllers 1/1 1 1 4d1h
deployment.apps/coredns 2/2 2 2 4d2h
deployment.apps/metrics-server 0/1 1 0 38s

NAME DESIRED CURRENT READY AGE
replicaset.apps/calico-kube-controllers-744cfdf676 1 1 1 4d1h
replicaset.apps/coredns-74ff55c5b 2 2 2 4d2h
replicaset.apps/metrics-server-5d5c49f488 1 1 0 38s

To see the logs, use the below command:-

root@master:~# kubectl logs -f metrics-server-5d5c49f488-rqhq8
I0202 05:21:04.376628 1 secure_serving.go:197] Serving securely on [::]:4443
I0202 05:21:04.376731 1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key
I0202 05:21:04.376752 1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0202 05:21:04.376883 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0202 05:21:04.376904 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0202 05:21:04.376965 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0202 05:21:04.377194 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0202 05:21:04.377489 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0202 05:21:04.377536 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
E0202 05:21:04.383340 1 server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node worker1: unable to fetch metrics from node worker1: Get “https://192.168.0.21:10250/stats/summary?only_cpu_and_memory=true”: x509: cannot validate certificate for 192.168.0.21 because it doesn’t contain any IP SANs, unable to fully scrape metrics from node master: unable to fetch metrics from node master: Get “https://192.168.0.20:10250/stats/summary?only_cpu_and_memory=true”: x509: cannot validate certificate for 192.168.0.20 because it doesn’t contain any IP SANs, unable to fully scrape metrics from node worker2: unable to fetch metrics from node worker2: Get “https://192.168.0.22:10250/stats/summary?only_cpu_and_memory=true”: x509: cannot validate certificate for 192.168.0.22 because it doesn’t contain any IP SANs]
I0202 05:21:04.477151 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::

To fix this edit the deployment file, under kubelet-use-node-status-port, add the below lines

    - --kubelet-insecure-tls
    - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP

File entry looks like the below:-

spec:
  containers:
  - args:
    - --cert-dir=/tmp
    - --secure-port=4443
    - --kubelet-use-node-status-port
    - --kubelet-insecure-tls
    - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP

Verify that the metrics-server deployment is running the desired number of pods with the following command.

root@master:~# kubectl get all
NAME READY STATUS RESTARTS AGE
pod/calico-kube-controllers-744cfdf676-w6l4v 1/1 Running 2 4d2h
pod/calico-node-7db46 1/1 Running 3 4d2h
pod/calico-node-8fckx 1/1 Running 3 4d2h
pod/calico-node-r6shp 0/1 Running 2 4d2h
pod/coredns-74ff55c5b-2bps6 1/1 Running 3 4d3h
pod/coredns-74ff55c5b-8v4n4 1/1 Running 3 4d3h
pod/etcd-master 1/1 Running 4 4d3h
pod/kube-apiserver-master 1/1 Running 4 4d3h
pod/kube-controller-manager-master 1/1 Running 4 4d3h
pod/kube-proxy-8cqc8 1/1 Running 4 4d3h
pod/kube-proxy-8grm9 1/1 Running 4 4d3h
pod/kube-proxy-z89hf 1/1 Running 3 4d3h
pod/kube-scheduler-master 1/1 Running 4 4d3h
pod/metrics-server-7945c98585-hljzc 1/1 Running 0 2m11s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.96.0.10 53/UDP,53/TCP,9153/TCP 4d3h
service/metrics-server ClusterIP 10.102.116.151 443/TCP 17m

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/calico-node 3 3 2 3 2 kubernetes.io/os=linux 4d2h
daemonset.apps/kube-proxy 3 3 3 3 3 kubernetes.io/os=linux 4d3h

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/calico-kube-controllers 1/1 1 1 4d2h
deployment.apps/coredns 2/2 2 2 4d3h
deployment.apps/metrics-server 1/1 1 1 17m

NAME DESIRED CURRENT READY AGE
replicaset.apps/calico-kube-controllers-744cfdf676 1 1 1 4d2h
replicaset.apps/coredns-74ff55c5b 2 2 2 4d3h
replicaset.apps/metrics-server-5d5c49f488 0 0 0 17m
replicaset.apps/metrics-server-6489d6f8c6 0 0 0 12m
replicaset.apps/metrics-server-7945c98585 1 1 1 2m11s

root@master:~# kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
master 228m 5% 1311Mi 15%
worker1 99m 4% 841Mi 10%
worker2 105m 5% 1043Mi 12%
root@master:~# kubectl top pods
NAME CPU(cores) MEMORY(bytes)
calico-kube-controllers-744cfdf676-w6l4v 2m 12Mi
calico-node-7db46 30m 96Mi
calico-node-8fckx 34m 99Mi
calico-node-r6shp 27m 95Mi
coredns-74ff55c5b-2bps6 3m 9Mi
coredns-74ff55c5b-8v4n4 3m 8Mi
etcd-master 15m 55Mi
kube-apiserver-master 70m 281Mi
kube-controller-manager-master 20m 48Mi
kube-proxy-8cqc8 1m 19Mi
kube-proxy-8grm9 1m 16Mi
kube-proxy-z89hf 1m 16Mi
kube-scheduler-master 4m 16Mi
metrics-server-7945c98585-hljzc 3m 12Mi