kubelet.service:主进程已退出,代码=已退出,状态= 255 / n / a

时间:2019-03-06 10:27:22

标签: kubernetes fedora kubelet

我正在按照以下说明进行测试群集: https://kubernetes.io/docs/getting-started-guides/fedora/fedora_manual_config/

https://kubernetes.io/docs/getting-started-guides/fedora/flannel_multi_node_cluster/ 不幸的是,当我检查我的节点时,会发生以下情况:

kubectl get no
NAME                        STATUS     ROLES     AGE       VERSION
pccshost2.lan.proficom.de   NotReady   <none>    19h       v1.10.3
pccshost3.lan.proficom.de   NotReady   <none>    19h       v1.10.3

据我所知,这个问题与主节点上的kubelet.service无法正常运行有关。

systemctl status kubelet.service

kubelet.service - Kubernetes Kubelet Server
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2019-03-06 10:38:30 CET; 32min ago
     Docs: https://github.com/GoogleCloudPlatform/kubernetes
  Process: 14057 ExecStart=/usr/bin/kubelet $KUBE_LOGTOSTDERR $KUBE_LOG_LEVEL $KUBELET_API_SERVER $KUBELET_ADDRESS $KUBELET_PORT $KUBELET_HOSTNAME $KUBE_ALLOW_PRIV $KU>
 Main PID: 14057 (code=exited, status=255)
      CPU: 271ms

Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Failed with result 'exit-code'.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Consumed 271ms CPU time
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Service RestartSec=100ms expired, scheduling restart.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 5.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: Stopped Kubernetes Kubelet Server.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Consumed 271ms CPU time
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Start request repeated too quickly.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Failed with result 'exit-code'.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: Failed to start Kubernetes Kubelet Server.

~kubectl describe node

 Normal  Starting                 9s    kubelet, pccshost2.lan.proficom.de  Starting kubelet.
  Normal  NodeHasSufficientDisk    9s    kubelet, pccshost2.lan.proficom.de  Node pccshost2.lan.proficom.de status is now: NodeHasSufficientDisk
  Normal  NodeHasSufficientMemory  9s    kubelet, pccshost2.lan.proficom.de  Node pccshost2.lan.proficom.de status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    9s    kubelet, pccshost2.lan.proficom.de  Node pccshost2.lan.proficom.de status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     9s    kubelet, pccshost2.lan.proficom.de  Node pccshost2.lan.proficom.de status is now: NodeHasSufficientPID

有人可以提供建议吗,我该如何解决?谢谢

4 个答案:

答案 0 :(得分:3)

solved problem with kubelet adding --fail-swap-on=false" to KUBELET_ARGS= in Kubelet config file. But the problem with nodes stays same - status NotReady

答案 1 :(得分:2)

当您使用kubeadm安装k8s集群并在master(Ubuntu)中安装kubelet时,它将在/etc/systemd/system/kubelet.service.d

中创建文件“ 10-kubeadm.conf” >

小玩意内容

ExecStart = / usr / bin / kubelet $ KUBELET_KUBECONFIG_ARGS $ KUBELET_CONFIG_ARGS $ KUBELET_KUBEADM_ARGS $ KUBELET_EXTRA_ARGS

变量$KUBELET_KUBECONFIG_ARGS的值为/etc/kubernetes/kubelet.conf,其中包含由CA签名的证书。现在,您需要验证证书的有效性。如果证书已过期,请使用 openssl 创建证书,并与您的CA签署。

验证证书的步骤

  1. 复制值client-certificate-data
  2. 解码证书(echo -n“ copied_certificate_value” | base64 --decode)
  3. 将输出保存到文件(vi kubelet.crt)
  4. 验证有效性(openssl x509 -in kubelet.crt -text -noout)

如果Vadility已过期,则创建一个新证书

注意:在开始进行任何更改之前,备份始终是安全的 cp -a /etc/kubernetes/ /root/

生成新证书的步骤

openssl genrsa -out kubelet.key 2048
openssl req -new -key kubelet.key -subj "/CN=kubelet" -out kubelet.csr
openssl x509 -req -in kubelet.csr -CA /etc/kubernetes/pki/ca.crt -CAkey /etc/kubernetes/pki/ca.key -out kubelet.crt -days 300

编码证书文件

cat kubelet.crt | base64
cat kubelet.key | base64

复制编码的内容并在/etc/kubernetes/kubelet.conf中进行更新。

现在,检查主节点上kubelet的状态
systemctl status kubelet
systemctl restart kubelet #restart kubelet

答案 2 :(得分:0)

我遇到了同样的问题,找到了解决方法here

基本上,我必须运行以下命令:

swapoff -a
kubeadm reset
kudeadm init
systemctl status kubelet

然后,我只需要按照屏幕上的说明进行操作。我的设置使用织网作为Pod网络,因此我还必须运行kubectl apply -f weave-net.yaml

答案 3 :(得分:0)

我有同样的问题。无法在主节点中启动 kubelet 服务。

运行以下命令解决了我的问题:

$ sudo swapoff -a

$ sudo systemctl重新启动kubelet.service

$ systemctl状态小方块