kubernetes集群主节点未准备好

时间:2017-09-12 09:27:42

标签: kubernetes

我不知道为什么,我的主节点处于未就绪状态,集群上的所有pod都正常运行,我使用cabernets v1.7.5,网络插件使用calico,而os版本是" centos7.2.1511& #34;

# kubectl get nodes
NAME        STATUS     AGE       VERSION
k8s-node1   Ready      1h        v1.7.5
k8s-node2   NotReady   1h        v1.7.5




# kubectl get all --all-namespaces
NAMESPACE     NAME                                           READY     STATUS    RESTARTS   AGE
kube-system   po/calico-node-11kvm                           2/2       Running   0          33m
kube-system   po/calico-policy-controller-1906845835-1nqjj   1/1       Running   0          33m
kube-system   po/calicoctl                                   1/1       Running   0          33m
kube-system   po/etcd-k8s-node2                              1/1       Running   1          15m
kube-system   po/kube-apiserver-k8s-node2                    1/1       Running   1          15m
kube-system   po/kube-controller-manager-k8s-node2           1/1       Running   2          15m
kube-system   po/kube-dns-2425271678-2mh46                   3/3       Running   0          1h
kube-system   po/kube-proxy-qlmbx                            1/1       Running   1          1h
kube-system   po/kube-proxy-vwh6l                            1/1       Running   0          1h
kube-system   po/kube-scheduler-k8s-node2                    1/1       Running   2          15m

NAMESPACE     NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
default       svc/kubernetes   10.96.0.1    <none>        443/TCP         1h
kube-system   svc/kube-dns     10.96.0.10   <none>        53/UDP,53/TCP   1h

NAMESPACE     NAME                              DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kube-system   deploy/calico-policy-controller   1         1         1            1           33m
kube-system   deploy/kube-dns                   1         1         1            1           1h

NAMESPACE     NAME                                     DESIRED   CURRENT   READY     AGE
kube-system   rs/calico-policy-controller-1906845835   1         1         1         33m
kube-system   rs/kube-dns-2425271678                   1         1         1         1h

更新

似乎主节点无法识别calico网络插件,我使用kubeadm来安装k8s集群,由于主节点上的127.0.0.1:2379上的kubeadm start etcd,而其他节点上的calico无法与etcd通话,所以我修改了etcd.yaml如下,并且所有印花布豆荚运行正常,我不熟悉印花布,如何解决它?

apiVersion: v1
kind: Pod
metadata:
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --listen-client-urls=http://127.0.0.1:2379,http://10.161.233.80:2379
    - --advertise-client-urls=http://10.161.233.80:2379
    - --data-dir=/var/lib/etcd
    image: gcr.io/google_containers/etcd-amd64:3.0.17
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /health
        port: 2379
        scheme: HTTP
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: etcd
    resources: {}
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: certs
    - mountPath: /var/lib/etcd
      name: etcd
    - mountPath: /etc/kubernetes
      name: k8s
      readOnly: true
  hostNetwork: true
  volumes:
  - hostPath:
      path: /etc/ssl/certs
    name: certs
  - hostPath:
      path: /var/lib/etcd
    name: etcd
  - hostPath:
      path: /etc/kubernetes
    name: k8s
status: {}

[root@k8s-node2 calico]# kubectl describe node k8s-node2
Name:                   k8s-node2
Role:
Labels:                 beta.kubernetes.io/arch=amd64
                        beta.kubernetes.io/os=linux
                        kubernetes.io/hostname=k8s-node2
                        node-role.kubernetes.io/master=
Annotations:            node.alpha.kubernetes.io/ttl=0
                        volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:                 node-role.kubernetes.io/master:NoSchedule
CreationTimestamp:      Tue, 12 Sep 2017 15:20:57 +0800
Conditions:
  Type                  Status  LastHeartbeatTime                       LastTransitionTime                      Reason                          Message
  ----                  ------  -----------------                       ------------------                      ------                          -------
  OutOfDisk             False   Wed, 13 Sep 2017 10:25:58 +0800         Tue, 12 Sep 2017 15:20:57 +0800         KubeletHasSufficientDisk        kubelet has sufficient disk space available
  MemoryPressure        False   Wed, 13 Sep 2017 10:25:58 +0800         Tue, 12 Sep 2017 15:20:57 +0800         KubeletHasSufficientMemory      kubelet has sufficient memory available
  DiskPressure          False   Wed, 13 Sep 2017 10:25:58 +0800         Tue, 12 Sep 2017 15:20:57 +0800         KubeletHasNoDiskPressure        kubelet has no disk pressure
  Ready                 False   Wed, 13 Sep 2017 10:25:58 +0800         Tue, 12 Sep 2017 15:20:57 +0800         KubeletNotReady                 runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
  InternalIP:   10.161.233.80
  Hostname:     k8s-node2
Capacity:
 cpu:           2
 memory:        3618520Ki
 pods:          110
Allocatable:
 cpu:           2
 memory:        3516120Ki
 pods:          110
System Info:
 Machine ID:                    3c6ff97c6fbe4598b53fd04e08937468
 System UUID:                   C6238BF8-8E60-4331-AEEA-6D0BA9106344
 Boot ID:                       84397607-908f-4ff8-8bdc-ff86c364dd32
 Kernel Version:                3.10.0-514.6.2.el7.x86_64
 OS Image:                      CentOS Linux 7 (Core)
 Operating System:              linux
 Architecture:                  amd64
 Container Runtime Version:     docker://1.12.6
 Kubelet Version:               v1.7.5
 Kube-Proxy Version:            v1.7.5
PodCIDR:                        10.68.0.0/24
ExternalID:                     k8s-node2
Non-terminated Pods:            (5 in total)
  Namespace                     Name                                            CPU Requests    CPU Limits      Memory Requests Memory Limits
  ---------                     ----                                            ------------    ----------      --------------- -------------
  kube-system                   etcd-k8s-node2                                  0 (0%)          0 (0%)          0 (0%)          0 (0%)
  kube-system                   kube-apiserver-k8s-node2                        250m (12%)      0 (0%)          0 (0%)          0 (0%)
  kube-system                   kube-controller-manager-k8s-node2               200m (10%)      0 (0%)          0 (0%)          0 (0%)
  kube-system                   kube-proxy-qlmbx                                0 (0%)          0 (0%)          0 (0%)          0 (0%)
  kube-system                   kube-scheduler-k8s-node2                        100m (5%)       0 (0%)          0 (0%)          0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits      Memory Requests Memory Limits
  ------------  ----------      --------------- -------------
  550m (27%)    0 (0%)          0 (0%)          0 (0%)
Events:         <none>

3 个答案:

答案 0 :(得分:8)

运行describe命令以查看节点出了什么问题,这是一个好习惯:

kubectl describe nodes <NODE_NAME>

例如:kubectl描述节点k8s-node2 您应该可以从那里开始调查,并在需要时为此问题添加更多信息。

答案 1 :(得分:4)

您需要安装网络策略提供程序,这是受支持的提供程序之一: Weave Net for NetworkPolicy。 命令行安装:

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

几秒钟后,每个节点上都应运行一个Weave Net Pod,并且您创建的任何其他Pod将自动连接到Weave网络。

答案 2 :(得分:0)

我认为您可能需要在正在使用的清单中添加容差并更新calico-node的注释,以便它可以在kubeadm创建的主服务器上运行。 Kubeadm污染了主人,这样豆荚就无法在它上面运行,除非他们对这种污染有耐受性。

我相信您正在使用https://docs.projectcalico.org/v2.5/getting-started/kubernetes/installation/hosted/calico.yaml清单,其中包含K8s v1.5的注释(包括容差),您应该检查https://docs.projectcalico.org/v2.5/getting-started/kubernetes/installation/hosted/kubeadm/1.6/calico.yaml,它具有K8s v1.6 +的容错语法

以上是带有注释和容忍的上述片段

    metadata:
      labels:
        k8s-app: calico-node
      annotations:
        # Mark this pod as a critical add-on; when enabled, the critical add-on scheduler
        # reserves resources for critical add-on pods so that they can be rescheduled after
        # a failure.  This annotation works in tandem with the toleration below.
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      hostNetwork: true
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      # Allow this pod to be rescheduled while the node is in "critical add-ons only" mode.
      # This, along with the annotation above marks this pod as a critical add-on.
      - key: CriticalAddonsOnly
        operator: Exists