pod挂起待定状态

时间:2017-09-26 10:49:28

标签: kubernetes google-kubernetes-engine

我有一个kubernetes部署,我试图在单个节点上的单个pod中运行5个docker容器。容器挂起处于“挂起”状态,从未安排。我不介意运行超过1个pod,但我想保持节点数量下降。我假设1个节点有1个CPU,1.7G RAM足够用于5个容器,我试图将工作负载分开。

最初我得出的结论是我资源不足。我启用了生成以下内容的节点的自动调节(请参阅kubectl describe pod命令):

  

pod没有触发放大(如果添加新节点则不适合)

无论如何,每个docker容器都有一个简单的命令,它运行一个相当简单的应用程序。理想情况下,我不想处理设置CPU和RAM资源分配,甚至设置CPU /内存限制在边界内,因此它们不会累加到> 1,我仍然得到(参见kubectl describe po / test-529945953-gh6cl)我明白了:

  

没有可用的节点匹配以下所有谓词::   cpu(1)不足,内存不足(1)。

以下是显示状态的各种命令。对我所做错的任何帮助都将不胜感激。

  

kubectl全部获取

user_s@testing-11111:~/gce$ kubectl get all
NAME                          READY     STATUS    RESTARTS   AGE
po/test-529945953-gh6cl   0/5       Pending   0          34m

NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
svc/kubernetes   10.7.240.1   <none>        443/TCP   19d

NAME              DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deploy/test   1         1         1            0           34m

NAME                    DESIRED   CURRENT   READY     AGE
rs/test-529945953   1         1         0         34m
user_s@testing-11111:~/gce$
  

kubectl描述po / test-529945953-gh6cl

user_s@testing-11111:~/gce$ kubectl describe po/test-529945953-gh6cl
Name:           test-529945953-gh6cl
Namespace:      default
Node:           <none>
Labels:         app=test
                pod-template-hash=529945953
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"test-529945953","uid":"c6e889cb-a2a0-11e7-ac18-42010a9a001a"...
Status:         Pending
IP:
Created By:     ReplicaSet/test-529945953
Controlled By:  ReplicaSet/test-529945953
Containers:
  container-test2-tickers:
    Image:      gcr.io/testing-11111/testology:latest
    Port:       <none>
    Command:
      process_cmd
      arg1
      test2
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:      100m
      memory:   375Mi
    Environment:
      DB_HOST:          127.0.0.1:5432
      DB_PASSWORD:      <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false
      DB_USER:          <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
  container-kraken-tickers:
    Image:      gcr.io/testing-11111/testology:latest
    Port:       <none>
    Command:
      process_cmd
      arg1
      arg2
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:      100m
      memory:   375Mi
    Environment:
      DB_HOST:          127.0.0.1:5432
      DB_PASSWORD:      <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false
      DB_USER:          <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
  container-gdax-tickers:
    Image:      gcr.io/testing-11111/testology:latest
    Port:       <none>
    Command:
      process_cmd
      arg1
      arg2
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:      100m
      memory:   375Mi
    Environment:
      DB_HOST:          127.0.0.1:5432
      DB_PASSWORD:      <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false
      DB_USER:          <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
  container-bittrex-tickers:
    Image:      gcr.io/testing-11111/testology:latest
    Port:       <none>
    Command:
      process_cmd
      arg1
      arg2
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:      100m
      memory:   375Mi
    Environment:
      DB_HOST:          127.0.0.1:5432
      DB_PASSWORD:      <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false
      DB_USER:          <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
  cloudsql-proxy:
    Image:      gcr.io/cloudsql-docker/gce-proxy:1.09
    Port:       <none>
    Command:
      /cloud_sql_proxy
      --dir=/cloudsql
      -instances=testing-11111:europe-west2:testology=tcp:5432
      -credential_file=/secrets/cloudsql/credentials.json
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:              100m
      memory:           375Mi
    Environment:        <none>
    Mounts:
      /cloudsql from cloudsql (rw)
      /etc/ssl/certs from ssl-certs (rw)
      /secrets/cloudsql from cloudsql-instance-credentials (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
Conditions:
  Type          Status
  PodScheduled  False
Volumes:
  cloudsql-instance-credentials:
    Type:       Secret (a volume populated by a Secret)
    SecretName: cloudsql-instance-credentials
    Optional:   false
  ssl-certs:
    Type:       HostPath (bare host directory volume)
    Path:       /etc/ssl/certs
  cloudsql:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  default-token-b2mxc:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-b2mxc
    Optional:   false
QoS Class:      Burstable
Node-Selectors: <none>
Tolerations:    node.alpha.kubernetes.io/notReady:NoExecute for 300s
                node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath   Type            Reason                  Message
  ---------     --------        -----   ----                    -------------   --------        ------                  -------
  27m           17m             44      default-scheduler                       Warning         FailedScheduling        No nodes are available that match all of the following predicates:: Insufficient cpu (1), Insufficient memory (2).
  26m           8s              150     cluster-autoscaler                      Normal          NotTriggerScaleUp       pod didn't trigger scale-up (it wouldn't fit if a new node is added)
  16m           2s              63      default-scheduler                       Warning         FailedScheduling        No nodes are available that match all of the following predicates:: Insufficient cpu (1), Insufficient memory (1).
user_s@testing-11111:~/gce$

> Blockquote
  

kubectl get nodes

user_s@testing-11111:~/gce$ kubectl get nodes
NAME                                      STATUS    AGE       VERSION
gke-test-default-pool-abdf83f7-p4zw   Ready     9h        v1.6.7
  

kubectl get pods

user_s@testing-11111:~/gce$ kubectl get pods
NAME                       READY     STATUS    RESTARTS   AGE
test-529945953-gh6cl   0/5       Pending   0          38m
  

kubectl描述节点

user_s@testing-11111:~/gce$ kubectl describe nodes
Name:                   gke-test-default-pool-abdf83f7-p4zw
Role:
Labels:                 beta.kubernetes.io/arch=amd64
                        beta.kubernetes.io/fluentd-ds-ready=true
                        beta.kubernetes.io/instance-type=g1-small
                        beta.kubernetes.io/os=linux
                        cloud.google.com/gke-nodepool=default-pool
                        failure-domain.beta.kubernetes.io/region=europe-west2
                        failure-domain.beta.kubernetes.io/zone=europe-west2-c
                        kubernetes.io/hostname=gke-test-default-pool-abdf83f7-p4zw
Annotations:            node.alpha.kubernetes.io/ttl=0
                        volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:                 <none>
CreationTimestamp:      Tue, 26 Sep 2017 02:05:45 +0100
Conditions:
  Type                  Status  LastHeartbeatTime                       LastTransitionTime                      Reason                          Message
  ----                  ------  -----------------                       ------------------                      ------                          -------
  NetworkUnavailable    False   Tue, 26 Sep 2017 02:06:05 +0100         Tue, 26 Sep 2017 02:06:05 +0100         RouteCreated                    RouteController created a route
  OutOfDisk             False   Tue, 26 Sep 2017 11:33:57 +0100         Tue, 26 Sep 2017 02:05:45 +0100         KubeletHasSufficientDisk        kubelet has sufficient disk space available
  MemoryPressure        False   Tue, 26 Sep 2017 11:33:57 +0100         Tue, 26 Sep 2017 02:05:45 +0100         KubeletHasSufficientMemory      kubelet has sufficient memory available
  DiskPressure          False   Tue, 26 Sep 2017 11:33:57 +0100         Tue, 26 Sep 2017 02:05:45 +0100         KubeletHasNoDiskPressure        kubelet has no disk pressure
  Ready                 True    Tue, 26 Sep 2017 11:33:57 +0100         Tue, 26 Sep 2017 02:06:05 +0100         KubeletReady                    kubelet is posting ready status. AppArmor enabled
  KernelDeadlock        False   Tue, 26 Sep 2017 11:33:12 +0100         Tue, 26 Sep 2017 02:05:45 +0100         KernelHasNoDeadlock             kernel has no deadlock
Addresses:
  InternalIP:   10.154.0.2
  ExternalIP:   35.197.217.1
  Hostname:     gke-test-default-pool-abdf83f7-p4zw
Capacity:
 cpu:           1
 memory:        1742968Ki
 pods:          110
Allocatable:
 cpu:           1
 memory:        1742968Ki
 pods:          110
System Info:
 Machine ID:                    e6119abf844c564193495c64fd9bd341
 System UUID:                   E6119ABF-844C-5641-9349-5C64FD9BD341
 Boot ID:                       1c2f2ea0-1f5b-4c90-9e14-d1d9d7b75221
 Kernel Version:                4.4.52+
 OS Image:                      Container-Optimized OS from Google
 Operating System:              linux
 Architecture:                  amd64
 Container Runtime Version:     docker://1.11.2
 Kubelet Version:               v1.6.7
 Kube-Proxy Version:            v1.6.7
PodCIDR:                        10.4.1.0/24
ExternalID:                     6073438913956157854
Non-terminated Pods:            (7 in total)
  Namespace                     Name                                                            CPU Requests    CPU Limits      Memory Requests Memory Limits
  ---------                     ----                                                            ------------    ----------      --------------- -------------
  kube-system                   fluentd-gcp-v2.0-k565g                                          100m (10%)      0 (0%)          200Mi (11%)     300Mi (17%)
  kube-system                   heapster-v1.3.0-3440173064-1ztvw                                138m (13%)      138m (13%)      301456Ki (17%)  301456Ki (17%)
  kube-system                   kube-dns-1829567597-gdz52                                       260m (26%)      0 (0%)          110Mi (6%)      170Mi (9%)
  kube-system                   kube-dns-autoscaler-2501648610-7q9dd                            20m (2%)        0 (0%)          10Mi (0%)       0 (0%)
  kube-system                   kube-proxy-gke-test-default-pool-abdf83f7-p4zw              100m (10%)      0 (0%)          0 (0%)          0 (0%)
  kube-system                   kubernetes-dashboard-490794276-25hmn                            100m (10%)      100m (10%)      50Mi (2%)       50Mi (2%)
  kube-system                   l7-default-backend-3574702981-flqck                             10m (1%)        10m (1%)        20Mi (1%)       20Mi (1%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits      Memory Requests Memory Limits
  ------------  ----------      --------------- -------------
  728m (72%)    248m (24%)      700816Ki (40%)  854416Ki (49%)
Events:         <none>

1 个答案:

答案 0 :(得分:1)

正如您在kubectl describe nodes下的Allocated resources:命令的输出中所看到的,728m (72%)已经请求700816Ki (40%)已经请求的内存kube-system内存和Events内存节点上的命名空间。测试Pod的资源请求总和都超过了节点上可用的剩余CPU和内存,正如您在kubectl describe po/[…]命令的firebase.database().ref("/.info/serverTimeOffset").on('value', function(offset) { var date = Date.now() + offset.val(); console.log(date); }); 下所看到的那样。

如果要将所有容器保留在单个容器中,则需要减少容器的资源请求,或者在具有更多CPU和内存的节点上运行它们。更好的解决方案是将您的应用程序拆分为多个pod,这样可以在多个节点上进行分发。