将PVC卷超时安装到吊舱

时间:2018-04-04 22:20:35

标签: amazon-web-services kubernetes

我正在尝试部署挂载在Persistent Volume上的有状态集。

我通过kops在AWS上安装了Kubernetes。

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T12:22:21Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T11:55:20Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}

根据this issue我需要首先创建PVC:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: zk-data-claim
spec:
  storageClassName: default
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: zk-logs-claim
spec:
  storageClassName: default
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi

存在default存储类,并且PVC成功绑定PV:

$ kubectl get sc
NAME            PROVISIONER             AGE
default         kubernetes.io/aws-ebs   20d
gp2 (default)   kubernetes.io/aws-ebs   20d
ssd (default)   kubernetes.io/aws-ebs   20d

$ kubectl get pvc
NAME            STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
zk-data-claim   Bound     pvc-5584fdf7-3853-11e8-a73b-02bb35448afe   2Gi        RWO            default        11m
zk-logs-claim   Bound     pvc-5593e249-3853-11e8-a73b-02bb35448afe   2Gi        RWO            default        11m

我可以在EC2 EBS卷列表中看到这两个卷为"可用"起初,但后来成为"正在使用"。

然后在我的StatefulSet中提取它

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: zk
spec:
  serviceName: zk-cluster
  replicas: 3
  template:
    metadata:
      labels:
        app: zookeeper
    spec:
      volumes:
        - name: zk-data
          persistentVolumeClaim:
            claimName: zk-data-claim
        - name: zk-logs
          persistentVolumeClaim:
            claimName: zk-logs-claim

      containers:
      ....
        volumeMounts:
        - name: zk-data
          mountPath: /opt/zookeeper/data
        - name: zk-logs
          mountPath: /opt/zookeeper/logs

失败
Unable to mount volumes for pod "zk-0_default(83b8dc93-3850-11e8-a73b-02bb35448afe)": timeout expired waiting for volumes to attach/mount for pod "default"/"zk-0". list of unattached/unmounted volumes=[zk-data zk-logs]

我正在使用默认命名空间。

任何想法可能导致这种失败?

3 个答案:

答案 0 :(得分:1)

问题是我的群集是用C5节点制作的。 C5和M5节点遵循不同的命名约定(NVMe),并且无法识别命名。

使用t2类型节点重新创建群集。

答案 1 :(得分:0)

是的,这是AWS和kubernetes的一个非常非常有名的问题。大多数情况下,它是由另一个节点上的陈旧目录引起的,导致EBS卷仍然从其他Node的角度“正在使用”,因此当AWS API请求时,Linux机器不会松开设备。你会在 import csv with open('mydata.csv', 'r') as f: for line in f.readlines(): if 'food' in line: print(line) 期刊,EBS机器和想要EBS的机器上看到很多关于此的讨论。

根据我的经验,只有ssh进入当前附加EBS卷的节点,找到挂载,卸载它们,然后等待指数退避计时器到期将解决: - (

挥手的版本是:

kubelet.service

在此过程中您可能会遇到一些涉及## cleaning up stale docker containers might not be a terrible idea docker rm $(docker ps -aq -f status=exited) ## identify any (and there could very well be multiple) mounts ## of the EBS device-name mount | awk '/dev\/xvdf/ {print $2}' | xargs umount ## or sometimes kubernetes will actually name the on-disk directory ebs~ so: mount | awk '/ebs~something/{print $2}' | xargs umount 的成功,但希望(!)清理已退出的容器将消除对此类事物的需求。

答案 2 :(得分:0)

以这种方式解决了问题:

systemctl daemon-reload 
systemctl restart kubelet 
kubeadm reset remove swap $iptables -F $swapoff -a $free -m 
kubeadm reset 
kubeadm init --pod-network-cidr=10.244.0.0/16
mkdir -p $HOME/.kube 
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

kubectl get pods --all-namespaces

NAMESPACE NAME READY STATUS RESTARTS AGE 
kube-system etcd-tonny 1/1 Running 0 1m 
kube-system kube-apiserver-tonny 1/1 Running 0 1m 
kube-system kube-controller-manager-tonny 1/1 Running 0 1m 
kube-system kube-dns-86f4d74b45-cq94w 0/3 Pending 0 1m 
kube-system kube-proxy-hfvbl 1/1 Running 0 1m 
#62229