Mesos奴隶拒绝所有持续数量的马拉松工作;声称没有空位

时间:2016-03-27 00:11:45

标签: mesos marathon

我正在尝试使用对Mesos的持久卷支持,并且很难让它工作。

我已按照以下方式配置了每个从属设备,并确认已使用此新配置成功重启:

的/ etc / mesos-从/资源

[    ​
  {
    "name" : "disk",
    "type" : "SCALAR",
    "scalar" : { "value" : 4194304 },
    "disk" : {
      "source" : {
        "type" : "PATH",
        "path" : { "root" : "/mnt/disk1" }
      }
    }
  },
  {
    "name" : "disk",
    "type" : "SCALAR",
    "scalar" : { "value" : 4194304 },
    "disk" : {
      "source" : {
        "type" : "PATH",
        "path" : { "root" : "/mnt/disk2" }
      }
    }
  },
  {
    "name" : "disk",
    "type" : "SCALAR",
    "scalar" : { "value" : 4194304 },
    "disk" : {
      "source" : {
        "type" : "PATH",
        "path" : { "root" : "/mnt/disk3" }
      }
    }
  },
  {
    "name" : "disk",
    "type" : "SCALAR",
    "scalar" : { "value" : 4194304 },
    "disk" : {
      "source" : {
        "type" : "PATH",
        "path" : { "root" : "/mnt/disk4" }
      }
    }
  },
  {
    "name" : "disk",
    "type" : "SCALAR",
    "scalar" : { "value" : 4194304 },
    "disk" : {
      "source" : {
        "type" : "PATH",
        "path" : { "root" : "/mnt/disk5" }
      }
    }
  },
  {
    "name" : "disk",
    "type" : "SCALAR",
    "scalar" : { "value" : 4194304 },
    "disk" : {
      "source" : {
        "type" : "MOUNT",
        "mount" : { "root" : "/mnt/disk6" }
      }
    }
  },
  {
    "name" : "disk",
    "type" : "SCALAR",
    "scalar" : { "value" : 4194304 },
    "disk" : {
      "source" : {
        "type" : "MOUNT",
        "mount" : { "root" : "/mnt/disk7" }
      }
    }
  }
]

具体而言,它表明我有无保留的资源。具体而言(完整回复here):

{
  ...
  "slaves": [{
    "id": "c5e59876-5157-463f-b31e-16b34d6ffc72-S8",
    "pid": "slave(1)@172.30.31.55:5051",
    "hostname": "redacted47.redacted.com",
    "registered_time": 1458810586.61153,
    "resources": {
      "cpus": 32,
      "disk": 29360128,
      "mem": 256651,
      "ports": "[31000-32000]"
    },
    "used_resources": {
      "cpus": 1,
      "disk": 0,
      "mem": 128,
      "ports": "[31282-31282]"
    },
    "offered_resources": {
      "cpus": 0,
      "disk": 0,
      "mem": 0
    },
    "reserved_resources": {},
    "unreserved_resources": {
      "cpus": 32,
      "disk": 29360128,
      "mem": 256651,
      "ports": "[31000-32000]"
    },

每当我尝试向其请求持久卷的作业时,所有从属设备都拒绝它,声称没有可用的磁盘资源:

Mar 26 17:59:43 redacted47.redacted.com start[30457]: [2016-03-26 17:59:43,606] INFO Offer [2220b6bf-aac2-402b-82e6-8d625284d1a4-O9375]. Considering unreserved resources with roles {*}. Not all basic resources satisfied: cpus SATISFIED (1.0 <= 1.0), mem SATISFIED (128.0 <= 128.0), disk including volumes NOT SATISFIED (1024.0 > 0.0) (mesosphere.mesos.ResourceMatcher$:marathon-akka.actor.default-dispatcher-38)
Mar 26 17:59:43 redacted47.redacted.com start[30457]: [2016-03-26 17:59:43,606] INFO Offer [2220b6bf-aac2-402b-82e6-8d625284d1a4-O9376]. Considering unreserved resources with roles {*}. Not all basic resources satisfied: cpus SATISFIED (1.0 <= 1.0), mem SATISFIED (128.0 <= 128.0), disk including volumes NOT SATISFIED (1024.0 > 0.0) (mesosphere.mesos.ResourceMatcher$:marathon-akka.actor.default-dispatcher-38)
Mar 26 17:59:43 redacted47.redacted.com start[30457]: [2016-03-26 17:59:43,606] INFO Finished processing 2220b6bf-aac2-402b-82e6-8d625284d1a4-O9375. Matched 0 ops after 1 passes. disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; cpus(*) 28.0; mem(*) 226955.0; ports(*) 31000->31085,31087->31364,31366->31940,31942->32000 left. (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-11)
Mar 26 17:59:43 redacted47.redacted.com start[30457]: [2016-03-26 17:59:43,606] INFO Offer [2220b6bf-aac2-402b-82e6-8d625284d1a4-O9379]. Considering unreserved resources with roles {*}. Not all basic resources satisfied: cpus SATISFIED (1.0 <= 1.0), mem SATISFIED (128.0 <= 128.0), disk including volumes NOT SATISFIED (1024.0 > 0.0) (mesosphere.mesos.ResourceMatcher$:marathon-akka.actor.default-dispatcher-38)

如果我尝试直接针对mesos主机发布创建卷的请求,则拒绝该请求,说“磁盘资源不足”,如下所示:

# curl -v -i \
    -u "marathon:$(cat /etc/marathon/.secret)" \
    -d slaveId=c5e59876-5157-463f-b31e-16b34d6ffc72-S8 \
    -d volumes='[
      {
        "name": "disk",
        "type": "SCALAR",
        "scalar": { "value": 512 },
        "role": "foo",
        "reservation": {
          "principal": "marathon"
        },
        "disk": {
          "persistence": {
            "id" : "very-persist"
          },
          "volume": {
            "mode": "RW",
            "container_path": "such-path"
          }
        }
      }
    ]' \
    -X POST http://localhost:5050/master/create-volumes; echo
* About to connect() to localhost port 5050 (#0)
*   Trying ::1...
* Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 5050 (#0)
* Server auth using Basic with user 'marathon'
> POST /master/create-volumes HTTP/1.1
> Authorization: Basic redacted
> User-Agent: curl/7.29.0
> Host: localhost:5050
> Accept: */*
> Content-Length: 481
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 481 out of 481 bytes
< HTTP/1.1 409 Conflict
HTTP/1.1 409 Conflict
< Date: Thu, 24 Mar 2016 09:50:36 GMT
Date: Thu, 24 Mar 2016 09:50:36 GMT
< Content-Length: 53
Content-Length: 53
​
<
* Connection #0 to host localhost left intact
Invalid CREATE Operation: Insufficient disk resources

我的智慧结束了。我不知道我在做什么,我正在尽力遵循文档。任何关于我可能做错的提示都将受到极大的赞赏。

我正在跑步:

  • Mesos 0.28.0
  • 马拉松1.0.0RC1

我尽可能遵循以下资源中的说明:

感谢您阅读!

2 个答案:

答案 0 :(得分:2)

首先感谢您提供这样一个记录良好的问题!

您的问题似乎如下:

a)没有root disk资源可用。一旦您手动指定磁盘资源 Mesos将自动停止检测根磁盘。您可以简单地添加根据here所述的根磁盘资源来解决您的问题。

b)你的&#34;创建卷&#34;上面的http请求只会考虑根磁盘资源(由于上述原因,您不会考虑这些资源)。 如果要使用非根磁盘,则应将源字段视为非常简短提及here

BTW欢迎任何关于如何改进文档的反馈(我将添加关于此问题的简短说明,但来自用户的任何反馈都非常有帮助)!欢迎在这里做出贡献!

希望这有用!

答案 1 :(得分:0)

抱歉,我无法添加评论。

我发现文档有点令人生畏。它很详细,很多,但我正在尝试在我自己的时间学习mesos,马拉松等,没有例子对我来说真的很难。我更喜欢的是一个显示小型集群的页面,其中包含IP地址,磁盘,CPU以及设置主服务器,代理和动物园管理员集合所需的配置文件。一些示例json文件显示了如何在特定用例中使用marathon。

我的目标是在我的公共github帐户中为我自己做一些笔记,显示我的测试集群,并解释当我有持续卷工作时,如何配置所有内容,jenkins和私有docker注册表都在mesos中,但我'远离那个。