谷歌管理的虚拟机模块陷入重启循环

时间:2015-12-09 01:09:15

标签: python google-app-engine google-cloud-platform managed-vm google-managed-vm

我一直在尝试添加一个使用托管虚拟机而不是默认GAE沙箱的新App Engine Module。目的是提供一个模块,我可以运行更新版本的SciPy和NumPy,我的面向用户的模块可以调用它。 我已在本地成功构建并运行我的Docker镜像/容器,但在尝试部署到Google服务器上的自定义版本时遇到了很多问题。

以下内容来自托管虚拟机模块实例的串行控制台输出,由于看似无法控制的问题而继续重启。

还有其他人遇到过这些吗?在配置/部署过程中我是否遗漏了什么?

FWIW:我已经使用GAE好几年了,甚至在我在Google期间做出了贡献。我也有使用模块和Docker的经验。围绕托管虚拟机的文档和工具目前看起来还不成熟,而且我已经失去了试图与它抗争的动力。我需要帮助。

Dec 09 00:52:41 vm_runtime_init: start 'pull_app'.
[   24.288054] docker0: port 1(veth8d67b7c) entered forwarding state
Dec  9 00:52:56 gae-mvm-vmv7-tsia kernel: [   24.288054] docker0: port 1(veth8d67b7c) entered forwarding state
Dec 09 00:52:57 Pulling GAE_FULL_APP_CONTAINER: appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
gcm-Heartbeat:1449622390000
Dec 09 00:53:13 ERROR: Timed out while trying to pull appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7 from registry!
===== Unexpected error during VM startup =====
=== Dump of VM runtime system logs follows ===
WARNING: HTTP 404 error while fetching metadata key gae_cloud_sql_instances. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_cloud_sql_proxy_image_name. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_extra_nginx_confs. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_redirect_appengine_googleapis_com. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_http_loadbalancer_enabled. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_loadbalancer. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_loadbalancer_ip. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_memcache_proxy. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_monitoring_image_name. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_use_cloud_monitoring. Will treat it as an empty string.
vm_runtime_init: Dec 09 00:52:40 Invoking all VM runtime components. /dev/fd/63
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: start 'allow_ssh'.
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: Done start 'allow_ssh'.
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: start 'unlocker'.
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: Done start 'unlocker'.
vm_runtime_init: Dec 09 00:52:40 vm_runtime_init: start 'fluentd_logger'.
vm_runtime_init: Dec 09 00:52:41 vm_runtime_init: Done start 'fluentd_logger'.
vm_runtime_init: Dec 09 00:52:41 vm_runtime_init: start 'pull_app'.
vm_runtime_init: Dec 09 00:52:57 Pulling GAE_FULL_APP_CONTAINER: appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Dec 09 00:53:13 ERROR: Timed out while trying to pull appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7 from registry!
Dec 09 00:52:40 Invoking all VM runtime components. /dev/fd/63
Dec 09 00:52:40 vm_runtime_init: start 'allow_ssh'.
Dec 09 00:52:40 vm_runtime_init: Done start 'allow_ssh'.
Dec 09 00:52:40 vm_runtime_init: start 'unlocker'.
Dec 09 00:52:40 vm_runtime_init: Done start 'unlocker'.
Dec 09 00:52:40 vm_runtime_init: start 'fluentd_logger'.
8aa8a33b8daa451d5595b951aeecad772a23d65b6592ac07cae6265cc74b6312
Dec 09 00:52:41 vm_runtime_init: Done start 'fluentd_logger'.
Dec 09 00:52:41 vm_runtime_init: start 'pull_app'.
Dec 09 00:52:57 Pulling GAE_FULL_APP_CONTAINER: appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Pulling metadata
16d49c9e1091: Pulling fs layer
16d49c9e1091: Download complete
54f405e77b26: Pulling metadata
54f405e77b26: Pulling fs layer
54f405e77b26: Download complete
36e2f6c710be: Pulling metadata
36e2f6c710be: Pulling fs layer
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Download complete
54f405e77b26: Download complete
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Download complete
54f405e77b26: Download complete
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Download complete
54f405e77b26: Download complete
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
Using default tag: latest
Pulling repository appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7
b626d012b369: Pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/
b626d012b369: Pulling dependent layers
643a001c5ee0: Download complete
559718b5f880: Download complete
8f8068a6a6b4: Download complete
16d49c9e1091: Download complete
54f405e77b26: Download complete
36e2f6c710be: Download complete
e8aed8091139: Pulling metadata
e8aed8091139: Pulling fs layer
e8aed8091139: Error downloading dependent layers
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, endpoint: https://appengine.gcr.io/v1/, Untar re-exec error: exit status 1: output: unexpected EOF
b626d012b369: Error pulling image (latest) from appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7, Untar re-exec error: exit status 1: output: unexpected EOF
Retrying docker pull.
CONTAINER ID        IMAGE                                    COMMAND                  CREATED             STATUS              PORTS               NAMES
8aa8a33b8daa        gcr.io/google_appengine/fluentd-logger   "/opt/google-fluentd/"   33 seconds ago      Up 32 seconds                           insane_panini
Container: 8aa8a33b8daa
========= rebooting. ========================

INIT: 
INIT: Sending processes the TERM signal


INIT: Sending processes the KILL signal

Dec  9 00:53:14 gae-mvm-vmv7-tsia init: Switching to runlevel: 1
gcm-StatusUpdate:TIME=1449622394000;STATUS=COMMAND_FAILED;INVOCATION_ID=0
[[36minfo[39;49m] Using makefile-style concurrent boot in runlevel 1.
Dec  9 00:53:15 gae-mvm-vmv7-tsia rpc.statd[1758]: Caught signal 15, un-registering and exiting
Dec  9 00:53:15 gae-mvm-vmv7-tsia google: shutdown script found in metadata.
[....] Stopping NFS common utilities: idmapd statd[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
Dec  9 00:53:15 gae-mvm-vmv7-tsia shutdownscript: Running shutdown script /var/run/google.shutdown.script
Dec  9 00:53:15 gae-mvm-vmv7-tsia rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w"
[....] Stopping rpcbind daemon...[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
Stopping supervisor: supervisord.
udhcpd: Disabled. Edit /etc/default/udhcpd to enable it.
[....] Unmounting iscsi-backed filesystems: Unmounting all devices marked _netdev[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
Dec  9 00:53:16 gae-mvm-vmv7-tsia iscsid: iscsid shutting down.
[....] Unmounting iscsi-backed filesystems: Unmounting all devices marked _netdev[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[....] Disconnecting iSCSI targets:iscsiadm: No matching sessions found
[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[....] Stopping iSCSI initiator service:[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
Dec  9 00:53:16 gae-mvm-vmv7-tsia shutdownscript: Finished running shutdown script /var/run/google.shutdown.script
[....] Stopping Docker: docker[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[....] Stopping The Kubernetes container manager: kubelet[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[....] Stopping enhanced syslogd: rsyslogd[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0c.
[   54.500923] docker0: port 1(veth8d67b7c) entered disabled state
[   54.512521] docker0: port 1(veth8d67b7c) entered disabled state
[   54.522249] device veth8d67b7c left promiscuous mode
[   54.527554] docker0: port 1(veth8d67b7c) entered disabled state
Terminating on signal number 15
Traceback (most recent call last):
  File "/usr/share/google/google_daemon/manage_accounts.py", line 94, in <module>
    options.daemon, options.force, options.debug)
  File "/usr/share/google/google_daemon/manage_accounts.py", line 65, in Main
    manager_daemon.StartDaemon()
  File "/usr/share/google/google_daemon/accounts_manager_daemon.py", line 73, in StartDaemon
    self.accounts_manager.Main()
  File "/usr/share/google/google_daemon/accounts_manager.py", line 87, in Main
    writer.close()
IOError: [Errno 32] Broken pipe
[....] Asking all remaining processes to terminate...acpid: exiting

[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0cdone.
[....] All processes ended within 1 seconds....[?25l[?1c7[1G[[32m ok [39;49m8[?25h[?0cdone.
[[36minfo[3
INIT: Sending processes the TERM signal


INIT: Sending processes the KILL signal

sulogin: root account is locked, starting shell
root@gae-mvm-vmv7-tsia:~# 

编辑:shutdown.log中的其他信息如下。 docker logs命令没有在我的任何代码或Dockerfiles中运行 - 我认为Google在其最终使用它的方式存在错误。

2015-12-08 17:08:22.194 Sending SIGUSR1 to fluentd to trigger a log flush.
2015-12-08 17:08:22.194 605e9f1ad747e63560fdc28a8c7f3c276d77255edd0a65381f7ad2f9f8eafd2a
2015-12-08 17:08:22.194 ---------------------------------------------------------------------
2015-12-08 17:08:22.194 ---------------App was unhealthy, grabbing debug logs----------------
2015-12-08 17:08:22.194 --------------------------App stdout/stderr--------------------------
2015-12-08 17:08:22.194 /usr/share/vm_runtime/vm_shutdown.sh: line 22: /var/run/app.cid: No such file or directory
2015-12-08 17:08:22.194 docker: "logs" requires 1 argument.
2015-12-08 17:08:22.194 See 'docker logs --help'.
2015-12-08 17:08:22.194 
2015-12-08 17:08:22.194 Usage:  docker logs [OPTIONS] CONTAINER
2015-12-08 17:08:22.194 
2015-12-08 17:08:22.194 Fetch the logs of a container
2015-12-08 17:08:22.194 ---------------------------------------------------------------------
2015-12-08 17:08:22.194 --------------------------Tail of app logs---------------------------
2015-12-08 17:08:22.194 tail: cannot open `/var/log/app_engine/app/app.0.log.json' for reading: No such file or directory
2015-12-08 17:08:22.194 ---------------------------------------------------------------------

2 个答案:

答案 0 :(得分:2)

编辑

这个答案大多是正确的,但推断基于卷曲的图像不存在是有缺陷的。 python-compat应该适合您,但python也是一个有效的图片,从运行docker pull gcr.io/google_appengine/python可以看出:

$ docker pull gcr.io/google_appengine/python
Pulling repository gcr.io/google_appengine/python
ac7db0912786: Download complete 
643a001c5ee0: Download complete 
559718b5f880: Download complete 
8f8068a6a6b4: Download complete 
16d49c9e1091: Download complete 
54f405e77b26: Download complete 
36e2f6c710be: Download complete 
e8aed8091139: Download complete 
8f0415d8e4e9: Download complete 
15ed20635873: Download complete 
6d70c8850a43: Download complete 
93ae290c32a1: Download complete 
7f766358fa71: Download complete 
7f4a74c30dc4: Download complete 
b51802c69e61: Download complete 
Status: Downloaded newer image for gcr.io/google_appengine/python:latest

在与gcr.io/google_appengine/python docker镜像的github repo上与贡献者jonparrott讨论时,各种python托管VM /自定义运行时泊坞窗映像的关系为clarified in a comment

因此,我认为您在此处看到的问题是,由于VM无法变得健康,可能与您采购的图像,Dockerfile,应用程序代码或基础架构有关。前三个比最后一个更可能,但这并不是不可思议的。似乎“Untar re-exec error: exit status 1: output: unexpected EOF”错误是串行控制台输出期间问题的第一个外在表现。

这可能值得将Google Cloud Platform Public Issue Tracker的问题报告与您的Dockerfile,应用程序代码(如果必要),时间范围(如果它只是暂时发生等)等信息一起提交给<{3}}。

原始答案

如果您检查容器注册表,则您尝试提供的图片FROM不存在:

curl -v -X HEAD https://gcr.io/google_appengine/python
* Hostname was NOT found in DNS cache
*   Trying 74.125.193.82...
* Connected to gcr.io (74.125.193.82) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* SSLv3, TLS handshake, Client hello (1):
* SSLv3, TLS handshake, Server hello (2):
* SSLv3, TLS handshake, CERT (11):
* SSLv3, TLS handshake, Server key exchange (12):
* SSLv3, TLS handshake, Server finished (14):
* SSLv3, TLS handshake, Client key exchange (16):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSL connection using ECDHE-RSA-AES128-GCM-SHA256
* Server certificate:
*        subject: C=US; ST=California; L=Mountain View; O=Google Inc; CN=*.googlecode.com
*        start date: 2015-12-02 14:40:36 GMT
*        expire date: 2016-03-01 00:00:00 GMT
*        subjectAltName: gcr.io matched
*        issuer: C=US; O=Google Inc; CN=Google Internet Authority G2
*        SSL certificate verify ok.
> HEAD /google_appengine/python HTTP/1.1
> User-Agent: curl/7.35.0
> Host: gcr.io
> Accept: */*
> 
< HTTP/1.1 404 Not Found
< Date: Thu, 10 Dec 2015 19:14:52 GMT
< Content-Type: text/html; charset=UTF-8
* Server Docker Registry is not blacklisted
< Server: Docker Registry
< Content-Length: 1584
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
< Alternate-Protocol: 443:quic,p=0
< Alt-Svc: clear
< 

python运行时的实际存在的容器注册表映像是:

gcr.io/google_appengine/python-compat

这可能是解决部署失败的关键。

答案 1 :(得分:1)

I don't know the answer, but I have some ideas.

Missing Image?

Dec 09 00:53:13 ERROR: Timed out while trying to pull appengine.gcr.io/389129677035831115/jt-calc.mvm.vmv7 from registry!

Indicates some problem getting the docker image from the registry. It doesn't sound like a 404 not found error. Nevertheless, I'd try the following:

Make sure you can pull the image from your development machine using gcloud docker pull. If not, push it to the registry.

Using App Engine Base Image?

WARNING: HTTP 404 error while fetching metadata key gae_cloud_sql_instances. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_cloud_sql_proxy_image_name. Will treat it as an empty string.
WARNING: HTTP 404 error while fetching metadata key gae_extra_nginx_confs. Will treat it as an empty string.

This sounds like the local metadata server isn't running or properly configured. My guess is this means your custom docker image isn't using one of the standard base images, in particular the Python base image. Try updating your Dockerfile to use the standard python base image.

相关问题