Theano和芹菜:工人过早退出:信号11(SIGSEGV)

时间:2015-12-14 19:27:45

标签: segmentation-fault celery theano

我正在构建一个Web应用程序,我开始通过从客户端发送ajax请求来启动在服务器端使用theano实现的神经网络的培训。显然,我不想等待服务器完全训练网络,以便将答案发回给我的客户,因为这样会很长。

所以我想出了芹菜,它使我能够在服务器端执行异步代码。我用命令celery -A CBIR worker -l info运行芹菜工人。不幸的是,每次工作人员运行我的任务(使用theano训练我的网络)时,我都会收到以下消息:

[2015-12-14 19:15:06,790: ERROR/MainProcess] Process 'Worker-3' pid:1610 exited with 'signal 11 (SIGSEGV)'
[2015-12-14 19:15:07,001: ERROR/MainProcess] Task fit[ac40d4d4-5b56-4278-b270-647ef76f3a49] raised unexpected: WorkerLostError('Worker exited prematurely: signal 11 (SIGSEGV).',)
Traceback (most recent call last):
File "/Users/leo/anaconda/envs/ImgRet/lib/python3.5/site-packages/billiard/pool.py", line 1175, in mark_as_worker_losthuman_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 11 (SIGSEGV).

我一直在寻找可能发生此错误的原因,据我所知,我正在运行的代码正在遭受内存泄漏。我不明白的是为什么我的代码在不使用芹菜时没有问题但是在使用Celery时会出现这个错误?

最重要的是,我不知道如何解决这个问题。我使用lldb来查看生成的转储文件,这是我的回溯:

thread #1: tid = 0x0000, 0x00007fff93b4a9b3 libdispatch.dylib`dispatch_group_async + 533, stop reason = signal SIGSTOP
* frame #0: 0x00007fff93b4a9b3 libdispatch.dylib`dispatch_group_async + 533
frame #1: 0x00007fff7c5b8d40 libdispatch.dylib`_dispatch_root_queues + 1280
frame #2: 0x00007fff9519b228 libBLAS.dylib`APL_dgemm + 1100
frame #3: 0x00007fff951d27aa libBLAS.dylib`cblas_dgemm + 1420
frame #4: 0x0000000104beeb18 multiarray.cpython-35m-darwin.so`gemm + 200
frame #5: 0x0000000104bee3b9 multiarray.cpython-35m-darwin.so`cblas_matrixproduct + 3097
frame #6: 0x0000000104bc01af multiarray.cpython-35m-darwin.so`PyArray_MatrixProduct2 + 207
frame #7: 0x0000000104bc4808 multiarray.cpython-35m-darwin.so`array_matrixproduct + 264
frame #8: 0x00000001000671a9 libpython3.5m.dylib`PyCFunction_Call + 281
frame #9: 0x00000001000f2fbd libpython3.5m.dylib`PyEval_EvalFrameEx + 32029
frame #10: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #11: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #12: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #13: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #14: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #15: 0x00000001000e95a7 libpython3.5m.dylib`PyEval_CallObjectWithKeywords + 87
frame #16: 0x00000001042fae3a lazylinker_ext.so`pycall(self=0x0000000108fad3d8, node_idx=13, verbose=0) + 442 at mod.cpp:510
frame #17: 0x00000001042fa869 lazylinker_ext.so`lazy_rec_eval(self=0x0000000108fad3d8, var_idx=24, one=0x000000010026cf60, zero=0x000000010026cf40) + 2089 at mod.cpp:704
frame #18: 0x00000001042fa789 lazylinker_ext.so`lazy_rec_eval(self=0x0000000108fad3d8, var_idx=28, one=0x000000010026cf60, zero=0x000000010026cf40) + 1865 at mod.cpp:690
frame #19: 0x00000001042fa16d lazylinker_ext.so`lazy_rec_eval(self=0x0000000108fad3d8, var_idx=30, one=0x000000010026cf60, zero=0x000000010026cf40) + 301 at mod.cpp:576
frame #20: 0x00000001042fa789 lazylinker_ext.so`lazy_rec_eval(self=0x0000000108fad3d8, var_idx=33, one=0x000000010026cf60, zero=0x000000010026cf40) + 1865 at mod.cpp:690
frame #21: 0x00000001042fa789 lazylinker_ext.so`lazy_rec_eval(self=0x0000000108fad3d8, var_idx=36, one=0x000000010026cf60, zero=0x000000010026cf40) + 1865 at mod.cpp:690
frame #22: 0x00000001042fa789 lazylinker_ext.so`lazy_rec_eval(self=0x0000000108fad3d8, var_idx=41, one=0x000000010026cf60, zero=0x000000010026cf40) + 1865 at mod.cpp:690
frame #23: 0x00000001042fa789 lazylinker_ext.so`lazy_rec_eval(self=0x0000000108fad3d8, var_idx=42, one=0x000000010026cf60, zero=0x000000010026cf40) + 1865 at mod.cpp:690
frame #24: 0x00000001042f83db lazylinker_ext.so`CLazyLinker_call(_self=0x0000000108fad3d8, args=0x0000000100382048, kwds=0x0000000000000000) + 811 at mod.cpp:838
frame #25: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #26: 0x00000001000ed08c libpython3.5m.dylib`PyEval_EvalFrameEx + 7660
frame #27: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #28: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #29: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #30: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #31: 0x000000010002a79c libpython3.5m.dylib`method_call + 140
frame #32: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #33: 0x0000000100080743 libpython3.5m.dylib`slot_tp_call + 67
frame #34: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #35: 0x00000001000ed08c libpython3.5m.dylib`PyEval_EvalFrameEx + 7660
frame #36: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #37: 0x00000001000f3d26 libpython3.5m.dylib`PyEval_EvalFrameEx + 35462
frame #38: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #39: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #40: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #41: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #42: 0x00000001000eff0b libpython3.5m.dylib`PyEval_EvalFrameEx + 19563
frame #43: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #44: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #45: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #46: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #47: 0x000000010002a79c libpython3.5m.dylib`method_call + 140
frame #48: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #49: 0x0000000100080743 libpython3.5m.dylib`slot_tp_call + 67
frame #50: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #51: 0x00000001000eff0b libpython3.5m.dylib`PyEval_EvalFrameEx + 19563
frame #52: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #53: 0x00000001000f3d26 libpython3.5m.dylib`PyEval_EvalFrameEx + 35462
frame #54: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #55: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #56: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #57: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #58: 0x00000001000eff0b libpython3.5m.dylib`PyEval_EvalFrameEx + 19563
frame #59: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #60: 0x00000001000f3d26 libpython3.5m.dylib`PyEval_EvalFrameEx + 35462
frame #61: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #62: 0x00000001000f3d26 libpython3.5m.dylib`PyEval_EvalFrameEx + 35462
frame #63: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #64: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #65: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #66: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #67: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #68: 0x000000010002a79c libpython3.5m.dylib`method_call + 140
frame #69: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #70: 0x0000000100080471 libpython3.5m.dylib`slot_tp_init + 81
frame #71: 0x000000010007b114 libpython3.5m.dylib`type_call + 212
frame #72: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #73: 0x00000001000ed08c libpython3.5m.dylib`PyEval_EvalFrameEx + 7660
frame #74: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #75: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #76: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #77: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #78: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #79: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #80: 0x00000001000eff0b libpython3.5m.dylib`PyEval_EvalFrameEx + 19563
frame #81: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #82: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #83: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #84: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #85: 0x000000010002a79c libpython3.5m.dylib`method_call + 140
frame #86: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #87: 0x0000000100080471 libpython3.5m.dylib`slot_tp_init + 81
frame #88: 0x000000010007b114 libpython3.5m.dylib`type_call + 212
frame #89: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #90: 0x00000001000eff0b libpython3.5m.dylib`PyEval_EvalFrameEx + 19563
frame #91: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #92: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #93: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #94: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #95: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #96: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #97: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #98: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #99: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #100: 0x00000001000eff0b libpython3.5m.dylib`PyEval_EvalFrameEx + 19563
frame #101: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #102: 0x00000001000f4ef7 libpython3.5m.dylib`PyEval_EvalCodeEx + 71
frame #103: 0x0000000100041d2a libpython3.5m.dylib`function_call + 186
frame #104: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #105: 0x000000010002a79c libpython3.5m.dylib`method_call + 140
frame #106: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #107: 0x0000000100080743 libpython3.5m.dylib`slot_tp_call + 67
frame #108: 0x000000010000d783 libpython3.5m.dylib`PyObject_Call + 99
frame #109: 0x00000001000eff0b libpython3.5m.dylib`PyEval_EvalFrameEx + 19563
frame #110: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #111: 0x00000001000f3d26 libpython3.5m.dylib`PyEval_EvalFrameEx + 35462
frame #112: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #113: 0x00000001000f3d26 libpython3.5m.dylib`PyEval_EvalFrameEx + 35462
frame #114: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #115: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #116: 0x00000001000f3d26 libpython3.5m.dylib`PyEval_EvalFrameEx + 35462
frame #117: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #118: 0x00000001000f3d26 libpython3.5m.dylib`PyEval_EvalFrameEx + 35462
frame #119: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #120: 0x00000001000f3d26 libpython3.5m.dylib`PyEval_EvalFrameEx + 35462
frame #121: 0x00000001000f4053 libpython3.5m.dylib`PyEval_EvalFrameEx + 36275
frame #122: 0x00000001000f4df0 libpython3.5m.dylib`_PyEval_EvalCodeWithName + 2400
frame #123: 0x00000001000f4f51 libpython3.5m.dylib`PyEval_EvalCode + 81
frame #124: 0x0000000100123d4e libpython3.5m.dylib`PyRun_FileExFlags + 206
frame #125: 0x0000000100123fef libpython3.5m.dylib`PyRun_SimpleFileExFlags + 447
frame #126: 0x000000010013c7d7 libpython3.5m.dylib`Py_Main + 3479
frame #127: 0x0000000100000e92 python3`main + 418
frame #128: 0x0000000100000cc4 python3`start + 52

我真的不知道如何解释这种回溯。在此先感谢您的帮助!

3 个答案:

答案 0 :(得分:3)

如果有人遇到同样的问题,那么解决方法就是在任务中内联导入theano库,而不是在模块级别。

这样:

import baz
import bar

@app.task
def foo():
    import theano

    # do something with theano

检查here以获取更多说明

答案 1 :(得分:1)

FWIW,这也发生在sklearn.cluster.KMeans。如果我使用threading.Thread自己创建线程,则工作正常。如果我尝试在Celery worker下调用fit,则获取sig11。

我对sklearn.linear_model.LogisticRegressionRidgeLinearRegression没有同样的问题。

答案 2 :(得分:1)

除了从 .celery from <project>.celery import app 导入应用程序外,我已经删除了在 tasks.py 文件顶部导入的所有包,然后在各个任务函数中导入包。它奏效了。