我正在尝试编写一个使用OpenCL执行蒙特卡罗模拟的程序。我遇到了一个涉及指数的问题。当变量步骤的值变大(大约20000)时,指数的计算意外失败,程序退出“Abort Trap:6”。考虑到步骤不应影响内存分配,这似乎是一个奇怪的错误。我已经尝试将正常, alpha 和 beta 设置为0,但这并不能解决问题,但会将指针注释掉并将其替换为常数1似乎解决了这个问题。我在AWS GPU实例上运行了我的代码,它没有遇到任何问题。有没有人想知道为什么这可能是集成显卡的问题?
解
在较小的范围内多次执行内核,以使内核执行时间低于5秒
代码段
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
static uint MWC64X(uint2 *state) {
enum { A = 4294883355U };
uint x = (*state).x, c = (*state).y;
uint res = x ^ c;
uint hi = mul_hi(x, A);
x = x * A + c;
c = hi + (x < c);
*state = (uint2)(x, c);
return res;
}
__kernel void discreteMonteCarloKernel(...) {
float cumulativeWalk = stockPrice;
float currentValue = stockPrice;
...
uint n = get_global_id(0);
uint2 seed2 = (uint2)(n, seed);
uint random1 = MWC64X(&seed2);
uint2 seed3 = (uint2)(random1, seed);
uint random2 = MWC64X(&seed3);
float alpha = (interestRate - 0.5 * sigma * sigma) * dt;
float beta = sigma * sqrt(dt);
float u1;
float u2;
float a;
float b;
float normal;
for (int j = 0; j < steps; j++) {
random1 = MWC64X(&seed2);
if (random1 == 0) {
random1 = MWC64X(&seed2);
}
random2 = MWC64X(&seed3);
u1 = (float)random1 / (float)0xffffffff;
u2 = (float)random2 / (float)0xffffffff;
a = sqrt(-2 * log(u1));
b = 2 * M_PI * u2;
normal = a * sin(b);
exponent = exp(alpha + beta * normal);
currentValue = currentValue * exponent;
cumulativeWalk += currentValue;
...
}
问题报告
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Exception Note: EXC_CORPSE_NOTIFY
Application Specific Information:
abort() called
Application Specific Signatures:
Graphics hardware encountered an error and was reset: 0x00000813
Thread 0 Crashed:: Dispatch queue: opencl_runtime
0 libsystem_kernel.dylib 0x00007fffb14bad42 __pthread_kill + 10
1 libsystem_pthread.dylib 0x00007fffb15a85bf pthread_kill + 90
2 libsystem_c.dylib 0x00007fffb1420420 abort + 129
3 libGPUSupportMercury.dylib 0x00007fffa98e6fbf gpusGenerateCrashLog + 158
4 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x000000010915f13b gpusKillClientExt + 9
5 libGPUSupportMercury.dylib 0x00007fffa98e7983 gpusQueueSubmitDataBuffers + 168
6 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x00000001091aa031 IntelCLCommandBuffer::getNew(GLDQueueRec*) + 31
7 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x00000001091a9f99 intelSubmitCLCommands(GLDQueueRec*, unsigned int) + 65
8 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x00000001091b00a1 CHAL_INTEL::ChalContext::ChalFlush() + 83
9 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x00000001091aa2c3 gldFinishQueue + 43
10 com.apple.opencl 0x00007fff9ffeeb37 0x7fff9ffed000 + 6967
11 com.apple.opencl 0x00007fff9ffef000 0x7fff9ffed000 + 8192
12 com.apple.opencl 0x00007fffa000ccca 0x7fff9ffed000 + 130250
13 com.apple.opencl 0x00007fffa001029d 0x7fff9ffed000 + 144029
14 libdispatch.dylib 0x00007fffb13568fc _dispatch_client_callout + 8
15 libdispatch.dylib 0x00007fffb1357536 _dispatch_barrier_sync_f_invoke + 83
16 com.apple.opencl 0x00007fffa001011d 0x7fff9ffed000 + 143645
17 com.apple.opencl 0x00007fffa000bda6 0x7fff9ffed000 + 126374
18 com.apple.opencl 0x00007fffa00011df clEnqueueReadBuffer + 813
19 simplisticComparison 0x0000000107b953cf BinomialMultiplication::execute(int) + 1791
20 simplisticComparison 0x0000000107b9ec7f main + 767
21 libdyld.dylib 0x00007fffb138c235 start + 1
Thread 1:
0 libsystem_pthread.dylib 0x00007fffb15a50e4 start_wqthread + 0
1 ??? 0x000070000eed6b30 0 + 123145552751408
Thread 2:
0 libsystem_pthread.dylib 0x00007fffb15a50e4 start_wqthread + 0
Thread 3:
0 libsystem_pthread.dylib 0x00007fffb15a50e4 start_wqthread + 0
1 ??? 0x007865646e496d65 0 + 33888479226719589
Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000000 rbx: 0x0000000000000006 rcx: 0x00007fff58074078 rdx: 0x0000000000000000
rdi: 0x0000000000000307 rsi: 0x0000000000000006 rbp: 0x00007fff580740a0 rsp: 0x00007fff58074078
r8: 0x0000000000000000 r9: 0x00007fffb140ba50 r10: 0x0000000008000000 r11: 0x0000000000000206
r12: 0x00007f92de80a7e0 r13: 0x00007f92e0008c00 r14: 0x00007fffba29e3c0 r15: 0x00007f92de801a00
rip: 0x00007fffb14bad42 rfl: 0x0000000000000206 cr2: 0x00007fffba280128
Logical CPU: 0
Error Code: 0x02000148
Trap Number: 133
答案 0 :(得分:1)
我有一个猜测。驱动程序可能以两种方式崩溃:
我的钱在#2上。如果较大的值(步长)使GPU运行时间太长,系统将终止事情。
我不熟悉Apple的Intel驱动程序的内核,但通常有一种方法可以在极端情况下禁用TDR。例如。请参阅TDR上的Windows Documenation以获取要点。 (Linux驱动程序也有办法禁用它。)
通常我们希望避免运行超长的事情,并且以某种方式分解工作负载可能是个好主意,这样你自然不会遇到这个kill开关。例如。也许将“步骤”分成更小的块(传入并保存你无法重新计算的部分的状态)。