Nvidia Cuda C分段故障

时间:2015-12-05 17:55:30

标签: cuda nvidia

<!DOCTYPE html>
<html>
<head>
    <title>Contact</title>
    <link href="css/bootstrap.min.css" rel="stylesheet" />
</head>
<body name="viewport" content="width=device-width, initial-scale=1.0">
    <div id="container">
        <form class="form-horizontal">
            <div class="form-group">
                <label class="col-sm-2 control-label" for="name">
                    Name
                </label>
                <div class="col-sm-4">
                    <input class="form-control" type="text" id="name" placeholder="Your name" />
                </div>
            </div>
            <div class="form-group">
                <label class="col-sm-2 control-label" for="email">
                    E-mail
                </label>
                <div class="col-sm-4">
                    <input class="form-control" type="email" id="email" placeholder="Your email address" />
                </div>
            </div>
            <div class="form-group">
                <label class="col-sm-2 control-label" for="comment">
                    Message
                </label>
                <div class="cols-sm-4">
                    <textarea class="form-control" rows="3" id="comment"></textarea>
                </div>
            </div>
            <div class="form-group">
                <div class="col-sm-10 col-sm-offset-2">
                    <input class="btn btn-success" type="submit" value="Send" />
                </div>
            </div>
        </form>
    </div>
</body>
</html>

编辑gdb结果

#include <stdio.h>
#include <sys/time.h>

#include <cuda_runtime.h>

float *h_A, *h_B, *h_C, *d_A, *d_B, *d_C;
float **d_Many, **h_Many;
cudaError_t err = cudaSuccess;
long numElements = 10000000;
double startHostAllocate, endHostAllocate, startDeviceAllocate,
       endDeviceAllocate, startCopy, endCopy, startExecute, endExecute;

double cpuSecond() {
    struct timeval tp;
    gettimeofday(&tp, NULL);
    return ((double) tp.tv_sec + (double) tp.tv_usec * 1.e-6);
}

void** allocateManyHostMemory(void **manyHostMemory, int length, size_t size,
        int numElements) {
    manyHostMemory = (void **) malloc(sizeof(void*) * length);
    printf("Host array memory allocated");
    for (int i = 0; i < length; i++) {
        manyHostMemory[i] = malloc(size * numElements);
    }
    return manyHostMemory;
}

void allocateMemory(int numElements) {
    bool memcpyThisArray[numElements];

    startHostAllocate = cpuSecond();
    {
        allocateManyHostMemory((void **) h_Many, 3, sizeof(float), numElements);
    }
    endHostAllocate = cpuSecond();
    printf("Host memory allocated");
}

int main(void) {
    startDeviceAllocate = cpuSecond();
    allocateMemory(numElements);
    endDeviceAllocate = cpuSecond();
}

我在这里错过了什么?

再次编辑MVCE 我添加了代码,以便可以复制和编译。

1 个答案:

答案 0 :(得分:2)

问题是the same as this(与CUDA无关):

bool memcpyThisArray[numElements];

numElements=10000000时,程序用尽堆空间并产生堆栈溢出/分段错误。将代码更改为:

   void allocateMemory(int numElements) {
        /* bool memcpyThisArray[numElements]; */

        startHostAllocate = cpuSecond();
        {
            allocateManyHostMemory((void **) h_Many, 3, sizeof(float), numElements);
        }
        endHostAllocate = cpuSecond();
        printf("Host memory allocated");
    }

问题就会消失。

如果您确实需要在实际应用程序中使用memcpyThisArray,请动态分配它。