为什么我在这里遇到段错误?

时间:2013-04-25 08:36:34

标签: c

将数据拆分为训练/测试子集的代码如下。请注意,data_points是一个大小项* attr的长向量,data_labels是大小项的向量。

int split_data(int items, int attr, double *data_points, int *data_labels, double **split_train_points, int **split_train_labels, double **split_test_points, int **split_test_labels)
{
  srand(time(NULL));
  int i, j;
  double temp0, temp1;
  double sorter[items][2];

  *split_train_points = malloc(floor(SPLIT_PROP*items * attr) * sizeof(double));
  *split_train_labels = malloc(floor(SPLIT_PROP*items       ) * sizeof(int));

  *split_test_points  = malloc(ceil((1-SPLIT_PROP)*items * attr) * sizeof(double));
  *split_test_labels  = malloc(ceil((1-SPLIT_PROP)*items       ) * sizeof(int));

  // create a 2d array with element number in one column and a random number in the other
  for (i = 0; i < items; i++) {
      sorter[i][0] = i;
      sorter[i][1] = rand() / (double)RAND_MAX;
  }

  // sort by the random number column
  for (i = items-1; i > 0; i--) {
    for (j = 1; j <= i; j++) {
      if (sorter[j-1][1] > sorter[j][1]) {
        temp0 = sorter[j-1][0];
        temp1 = sorter[j-1][1];

        sorter[j-1][0] = sorter[j][0];
        sorter[j-1][1] = sorter[j][1];

        sorter[j][0] = temp0;
        sorter[j][1] = temp1;
      }
    }
  }

  int cutoff = floor(SPLIT_PROP*items);
  int element = 0;
  // now we have a bunch of indices in a random order.  we select the first 70% to store into our split_train datasets
  for (i = 0; i < cutoff; i++) {
    element = (int)sorter[i][0];
    *split_train_labels[i] = data_labels[element];
    printf("success!\n");
    for (j = 0; j < attr; j++) {
      printf("j: %d, data_points_element: %d\n",j,attr*element+j);

      //SEGFAULT OCCURS HERE WHEN J=4 EVERY TIME EVEN AS ELEMENT VALUE CHANGES DUE TO RANDOMNESS
      *split_train_points[attr*i+j] = data_points[attr*element+j];
      printf("j out! %d\n",j);
    }
  }


  for (i = cutoff; i < items; i++) {
    *split_train_labels[i - cutoff] = data_labels[(int)sorter[i][0]];

    for (j = 0; j < attr; j++) {
      *split_train_points[attr*(i-cutoff)+j] = data_points[attr*(int)sorter[i][0]+j];
    }
  }  

  return 0;
}

如代码中所述,SEGFAULT出现在同一行,j = 4,即使“element”是一个随机数。

1 个答案:

答案 0 :(得分:2)

我的猜测是因为表达式*split_train_labels[i]并不代表您认为的含义。对于与*(split_train_labels[i])相同的编译器,但您可能意味着(*split_train_labels)[i]。你在多个地方遇到这个问题。

数组索引具有比指针解除引用更高的precedence