执行cblas_cgemm()时应该进行哪些更改;

时间:2018-01-12 16:40:54

标签: c cblas

我尝试使用函数cblas_cgemm()进行矩阵到矩阵乘法;但与手动计算相比,我得到的答案是不正确的。我试图简化我的代码而不使用输入中的虚构术语,但问题仍然存在。我应该做些什么改变来获得正确的输出。这是我的代码。

#include<stdio.h>
#include<math.h>
#include<complex.h>
#include "cblas.h"

void main()
{
 int i,j;
 double complex A[2][2]={1,2,
                         3,4};
 double complex B[2][2]={4,5,
                         6,7};
 double complex W[2][2]={0,0,
                         0,0};

 const int m1=2;
 const int n1=2;
 const int k1=2;

 const int lda1=2;
 const int ldb1=2;
 const int ldc1=2;
 const double alpha=1.0;
 const double beta=0.0;

 cblas_cgemm(CblasRowMajor,CblasNoTrans,CblasNoTrans,m1,n1,k1,&alpha,A,lda1,B, ldb1 ,&beta,W, ldc1);

 for(i=0;i<m1;++i)
  {
  for(j=0;j<n1;++j)
    printf("%lf %lf\n" ,creal(W[i][j]),cimag(W[i][j]));
  printf("\n");
   }
 }

我的输出为

-119296.000000 0.000000
-188416.000000 0.000000 0.000000 0.000000
0.000000 0.000000
我提到了这个网站lapack:cblas_cgemm 请帮忙 我使用cblas_dgemm()的代码如下所示

//Y := alpha*A*X + beta*Y, or   y := alpha*A**T*x + beta*y,
#include<stdio.h>
#include "cblas.h"
const double A[3][1]={
                      1,
                      2,
                      3
                       };
const double X[1][4]={
1,2,3,4,
};
double Y[3][4]={
0,0,0,0,
0,0,0,0,
0,0,0,0
};
int main()
{
 const int m=3;
const int k=1;const int n=4;
const int lda=1;
const int ldb=4;
const int ldc=4;
int incX,incY;
const double alpha=1.0;
const double beta=0.0;
incX=1;incY=1;
int i,j;
for(i=0;i<m;++i)
   {for(j=0;j<k;++j)
    printf("%lf \t" ,A[i][j]);
putchar('\n');
}
cblas_dgemm(CblasRowMajor,CblasNoTrans,CblasNoTrans,m,n,k,alpha,A, lda,X, ldb ,beta,Y, ldc);
for(i=0;i<m;++i)
{
for(j=0;j<n;++j)
printf("%lf\t" ,Y[i][j]);
printf("\n");
}
return 0;
}

我的输出为

hp @ hp-HP-Notebook:〜/ beamforming / programs / studentprojectdetails $ ./dgemm_trial 1.000000
2.000000
3.000000
1.000000 2.000000 3.000000 4.000000
2.000000 4.000000 6.000000 8.000000
3.000000 6.000000 9.000000 12.000000

2 个答案:

答案 0 :(得分:0)

第一个问题:应根据cblas.h中指定的特定复数布局声明和使用复数。您的代码表明您期望2x2矩阵,并且必须使用8个总值(4个实数和4个虚数)指定2x2复数值矩阵。

*
 * A note on complex data layouts:
 *
 * In order to allow straightforward interoperation with other libraries and
 * complex types in C and C++, complex data in BLAS is passed through an opaque
 * pointer (void *).  The layout requirements on this complex data are that
 * the real and imaginary parts are stored consecutively in memory, and have
 * the alignment of the corresponding real type (float or double).  The BLAS
 * complex interfaces are compatible with the following types:
 *
 *     - The C complex types, defined in <complex.h>.
 *     - The C++ std::complex types, defined in <complex>.
 *     - The LAPACK complex types, defined in <Accelerate/vecLib/clapack.h>.
 *     - The vDSP types DSPComplex and DSPDoubleComplex, defined in <Accelerate/vecLib/vDSP.h>.
 *     - An array of size two of the corresponding real type.
 *     - A structure containing two elements, each of the corresponding real type.
 * 

第二个问题:BLAS例程不适用于二维数组。相反,你应该声明一个长的一维数组。这是LDA参数的目的。正确传递二维数组依赖于编译器将按特定顺序布置二维数组的假设,这可能是也可能不是,并导致未定义的行为。

答案 1 :(得分:0)

请参阅naming conventions of BLAS and Lapack。由于矩阵的类型为double complex,因此应使用cblas_zgemm()代替cblas_cgemm()。实际上,z用于双精度复数,c用于单精度复数。

此外,标量alphabeta也必须属于double complex类型。请参阅fortran例程的来源zgemm()来检查这些内容:COMPLEX*16对应double complex