Question

考虑以下示例：

template<int X> class MyClass
{
    public:
        MyClass(int x) {_ncx = x;}
        void test() 
        {
            for (unsigned int i = 0; i < 1000000; ++i) {
                if ((X < 0) ? (_cx > 5) : (_ncx > 5)) {
                    /* SOMETHING */
                } else {
                    /* SOMETHING */
                }
            }
        }
    protected:
        static const int _cx = (X < 0) ? (-X) : (X);
        int _ncx;
};

我的问题是：MyClass＆lt; -6＆gt; :: test（）和MyClass＆lt; 6＆gt; :: test（）会有不同的速度吗？

我希望如此，因为在模板参数为负的情况下，测试函数中的if可以在编译时进行评估，但是如果有编译的话，我不确定编译器的行为是什么-time事物和三元运算符中的非编译时间（这里就是这种情况）。

注意：这是一个纯粹的“理论”问题。如果存在“yes”的非空概率，我将使用这样的编译时模板参数为我的代码实现一些类，如果没有，我将只提供运行时版本。

Answer 1

在循环外移动条件：

        ...
        if ((X < 0) ? (_cx > 5) : (_ncx > 5)) {
            for (unsigned int i = 0; i < 1000000; ++i) {
                /* SOMETHING */
            }
        } else {
            for (unsigned int i = 0; i < 1000000; ++i) {
                /* SOMETHING */
            }
        }
        ...

这样你就不依赖于编译器优化来删除未使用的代码;如果编译器没有删除条件的未使用部分，你只需支付一次条件分支，而不是每次循环。

Answer 2

对于我的编译器（OS X上的clang ++ v2.9）编译这个相似但不相同的代码：

void foo();
void bar();

template<int N>
void do_something( int arg ) {
  if ( N<0 && arg<0 ) { foo(); }
  else { bar(); }
}

// Some functions to instantiate the templates.
void one_fn(int arg) {
  do_something<1>(arg);
}

void neg_one_fn(int arg) {
  do_something<-1>(arg);
}

这将使用clang++ -S -O3生成以下程序集。

one_fn = do_something＆lt; 1＆gt;

第一个功能组件显然只能调用bar。

    .globl  __Z6one_fni
    .align  4, 0x90
__Z6one_fni:                            ## @_Z6one_fni
Leh_func_begin0:
    pushl   %ebp
    movl    %esp, %ebp
    popl    %ebp
    jmp __Z3barv                ## TAILCALL
Leh_func_end0:

neg_one_fn = do_something＆lt; -1＆gt;

如果要调用bar或foo，第二个功能已简化为简单。

    .globl  __Z10neg_one_fni
    .align  4, 0x90
__Z10neg_one_fni:                       ## @_Z10neg_one_fni
Leh_func_begin1:
    pushl   %ebp
    movl    %esp, %ebp
    cmpl    $0, 8(%ebp)
    jns LBB1_2                  ## %if.else.i
    popl    %ebp
    jmp __Z3foov                ## TAILCALL
LBB1_2:                                 ## %if.else.i
    popl    %ebp
    jmp __Z3barv                ## TAILCALL
Leh_func_end1:

摘要

因此，您可以看到编译器内联模板，然后在可能的情况下优化掉分支。因此，您希望的转换类型确实会出现在当前的编译器中。我从旧的g ++ 4.0.1编译器得到了类似的结果（但是组装不太清楚）。

附录：

我认为这个例子与你的初始案例不太相似（因为它没有'涉及三元运算符）所以我把它改为:(获得相同类型的结果）

template<int X>
void do_something_else( int _ncx ) {
  static const int _cx = (X<0) ? (-X) : (X);
  if ( (X < 0) ? (_cx > 5) : (_ncx > 5)) {
    foo();
  } else {
    bar();
  }
}

void a(int arg) {
  do_something_else<1>(arg);
}

void b(int arg) {
  do_something_else<-1>(arg);
}

这会生成程序集

a（）= do_something_else＆lt; 1＆gt;

这仍包含分支。

__Z1ai:                                 ## @_Z1ai
Leh_func_begin2:
    pushl   %ebp
    movl    %esp, %ebp
    cmpl    $6, 8(%ebp)
    jl  LBB2_2                  ## %if.then.i
    popl    %ebp
    jmp __Z3foov                ## TAILCALL
LBB2_2:                                 ## %if.else.i
    popl    %ebp
    jmp __Z3barv                ## TAILCALL
Leh_func_end2:

b（）= do_something_else＆lt; -1＆gt;

分支已经过优化。

__Z1bi:                                 ## @_Z1bi
Leh_func_begin3:
    pushl   %ebp
    movl    %esp, %ebp
    popl    %ebp
    jmp __Z3barv                ## TAILCALL
Leh_func_end3:

Answer 3

这可能取决于编译器的智能程度。我建议你写一个小的基准程序，在你的环境中自己测试一下，以确定。

使用混合const /非常量三元运算符编译时

3 个答案:

one_fn = do_something＆lt; 1＆gt;

neg_one_fn = do_something＆lt; -1＆gt;

摘要

附录：

a（）= do_something_else＆lt; 1＆gt;

b（）= do_something_else＆lt; -1＆gt;