Question

当我尝试使用管道来组合几个变压器时，第二个变压器（对数）似乎没有被应用。

我试图简化对数转换器以执行简单的加法，但是仍然存在相同的问题。

    credit_score    n_cats  premium fix_date
0   6.381816    1.386294    1.558145    2018-01-01
1   0.000000    1.098612    1.386294    2018-01-01
2   6.381816    1.098612    1.609438    2019-01-01
3   6.231465    1.203973    1.609438    2018-01-01
4   6.746412    1.203973    1.609438    2018-01-01

单独使用变压器时，显示：

    credit_score    n_cats  premium fix_date
0   590.0   3.000000    3.75    2018-01-01
1   0.0 2.000000    3.00    2018-01-01
2   590.0   2.000000    4.00    2019-01-01
3   507.5   2.333333    4.00    2018-01-01
4   850.0   2.333333    4.00    2018-01-01

当我尝试使用管道时，它会显示

int main(void)
{

    float current_i;
    float floating_part;
    int float_to_int_part;

    current_i=13;

    printf("----------------- current_i = %f ------------------\n",current_i);
    current_i=current_i/10;
    printf("new current_i = %f \n",current_i);

    floating_part = (current_i-(int)current_i);
    printf("floating_part = %f\n",floating_part);

    float_to_int_part= (int)(floating_part * 10.0); //i have a problem unserstanding where i went wrong here
    printf("float_to_int_part = %d\n",float_to_int_part);

    return 0;
}

Answer 1

问题是transform和Impute类中的Log方法的实现有所不同。在Impute中，就地修改X（不进行复制），然后将其返回。但是，在Log中，您首先复制X，对该副本进行修改，然后返回该副本。

一种快速解决方案是查看返回值以获取正确答案：

pipe = Pipeline(steps)

pipe.fit(temp)
new_df = pipe.transform(temp)

通常，更好的做法是根本不修改原始的DataFrame X，而仅将修改应用于其副本。这样，transform方法将始终返回全新的DataFrame，而您原来的DataFrame将保持不变。

Scikit-learn变压器管道产生的结果不同于单独运行

1 个答案: