Question

在Java中处理浮点值时，调用toString（）方法会给出一个打印值，该值具有正确数量的浮点有效数字。但是，在C ++中，通过stringstream打印float会在5位或更少位数后对值进行舍入。有没有办法将C ++中的浮点数“漂亮地”打印到（假定的）有效数字的正确数量上？

编辑：我想我被误解了。我希望输出是动态长度，而不是固定的精度。我熟悉setprecision。如果你查看Double的java源代码，它会以某种方式计算有效数字的数量，我真的想了解它是如何工作的和/或在C ++中轻松复制它的可行性。

/*
 * FIRST IMPORTANT CONSTRUCTOR: DOUBLE
 */
public FloatingDecimal( double d )
{
    long    dBits = Double.doubleToLongBits( d );
    long    fractBits;
    int     binExp;
    int     nSignificantBits;

    // discover and delete sign
    if ( (dBits&signMask) != 0 ){
        isNegative = true;
        dBits ^= signMask;
    } else {
        isNegative = false;
    }
    // Begin to unpack
    // Discover obvious special cases of NaN and Infinity.
    binExp = (int)( (dBits&expMask) >> expShift );
    fractBits = dBits&fractMask;
    if ( binExp == (int)(expMask>>expShift) ) {
        isExceptional = true;
        if ( fractBits == 0L ){
            digits =  infinity;
        } else {
            digits = notANumber;
            isNegative = false; // NaN has no sign!
        }
        nDigits = digits.length;
        return;
    }
    isExceptional = false;
    // Finish unpacking
    // Normalize denormalized numbers.
    // Insert assumed high-order bit for normalized numbers.
    // Subtract exponent bias.
    if ( binExp == 0 ){
        if ( fractBits == 0L ){
            // not a denorm, just a 0!
            decExponent = 0;
            digits = zero;
            nDigits = 1;
            return;
        }
        while ( (fractBits&fractHOB) == 0L ){
            fractBits <<= 1;
            binExp -= 1;
        }
        nSignificantBits = expShift + binExp +1; // recall binExp is  - shift count.
        binExp += 1;
    } else {
        fractBits |= fractHOB;
        nSignificantBits = expShift+1;
    }
    binExp -= expBias;
    // call the routine that actually does all the hard work.
    dtoa( binExp, fractBits, nSignificantBits );
}

在这个函数之后，它调用dtoa( binExp, fractBits, nSignificantBits );处理一堆案例 - 这是来自OpenJDK6

为了更清晰，一个例子：爪哇：

double test1 = 1.2593;
double test2 = 0.004963;
double test3 = 1.55558742563;

System.out.println(test1);
System.out.println(test2);
System.out.println(test3);

输出：

1.2593
0.004963
1.55558742563

C ++：

std::cout << test1 << "\n";
std::cout << test2 << "\n";
std::cout << test3 << "\n";

输出：

1.2593
0.004963
1.55559

Answer 1

我认为你在谈论如何打印最小数量的浮点数，这些数字允许你读回完全相同的浮点数。本文很好地介绍了这个棘手的问题。

http://grouper.ieee.org/groups/754/email/pdfq3pavhBfih.pdf

dtoa函数看起来像David Gay的工作，你可以在这里找到源http://www.netlib.org/fp/dtoa.c（虽然这是C而不是Java）。

盖伊还写了一篇关于他的方法的论文。我没有链接，但在上面的论文中引用了它，所以你可以谷歌它。

Answer 2

<块引用>

有没有办法将 C++ 中的浮点数“漂亮地打印”到（假定）正确的有效数字位数？

是的，您可以使用 C++20 std::format 来实现，例如：

double test1 = 1.2593;
double test2 = 0.004963;
double test3 = 1.55558742563;
std::cout << std::format("{}", test1) << "\n";
std::cout << std::format("{}", test2) << "\n";
std::cout << std::format("{}", test3) << "\n";

印刷品

1.2593
0.004963
1.55558742563

默认格式将为您提供最短的十进制表示，并像 Java 一样提供往返保证。

由于这是一项新功能，某些标准库可能尚不支持，因此您可以使用 the {fmt} library，std::format 是基于。 {fmt} 还提供了 print 函数，使这变得更加简单和高效 (godbolt)：

fmt::print("{}", 1.2593);

免责声明：我是 {fmt} 和 C++20 std::format 的作者。

Answer 3

您可以使用ios_base :: precision技术指定所需的位数

例如

#include <iostream>
using namespace std;

int main () {
double f = 3.14159;
cout.unsetf(ios::floatfield);            // floatfield not set
cout.precision(5);
cout << f << endl;
cout.precision(10);
cout << f << endl;
cout.setf(ios::fixed,ios::floatfield);   // floatfield set to fixed
cout << f << endl;
return 0;

以上代码带输出
3.1416
3.14159
3.1415900000

Answer 4

有一个名为numeric_limits的实用程序：

#include <limits>

    ...
    int num10 = std::numeric_limits<double>::digits10;
    int max_num10 = std::numeric_limits<double>::max_digits10;

请注意，IEEE编号并非完全由十进制数字表示。这些是二进制数量。更准确的数字是二进制位的数量：

    int bits = std::numeric_limits<double>::digits;

要打印所有有效数字，请使用setprecision：

out.setprecision(std::numeric_limits<double>::digits10);

如何计算c ++ double的有效小数位数？

4 个答案: