Question

我的数据框中包含不同的数据类型。我想确定浮点类型的精度。我只能用以下代码选择float64：

df_float64 = df.loc[:, df.dtypes == np.float64]

（不确定为什么还会选择只有＆＃39; Nan＆＃39;值的列，但这只是旁注）

现在确定精度我接近这样的方法：

precision = len(cell.split(".")[1]

如果cell是一个字符串。

并以csv的形式输出，每列的精度最高。

所以有这样的数据框：

|     A|     B|     C|     D|
|  0.01|0.0923|   1.0|   1.2|
| 100.1| 203.3| 1.093|   1.9|
|   0.0|  0.23|  1.03|   1.0|

我想要这个：

|     A|     B|     C|     D|
|     2|     4|     3|     1|

使用Pandas可以吗？

由于

Answer 1

我认为您正在寻找applymap即

如果你有数据帧df

        A         B      C    D
0    0.01    0.0923  1.000  1.2
1  100.10  203.3000  1.093  1.9
2    0.00    0.2300  1.030  1.0

ndf = pd.DataFrame(df.astype(str).applymap(lambda x: len(x.split(".")[-1])).max()).T

如果你有nan，你可以使用if else，即

ndf = pd.DataFrame(df.astype(str).applymap(lambda x:  len(x.split(".")[-1]) if x != 'nan' else 0 ).max()).T

输出：

   A  B  C  D
0  2  4  3  1

Answer 2

您可以使用：

fillna首先删除NaNs
astype

str

按apply或list comprehension使用lambda函数
对于每列split，按str[1]获取列表的第二个值并获取len
获取max值 - 输出为Series
将Series转换为一行DataFrame

a = df.fillna(0).astype(str).apply(lambda x: x.str.split('.').str[1].str.len()).max()
print (a)
A    2
B    4
C    3
D    1
dtype: int64

df = a.to_frame().T
print (df)
   A  B  C  D
0  2  4  3  1

另一种解决方案：

df = df.fillna(0).astype(str)
a = [df[x].str.split('.').str[1].str.len().max() for x in df]

df = pd.DataFrame([a], columns=df.columns)
print (df)
   A  B  C  D
0  2  4  3  1

对数据帧进行单元操作，确定精度

2 个答案: