Question

我想把熊猫中数据集中的一列从“对象”更改为“ int64”。我的DataFrame名为bsblandings。

我的bsblandings.info（）输出看起来像这样：

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 810 entries, 0 to 809
Data columns (total 9 columns):
Year           810 non-null int64
Coast          810 non-null object
Subregion      810 non-null object
State          810 non-null object
Common Name    810 non-null object
Pounds         810 non-null object
Live Pounds    810 non-null object
Dollars        810 non-null object
% Display      810 non-null object
dtypes: int64(1), object(8)
memory usage: 57.0+ KB

我需要使用“磅”（Pounds）列，并且我成功地将所有非int64值从“ *”更改为“ 0”。我也尝试使用numpy和NaN。

我用过：

bsblandings = bsblandings.replace('*', ' ')

这并未将dtype从“对象”更改为“ int64”（尽管实际上所有“ *”都已替换为“ 0”。

然后我尝试使用以下方法对“磅”列进行排序：

bsblandings.sort_values("Pounds")

我真正需要的是仅对磅列从最小到最大（或从最大到最小）进行排序。当我尝试使用.sort_values进行此操作时，该列未正确排序。相反，我得到的输出命令为103800、10400、104400、10600：

90  1951    US Atlantic Coast   North Atlantic  MASSACHUSETTS   BASS, BLACK SEA 103800  103800      100%
223 1964    US Atlantic Coast   North Atlantic  MASSACHUSETTS   BASS, BLACK SEA 10400   10400   1687    100%
380 1977    US Atlantic Coast   North Atlantic  MASSACHUSETTS   BASS, BLACK SEA 104400  104400  67172   100%
269 1965    US Atlantic Coast   North Atlantic  MASSACHUSETTS   BASS, BLACK SEA 10600   10600   1379    100%

我是菜鸟，已经搜寻了很多东西，但是我一直撞墙。任何帮助将不胜感激。

Answer 1

这不是错误：排序正确。您的Pounds列是字符串格式，因此就是所应用的排序。字符串是按整理顺序排序的，不是明显的数值。因此，以“ 103”开头的内容要小于以“ 104”开头的内容。

如果要进行数字排序，请将该列转换为int，或指定将其强制转换为int的排序键。

Answer 2

这已经做好了！

bsblandings [“磅”] = pd.to_numeric（bsblandings [“磅”]）

谢谢！

在熊猫中丢掉NaN有麻烦

2 个答案: