Pandas为什么pd.cut()产生负值

时间:2017-09-03 08:09:51

标签: python pandas

段:

test = pd.DataFrame({'counts':[0,1,2,3,4,5,6,11,12,14,15]})
test['range'] = pd.cut(test.counts, [0,5,10,15], include_lowest=True)
test

输出:

    counts  range
0   0   (-0.001, 5.0]
1   1   (-0.001, 5.0]
2   2   (-0.001, 5.0]
3   3   (-0.001, 5.0]
4   4   (-0.001, 5.0]
5   5   (-0.001, 5.0]
6   6   (5.0, 10.0]
7   11  (10.0, 15.0]
8   12  (10.0, 15.0]
9   14  (10.0, 15.0]
10  15  (10.0, 15.0]

我可以获得(0, 5.0]而不是(-0.001, 5.0]吗?为什么-0.001即使我没有指定它也会出现?

1 个答案:

答案 0 :(得分:3)

这是the result of include_lowest=True internal logic

您可以按pd.cuts() include_lowest=False In [50]: import pandas.core.algorithms as algos In [51]: labels = pd.Categorical(pd.core.reshape.tile._format_labels(algos.unique(bins), precision=0), ordered=True) In [52]: labels Out[52]: [(0, 5], (5, 10], (10, 15]] Categories (3, interval[int64]): [(0, 5] < (5, 10] < (10, 15]] In [53]: test['range'] = pd.cut(test.counts, [0,5,10,15], labels=labels, include_lowest=True) In [54]: test Out[54]: counts range 0 0 (0, 5] 1 1 (0, 5] 2 2 (0, 5] 3 3 (0, 5] 4 4 (0, 5] 5 5 (0, 5] 6 6 (5, 10] 7 11 (10, 15] 8 12 (10, 15] 9 14 (10, 15] 10 15 (10, 15] 的方式自行生成标签。

{{1}}
相关问题