段:
test = pd.DataFrame({'counts':[0,1,2,3,4,5,6,11,12,14,15]})
test['range'] = pd.cut(test.counts, [0,5,10,15], include_lowest=True)
test
输出:
counts range
0 0 (-0.001, 5.0]
1 1 (-0.001, 5.0]
2 2 (-0.001, 5.0]
3 3 (-0.001, 5.0]
4 4 (-0.001, 5.0]
5 5 (-0.001, 5.0]
6 6 (5.0, 10.0]
7 11 (10.0, 15.0]
8 12 (10.0, 15.0]
9 14 (10.0, 15.0]
10 15 (10.0, 15.0]
我可以获得(0, 5.0]
而不是(-0.001, 5.0]
吗?为什么-0.001即使我没有指定它也会出现?
答案 0 :(得分:3)
这是the result of include_lowest=True
internal logic。
您可以按pd.cuts()
include_lowest=False
In [50]: import pandas.core.algorithms as algos
In [51]: labels = pd.Categorical(pd.core.reshape.tile._format_labels(algos.unique(bins), precision=0),
ordered=True)
In [52]: labels
Out[52]:
[(0, 5], (5, 10], (10, 15]]
Categories (3, interval[int64]): [(0, 5] < (5, 10] < (10, 15]]
In [53]: test['range'] = pd.cut(test.counts, [0,5,10,15],
labels=labels,
include_lowest=True)
In [54]: test
Out[54]:
counts range
0 0 (0, 5]
1 1 (0, 5]
2 2 (0, 5]
3 3 (0, 5]
4 4 (0, 5]
5 5 (0, 5]
6 6 (5, 10]
7 11 (10, 15]
8 12 (10, 15]
9 14 (10, 15]
10 15 (10, 15]
的方式自行生成标签。
{{1}}