Question

我有一个带有值的嵌套列表：

list = [
...
['Country1', 142.8576737907048, 207.69725105029553, 21.613192419863577, 15.129178465784218],
['Country2', 109.33326343550823, 155.6847323746669, 15.450489646386226, 14.131554442715336],
['Country3', 99.23033109735835, 115.37122637190915, 5.380298424850267, 5.422030104456135],
...]

我想按照数量级计算第二个索引/列中的值，从最低数量级开始，到最大数量结束......例如。

99.23033109735835 = 10 <= x < 100
142.8576737907048 = 100 <= x < 1000
             9432 = 1000 <= x < 10000

目的是输出一个简单的char（＃）计数，表示每个类别中有多少个索引值，例如

  10 <= x < 100: ###
100 <= x < 1000: #########

我首先抓取索引的max()和min()值，以便自动计算最大和较小的数量类别，但我不确定如何将每个值关联到列到一个数量级......如果有人能指出我正确的方向或给我一些想法，我将非常感激。

Answer 1

此函数会将您的double变为整数数量级：

>>> def magnitude(x):
...     return int(math.log10(x))
... 
>>> magnitude(99.23)
1
>>> magnitude(9432)
3

（所有10 ** magnitude(x) <= x <= 10 ** (1 + magnitude(x))所以x。）

只需将幅度用作键，并计算每个键的出现次数。 defaultdict在这里可能会有所帮助。

请注意，此幅度仅适用于10的正幂（因为int(double)截断向零舍入）。

使用

def magnitude(x):
    return int(math.floor(math.log10(x)))

相反，如果这对您的用例很重要。（感谢larsmans指出这一点。）

Answer 2

按数量级进行分类：

from math import floor, log10
from collections import Counter
counter =  Counter(int(floor(log10(x[1]))) for x in list)

1从10到小于100,2从100到小于1000。

print counter
Counter({2: 2, 1: 1})

然后只是将其打印出来

for x in sorted(counter.keys()):
    print "%d <= x < %d: %d" % (10**x, 10**(x+1), counter[x])

Answer 3

如果x是您的某个号码，那么len(str(int(x)))是什么？

或者，如果您的数字小于0，那么int(math.log10(x))是什么？

（另请参阅log10的文档。另请注意，此处的int（）舍入可能不是您想要的 - 请参阅ceil和floor，并注意您可能需要{{ 1}}或int(ceil(...))得到整数答案）

Answer 4

import bisect
from collections import defaultdict
lis1 = [['Country1', 142.8576737907048, 207.69725105029553, 21.613192419863577, 15.129178465784218],
['Country2', 109.33326343550823, 155.6847323746669, 15.450489646386226, 14.131554442715336],
['Country3', 99.23033109735835, 115.37122637190915, 5.380298424850267, 5.422030104456135],
]
lis2 = [0, 100, 1000, 1000]

dic = defaultdict(int)

for x in lis1:
       x = x[1]
       ind=bisect.bisect(lis2,x) 
       if not (x >= lis2[-1] or x <= lis2[0]):
           sm, bi = lis2[ind-1], lis2[ind]
           dic ["{} <= {} <= {}".format(sm ,x, bi)] +=1
for k,v in dic.items():
    print k,'-->',v

<强>输出：

0 <= 99.2303310974 <= 100 --> 1
100 <= 142.857673791 <= 1000 --> 1
100 <= 109.333263436 <= 1000 --> 1

Answer 5

如果你想要重叠范围或具有任意界限的范围（不坚持2 /任何其他可预测系列的数量级/次数）：

from collections import defaultdict
lst = [
    ['Country1', 142.8576737907048, 207.69725105029553, 21.613192419863577, 15.129178465784218],
    ['Country2', 109.33326343550823, 155.6847323746669, 15.450489646386226, 14.131554442715336],
    ['Country3', 99.23033109735835, 115.37122637190915, 5.380298424850267, 5.422030104456135],
]

buckets = {
    '10<=x<100': lambda x: 10<=x<100,
    '100<=x<1000': lambda x: 100<=x<1000,
}

result = defaultdict(int)
for item in lst:
    second_column = item[1]
    for label, range_check in buckets.items():
        if range_check(second_column):
            result[label] +=1

print (result)

Answer 6

另一种选择，使用bisect

import bisect
from collections import Counter
list0 = [
['Country1', 142.8576737907048, 207.69725105029553, 21.613192419863577, 15.129178465784218],
['Country2', 109.33326343550823, 155.6847323746669, 15.450489646386226, 14.131554442715336],
['Country3', 99.23033109735835, 115.37122637190915, 5.380298424850267, 5.422030104456135]
]

magnitudes = [10**x for x in xrange(5)]
c = Counter(bisect.bisect(magnitudes, x[1]) for x in list0)
for x in c:
  print x, '#'*c[x]

Answer 7

将Useless'答案扩展到所有实数，您可以使用：

import math

def magnitude (value):
    if (value == 0): return 0
    return int(math.floor(math.log10(abs(value))))

测试用例：

In [123]: magnitude(0)
Out[123]: 0

In [124]: magnitude(0.1)
Out[124]: -1

In [125]: magnitude(0.02)
Out[125]: -2

In [126]: magnitude(150)
Out[126]: 2

In [127]: magnitude(-5280)
Out[127]: 3

Python：按数量级对列表进行分类

7 个答案: