三重嵌套字典理解?

时间:2016-08-18 14:12:43

标签: python dictionary list-comprehension

假设我有pandas Series,就像这样:

import pandas as pd
s = pd.Series(["hello go home bye bye", "you can't always get", "what you waaaaaaant", "apple banana carrot munch 123"])

我想创建一个字典,其中单个字符作为键,其频率作为值。在collections.Counter

的帮助下,为过去的单词创建这些词典很容易
from collections import Counter
c = Counter(word for row in s for word in row.lower().split())

但是,我现在尝试存储单个字符,并且遇到了三嵌套字典理解的问题。这就是我所拥有的:

c = Counter((letter for letter in word) for word for row in s for word in row.lower().split())

这给了我一个语法错误。如何在一行中等效以下for循环?

d = {}
for row in s:
    for word in row.lower().split():
        for letter in word:
            d[letter] += 1

4 个答案:

答案 0 :(得分:2)

我认为你可以使用

Counter([j for i in s for j in i])
Counter({'a': 16, ' ': 13, 'e': 6, 'o': 6, 'n': 5, 't': 5, 'y': 5, 'h': 4, 'l': 4, 'c': 3, 'b': 3, 'u': 3, 'w': 3, 'g': 2, 'm': 2, 'p': 2, 'r': 2, "'": 1, '1': 1, '3': 1, '2': 1, 's': 1})

获取个人字符数。

答案 1 :(得分:2)

只需传递每个字,调用 .lower()展平列表列表:

import pandas as pd
s = pd.Series(["hello go home bye bye", "you can't always get", "what you waaaaaaant", "apple banana carrot munch 123"])
from collections import Counter


print(Counter(word.lower() for row in s for word in row))

或带地图的链:

from collections import Counter
from itertools import chain

print(Counter(chain.from_iterable(map(str.lower, s))))

两者都会给你:

Counter({'a': 16, ' ': 13, 'e': 6, 'o': 6, 'n': 5, 't': 5, 'y': 5, 'h': 4, 'l': 4, 'c': 3, 'b': 3, 'u': 3, 'w': 3, 'g': 2, 'm': 2, 'p': 2, 'r': 2, "'": 1, '1': 1, '3': 1, '2': 1, 's': 1})

您还可以使用 apply s.str.lower()

print(Counter(chain.from_iterable(s.apply(str.lower))))
print(Counter(chain.from_iterable(s.str.lower())))

答案 2 :(得分:2)

使用pandas:

n [6]: pd.Series(list(''.join(s))).value_counts()
Out[6]: 
a    16
     13
e     6
o     6
n     5
t     5
y     5
h     4
l     4
u     3
b     3
c     3
w     3
p     2
m     2
r     2
g     2
1     1
s     1
'     1
2     1
3     1
dtype: int64

In [7]: dict(pd.Series(list(''.join(s))).value_counts())
Out[7]: 
{' ': 13,
 "'": 1,
 '1': 1,
 '2': 1,
 '3': 1,
 'a': 16,
 'b': 3,
 'c': 3,
 'e': 6,
 'g': 2,
 'h': 4,
 'l': 4,
 'm': 2,
 'n': 5,
 'o': 6,
 'p': 2,
 'r': 2,
 's': 1,
 't': 5,
 'u': 3,
 'w': 3,
 'y': 5}

答案 3 :(得分:1)

你想要这个:

dict(zip([letter for row in s for word in row.lower().split() for letter in word], range(len([letter for row in s for word in row.lower().split() for letter in word]))))