Question

我有一个名词短语块的标记列表：

NP = [["The dog"], ["it"], ["black car"], ["one cow"], ["the gift in the box"]]

我需要计算每个列表中的标记数。因此，NP[0] 是 [The dog]，而 "the dog" 是两个标记。我如何为嵌套列表中的每个元素计算这个？

Answer 1

遍历列表并使用 len():

>>> NP = [["The", "dog"], ["it"], ["black", "car"], ["one", "cow"], ["the", "gift", "in", "the", "box"]]

>>> [len(x) for x in NP]
[2, 1, 2, 2, 5]

Answer 2

根据您提供的示例，使用 list comprehension 的 str.split 应该可以工作：

>>> NP = [["The dog"], ["it"], ["black car"], ["one cow"], ["the gift in the box"]]
>>> token_counts = [len(nested[0].split()) for nested in NP]
>>> token_counts
[2, 1, 2, 2, 5]

Answer 3

我猜你的意思是[['狗']、['它']、['黑车']、['一头牛']、['盒子里的礼物']]

num_of_tokens = [len(np[0].split()) for np in NP]

Answer 4

请使用实际的代码示例，以便人们可以更准确地帮助您。

不清楚您的元素是字符串还是带有单个字符串的列表。

映射一个lambda函数，可以将每个元素split()转化成token，并取len()得到token的个数。这将返回一个整数列表，显示每个字符串的标记数

NP = [["The dog"], ["it"], ["black car"], ["one cow"], ["the gift in the box"]]
lengthlist = list(map(lambda text:len(text[0].split()), NP))

哪里lengthlist = [2, 1, 2, 2, 5]

计算列表python的嵌套列表的元素

4 个答案: