串联嵌套列表中的字符

时间:2018-12-05 00:39:52

标签: python string parsing recursion

我目前正在使用这样的数据结构:

['t','h','i','s',' ','i','s',' ','q','u','e','r','y',' ','i','t','e','m',' ','1','t','h','i','s',' ','i','s',' ','q','u','e','r','y',' ','i','t','e','m',' ','2', ['t','h','i','s',' ','i','s',' ','a',' ','s','u','b','q','u','e','r','y'], 't','h','i','s',' ','i','s',' ','q','u','e','r','y',' ','i','t','e','m',' ','3']

我通过使用来自SO的以下答案解析查询字符串来获得此数据集:https://stackoverflow.com/a/17141441

我解析的查询是:

(this is query item 1 this is query item 2(this is a subquery)this is query item 3)

问题在于它处理单个字符,这些字符被逐一添加到列表中。我需要回到像这样的结构:

['this is query item 1 this is query item 2', ['this is a subquery'], 'this is query item 3']

我正在尝试将其包裹在解析器函数周围,或者执行后处理步骤以将字符重新推回去。有人知道解决方案吗?

1 个答案:

答案 0 :(得分:2)

作为后处理步骤,您可以在递归函数中使用itertools.groupby

from itertools import groupby

data = ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 'q', 'u', 'e', 'r', 'y', ' ', 'i', 't', 'e', 'm', ' ', '1', 't', 'h',
        'i', 's',
        ' ', 'i', 's', ' ', 'q', 'u', 'e', 'r', 'y', ' ', 'i', 't', 'e', 'm', ' ', '2',
        ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', ' ', 's', 'u', 'b', 'q', 'u', 'e', 'r', 'y'], 't', 'h', 'i', 's',
        ' ', 'i', 's', ' ', 'q', 'u', 'e', 'r', 'y', ' ', 'i', 't', 'e', 'm', ' ', '3']


def join(lst):
    for is_list, group in groupby(lst, key=lambda x: isinstance(x, list)):
        if is_list:
            yield from (list(join(value)) for value in group)
        else:
            yield ''.join(group)


result = list(join(data))
print(result)

输出

['this is query item 1this is query item 2', ['this is a subquery'], 'this is query item 3']

这将为列表和字符创建组,如果该组是使用内置join函数的字符之一,则以递归方式调用join函数。