数据转换为数据结构

时间:2017-04-03 17:36:47

标签: python python-3.x data-structures spreadsheet

我有一个文本文件,我需要将其转换为列表。这是数据表单文本文件:

'ID=s4k5jk\nDate=8 December 1970\nTitle=crossing the atlantic on a tricycle\nID=f983\nDate=22 December 1970\nTitle=Royal Episode 13'

我需要一个列表形式的输出,看起来像这样

l = [
 #ID               Date               Title        
["s4k5jk", "8 December 1970", "crossing the atlantic on a tricycle"],
["f983",   "22 December 1970",   "Royal Episode 13"]]

有人可以让我知道如何转换这个吗?非常感谢!

2 个答案:

答案 0 :(得分:2)

由于每个项目都是由"ID="定义的,因此我使用此术语来split()初始句子。

然后只需splitting "\n" appending句,操纵一些字符串,listresults称为s = 'ID=s4k5jk\nDate=8 December 1970\nTitle=crossing the atlantic on a tricycle\nID=f983\nDate=22 December 1970\nTitle=Royal Episode 13' data = s.split("\nID=") results = [] for d in data: res = d.split("\n") _id = res[0].replace("ID=", "") _date = res[1].replace("Date=", "") _title = res[2].replace("Title=", "") results.append([_id, _date, _title]) for r in results: print(r)

<强>代码:

['s4k5jk', '8 December 1970', 'crossing the atlantic on a tricycle']
['f983', '22 December 1970', 'Royal Episode 13']

<强>输出:

--keep

答案 1 :(得分:1)

您还可以尝试使用正则表达式方法:

>>> print(s)
ID=s4k5jk
Date=8 December 1970
Title=crossing the atlantic on a tricycle
ID=f983
Date=22 December 1970
Title=Royal Episode 13
>>> fields = re.findall(r'ID=([\s\S]+?)\sDate=([\s\S]+?)\sTitle=([\s\S]+?)$', s, re.MULTILINE)
>>> fields
[('s4k5jk', '8 December 1970', 'crossing the atlantic on a tricycle'), ('f983', '22 December 1970', 'Royal Episode 13')]
>>>

请注意,使用捕获组的工作方式完全符合人们希望的re.findall