Question

我有来自http请求的输出，它是字符串类型，但数据类似于csv。由于我的请求标题中的输出类型是csv（＆＃39;接受＆＃39;：＆＃34; application / csv＆＃34;）。因为这是source支持的格式。但响应内容类型是一个字符串。 res=request.content type（res）`给我字符串。

以下是对象（res）的示例输出：

QueryTime
start,end
144488,144490

Data

Data - AData
id,G_id,name,type,time,sid,channel
23,-1,"B1",type1,144488,11,CH23
23,-1,"B1",type1,144488,11,CH23
Data - BData
id,G_id,time,se
23,-1,144488,undefined
23,-1,144488,undefined

如果您看到数据是csv的形式，并且有多个表，就像您看到的那样＆＃34; AData＆＃34; ＆安培; ＆＃34; BDATA＆＃34; 我没有采取哪种方法来阅读这个。我试过csv模块，但没有帮助。我已经尝试过dict.csv转换但是又一次。没有得到理想的输出。可能是我做错了，因为我是python的新手。需要从输出对象中读取每个表。

with open('file.csv', 'wb') as csvfile:
  spamwriter = csv.writer(csvfile, delimiter=',',quoting=csv.QUOTE_NONE)
  spamwriter.writerow(rec)

with open('file.csv') as csvfile:
   reader = csv.DictReader(csvfile)
   for row in reader:
   print row

专家请指导： - ）

Answer 1

您可以使用正则表达式预解析输出以提取各个部分，然后使用StringIO将每个部分解析为csv.reader，如下所示：

import csv
import StringIO
from collections import OrderedDict

output = """
QueryTime
start,end
144488,144490

Data

Data - AData
id,G_id,name,type,time,sid,channel
23,-1,"B1",type1,144488,11,CH23
23,-1,"B1",type1,144488,11,CH23
Data - BData
id,G_id,time,se
23,-1,144488,undefined
23,-1,144488,undefined"""

sections = ['QueryTime', 'Data - AData', 'Data - BData', 'Data']
re_sections = '|'.join([re.escape(s) for s in sections])
tables = re.split(r'(' + re_sections + ')', output)
tables = [t.strip() for t in tables[1:]]

d_tables = OrderedDict()

for section, table in zip(*[iter(tables)]*2):
    if len(table):
        csv_input = csv.reader(StringIO.StringIO(table))
        d_tables[section] = list(csv_input)

for section, entries in d_tables.items():
    print section
    print entries
    print

给你以下输出：

QueryTime
[['start', 'end'], ['144488', '144490']]

Data - AData
[['id', 'G_id', 'name', 'type', 'time', 'sid', 'channel'], ['23', '-1', 'B1', 'type1', '144488', '11', 'CH23'], ['23', '-1', 'B1', 'type1', '144488', '11', 'CH23']]

Data - BData
[['id', 'G_id', 'time', 'se'], ['23', '-1', '144488', 'undefined'], ['23', '-1', '144488', 'undefined']]

Answer 2

我想出了这个函数来解析数据：

def parse_data(data):
 parsed = {}
 current_section = None

 for line in data.split('\n'):
  line = line.strip()
  if line:
   if ',' in line:
    current_section.append(line.split(','))
   else:
    parsed[line] = []
    current_section = parsed[line]
 return parsed

它返回一个字典，其中每个键引用输入的一部分。它的值是一个列表，其中每个成员代表一行输入。每行也是作为字符串的各个值的列表。它不会特意处理某个部分的第一行。

在输入上运行它会产生这种情况（为了便于阅读而重新格式化）：

{
 'Data - AData': [
  ['id', 'G_id', 'name', 'type', 'time', 'sid', 'channel'],
  ['23', '-1', '"B1"', 'type1', '144488', '11', 'CH23'],
  ['23', '-1', '"B1"', 'type1', '144488', '11', 'CH23']
 ],
 'Data - BData': [
  ['id', 'G_id', 'time', 'se'],
  ['23', '-1', '144488', 'undefined'],
  ['23', '-1', '144488', 'undefined']
 ],
 'Data': [
 ],
 'QueryTime': [
  ['start', 'end'],
  ['144488', '144490']
 ]
}

从python中的字符串对象读取逗号分隔值

2 个答案: