读取并处理文本文件并保存到csv

时间:2017-05-09 17:26:30

标签: python string python-3.x csv byte

我的文件似乎是“dict”格式......

文件标题如下:time,open,high,low,close,volume

下一行如下: { “吨”:[1494257340], “○”:[206.7], “H”:[209.3], “L”:[204.50002], “C”:[204.90001], “V”:[49700650]}`

    import csv
    with open ('test_data.txt', 'rb') as f:

    for line in f:
        dict_file = eval(f.read())
        time = (dict_file['t'])    # print (time) result [1494257340]
        open_price = (dict_file['o'])    # print (open_price) result [206.7]
        high = (dict_file['h'])    # print (high) result [209.3]
        low = (dict_file['l'])    # print (low) result [204.50002]
        close = (dict_file['c'])    # print (close) result [204.90001]
        volume = (dict_file['v'])    # print (volume) result [49700650]

        print (time, open_price, high, low, close, value)

# print result [1494257340] [206.7] [209.3] [204.50002] [204.90001] [49700650]

# I need to remove the [] from the output.

# expected result 

# 1494257340, 206.7, 209.3, 204.50002, 204.90001, 49700650

我需要的结果是(将时间(“纪元日期格式”)改为dd,mm,yy

5/8/17, 206.7, 209.3, 204.50002, 204.90001, 49700650

所以我知道我需要csv.writer函数

1 个答案:

答案 0 :(得分:0)

我在您提交的代码中看到了许多问题。我建议你把你的任务分成小块,看看你是否可以让它们单独工作。那么你想要做的是:

  1. 打开文件
  2. 逐行阅读文件
  3. eval每行获取dict对象
  4. 从该对象获取值
  5. 将这些值写入(单独的?)csv文件
  6. 右?

    现在做每一个,当时只需一小步

    1. 打开文件。
    2. 你在那里非常重要:

      with open('test_data.txt', 'rb') as f:
          print(f.read())
      
      # b'{"t":[1494257340],"o":[207.75],"h":[209.8],"l":[205.75],"c":[206.35],"v":[61035956]}\n'
      

      您可以在r模式下打开文件,它会为您提供字符串而不是byte类型对象

      with open('test_data.txt', 'r') as f:
          print(f.read())
      
      # {"t":[1494257340],"o":[207.75],"h":[209.8],"l":[205.75],"c":[206.35],"v":[61035956]}
      

      它可能会导致一些问题但是应该可以工作,因为eval可以很好地处理它(至少在python 3中)

      1. 逐行阅读文件
      2. with open('test_data.txt', 'rb') as f:
            for line in f:
                print(line)
        
        # b'{"t":[1494257340],"o":[207.75],"h":[209.8],"l":[205.75],"c":[206.35],"v":[61035956]}\n'
        

        以下是您的代码中的另一个问题,您没有使用line变量并尝试使用f.read()。这将只读取整个文件(从第二行开始,因为已经读取了第一行)。尝试互换一个,看看会发生什么

        1. eval每行获取dict对象
        2. 再次。这很好用。但我会在这里加一些保护。如果您在文件中找到空行或格式错误的行,该怎么办?此外,如果此文件来自不受信任的来源,您可能会成为代码注入的受害者,例如文件中的某行更改为:

          print("You've been hacked") or {"t":[1494257340],"o":[207.75],"h":[209.8],"l":[205.75],"c":[206.35],"v":[61035956]}

          with open('test_data.txt', 'rb') as f:
              for line in f:
                  dict_file = eval(line)
                  print(dict_file)
          
          # You've been hacked
          # {'t': [1494257340], 'o': [207.75], 'h': [209.8], 'l': [205.75], 'c': [206.35], 'v': [61035956]}
          

          我不知道您的确切规格,但您应该使用json.loads更安全。

          ...

          你可以从那里继续吗?

          1. 从对象中获取值
          2. 我认为dict_file['t']并没有为您提供所期望的价值。

            它给你带来了什么?

            为什么呢?

            如何解决?

            1. 将这些值写入csv文件
            2. 你能把一些随机字符串写入文件吗?

              scv格式是什么样的?您可以格式化您的值以匹配它

              检查csv模块的文档,它对您有帮助吗?

              依此类推......

              编辑:解决方案

              # you can save the print output in a file by running:
              # $ python convert_to_csv.py > output.cvs
              import datetime, decimal, json, os
              
              
              CSV_HEADER = 'time,open,high,low,close,volume'
              
              
              with open('test_data.txt', 'rb') as f:
              
                  print(CSV_HEADER)
              
                  for line in f:
                      data = json.loads(line, parse_float=decimal.Decimal)
                      data['t'][0] = datetime.datetime.fromtimestamp(data['t'][0]) \
                          .strftime('%#d/%#m/%y' if os.name == 'nt' else '%-d/%-m/%y')
                      print(','.join(str(data[k][0]) for k in 'tohlcv'))
              

              运行:

              $ cat test_data.txt
              {"t":[1494257340],"o":[207.75],"h":[209.8],"l":[205.75],"c":[206.35],"v":[61035956]}
              {"t":[1490123123],"o":[107.75],"h":[109.8],"l":[105.75],"c":[106.35],"v":[11035956]}
              {"t":[1491234234],"o":[307.75],"h":[309.8],"l":[305.75],"c":[306.35],"v":[31035956]}
              
              $ python convert_to_csv.py
              time,open,high,low,close,volume
              8/5/17,207.75,209.8,205.75,206.35,61035956
              21/3/17,107.75,109.8,105.75,106.35,11035956
              3/4/17,307.75,309.8,305.75,306.35,31035956