Question

我在网络中有需求的以下文本文件。

Origin  1
    1 :      0.0;     2 :    100.0;     3 :    100.0;     4 :    500.0;     5 :    200.0;
    6 :    300.0;     7 :    500.0;     8 :    800.0;     9 :    500.0;    10 :   1300.0;
   11 :    500.0;    12 :    200.0;    13 :    500.0;    14 :    300.0;    15 :    500.0;
   16 :    500.0;    17 :    400.0;    18 :    100.0;    19 :    300.0;    20 :    300.0;
   21 :    100.0;    22 :    400.0;    23 :    300.0;    24 :    100.0;

Origin  2
    1 :    100.0;     2 :      0.0;     3 :    100.0;     4 :    200.0;     5 :    100.0;
    6 :    400.0;     7 :    200.0;     8 :    400.0;     9 :    200.0;    10 :    600.0;
   11 :    200.0;    12 :    100.0;    13 :    300.0;    14 :    100.0;    15 :    100.0;
   16 :    400.0;    17 :    200.0;    18 :      0.0;    19 :    100.0;    20 :    100.0;
   21 :      0.0;    22 :    100.0;    23 :      0.0;    24 :      0.0;

Origin  3
    1 :    100.0;     2 :    100.0;     3 :      0.0;     4 :    200.0;     5 :    100.0;
    6 :    300.0;     7 :    100.0;     8 :    200.0;     9 :    100.0;    10 :    300.0;
   11 :    300.0;    12 :    200.0;    13 :    100.0;    14 :    100.0;    15 :    100.0;
   16 :    200.0;    17 :    100.0;    18 :      0.0;    19 :      0.0;    20 :      0.0;
   21 :      0.0;    22 :    100.0;    23 :    100.0;    24 :      0.0;

... records 4-23 elided ...

Origin  24
    1 :    100.0;     2 :      0.0;     3 :      0.0;     4 :    200.0;     5 :      0.0;
    6 :    100.0;     7 :    100.0;     8 :    200.0;     9 :    200.0;    10 :    800.0;
   11 :    600.0;    12 :    500.0;    13 :    700.0;    14 :    400.0;    15 :    400.0;
   16 :    300.0;    17 :    300.0;    18 :      0.0;    19 :    100.0;    20 :    400.0;
   21 :    500.0;    22 :   1100.0;    23 :    700.0;    24 :      0.0;

现在我需要创建一个字典，它应该类似于：

{(1,1):0.0, (1,2):100.0, (1, 3):100.0, .......
 (2, 1):100.0, (2,2):0, ......}

元组元素例如(1, 2)代表原点和目的地，值代表需求（100.0密钥为(1, 2)。）

我尝试了以下内容：

with open("trips.txt", "r") as f:
     line = f.readline()
     line = f.readline()
     ind = 0
     while len(line):
         line = line.strip(';')
         l = line.split()
         print l

         ind = ind + 1
         if(ind == 5):
             line = f.readline()
             line = f.readline()
             line = f.readline()
             ind = 0
             node = node + 1
         else:
             line = f.readline()

但是我认为我不会去任何地方......

Answer 1

你绝对不会去任何地方，因为你根本没有对字典做过任何参考。

我将在此为您概述一个流程;你能填写详细资料吗？

my_dict = {}

while not EOF:
    # read the "Origin" line
    line = f.readline()

    # extract the number on the right
    origin_num = int( line.split()[-1] )

    # Read the data lines
    for _ in range(5):    # each data chunk has 5 lines
        data_line = readline()
        entries = data_line.split(';')    # split at semicolons

        for field in entries:
            y_key, value = field.split(:)
            # Now, you need to convert both of those to integers,
            #    combine v_key with the origin_num,
            #    and insert that value into my_dict.

这会让你感动吗？请注意，您还需要处理空行，检测文件结尾等。

Answer 2

嗯，如果要提取数据，则需要逐行解析，算法大致应该是：

逐行扫描文件
- if空行，跳过
- if该行以＆Origin;＆＃39;起始＆＃39;捕获它之后的数字（origin_no）
- else用分号和每个元素拆分行：
  - 用冒号分割
  - 第一个元素是第二个dict键号（element_no）
  - 第二个元素是值（value_no）
  - 将您的结果字典存储为(origin_no, element_no): value_no

实施起来非常简单：

result = {}  # we'll store our result in this dict
origin_no = 0  # our starting Origin number in case the file doesn't begin with one
with open("trips.txt", "r") as f:
    for line in f:
        line = line.rstrip()  # we're not interested in the newline at the end
        if not line:  # empty line, skip
            continue
        if line.startswith("Origin"):
            origin_no = int(line[7:].strip())  # grab the integer following Origin
        else:
            elements = line.split(";")  # get our elements by splitting by semi-colon
            for element in elements:  # loop through each of them:
                if not element:  # we're not interested in the last element
                    continue
                element_no, element_value = element.split(":")  # get our pair
                # beware, these two are now most likely padded strings!
                # that's why we'll strip them from whitespace and convert to integer/float
                result[(origin_no, int(element_no.strip()))] = float(element_value.strip())
# Done!

Answer 3

您可以尝试：

with open('trips.txt', 'r') as f:
    dic = {}
    try:
        while True:
            num = int(f.next().split()[1])
            lst = []
            for _ in xrange(5):
                lst.append(f.next().strip().split(';'))
            f.next()
            for n in lst:
                for l in n:
                    if l != '':
                        tmp = l.strip().split(':')
                        dic[(num, int(tmp[0]))] = float(tmp[1])
    except StopIteration:
        print dic

输出：

{(1, 21): 100.0, (1, 3): 100.0, (2, 18): 0.0, (2, 8): 400.0, (1, 17): 400.0, (2, 1): 100.0, (1, 15): 500.0, (2, 22): 100.0....etc}

Answer 4

另一种方法 -

nw.usage是一个具有使用内容的文件..

正如我在下面的代码中所评论的那样..如果你想维护插入顺序，请使用collections.OrderedDict（）。

希望它有所帮助！

#!/usr/bin/env python

import re
#import collections

with open('nw.usage', 'r') as f:
  usage_dict = {}
  #Use collections.OrderedDict() if you want to maintain insertion order
  origin_val = ''
  for line in f:
    if re.search('Origin', line):
      origin_val = line.rstrip()[-1]
    else:
      hr_demand = line.strip().split(';')
      for hr in hr_demand:
        if not hr:
          continue
        hour = hr.split(':')[0].strip()
        usage = hr.split(':')[1].strip()
        usage_dict[(origin_val, hour)] = usage

  print usage_dict

输出是 -

{('1', '17'): '400.0', ('2', '2'): '0.0', ('2', '17'): '200.0', ('1', '20'): '300.0', ('1', '18'): '100.0', ('2', '20'): '100.0', ('1', '13'): '500.0', ('1', '6'): '300.0', ('2', '13'): '300.0', ('1', '24'): '100.0', ('2', '7'): '200.0', ('2', '24'): '0.0', ('1', '2'): '100.0', ('1', '16'): '500.0', ('2', '3'): '100.0', ('2', '18'): '0.0', ('1', '21'): '100.0', ('2', '23'): '0.0', ('1', '12'): '200.0', ('2', '14'): '100.0', ('2', '8'): '400.0', ('1', '5'): '200.0', ('2', '10'): '600.0', ('2', '4'): '200.0', ('2', '19'): '100.0', ('1', '22'): '400.0', ('1', '1'): '0.0', ('2', '22'): '100.0', ('1', '15'): '500.0', ('2', '15'): '100.0', ('2', '9'): '200.0', ('1', '11'): '500.0', ('1', '4'): '500.0', ('2', '11'): '200.0', ('1', '9'): '500.0', ('2', '5'): '100.0', ('1', '23'): '300.0', ('1', '14'): '300.0', ('2', '1'): '100.0', ('2', '16'): '400.0', ('1', '19'): '300.0', ('2', '21'): '0.0', ('1', '10'): '1300.0', ('1', '7'): '500.0', ('2', '12'): '100.0', ('1', '8'): '800.0', ('2', '6'): '400.0', ('1', '3'): '100.0'}

将文本文件转换为字典

4 个答案: