Question

我有一个大文件，格式如下：

* {
  box-sizing: border-box; /*for example purpose*/
}

.img-container {
  position: relative;
  width: 180px;
  height: 180px;
}

.img-container img {
  width: 100%;
  height: 100%;
}

.img-container .ImageText {
  position: absolute;
  top: 50%;
  left: 50%;
  transform: translate(-50%, -50%);
  margin: 0;
}

用第二列的键构成字典的最佳方式是什么，所以我可以得到以下信息：

<div id="SubnauticaImage" class="img-container">
  <img src="https://placehold.it/100x100" class="GameImage" alt="Subnautica Slika" />
  <h4 class="ImageText">Subnautica</h4>
</div>

甚至

--
0058 name_1 0BA7 VOL 512.0 2018-04-07/12:00
0058 name_1 0BAF VOL 64.0 2018-04-07/12:00
0058 name_2 0C93 VOL 808.0 2018-04-07/12:00
0058 name_2 0CFF VOL 307.1 2018-04-07/12:00
0058 name_3 0F4F VOL 16.2 2018-04-07/12:00
0058 name_3 0F51 VOL 16.0 2018-04-07/12:00
0058 name_3 0F53 VOL 16.2 2018-04-07/12:00
--

Answer 1

这是一种简单的方法，考虑了两种可能的输出格式：

with read(my_file.txt, 'r') as infile:  # open the file
    lines = [i.split() for i in infile.readlines()[1:-1]]  # use list comprehension to put lines from the file in a more useful format
    # each line now looks like
    # [0058, name_1, 0BA7, VOL, 512.0, 2018-04-07/12:00]
    #  0     1       2     3    4      5
    my_dict = {}
    for line in lines:
        # version 1 of your intended output
        if not line[1] in my_dict:
            my_dict[line[1]] = ([line[2]], [line[4]])  # initialize as a new tuple
        else:
            my_dict[line[1]][0].append(line[2])  # already initialized, so we
            my_dict[line[1]][1].append(line[4])  #    add on to the end of what's there
        # version 2 of your intended output
        if not line[1] in my_dict:
            my_dict[line[1]] = {line[2]: line[4]}  # initialize as a new dict
        else:
            my_dict[line[1]][line[2]] = line[4]  # add a key to existing dict
return my_dict

我认为不可能通过dict理解来做到这一点，因为键是动态建立的。

Answer 2

如果每一行都跟在您的样本后面，则最简单的方法是按行将行分开。至于读取文件，我会选择readline方法，因为一次读取一个大文件比较好。

d = {}
with open(filepath) as fp:
    line = fp.readline()
    if not line:
        break
    _, key, val1, _, val2, _ = line.split()
    if key not in d:
        d[key] = {}
    d[key][val1] = val2
print(d)

您如何从以下文件中形成字典

2 个答案: