逐行读取文件,拆分其内容,跳过空行

时间:2018-02-13 19:00:44

标签: python python-3.x

我有一个这样的文件:

a;a_desc
b;b_desc  

c;
d
;
e;e_desc

我想要的是:

  • 逐行阅读
  • 删除换行符号
  • 跳过空行
  • 如果在分号之前有字符串但不在之后,请在两次之前使用字符串
  • 如果有字符串但没有分号,请使用字符串两次
  • 返回一个列表

我希望得到:

[['a', 'a_desc'], ['b', 'b_desc'], ['c', 'c'], ['d', 'd'], ['e', 'e_desc']]

我已经得到了什么:

filename = 'data.txt'

with open(filename, 'r') as f:

    x = [line.rstrip('\n') for line in f.readlines() if not line.isspace()]

    xx = [line.split(';') for line in x]

    content = [line for line in xx if line[0]]

print(content)  

那会给我:

[['a', 'a_desc'], ['b', 'b_desc'], ['c', ''], ['d'], ['e', 'e_desc']]

我可能会创建更多循环,以便正确捕捉 c d 行。
但有没有更短的方式而不是所有的循环?

谢谢!

3 个答案:

答案 0 :(得分:2)

您可以只执行一个循环并在每个步骤中检查值,如果只有一个元素,则将列表加倍。

with open(filename, 'r') as f:
    data = []
    for line in f.readlines():
        line = line.rstrip('\n')
        if not line:
            continue
        line_list = [s for s in line.split(';') if s]
        if not line_list:
            continue
        if len(line_list) == 1:
            line_list *= 2
        data.append(line_list)

答案 1 :(得分:1)

另一种可能更简单的解决方案:

data = []
with open('data.txt') as f:
    # Loop through lines (without \n)
    for line in f.read().splitlines():
        # Make sure line isn't empty or a semicolon
        if not (line is ';' or line is ''):
            # Clean line from spaces then split it
            cells = line.rstrip().split(';')
            # Use first cell twice if second is empty or not there
            if len(cells) < 2 or not cells[1]:
                cells = [cells[0]] * 2
            data.append(cells)

答案 2 :(得分:1)

对单个for循环和几个if条件使用以下方法:

with open(filename) as f:
    result = []
    for r in f.read().splitlines():
        r = r.strip()
        if r and r[0] != ';':
            pair = r.split(';')
            result.append([pair[0]] * 2 if len(pair) == 1 or not pair[1] else pair)

print(result)

输出:

[['a', 'a_desc'], ['b', 'b_desc'], ['c', 'c'], ['d', 'd'], ['e', 'e_desc']]