Question

我的代码中有一个文件读取，然后我从列表中的源代码中获得了一年的字段，现在我需要每年查看整个文件以查找每年有多少行。

我在Excel中进行了练习，我期待以下输出：

我的代码：

input_f = open("C:\\Users\\P928260\\Downloads\\ssa-pop3-eng.csv","r")
next(input_f)


years_unique = []
controler = False

while(controler != True):
        counter_rows = 0
        #Get a list with the read years
        for line in input_f:
            item = line.split(',')
            year_f = item[0][:4]
            if (year_f not in years_unique):
                years_unique.append(year_f)


        input_f.close()

        input_f = open("C:\\Users\\P928260\\Downloads\\ssa-pop3-eng.csv","r")
        next(input_f)


        for year in years_unique:
            for line in input_f:
                item = line.split(',')
                year_f = item[0][:4]
                if (year == year_f):
                    counter_rows +=1

            print(year,counter_rows)


    controler = True

我当前的输出仅打印适合2012年的同一行帐户，但不包括其他年份。我知道我很接近。感谢你的帮助。

Answer 1

您希望在代码中更改一些内容。

必要时使用上下文管理器。无论是否发生异常，您都可以使用它隐式管理文件关闭。

您还可以在defaultdict库中使用collections。它有助于设置默认工厂，以便首次设置字典中访问的任何键的初始值。在这种情况下，我们使用int内置函数将默认值设置为0。

from collections import defaultdict

year_count = defaultdict(int)

with open("C:\\Users\\P928260\\Downloads\\ssa-pop3-eng.csv","r") as file:    
    for line in file:
        year, *rest = line.split(',')
        year = year.strip()  # clean year
        year_count[year] += 1

for year, count in year_count.items():
    print(year, count)

Answer 2

第二个循环中的问题，特别是在内循环中只运行一次;您只能迭代一次文件对象（除非您在每次迭代时seek到文件的开头）。

for year in years_unique:
            for line in input_f:
                item = line.split(',')
                year_f = item[0][:4]
                if (year == year_f):
                    counter_rows +=1

'2012'是years_unique列表中的第一项，因此内循环将运行，文件中的每次出现都会导致counter_rows增加1，但在下一次迭代中， input_f已经筋疲力尽，可以说，并没有其他增量。

另外，请注意，每次迭代都没有重置counter_rows。

更简单的方法是在一个循环中使用dict，这是一个示例：

input_f = open("YOUR_FILE")
next(input_f)

years = {}

for line in input_f:
    items = line.split(",")
    year = items[0][:4]
    years.setdefault(year, 0)
    years[year] += 1

input_f.close()
print(years)

Answer 3

我这样想出来了！

input_f = open("C:\\Users\\P928260\\Downloads\\ssa-pop3-eng.csv","r")
output_f = open("C:\\Users\\P928260\\Downloads\\output.txt","w")

next(input_f)


years_unique = []


for line in input_f:
    item = line.split(',')
    year_f = item[0][:4]

    if (year_f not in years_unique):
        years_unique.append(year_f)


input_f.close()



for year in years_unique:
    counter_rows =0
    input_f = open("C:\\Users\\P928260\\Downloads\\ssa-pop3-eng.csv","r")      
    next(input_f)   
    for line in input_f:
        item = line.split(',')
        year_f = item[0][:4]
        if (year_f in year):
            counter_rows += 1
    input_f.close()            
    print(year,counter_rows)

从列表字段python 3中的文件中获取行计数

3 个答案: