Question

我尝试做的是获取从filteredData.csv读取的数据并运行并找出如何查找2016年内每个位置的平均雪量。然后我想要获取该数据并将其加载到名为average2016.csv的新csv文件中。

我目前试图在filteredData的打开文件中打开theaverage2016.csv，并尝试循环播放位置和平均雪。

data2 = open('average2016.csv','w')
for row in csv1:
    print (location + "," + average_snow)
data2.close()

我的整个代码如下：

import csv
data = open('filteredData.csv','r')
# Create Dictionaries to store location values
# Snow_Fall is the number of inches for that location
# Number_Days is the number of days that there is Snowfall data for
Snow_Fall = {}
Number_Days = {}

# Create CSV reader
csv1 = csv.DictReader(data,delimiter=',')
# read each row of the CSV file and process it
for row in csv1:
    # Check the date column and see if it is in 2016
    if "2016" in row["DATE"]:
        # Check to see if the value in the snow column is null/none if so then skip processing that row
        if (row["SNOW"] is None) or (row["SNOW"] == ""):
            pass
        else:
            # Check to see if the location has been added to the dict if it has then add the data to itself
            # If it has not then just assign the data to the location.
            if row["NAME"] in Snow_Fall:
                Snow_Fall[row["NAME"]] = Snow_Fall[row["NAME"]] + float(row["SNOW"])
                Number_Days[row["NAME"]] = Number_Days[row["NAME"]] + 1
            else:
                Snow_Fall[row["NAME"]] = float(row["SNOW"])
                Number_Days[row["NAME"]] = 1

# For each location we want to print the data for that location
for location in Snow_Fall:
   print ("The number of inches for location " + location + " is " + str(Snow_Fall[location]))            
   print ("The number of days of snowfall for location " + location + " is " + str(Number_Days[location]))
   print ("The average Number of Inches for location " + location + " is " + str(Snow_Fall[location] / Number_Days[location]))
data2 = open('average2016.csv','w')
for row in csv1:
    print (location + "," + average_snow)
data2.close()
data.close()

和

Answer 1

pandas绝对是你的朋友。请考虑以下几点：

import pandas as pd

df = pd.read_csv('filteredData.csv')

# Assuming that you are trying to find the average snow in each STATION
by_location = df.groupby('STATION')
# Get the means for each location
snow_means = by_location.mean().SNOW

# The following 2 lines are just to make everything more readable in your new csv (you could skip them if wanted):
snow_means = snow_means.to_frame()
# Rename your column to 'AVG_SNOW'
snow_means.columns = ['AVG_SNOW']

# Finally, write your new dataframe to CSV
snow_means.to_csv('average2016.csv', header=True)

请注意，这是未经测试的（但应该有效）。如果您发布一个包含数据帧中某些行的最小示例（而不是屏幕截图），我可以对其进行测试和调试，以确保一切正常。如果您打算尝试用python替换excel，我强烈建议您遵循pandas tutorial。

Answer 2

for location in Snow_Fall:
    print (location + "," + str(Snow_Fall[location] / Number_Days[location]),file=data2)

在python

2 个答案: