Python:读取和分析CSV文件

时间:2018-12-06 02:00:58

标签: python loops csv

我有一个CSV文件,其中包含学生姓名及其8个科目的平均值。我必须计算出哪些学生荣登荣誉榜(平均总分达到80分或以上),哪些学生获得了该学科的奖项(每个学科的最高分)。我已经完成了荣誉榜部分,并且可以使用,但是我无法使主题奖部分起作用。我将如何工作?我想不通!

这是我的代码:

import csv

with open('C:/Users/rohan/Desktop/Google Drive/honourCSVreader/honour.csv') 
as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=",")

    # Honour Roll
    print('The honour roll students are:')
    for col in csv_reader:
        if not col[0] or col[1]:
            for row in csv_reader:
                if (int(row[2]) + int(row[3]) + int(row[4]) + int(row[5]) + 
                int(row[6]) + int(row[7]) + int(row[8]) + int(row[9])) / 8 
                >= 80:
                    print(row[1] + " " + row[0])
    # Subject Awards
    print('The subject award winners are:')
    for col in csv_reader:
        if not col[0] and not col[1]:
            name = []
            maximum_grade = 0
            subject = []
            for col[2:] in csv_reader:
                if col > maximum_grade:
                    subject = row
                    maximum_grade = col
                    name = [col[1], col[0]]
                    print(str(name) + ' - ' + str(subject))

这是“荣誉”文件(学生名单):https://1drv.ms/x/s!AhndVfox8v67iggaLRaK7LTpxBQt

谢谢!

3 个答案:

答案 0 :(得分:1)

[EDIT]与@edilio合作,我制作了一个更高效的版本,可以保持联系。其中有很多,所以这是一个非常重要的区别。该代码很长,因此我将在主要内容上进行托管。

https://gist.github.com/SamyBencherif/fde7c3bca702545dd22739dd8caf796a


不需要for循环。实际上,第二个for循环中的语法完全被破坏了。

import csv

with open('C:/Users/rohan/Desktop/Google Drive/honourCSVreader/honour.csv') 
as csv_file:
    csv_list = list(csv.reader(csv_file, delimiter=","))[1:]

    # Subject Awards
    print('The subject award winners are:')
    print('English', max(csv_list, key=lambda row: row[2]))
    print('Math', max(csv_list, key=lambda row: row[3]))
    print('Geography', max(csv_list, key=lambda row: row[4]))     

以此类推

答案 1 :(得分:1)

以一种更好的方式进行操作,以使代码清晰,模块化且易于理解。

https://paiza.io/projects/e/35So9iUPfMdIORGzJTb2NQ

首先,将学生数据作为字典读入。

import csv

with open('data.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter=",")
    for line in csv_reader:
        print line

输出:

{'History': '39', 'Last': 'Agalawatte', 'Science': '68', 'Gym': '88', 'Music': '84', 'English': '97', 'Art': '89', 'First': 'Matthew', 'Math': '79', 'Geography': '73'}
{'History': '95', 'Last': 'Agorhom', 'Science': '95', 'Gym': '80', 'Music': '93', 'English': '95', 'Art': '72', 'First': 'Devin', 'Math': '60', 'Geography': '80'}
{'History': '84', 'Last': 'Ahn', 'Science': '98', 'Gym': '71', 'Music': '95', 'English': '91', 'Art': '56', 'First': 'Jevon', 'Math': '95', 'Geography': '83'}
{'History': '97', 'Last': 'Ajagu', 'Science': '69', 'Gym': '82', 'Music': '87', 'English': '60', 'Art': '74', 'First': 'Darion', 'Math': '72', 'Geography': '99'}
{'History': '74', 'Last': 'Akahira', 'Science': '90', 'Gym': '71', 'Music': '79', 'English': '94', 'Art': '86', 'First': 'Chandler', 'Math': '89', 'Geography': '77'}

配合使用会更好吗?

现在将每行视为一个学生,然后编写两个函数来评估该学生是否符合任一列表的条件。

弄清楚如何跟踪结果。在这里,我使用了一些嵌套字典:

import csv
import json

roles = {}
roles['honor role'] = []
subjects = ['History', 'Science','Gym', 'Music', 'English', 'Art', 'Math', 'Geography']
for subject in subjects:
    roles[subject] = {'highest grade':0, 'students':[]}


def isHonorRole(student):
    ''' Test to see if this student has earned the honor role'''
    return False

def isSubjectAward(subject, student):
    ''' Test to see if this student has earned the honor role'''
    return False

with open('data.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter=",")
    for student in csv_reader:

        if isHonorRole(student):
            ''' Add to the honor role '''

        for subject in subjects:
            if isSubjectAward(subject, student):

好的,现在我们需要实施对谁赢得主题奖进行分类的逻辑。

def isSubjectAward(subject, student):
    ''' Test to see if this student has earned the subject award'''
    grade    = float(student[subject])
    highest  = roles[subject]['highest grade']
    students = roles[subject]['students']

    student = (student['First'], student['Last'])

    # is this grade higher than the current highest?
    if grade > highest:
        # we have a new highest!
        # clear the list
        students = []
        students.append(student)

        # set new highest
        highest = grade
    elif grade == highest:
        # add to list of students
        students.append(student)
    else:
        return

    # There where changes to the list
    roles[subject]['highest grade'] = grade
    roles[subject]['students'] = students

print json.dumps(roles, sort_keys=True, indent=4)

现在我们有主题奖获奖者:

{
    "Art": {
        "highest grade": 100.0, 
        "students": [
            [
                "Nathan", 
                "Bryson"
            ], 
            [
                "Chase", 
                "Putnam"
            ]
        ]
    }, 
    "English": {
        "highest grade": 99.0, 
        "students": [
            [
                "Josiah", 
                "Gower"
            ]
        ]
    }, 
    "Geography": {
        "highest grade": 100.0, 
        "students": [
            [
                "Ismaila", 
                "LeBlanc"
            ]
        ]
    }, 
    "Gym": {
        "highest grade": 100.0, 
        "students": [
            [
                "Woo Taek (James)", 
                "Irvine"
            ]
        ]
    }, 
    "History": {
        "highest grade": 100.0, 
        "students": [
            [
                "Tami", 
                "Easterbrook"
            ]
        ]
    }, 
    "Math": {
        "highest grade": 99.0, 
        "students": [
            [
                "Carson", 
                "Whicher"
            ]
        ]
    }, 
    "Music": {
        "highest grade": 100.0, 
        "students": [
            [
                "Jamie", 
                "Bates"
            ], 
            [
                "Michael", 
                "Giroux"
            ]
        ]
    }, 
    "Science": {
        "highest grade": 100.0, 
        "students": [
            [
                "Jonathan", 
                "Emes"
            ], 
            [
                "Jack", 
                "Hudspeth"
            ]
        ]
    }, 
    "honor role": []
}

找到荣誉角色的学生应该微不足道。尤其是如果我们有一些辅助功能:

def getOverallAverage(student):
    ''' Returns the average of all the student's subject grades '''
    total = sum([float(student[subject]) for subject in subjects])
    return total/len(subjects)

def getName(student):
    '''Extracts the student's first and last name as a tuple'''
    return ' '.join((student['First'], student['Last']))

def isHonorRole(student):
    ''' Test to see if this student has earned the honor role'''
    cutoff = 80

    if getOverallAverage(student) >= cutoff:
        roles['honor role'].append(getName(student))

    return False

荣誉角色是:

"honor role": [
        "Devin Agorhom", 
        "Jevon Ahn", 
        "Darion Ajagu", 
        "Chandler Akahira", 
        "Stas Al-Turki", 
        "Bryce Allison", 
        "Tucker Allison", 
        "Eric Andrews", 
        "Henry Angeletti", 
        "Harry Apps", 
        "Jesse Arnold", 
        "Benjamin Aucoin", 
        "Matthew Bainbridge", 
        "Geordie Ball", 
        "Sean Barbe", 
        "Dwayne Barida", 
        "Jamie Bates", 
        "Bradley Baverstock", 
        "Adam Beckman", 
        "Michael Becq", 
        "Joshua Berezny", 
        "Aaron Best", 
        "Doug Bolsonello", 
        "Richard Bolton", 
        "Trevor Bolton", 
        "Travis Bonellos", 
        "Daniel Boulet", 
        "Nicholas Bowman", 
        "Connor Brent", 
        "Michael Britnell", 
        "Shu Brooks", 
        "Cody Brown", 
        "Dylan Brown", 
        "Mark Brown", 
        "Xinkai (Kevin) Brown", 
        "Daniel Bryce", 
        "Nathan Bryson", 
        "Greg Bull", 
        "Eric Burnham", 
        "Kevin Burns", 
        "Rhys Caldwell", 
        "Evan Campbell", 
        "Jeremiah Carroll", 
        "Ian Cass", 
        "Robert Cassidy", 
        "Matt Catleugh", 
        "Garin Chalmers", 
        "Matthew Chan", 
        "Ryan Cheeseman", 
        "Jack Chen", 
        "Phillipe Chester", 
        "Cameron Choi", 
        "Jason Clare", 
        "Brandon Clarke", 
        "Justin Clarke", 
        "Reid Clarke", 
        "Brendan Cleland", 
        "Andrew Clemens", 
        "Matthew Clemens", 
        "Pete Conly", 
        "Marc Coombs", 
        "Leif Coughlin", 
        "Michael Cox", 
        "Michael Creighton", 
        "Raymond Croke", 
        "Andrew Cummins", 
        "William Cupillari", 
        "James Davidson", 
        "Maxim Davis", 
        "Peter Davis", 
        "Daniel Dearham", 
        "Michael Deaville", 
        "Andrew Decker", 
        "Alex Del Peral", 
        "Kobe Dick", 
        "Alec Dion", 
        "Gaelan Domej", 
        "Harrison Dudas", 
        "Ted Duncan", 
        "Andrew Dunkin", 
        "Micah Dupuy", 
        "Cameron Dziedzic", 
        "Tami Easterbrook", 
        "Ethan Ellis", 
        "Jonathan Emes", 
        "Kevin Ernst", 
        "Taylor Evans", 
        "Jack Everett", 
        "Andrew Fabbri", 
        "Les Fawns", 
        "Cameron Faya", 
        "Patrick Feaver", 
        "Josh Ferrando", 
        "Aidan Flett", 
        "Tommy Flowers", 
        "Gregory Friberg", 
        "Craig Friesen", 
        "Keegan Friesen", 
        "Ryan Fullerton", 
        "Jason Gainer", 
        "Adam Gall", 
        "Ryan Gallant", 
        "Michael Gasparotto", 
        "Scott Gerald", 
        "Michael Giroux", 
        "Ramanand Gleeson", 
        "Jack Goldblatt", 
        "Daniel Gonzalez-Stewart", 
        "Christopher Got", 
        "Josiah Gower", 
        "Zachary Grannum", 
        "Stuart Gray", 
        "Gonzalo Grift-White", 
        "Aris Grosvenor", 
        "Eric Hager", 
        "I\u00c3\u00b1igo Hamel", 
        "Davin Hamilton", 
        "Matthew Hanafy", 
        "Christopher Harpur", 
        "Tomas Hart", 
        "Gage Haslam", 
        "Ross Hayward", 
        "Sean Heath", 
        "Ryan Hess", 
        "Matthew Hessey", 
        "Stephen Hewis", 
        "Michael Hill", 
        "Edward Holbrook", 
        "Gavin Holenski", 
        "Brendan Holmes", 
        "Gregory Houston", 
        "Douglas Howarth", 
        "Conor Hoyle", 
        "Agustin Huang", 
        "Jack Hudspeth", 
        "James Humfries", 
        "David Hunchak", 
        "Jesse Im", 
        "Steve Inglis", 
        "Woo Taek (James) Irvine", 
        "Kenny James", 
        "Eric Jang", 
        "Erik Jeong", 
        "Michael Jervis", 
        "Brett Johnson", 
        "Adam Johnston", 
        "Ben Johnstone", 
        "Taylor Jones", 
        "Braedon Journeay", 
        "Neil Karakatsanis", 
        "David Karrys", 
        "Ryan Keane", 
        "Josh Kear", 
        "Alexander Kee", 
        "Joshua Khan", 
        "Matthew Kim", 
        "David Kimbell Boddy", 
        "Daniel King", 
        "Tristan Knappett", 
        "Timothy Koornneef", 
        "Michael Krikorian", 
        "George Kronberg", 
        "Danny Kwiatkowski", 
        "Chris Lackey", 
        "Spenser LaMarre", 
        "Matthew Lampi", 
        "Craig Landerville", 
        "Dallas Lane", 
        "Matthew Lanselle", 
        "Allen Lapko", 
        "Cory Latimer", 
        "Ben Lawrence", 
        "Matthew Lebel", 
        "Ismaila LeBlanc", 
        "Christopher Lee", 
        "Bailey Legiehn", 
        "Andy Lennox", 
        "Samuel Leonard", 
        "Sam Lockner", 
        "Jeffrey MacPherson", 
        "Simon Mahoney", 
        "Lucas Maier", 
        "Trent Manley", 
        "Jeremy Manoukas", 
        "Nathanial Marsh", 
        "Alastair Marshall", 
        "Connor Mattucci", 
        "Samuel McCormick", 
        "Cameron McCuaig", 
        "Ronan Mcewan", 
        "John McGuire", 
        "Brian McNaughton", 
        "Christopher McPherson", 
        "Alistair McRae", 
        "Andrew Medlock", 
        "Trevor Meipoom", 
        "Justin Metcalfe", 
        "Chieh (Jack) Miller", 
        "Graham Miller", 
        "Josh Miller", 
        "Salvador Miller", 
        "Max Missiuna", 
        "Jack Mitchell", 
        "Michael Morris", 
        "Paul Morrison", 
        "Morgan Moszczynski", 
        "Curtis Muir", 
        "Christopher Murphy", 
        "Mark Murphy", 
        "Hiroki Nakajima", 
        "Michael Neary", 
        "James Nelson", 
        "John Nicholson", 
        "Stephen Nishida", 
        "Michael Nowlan", 
        "Jason O'Brien", 
        "Manny O'Brien", 
        "James O'Donnell", 
        "Spencer Olubala Paynter", 
        "Daniel Ortiz", 
        "Jihwan Ottenhof", 
        "Joel Ottenhof", 
        "Roger Owen", 
        "Jason Ozark", 
        "Brent Pardhan", 
        "Bernard Park", 
        "Jason Parker", 
        "Alistair Pasechnyk", 
        "James Patrick", 
        "Hunter Pellow", 
        "Jason Pennings", 
        "Brant Perras", 
        "Michael Petersen", 
        "Jordan Petrov", 
        "Don Philp", 
        "Adam Piil", 
        "Ryan Pirhonen", 
        "Alex Pollard", 
        "Daniel Postlethwaite", 
        "John-Michael Potter", 
        "Tim Powell", 
        "Chad Power", 
        "Jack Pratt", 
        "Alexander Price", 
        "Tyler Purdie", 
        "Andrew Purvis", 
        "Colin Purvis", 
        "Chase Putnam", 
        "Kael Radonicich", 
        "Curtis Ravensdale", 
        "Brett Ray", 
        "Forrest Reid", 
        "Aiden Ren", 
        "Tyler Rennicks", 
        "Alden Revell", 
        "Joshua Robinson", 
        "Richard Roffey", 
        "Michael Rose", 
        "Nicholas Roy", 
        "Christopher Samuel", 
        "Chris Sandilands", 
        "Christopher Sarbutt", 
        "David Saun", 
        "David Scharman", 
        "Adam Schoenmaker", 
        "Derek Schultz", 
        "Rocky Scuralli", 
        "Turner Seale", 
        "Bryan Senn", 
        "Alexander Serena", 
        "Seth Shaubel", 
        "Alex Shaw", 
        "Denroy Shaw", 
        "William Sibbald", 
        "Curtis Simao", 
        "Greg Simm", 
        "Nicholas Simon", 
        "Stuart Simons", 
        "Michael Skarsten", 
        "Matthew Skorbinski", 
        "Greg Slogan", 
        "Lucas Smith", 
        "Andrew South", 
        "Benjamin Sprowl", 
        "Jackson Staley", 
        "Reid Stencill-Hohn", 
        "Matthew Stevens", 
        "Jason Sula", 
        "Edward Sunderland", 
        "James Suppa", 
        "Jason Talbot", 
        "Tony Tan", 
        "Stuart Tang", 
        "Alex Temple", 
        "Leonard Theaker", 
        "Parker Thomas", 
        "Matthew Tisi", 
        "Scott Toda", 
        "Michael Toth", 
        "Zachary Trotter", 
        "Matthew Underwood", 
        "David Ure", 
        "Michael Utts", 
        "Joey Van Dyk", 
        "Jonathan Van Gaal", 
        "Chris Vandervies", 
        "Ryan Vickery", 
        "Dustin Wain", 
        "Brian Walker", 
        "Young-Jun Walsh", 
        "Brad Walton", 
        "Zachary Waugh", 
        "Matthew Webster", 
        "Samuel Welsh", 
        "Coleman West", 
        "Alexander Westendorp", 
        "Carson Whicher", 
        "David Whitney", 
        "Samuel Wilkinson", 
        "Kevin Williams", 
        "Aedan Williamson", 
        "Jason Wilson", 
        "William Wilson", 
        "David Wilton", 
        "Isaac Windeler", 
        "Liam Winter", 
        "Timothy Wong", 
        "Vladimir Wong", 
        "Robert Workman", 
        "Brian Yang", 
        "Owen Yates", 
        "Devin Young", 
        "Paul Young", 
        "Joshua Zhao"
    ]

完成

答案 2 :(得分:1)

我的两美分:

在一个循环中执行两个计算。即使使用maxlambda看起来很酷并且可读性很强,并且仍然是O(n),但它也比使用两个循环都进行一次计算的下一个实现慢9倍({{ 1}}和Honour Roll):

Subject Awards

输出:

#!/usr/bin/env python
import csv

with open('/Users/edil3508/Downloads/honours.csv') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=",")
    next(csv_reader, None)  # skip the headers

    subjects = ['English', 'Math', 'Geography', 'Science', 'Gym', 'History', 'Art', 'Music']
    award_winners = [['', 0], ['', 0], ['', 0], ['', 0], ['', 0], ['', 0], ['', 0], ['', 0]]
    # Honour Roll
    print('The honour roll students are:')
    print("-" * 80)
    for row in csv_reader:
        subtotal = 0
        for i in range(2, 8 + 2):
            subtotal += int(row[i])
            if int(row[i]) > award_winners[i-2][1]:
                award_winners[i - 2][0] = row[1] + " " + row[0]
                award_winners[i - 2][1] = int(row[i])
        avg = subtotal / 8
        if avg > 80:
            print(row[1] + " " + row[0], avg)
    # Subject Awards
    print("-" * 80)
    print('The subject award winners are:')
    print("-" * 80)
    for ix, student_grade in enumerate(award_winners):
        print('{}: {} with {}'.format(subjects[ix], student_grade[0], student_grade[1]))