如何解析 csv 文件并根据该数据计算统计信息答案

【问题标题】：How to parse csv file and compute stats based on that data如何解析 csv 文件并根据该数据计算统计信息
【发布时间】：2016-03-24 03:55:42
【问题描述】：

我有一个任务需要我用 python 编写一个程序，该程序读取一个文本文件，该文件包含有关人的信息（姓名、体重和身高）。

然后我需要程序要求用户输入一个姓名，然后在文本文件中查找该姓名并打印出包含该姓名以及该人的身高和体重的行。

然后程序必须计算出人们的平均体重和平均身高。

文本文件是：

James,73,1.82,M
Peter,78,1.80,M
Jay,90,1.90,M
Beth,65,1.53.F
Mags,66,1.50,F
Joy,62,1.34,F

到目前为止，我有这段代码使用用户输入的名称打印出该行，但我不知道分配高度和重量：

search = input("Who's information would you like to find?")
with open("HeightAndWeight.txt", "r") as f:
    for line in f:
        if search in line:
            print(line)

【问题讨论】：

查看str.split。
您应该尝试将任务拆分为小的子问题。例如，您是否尝试过从简单的数字列表中计算平均值？您应该阅读阅读 python 中的 csv 文件和生成器表达式/列表推导。或者查看pandas 库，它可能提供您需要的所有功能。最后（更好：首先），熟悉语言。也许可以通过一个基本教程。
我制作了一个程序，可以从计划中的数字列表中计算平均值我只是在拆分行并只取所需的数字时遇到问题。

标签： python python-3.x csv pandas

【解决方案1】：

按照建议使用pandas 库，您可以执行以下操作：

import pandas as pd
df = pd.read_csv('people.txt', header=None, index_col=0)
df.columns = ['weight', 'height', 'sex']
print(df)

       weight  height sex
0                        
James      73    1.82   M
Peter      78    1.80   M
Jay        90    1.90   M
Beth       65    1.53   F
Mags       66    1.50   F
Joy        62    1.34   F

print(df.mean())

weight    72.333333
height     1.648333

【讨论】：

因为这是一个任务，我不认为像熊猫这样的东西会被接受;)
因为这个网站不仅可以为学生提供家庭作业的解决方案，还可以帮助其他人解决类似的问题，我认为这个答案非常好。

【解决方案2】：

您可以使用 Python 内置的 csv 模块将文件中的每一行拆分为列列表，如下所示：

import csv

with open('HeightAndWeight.txt', 'rb') as f_input:
    csv_input = csv.reader(f_input)
    total_weight = 0
    total_height = 0

    for index, row in enumerate(csv_input, start=1):
        total_weight += float(row[1])
        total_height += float(row[2])

    print "Average weight: {:.2f}".format(total_weight / index)
    print "Average height: {:.2f}".format(total_height / index)

这将显示以下输出：

平均体重：72.33 平均身高：1.65

【讨论】：

【解决方案3】：

答案实际上在您的问题标题中：使用standard lib's csv module 解析您的文件

【讨论】：

这个标题不是我写的，我是新手，所以我认为它可能已被其他人编辑。

【解决方案4】：

用途：

splitted_line = line.split(',', 4)

使用逗号, 作为分隔符将您刚刚找到的行分成四部分。然后您可以使用splitted_line[0] 获取第一部分（姓名），使用splitted_line[1] 获取第二部分（年龄），依此类推。所以，要打印出人的姓名、身高和体重：

print('The person %s is %s years old and %s meters tall.' % (splitted_line[0], splitted_line[1], splitted_line[2]))

要获得身高和年龄的平均值，您需要知道文件中有多少条目，然后将年龄和身高相加，然后除以条目/人的数量。整个事情看起来像：

search = input("Who's information would you like to find?")
total = 0
age = 0
height = 0
with open("HeightAndWeight.txt", "r") as f:
for line in f:
    total += 1
    splitted_line = line.split(',', 4)
    age += int(splitted_line[1])
    height += int(splitted_line[2]) 
    if search in line:
        print('The person %s is %s years old and %s meters tall.' % (splitted_line[0], splitted_line[1], splitted_line[2]))
average_age = age / total
average_height = height / total

这是一种简单的方法，希望也很容易理解。

【讨论】：

在写这个答案时，标题中没有提到 csv，也许你应该在这种情况下使用csv lib。
非常感谢您的帮助，这确实有效，而且很容易理解。
不客气，如果您愿意接受这个作为答案，我不介意

【解决方案5】：

search = input("Who's information would you like to find?")
heights = []
weights = []
with open("HeightAndWeight.txt", "r") as f:
    for line in f:
        if search in line:
            print(line)
            heights.append(int(line.split(',')[2]))
            weights.append(int(line.split(',')[1]))
# your calculation stuff

【讨论】：