【发布时间】:2022-01-30 19:28:31
【问题描述】:
我目前正在研究一个分析 tsv 文件数据的程序。我创建了基本功能,但我需要进一步过滤数据框。我有需要用于过滤的运营商、来源和日期列。这就是我现在的方法:
import argparse
import pandas as pd
# Parsing arguments. You must not modify these lines!
parser = argparse.ArgumentParser()
parser.add_argument("statistic", choices=["avg", "max"], help="Which statistic should be run?")
parser.add_argument("variable", choices=["distance", "delay"], help="What variable should be used for the calculation?")
parser.add_argument("tsvfile", help="Name of data file to be analyzed")
parser.add_argument("--carrier", dest="carrier", help="Comma-separated list of airline codes for those airlines whose flights should be included")
#parser.add_argument("--date", dest="date", help="Departure dates for flights to be included")
#parser.add_argument("--origin", dest="origin", help="Departure dates for flights to be included")
args = parser.parse_args()
# Start here with the rest of the program....
#accesing the values
stats = args.statistic
var = args.variable
car = args.carrier
#the_date = args.date
#origin = args.origin
#opening the file
file = pd.read_csv(args.tsvfile, sep='\t')
#printing the max distance
if stats == "max" and var == "distance":
print(max(file["DISTANCE"]))
#printing the max delay
if stats == "max" and var =="delay":
print(max(file["DEPARTURE_DELAY"]))
#printing the avg delay
if stats == "avg" and var == "delay":
no_of_planes_delay = 0
sum_delay = 0
for number in file["DEPARTURE_DELAY"]:
if number > 0:
no_of_planes_delay += 1
sum_delay = sum_delay + number
if number <= 0:
no_of_planes_delay +=1
sum_delay = sum_delay + 0
average_delay = sum_delay/no_of_planes_delay
print(round(average_delay, 1))
#printing the avg distance
if stats == "avg" and var == "distance":
sum_distance = 0
no_of_planes = 0
for number in file["DISTANCE"]:
no_of_planes +=1
sum_distance = sum_distance + number
average_distance = (sum_distance/no_of_planes)
print(round(average_distance, 1))`
所以我需要通过命令行应用这些过滤器,例如 python flight.py --carrier AA,DL --origin JFK avg delay flight.tsv 有谁知道我如何使用我的函数并进一步过滤数据框?
【问题讨论】:
标签: python pandas dataframe command-line-arguments argparse