【发布时间】:2016-12-29 18:36:26
【问题描述】:
我有这个功能:
def cal_score(research, citations, teaching, international, income):
return .3 **research + .3 **citations + .3 **teaching +.075 **international + .025 **income
其中“研究”、“引用”、“教学”、“国际”和“收入”是数据集的列。我想在数据集中添加一个新列,其值应根据上述函数计算。我尝试了不同的程序,但都没有奏效。
示例:如果我们有如下一行
university_name Indian Institute of Technology Bombay
teaching 43.8
international 14.3
research 24.2
citations 8,327
income 14.9
Total Score Ranking
那么总分应该计算为
Total Score = .3 **research + .3 **citations + .3 **teaching +.075 **international + .025 **income.
这应该适用于数据集中的所有行。
谁能帮我实现这个要求。我被困在这很长一段时间了。 :-(
Indian_univ.head(10).to_dict()
{'citations': {510: 38.799999999999997,
832: 39.0,
856: 45.600000000000001,
959: 45.799999999999997,
1232: 84.700000000000003,
1360: 38.5,
1361: 41.799999999999997,
1362: 35.299999999999997,
1363: 53.600000000000001,
1679: 51.600000000000001},
'country': {510: 'India',
832: 'India',
856: 'India',
959: 'India',
1232: 'India',
1360: 'India',
1361: 'India',
1362: 'India',
1363: 'India',
1679: 'India'},
'female_male_ratio': {510: '16 : 84',
832: '15 : 85',
856: '16 : 84',
959: '17 : 83',
1232: '46 : 54',
1360: '18 : 82',
1361: '13 : 87',
1362: '15 : 85',
1363: '17 : 83',
1679: '19 : 81'},
'income': {510: '24.2',
832: '72.4',
856: '52.7',
959: '70.4',
1232: '28.4',
1360: '-',
1361: '42.4',
1362: '-',
1363: '64.8',
1679: '37.9'},
'international': {510: '14.3',
832: '16.1',
856: '19.9',
959: '15.6',
1232: '29.3',
1360: '15.3',
1361: '17.3',
1362: '14.7',
1363: '15.6',
1679: '18.2'},
'international_students': {510: '1%',
832: '0%',
856: '1%',
959: '1%',
1232: '1%',
1360: '1%',
1361: '0%',
1362: '0%',
1363: '1%',
1679: '1%'},
'num_students': {510: '8,327',
832: '9,928',
856: '8,327',
959: '8,061',
1232: '16,691',
1360: '8,371',
1361: '6,167',
1362: '9,928',
1363: '8,061',
1679: '3,318'},
'research': {510: 15.699999999999999,
832: 45.299999999999997,
856: 33.100000000000001,
959: 13.699999999999999,
1232: 14.0,
1360: 23.0,
1361: 25.199999999999999,
1362: 30.0,
1363: 12.300000000000001,
1679: 39.5},
'student_staff_ratio': {510: 14.9,
832: 17.5,
856: 14.9,
959: 18.699999999999999,
1232: 23.899999999999999,
1360: 17.300000000000001,
1361: 12.199999999999999,
1362: 17.5,
1363: 18.699999999999999,
1679: 8.1999999999999993},
'teaching': {510: 43.799999999999997,
832: 44.200000000000003,
856: 47.299999999999997,
959: 30.399999999999999,
1232: 25.800000000000001,
1360: 33.799999999999997,
1361: 31.300000000000001,
1362: 39.299999999999997,
1363: 25.100000000000001,
1679: 32.600000000000001},
'total_score': {510: 29.489999999999995,
832: 38.549999999999997,
856: 37.799999999999997,
959: 26.969999999999999,
1232: 37.350000000000001,
1360: 28.589999999999996,
1361: 29.489999999999998,
1362: 31.379999999999995,
1363: 27.299999999999997,
1679: 37.109999999999999},
'university_name': {510: 'Indian Institute of Technology Bombay',
832: 'Indian Institute of Technology Kharagpur',
856: 'Indian Institute of Technology Bombay',
959: 'Indian Institute of Technology Roorkee',
1232: 'Panjab University',
1360: 'Indian Institute of Technology Delhi',
1361: 'Indian Institute of Technology Kanpur',
1362: 'Indian Institute of Technology Kharagpur',
1363: 'Indian Institute of Technology Roorkee',
1679: 'Indian Institute of Science'},
'world_rank': {510: '301-350',
832: '226-250',
856: '251-275',
959: '351-400',
1232: '226-250',
1360: '351-400',
1361: '351-400',
1362: '351-400',
1363: '351-400',
1679: '276-300'},
'year': {510: 2012,
832: 2013,
856: 2013,
959: 2013,
1232: 2014,
1360: 2014,
1361: 2014,
1362: 2014,
1363: 2014,
1679: 2015}}
【问题讨论】:
-
请发布您的实际数据帧。发布
df.head(20).to_dict()通常也很有帮助,这样人们就可以玩弄您的数据了。 -
嗨。我添加了数据的屏幕截图。请看一看。
-
请勿发布图片。发布 df.head().to_dict() 的输出
-
编辑了帖子,但看起来很笨拙。 :( 更新:虽然教学、研究、引文是 float64,total_score、international 和income 是对象。我只能根据 dtype float64 的字段来计算分数。这意味着我需要从对象转换剩余的必填字段到float64,应该可以解决问题
-
我让它看起来更好:)。是的,您需要列具有数字 dtype,否则您将无法使用它们进行计算!这应该很简单。
标签: python pandas dataframe dataset data-science