【发布时间】:2021-03-18 08:05:40
【问题描述】:
我正在关注 YouTube 教程,并且我根据教程编写了这段代码
import numpy as np
import pandas as pd
from scipy.stats import percentileofscore as score
my_columns = [
'Ticker',
'Price',
'Number of Shares to Buy',
'One-Year Price Return',
'One-Year Percentile Return',
'Six-Month Price Return',
'Six-Month Percentile Return',
'Three-Month Price Return',
'Three-Month Percentile Return',
'One-Month Price Return',
'One-Month Percentile Return'
]
final_df = pd.DataFrame(columns = my_columns)
# populate final_df here....
pd.set_option('display.max_columns', None)
print(final_df[:1])
time_periods = ['One-Year', 'Six-Month', 'Three-Month', 'One-Month']
for row in final_df.index:
for time_period in time_periods:
change_col = f'{time_period} Price Return'
print(type(final_df[change_col]))
percentile_col = f'{time_period} Percentile Return'
print(final_df.loc[row, change_col])
final_df.loc[row, percentile_col] = score(final_df[change_col], final_df.loc[row, change_col])
print(final_df)
它将我的数据框打印为
| Ticker | Price | Number of Shares to Buy | One-Year Price Return | One-Year Percentile Return | Six-Month Price Return | Six-Month Percentile Return | Three-Month Price Return | Three-Month Percentile Return | One-Month Price Return | One-Month Percentile Return |
|--------|---------|-------------------------|------------------------|----------------------------|------------------------|-----------------------------|--------------------------|-------------------------------|-------------------------|------------------------------|
| A | 120.38 | N/A | 0.437579 | N/A | 0.280969 | N/A | 0.198355 | N/A | 0.0455988 | N/A |
但是当我调用 score 函数时,我得到了这个错误
<class 'pandas.core.series.Series'>
0.4320217937551543
Traceback (most recent call last):
File "program.py", line 72, in <module>
final_df.loc[row, percentile_col] = score(final_df[change_col], final_df.loc[row, change_col])
File "/Users/abhisheksrivastava/Library/Python/3.7/lib/python/site-packages/scipy/stats/stats.py", line 2017, in percentileofscore
left = np.count_nonzero(a < score)
TypeError: '<' not supported between instances of 'NoneType' and 'float'
出了什么问题?我在 YouTube 视频中看到了相同的代码。我几乎没有使用 Python 的经验
编辑:
我也试过
print(type(final_df['One-Year Price Return']))
print(type(final_df['Six-Month Price Return']))
print(type(final_df['Three-Month Price Return']))
print(type(final_df['One-Month Price Return']))
for row in final_df.index:
final_df.loc[row, 'One-Year Percentile Return'] = score(final_df['One-Year Price Return'], final_df.loc[row, 'One-Year Price Return'])
final_df.loc[row, 'Six-Month Percentile Return'] = score(final_df['Six-Month Price Return'], final_df.loc[row, 'Six-Month Price Return'])
final_df.loc[row, 'Three-Month Percentile Return'] = score(final_df['Three-Month Price Return'], final_df.loc[row, 'Three-Month Price Return'])
final_df.loc[row, 'One-Month Percentile Return'] = score(final_df['One-Month Price Return'], final_df.loc[row, 'One-Month Price Return'])
print(final_df)
但还是出现同样的错误
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
Traceback (most recent call last):
File "program.py", line 71, in <module>
final_df.loc[row, 'One-Year Percentile Return'] = score(final_df['One-Year Price Return'], final_df.loc[row, 'OneYear Price Return'])
File "/Users/abhisheksrivastava/Library/Python/3.7/lib/python/site-packages/scipy/stats/stats.py", line 2017, in percentileofscore
left = np.count_nonzero(a < score)
TypeError: '<' not supported between instances of 'NoneType' and 'float'
【问题讨论】:
-
final_df 中的一个列名最后必须有多余的空格。因此,当您尝试使用 final_df[change_col] 访问该列时,您的数据框无法找到该列。所以它返回 None 而不是 pandas 系列。您可以输入 print(type(final_df[change_col])) 并在此处复制结果吗?它将更加清晰。
-
我更改了代码并添加了打印语句。类型是
<class 'pandas.core.series.Series'> -
请注意,您的最后一条错误消息是指“OneYear”而不是“One-Year”。我的猜测是您发布的代码并不是您正在运行的代码。
-
它的
One-Year无处不在。抱歉,我正在尝试删除-