pandas 数据框中的值是 13，但并不总是被识别答案

【问题标题】：Value in pandas dataframe is 13 but not always recognizedpandas 数据框中的值是 13，但并不总是被识别
【发布时间】：2018-07-03 15:12:49
【问题描述】：

我正在为 coursera 数据科学入门课程做作业。我有一个以“国家”作为索引，“排名”作为列之一的数据框。当我尝试减少数据框以仅包含排名为 1-15 的国家/地区的行时，以下工作但不包括伊朗，其中排名第 13。

df.set_index('Country', inplace=True)
df.loc['Iran', 'Rank'] = 13 #I did this in case there was some sort of 
corruption in the original data
df_top15 = df.where(df.Rank < 16).dropna().copy()   
return df_top15

当我尝试时

df_top15 = df.where(df.Rank == 12).dropna().copy()

我得到了西班牙的行。

但是当我尝试时

df_top15 = df.where(df.Rank == 13).dropna().copy()

我只得到列标题，没有伊朗的行。

我也试过了

df.Rank == 13

得到了一个系列，除了伊朗以外的所有国家都是假的，这是真的。

知道是什么原因造成的吗？

【问题讨论】：

您能否发送一个指向您的数据框的链接？
而不是 df_top15 = df.where(df.Rank
谢谢您，Charles R。我尝试了您的建议，但仍然遇到同样的问题。不幸的是，我不知道如何发送指向数据框的链接。它基于读取存放在 Coursera 系统上的 excel 和 csv 文件。我已经下载了这些文件，但不知道如何将它们放在可以公开访问的地方。

标签： python python-3.x pandas dataframe

【解决方案1】：

您的代码运行良好：

df = pd.DataFrame([['Italy', 5],
                   ['Iran', 13],
                   ['Tinbuktu', 20]],
                  columns=['Country', 'Rank'])

res = df.where(df.Rank < 16).dropna()

print(res)

  Country  Rank
0   Italy   5.0
1    Iran  13.0

但是，我不喜欢这种方法，因为通过mask，您的Rank 系列的dtype 变为float，因为某些值初始转换为NaN。

在我看来，一个更好的主意是使用query 或loc。使用任何一种方法都不需要dropna：

res = df.query('Rank < 16')
res = df.loc[df['Rank'] < 16]

print(res)

  Country  Rank
0   Italy     5
1    Iran    13

【讨论】：

非常感谢jpp！你的两个例子都为我解决了这个问题，是的，整数变成浮点数很丑。
我仍然想知道为什么旧代码对我不起作用，尽管我很感激学习这些新方法。