选择熊猫数据框中列表的最后一个元素答案

【问题标题】：Selecting the last element of a list inside a pandas dataframe选择熊猫数据框中列表的最后一个元素
【发布时间】：2021-07-16 00:02:21
【问题描述】：

我有一个 pandas 数据框，其中有一列包含列表值和示例数据：

datetime.              column1
2021-04-10 00:03 00.   [20.0, 21.6, 30.7]
2021-04-10 00:06 00.   [10.0, 20.6, 20.7]
2021-04-10 00:09 00.   [20.0, 21.5, 10.7]

我想选择 column1 的最后一个元素，预期输出为

datetime.              column1
2021-04-10 00:03 00.   30.7
2021-04-10 00:06 00.   20.7
2021-04-10 00:09 00.   10.7

【问题讨论】：

访问任何列表的最后一个元素的方式相同：访问列表，然后访问该表达式的最后一个元素。你在哪里卡住了？
df.column1.str[-1] ?

标签： python pandas numpy data-science

【解决方案1】：

df.column1 = df.column1.apply(lambda x: x[-1])    
print(df)

打印：

              datetime.  column1
0  2021-04-10 00:03 00.     30.7
1  2021-04-10 00:06 00.     20.7
2  2021-04-10 00:09 00.     10.7

【讨论】：

【解决方案2】：

Pandas 中没有用于处理列表的内置方法，但您可以使用apply()。

df.column1 = df.column1.apply(lambda x: x[-1])

【讨论】：

【解决方案3】：

一种不使用apply的方法，与逐行迭代DataFrame相同，是使用标准构造函数将列放入新的DataFrame。

df.assign(new_column1=pd.DataFrame(df.column1.tolist()).iloc[:, -1])

              column1  new_column1
0  [20.0, 21.6, 30.7]         30.7
1  [10.0, 20.6, 20.7]         20.7
2  [20.0, 21.5, 10.7]         10.7

【讨论】：

【解决方案4】：

也许它看起来很奇怪，但您可以使用.str 从列表中获取元素

 df.column1 = df.column1.str[-1]

有字典的时候也可以用

 df.other = df.other.str[key]

最少的工作代码

import pandas as pd

df = pd.DataFrame({
    'datetime.': [
        '2021-04-10 00:03 00.', 
        '2021-04-10 00:06 00.', 
        '2021-04-10 00:09 00.'
    ],
    'column1':  [
        [20.0, 21.6, 30.7], 
        [10.0, 20.6, 20.7], 
        [20.0, 21.5, 10.7]
    ],
    'other':  [
        {'a': 20.0, 'b': 21.6, 'c': 30.7}, 
        {'a': 10.0, 'b': 20.6, 'c': 20.7}, 
        {'a': 20.0, 'b': 21.5, 'c': 10.7}
    ],
})    

print(df)

df.column1 = df.column1.str[-1]
df.other = df.other.str['c']

print(df)

结果：

              datetime.             column1                              other
0  2021-04-10 00:03 00.  [20.0, 21.6, 30.7]  {'a': 20.0, 'b': 21.6, 'c': 30.7}
1  2021-04-10 00:06 00.  [10.0, 20.6, 20.7]  {'a': 10.0, 'b': 20.6, 'c': 20.7}
2  2021-04-10 00:09 00.  [20.0, 21.5, 10.7]  {'a': 20.0, 'b': 21.5, 'c': 10.7}


              datetime.  column1  other
0  2021-04-10 00:03 00.     30.7   30.7
1  2021-04-10 00:06 00.     20.7   20.7
2  2021-04-10 00:09 00.     10.7   10.7

要同时对多个列执行相同操作，您还需要 .apply()

df[['column1', 'column2']] = df[['column1', 'column2']].apply(lambda column: column.str[-1]) # axis=0

或成行

df[['column1', 'column2']] = df[['column1', 'column2']].apply(lambda row: row.str[-1], axis=1)

顺便说一句：

如果您想将所有元素转换为列，则可以使用.apply(pd.Series)

df[ ["1", "2", "3"] ] = df.column1.apply(pd.Series)
df[ ["a", "b", "c"] ] = df.other.apply(pd.Series)

【讨论】：