Pandas 过滤/池化并保留旧索引答案

【问题标题】：Pandas filtering/pooling and keeping the old IndicesPandas 过滤/池化并保留旧索引
【发布时间】：2018-01-15 07:37:09
【问题描述】：

这是我的问题：

我在同一数据中有 2 列。一列是 ID（重复了几个 ID），另一列是关于年龄（重复了许多年龄）。我想创建新列，在其中重新组合 Id，然后在 OLD 表中的位置调用/显示它们的索引。一个例子： age = [12, 14, 10, 12, 10] （索引为 1, 2, 3, 4, 5）我希望获得以下内容： Age2 = [10, 10, 12, 12, 14] , Indexe = [3, 5, 1, 4, 2] 当我调用 10 岁时，我可以得到 10 最初在以下索引 (3, 5) 中。

我的代码示例：

    for ind in ind_list:
        data.temp = data[data['age'] == ind].copy()
        inds = data.temp.index.tolist()
     #Here I obtain a list that inform me about the indexes of the IDS in 
     the old data

另一种更长的方法：

        Final = []
        index = 0;
        for i in range(len(CTs2) - 1, -1, -1):
        data.temp = data['student_ID'][inds]
        data.temp = data.temp[data.temp == CTs2[i]]
        inds2 = data.temp.index.tolist()

     if len(inds2) > 0:
            CTs2.pop(i)
            final.extend(inds2)
            final.extend(inds2)
            special_index += 1

希望对大家有所帮助……谢谢大家

【问题讨论】：

可以创建示例输入表和预期输出吗？

标签： python python-2.7 pandas indexing

【解决方案1】：

如果你想创建一个列来存储重复年龄的索引，你可以使用

frame = pd.DataFrame(np.random.randint(1,5,(10,2)),columns=['ID','Age'])

frame['Age2'] = [[dex for y,dex in zip(frame.Age,frame.index) if x == y] for x in frame.Age]

【讨论】：