字典中特定列表索引中的最大值答案

【问题标题】：Maximum value in a specific index of lists within a dictionary字典中特定列表索引中的最大值
【发布时间】：2023-03-29 21:52:01
【问题描述】：

我有一个看起来像这样的字典，其中的值是相同数量索引的列表。它是构建一个熊猫数据框。我想获取这些列表中每个索引的最大值的键名。（即这些列表的第一个索引为 0.00023478，第四个索引为 0.23849287）。我试图将其转换为熊猫数据框，然后找到最大索引，但这需要太多时间，因为我处理的数据太多。我需要找到特定索引的最大值，然后在将字典转换为数据框之前返回键。

{'DT': [0, 0, 0, 0, 0, 0, 0, 0], 'NN': [0.00023478, 0, 0, 0, 0, 0, 0, 0], 
'POS': [0, 0, 0, 0.000192837, 0, 0, 0, 0], 'MD': [0, 0, 0, 0, 0, 0, 0, 0], 
'VB': [0, 0, 0, 0, 0, 0, 0, 0], 'VBN': [0, 0, 0, 0, 0, 0, 0, 0], 
'IN': [0.0000028945, 0, 0, 0, 0, 0, 0, 0], 'JJ': [0, 0, 0, 0, 0, 0, 0, 0], 
'NNS': [0, 0, 0, 0, 0, 0, 0, 0], 'CC': [0, 0, 0, 0.23849287, 0, 0, 0, 0], 
'RBS': [0, 0, 0, 0, 0, 0, 0, 0], 'NNP': [0, 0, 0, 0, 0, 0, 0, 0], 
'VBZ': [0, 0, 0, 0, 0, 0, 0, 0], 'TO': [0, 0, 0, 0, 0, 0, 0, 0]}

for i in range(len(test)):  # how many sentence
    list1 = [[0 for x in range(len(test[i]))] for y in range(len(pos_list))]
    q = dict(zip(pos_list, list1))
    for j in range(len(test[i])):

【问题讨论】：

你试过了吗：pd.DataFrame(data=data).idxmax(1)？

标签： python arrays pandas dataframe dictionary

【解决方案1】：

使用max 和dict.get 作为键：

max(data, key=data.get)

或者DataFrame.idxmax:

df.idxmax(1)

【讨论】：

【解决方案2】：

将您的 dict 转换为 DataFrame:

df = pd.DataFrame(d)
print(df)

# Output:
   DT        NN       POS  MD  VB  VBN        IN  JJ  NNS        CC  RBS  NNP  VBZ  TO
0   0  0.000235  0.000000   0   0    0  0.000003   0    0  0.000000    0    0    0   0
1   0  0.000000  0.000000   0   0    0  0.000000   0    0  0.000000    0    0    0   0
2   0  0.000000  0.000000   0   0    0  0.000000   0    0  0.000000    0    0    0   0
3   0  0.000000  0.000193   0   0    0  0.000000   0    0  0.238493    0    0    0   0
4   0  0.000000  0.000000   0   0    0  0.000000   0    0  0.000000    0    0    0   0
5   0  0.000000  0.000000   0   0    0  0.000000   0    0  0.000000    0    0    0   0
6   0  0.000000  0.000000   0   0    0  0.000000   0    0  0.000000    0    0    0   0
7   0  0.000000  0.000000   0   0    0  0.000000   0    0  0.000000    0    0    0   0

然后在列轴上使用max：

>>> df.max(axis='columns')
0    0.000235
1    0.000000
2    0.000000
3    0.238493
4    0.000000
5    0.000000
6    0.000000
7    0.000000
dtype: float64

这和idxmax知道索引键是一样的：

>>> df.idxmax(axis='columns')
0    NN
1    DT
2    DT
3    CC
4    DT
5    DT
6    DT
7    DT
dtype: object

【讨论】：