【问题标题】:Iteration issue to create a nested dictionary创建嵌套字典的迭代问题
【发布时间】:2019-05-29 21:57:30
【问题描述】:

我的数据如下:

                 Application                       WorkflowStep
0                WF:ACAA-CR (auto)                      Manager
1                WF:ACAA-CR (auto)           Access Responsible
2                WF:ACAA-CR (auto)                    Automatic
3                WF:ACAA-CR-AccResp (auto)              Manager
4                WF:ACAA-CR-AccResp (auto)   Access Responsible
5                WF:ACAA-CR-AccResp (auto)            Automatic
6                WF:ACAA-CR-IT-AccResp[AUTO]              Group
7                WF:ACAA-CR-IT-AccResp[AUTO] Access Responsible
8                WF:ACAA-CR-IT-AccResp[AUTO]          Automatic

除了这两列之外,我还想添加第三列,显示所有WorkflowStep 的总和。 字典应如下所示(或类似)

{'WF:ACAA-CR (auto)': 
             [{'Workflow': ['Manager', 'Access Responsible','Automatic'], 'Summary': 3}], 
 'WF:ACAA-CR-AccResp (auto)': 
             [{'Workflow': ['Manager','Access Responsible','Automatic'], 'Summary': 3}], 
 'WF:ACAA-CR-IT-AccResp[AUTO]': 
             [{'Workflow': ['Group','Access Responsible','Automatic'], 'Summary': 3}]
}

我从上述两列中创建字典的代码工作正常。

for i in range(len(df)):
    currentid = df.iloc[i,0]
    currentvalue = df.iloc[i,1]
    dict.setdefault(currentid, [])
    dict[currentid].append(currentvalue)

创建WorkflowStep 总和的代码如下,也可以正常工作:

for key, values in dict.items():
    val = values
    match = ["Manager", "Access Responsible", "Automatic", "Group"]
    c = Counter(val)
    sumofvalues = 0
    for m in match:
        if c[m] == 1:
            sumofvalues += 1

我的最初的想法是调整我的第一个代码,其中初始键是ApplicationWorkflowStepSummary 将是子字典。

for i in range(len(df)):
    currentid = df.iloc[i,0]
    currentvalue = df.iloc[i,1]
    dict.setdefault(currentid, [])
    dict[currentid].append({"Workflow": [currentvalue], "Summary": []})

然而,这样做的结果并不令人满意,因为它不会将 currentvalue 添加到已经存在的 Workflow 键中,而是在每次迭代后重新创建它们。

示例

 {'WF:ACAA-CR (auto)': [{'Workflow': ['Manager'], 'Summary': []},
                        {'Workflow': ['Access Responsible'], 'Summary': []}, 
                        {'Workflow': ['Automatic'], 'Summary': []}]
 }

如何创建类似于我上面写的字典?

【问题讨论】:

    标签: python python-3.x dictionary nested


    【解决方案1】:

    IIUC,这里有什么可以帮助的 -

    val = df.groupby('Application')['WorkflowStep'].unique()
    {val.index[i]: [{'WorkflowStep':list(val[i]), 'Summary':len(val[i])}] for i in range(len(val))}
    

    导致,

    {'WF:ACAA-CR (auto)': [{'WorkflowStep': ['Manager', 'Access Responsible', 'Automatic'], 'Summary': 3}],
     'WF:ACAA-CR-AccResp (auto)': [{'WorkflowStep': ['Manager', 'Access Responsible', 'Automatic'], 'Summary': 3}],
     'WF:ACAA-CR-IT-AccResp[AUTO]': [{'WorkflowStep': ['Group', 'Access Responsible', 'Automatic'], 'Summary': 3}]}
    

    【讨论】:

    • 非常感谢您的精彩回答!
    【解决方案2】:

    我认为 meW 的答案是一种更好的做事方式,并利用了数据框的简洁功能,但作为参考,如果您想按照自己尝试的方式进行操作,我认为这会起作用:

    # Create the data for testing.
    d = {'Application': ["WF:ACAA-CR (auto)", "WF:ACAA-CR (auto)", "WF:ACAA-CR (auto)",
                         "WF:ACAA-CR-AccResp (auto)", "WF:ACAA-CR-AccResp (auto)", "WF:ACAA-CR-AccResp (auto)"],
         'WorkflowStep': ["Manager", "Access Responsible","Automatic","Manager","Access Responsible", "Automatic"]}
    df = pd.DataFrame(d)
    
    new_dict = dict()
    # Iterate through the rows of the data frame. 
    for index, row in df.iterrows():
        # Get the values for the current row.
        current_application_id = row['Application']
        current_workflowstep = row['WorkflowStep']
    
        # Set the default values if not already set.
        new_dict.setdefault(current_application_id, {'Workflow': [], 'Summary' : 0})
    
        # Add the new values.
        new_dict[current_application_id]['Workflow'].append(current_workflowstep)
        new_dict[current_application_id]['Summary'] += 1
    
    print(new_dict)
    

    输出如下:

    {'WF:ACAA-CR (auto)': {'Workflow': ['Manager', 'Access Responsible', 'Automatic'], 'Summary': 3}, 
    'WF:ACAA-CR-AccResp (auto)': {'Workflow': ['Manager', 'Access Responsible', 'Automatic'], 'Summary': 3}}
    

    【讨论】:

      猜你喜欢
      • 2021-11-29
      • 1970-01-01
      • 1970-01-01
      • 2014-01-21
      • 2021-08-25
      • 2019-10-17
      • 1970-01-01
      • 2021-01-16
      • 1970-01-01
      相关资源
      最近更新 更多