【问题标题】:How to convert a dataframe to multi-layers Json in Python?如何在 Python 中将数据框转换为多层 Json?
【发布时间】:2019-01-03 23:34:39
【问题描述】:

我有一个如下所示的数据框。

Supervisor-L3   Supervisor-L2   Supervisor-L1   Employee
    O               M                J            A
    O               M                J            B
    O               M                J            C
    O               M                K            D
    O               N                K            E
    O               N                K            F
    O               N                L            G
    O               N                L            H
    O               N                L            I

我想将数据框转换为 json 文件以制作组织结构图。但是,当我使用 pandas.to_json 函数时。输出是:

 {"Supervisor-L3":{"0":"O","1":"O","2":"O","3":"O","4":"O","5":"O","6":"O","7":"O","8":"O"},"Supervisor-L2":{"0":"M","1":"M","2":"M","3":"M","4":"N","5":"N","6":"N","7":"N","8":"N"},"Supervisor-L1":{"0":"J","1":"J","2":"J","3":"K","4":"K","5":"K","6":"L","7":"L","8":"L"},"Name":{"0":"A","1":"B","2":"C","3":"D","4":"E","5":"F","6":"G","7":"H","8":"I"}}

我需要一个 json 文件,可以描述数据集中人们之间的层次关系。有没有人可以帮助我?谢谢!

relationships

【问题讨论】:

    标签: python json python-3.x pandas dataframe


    【解决方案1】:

    您可以使用 networkx 或只是将数据拉长为数据、数据帧。

    data = pd.concat([pd.DataFrame(df.iloc[:,i:i+2].values, columns=['P','C']) for i in range(3)], ignore_index=True)
    
    G = nx.from_pandas_edgelist(data, 'P','C')
    
    from networkx.readwrite import json_graph
    
    txtgraph = json_graph.node_link_data(G)
    
    txtgraph
    

    输出:

    {'directed': False,
     'graph': {},
     'links': [{'source': 'O', 'target': 'M'},
      {'source': 'O', 'target': 'N'},
      {'source': 'M', 'target': 'J'},
      {'source': 'M', 'target': 'K'},
      {'source': 'N', 'target': 'K'},
      {'source': 'N', 'target': 'L'},
      {'source': 'J', 'target': 'A'},
      {'source': 'J', 'target': 'B'},
      {'source': 'J', 'target': 'C'},
      {'source': 'K', 'target': 'D'},
      {'source': 'K', 'target': 'E'},
      {'source': 'K', 'target': 'F'},
      {'source': 'L', 'target': 'G'},
      {'source': 'L', 'target': 'H'},
      {'source': 'L', 'target': 'I'}],
     'multigraph': False,
     'nodes': [{'id': 'O'},
      {'id': 'M'},
      {'id': 'N'},
      {'id': 'J'},
      {'id': 'K'},
      {'id': 'L'},
      {'id': 'A'},
      {'id': 'B'},
      {'id': 'C'},
      {'id': 'D'},
      {'id': 'E'},
      {'id': 'F'},
      {'id': 'G'},
      {'id': 'H'},
      {'id': 'I'}]}
    

    【讨论】:

    • @Scott Boston 谢谢。但是,在数据框中,一名员工可能有多个主管。例如,K 有两个老板。属于 K 的两名员工为 M 工作。如果我将数据框转换为边缘列表。我会丢失这些信息。你能告诉我如何解决这个问题吗?谢谢!
    【解决方案2】:

    我将“Supervisor-L3”的名称修改为“Supervisor”,将“Supervisor-L2”的名称修改为“Team Leader”,将“Supervisor-L1”的名称修改为“Company”。因为一家公司可能属于多个团队负责人。因此,我编写了三个循环来实现可以描述关系的 json 文件。

    a = {'name':'O',
     'Subordinate':[]}
    
    ##merge these columns to have a one-to-one mapping
    df['merge'] = df['Team Leader']+','+df['Company']
    df['merge2'] =  df['Team Leader']+','+df['Company'] +','+df['Name']
    
    
    ##get the list of unique elements
    set1 = list(set(df['Supervisor']))
    set2 = list(set(df['Team Leader']))
    set3 = list(set(df['merge']))
    set4 = list(set(df['merge2']))
    
    ## write the loop
    for i in range(len(set2)):
        temp_dict1 = {'name':set2[i],
                 'Subordinate':[]}
        a['Subordinate'].append(temp_dict1)
        m = -1
        for j in range(len(set3)):
            list1 = set3[j].split(",")
            if set2[i] == list1[0]:
                temp_dict2 = {'name':list1[1],
                     'Subordinate':[]}
                a['Subordinate'][i]['Subordinate'].append(temp_dict2)
                m += 1
                for k in range(len(set4)):
                    list2 = set4[k].split(",")
                    if (list1[0] == list2[0]) and (list1[1] == list2[1]):
                        temp_dict3 = {'name':list2[2]}
                        a['Subordinate'][i]['Subordinate'][m]['Subordinate'].append(temp_dict3)
    

    输出:

    Out[86]: 
    {'Subordinate': [{'Subordinate': [{'Subordinate': [{'name': 'F'},
          {'name': 'E'}],
         'name': 'K'},
        {'Subordinate': [{'name': 'I'}, {'name': 'H'}, {'name': 'G'}],
         'name': 'L'}],
       'name': 'N'},
      {'Subordinate': [{'Subordinate': [{'name': 'D'}], 'name': 'K'},
        {'Subordinate': [{'name': 'B'}, {'name': 'A'}, {'name': 'C'}],
         'name': 'J'}],
       'name': 'M'}],
     'name': 'O'}     
    

    【讨论】:

      猜你喜欢
      • 2017-05-25
      • 2018-11-20
      • 1970-01-01
      • 2017-10-31
      • 1970-01-01
      • 1970-01-01
      • 2020-05-29
      • 2021-06-20
      • 2017-05-01
      相关资源
      最近更新 更多