【问题标题】:Build Structure using pandas Dataframe使用 pandas Dataframe 构建结构
【发布时间】:2023-02-09 15:27:02
【问题描述】:

输入数据

import pandas as pd
import numpy as np

a1=["data.country", "data.studentinfo.city","data.studentinfo.name.id.grant"]
a2=["StringType()","StringType()","StringType()"]
d1=pd.DataFrame(list(zip(a1,a2)),columns=['action','type'])

我们必须使用 for 循环使用数据框构建以下结构

StructType([StructField("data", 
    StructType([StructField("country",StringType(),True),
                StructField("studentinfo",
                StructType([StructField("city",StringType(),True),
                    StructField("name",StructType([
                        StructField("id",StructType([
                        StructField("grant",StringType(),True)])
                        )]))    
                ])
            )])
    )])

【问题讨论】:

    标签: python python-3.x pandas dataframe


    【解决方案1】:

    第一阶段是构建结构,然后函数将其转换为以下格式:

    s = dict()
    for _, r in d1.iterrows():
      d = s
      fields = r['action'].split('.')
      for name in fields[:-1]:
        if not name in d:
          d[name] = dict()
        d = d[name]
      d[fields[-1]] = r['type']
    
    def sprint(n):
      children = list()
      for k, v in n.items():
        entry = f'StructField("{k}",'
        if type(v) is dict:
          entry += sprint(v)
        else:
          entry += f'{v},True)'
        children.append(entry)
      return f'StructType([{",".join(children)}])'
    
    print(sprint(s))
    

    【讨论】:

      猜你喜欢
      • 2014-08-09
      • 2014-02-08
      • 2023-04-08
      • 2022-07-28
      • 2018-05-10
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多