【问题标题】:Parsing SQL query joins with python用python解析SQL查询连接
【发布时间】:2021-06-17 11:32:13
【问题描述】:

我正在尝试解析 sql 查询。我正在使用[moz-sql-parser][1] 来识别查询中的 sql 部分,然后编写一个函数来解析表名和连接的列。

我在下面有一个示例查询:

join_query2 = json.dumps(parse('''select * from tbl d
inner join jointbl1 c
on d.visit_id = c.session_id 
inner join jointbl2 b
on b.sv_id = c.sv_id'''))

join_query2 = json.loads(join_query2)

当通过moz-sql-parser 运行时产生:

    {'select': '*',
 'from': [{'value': 'tbl', 'name': 'd'},
  {'inner join': {'name': 'c',
    'value': 'jointbl1'},
   'on': {'eq': ['d.visit_id', 'c.session_id']}},
  {'inner join': {'name': 'b',
    'value': 'jointbl2'},
   'on': {'eq': ['b.sv_id', 'c.sv_id']}}]}

现在我已经编写了可以解析表名和列名的函数:

def parse_table_names_v2(result):
    the_list = [] 
    for x in result['from']:
        try:
            if 'value' in x: #returning just the main table_name
                if 'name' in x:
                    the_list.append(x.get('name',None))
                the_list.append(x.get('value'))
            elif 'join' in x:
                join = x['join']
                if 'value' in join:
                    if 'name' in join:
                        the_list.append(join.get('name'))
                the_list.append(join.get('value'))
            elif 'inner join' in x:
                inner_join = x['inner join']
                if 'value' in inner_join:
                    if 'name' in inner_join:
                        the_list.append(inner_join.get('name'))
                the_list.append(inner_join.get('value'))

        except Exception as e:
            print(e)
    return the_list
    

def parse_column_names(result):
    columns = []
    for x in result['from']:
        try:
            if 'on' in x:
                on = x['on']
                if 'and' in on:
                    for x in on['and']:
                        if 'eq' in x:
                                columns.append(x['eq'])
                elif 'and' not in on:
                    if 'eq' in on:
                            columns.append(on['eq'])
        except Exception as e:
            print(e)     
    return columns   

它会产生 2 个列表,如下所示:

['d',
 'tbl1',
 'c',
 'jointbl1',
 'b',
 'jointbl2']

[['d.visit_id', 'c.session_id'], ['b.sv_id', 'c.sv_id']]

但这里的诀窍是所需的输出看起来像

Row1 -> tbl1 visit_id jointbl1 session_id
Row2 -> jointbl1 sv_id jointbl2 sv_id

我的目标是解析类似的查询,我可以在其中将输出构建到数据框/列表,但很难以这种特殊的方式输出解析。任何线索将不胜感激。

【问题讨论】:

    标签: python dataframe parsing sql-parser


    【解决方案1】:

    这对你想做的事情有用吗?

    tables = ['d',
     'tbl1',
     'c',
     'jointbl1',
     'b',
     'jointbl2']
    
    
    columns = [['d.visit_id', 'c.session_id'], ['b.sv_id', 'c.sv_id']]
    
    # Convert table list to a lookup table
    lookup_table = {}
    alias = ""
    tablename = ""
    for idx, item in enumerate(tables):
        if idx % 2 != 1:
            alias = item
        else:
            tablename = item
            lookup_table[alias] = tablename
    
    # Use the lookup table to build the new row format
    new_rows = []
    for row in columns:
        new_row = [] 
        for elem in row:
            item = elem.split('.')
            col_table = item[0]
            column = item[1]
            new_row.append(lookup_table[col_table])
            new_row.append(column)
        new_rows.append(new_row)
    
    for row in new_rows:
        print(" ".join(row))
    

    输出:

    tbl1 visit_id jointbl1 session_id
    jointbl2 sv_id jointbl1 sv_id
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-04-05
      相关资源
      最近更新 更多