【问题标题】:Is there any function to parse a complete SQL query in Python?是否有任何函数可以在 Python 中解析完整的 SQL 查询?
【发布时间】:2019-11-02 09:39:34
【问题描述】:

我正在使用 postgreSQL 查询。例如,我想从 SQL 查询中提取所有信息

sql = " select d_year, s_nation, p_category, sum(lo_revenue - lo_supplycost) as profit from DATES, CUSTOMER, SUPPLIER, PART, LINEORDER where lo_custkey  =  c_custkey and lo_suppkey  =  s_suppkey and lo_partkey  =  p_partkey and lo_orderdate  =  d_datekey and c_region  =  'AFRICA' and s_region  =  'AFRICA' and (d_year  =  1996 or d_year  =  1997) and (p_mfgr  =  'MFGR#2' or p_mfgr  =  'MFGR#4') group by d_year, s_nation, p_category order by d_year, s_nation, p_category "

我想获取所有相关表、所有选择谓词和所有连接谓词、按部分分组和按部分排序。

我使用了sqlparse,我找到了一种只获取相关表格的方法。 是否有任何示例说明如何提取此信息?

【问题讨论】:

  • 您可以使用 Antlr 解析 SQL 语句并提取 AST。

标签: python sql postgresql text-extraction sql-parser


【解决方案1】:

该算法给出了每个关键字之间的确切元素。我用sqlparse

parsed = sqlparse.parse(sql)
stmt = parsed[0]
from_seen = False
select_seen = False
where_seen = False
groupby_seen = False
orderby_seen = False

for token in stmt.tokens:
    if select_seen:
        if isinstance(token, IdentifierList):
            for identifier in token.get_identifiers():
                print("{} {}\n".format("Attr = ", identifier))
        elif isinstance(token, Identifier):
            print("{} {}\n".format("Attr = ", token))
    if from_seen:
        if isinstance(token, IdentifierList):
            for identifier in token.get_identifiers():
                print("{} {}\n".format("TAB = ", identifier))
        elif isinstance(token, Identifier):
            print("{} {}\n".format("TAB = ", token))
    if orderby_seen:
        if isinstance(token, IdentifierList):
            for identifier in token.get_identifiers():
                print("{} {}\n".format("ORDERBY att = ", identifier))
        elif isinstance(token, Identifier):
            print("{} {}\n".format("ORDERBY att = ", token))
    if groupby_seen:
        if isinstance(token, IdentifierList):
            for identifier in token.get_identifiers():
                print("{} {}\n".format("GROUPBY att = ", identifier))
        elif isinstance(token, Identifier):
            print("{} {}\n".format("GROUPBY att = ", token))

    if isinstance(token, Where):
        select_seen = False
        from_seen = False
        where_seen = True
        groupby_seen = False
        orderby_seen = False
        for where_tokens in token:
            if isinstance(where_tokens, Comparison):
                print("{} {}\n".format("Comparaison = ", where_tokens))
            elif isinstance(where_tokens, Parenthesis):
                print("{} {}\n".format("Parenthesis = ", where_tokens))
                # tables.append(token)
    if token.ttype is Keyword and token.value.upper() == "GROUP BY":
        select_seen = False
        from_seen = False
        where_seen = False
        groupby_seen = True
        orderby_seen = False
    if token.ttype is Keyword and token.value.upper() == "ORDER BY":
        select_seen = False
        from_seen = False
        where_seen = False
        groupby_seen = False
        orderby_seen = True
    if token.ttype is Keyword and token.value.upper() == "FROM":
        select_seen = False
        from_seen = True
        where_seen = False
        groupby_seen = False
        orderby_seen = False
    if token.ttype is DML and token.value.upper() == "SELECT":
        select_seen = True
        from_seen = False
        where_seen = False
        groupby_seen = False
        orderby_seen = False

【讨论】:

  • ``` if isinstance(token, Where):```这里是什么地方?
  • @RakeshV 它是一个实例。使用 IDE 时,“Where”将以不同的颜色返回。同isinstance(token, Identifier) - 本程序中的isinstance(token, IdentifierList)
  • @RakeshV 来自sqlparse.sql.Where
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2023-01-31
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2017-03-21
相关资源
最近更新 更多