【发布时间】:2020-07-17 17:56:02
【问题描述】:
我一直试图在多行文本中提取 SQL 查询,但我一直得到错误的输出。
如何获取一个或三个引号之间的文本?
注意:在第一个完整引号 ''、""、""""""、'''''' 之前和之后可以有任何内容,我只对找到引号之间的第一个文本感兴趣。
import re
cell_text = """\
#%%sql
q = \"\"\"
select
name, breed, sum(weight) over (partition by breed order by name) as running_total_weight
from cats
order by breed, name
\"\"\"
f(q)
"""
print(cell_text)
我的尝试:
pat = """.*select(.*)['"].*"""
out = re.findall(pat,cell_text,flags=re.M)[0]
sql = 'select ' + out
print(sql)
# I am getting empty outputs for re.findall instead of text there.
需要的输出:
input
----
#%%sql
q = """
select
name, breed, sum(weight) over (partition by breed order by name) as running_total_weight
from cats
order by breed, name
"""
f(q)
output
------
select
name, breed, sum(weight) over (partition by breed order by name) as running_total_weight
from cats
order by breed, name
input
-----
#%%sql
q = "select * from cats;"
f(q)
output
-------
select * from cats;
input
-----
q = 'select * from cats limit 2'
output
------
select * from cats limit 2
【问题讨论】:
-
我认为问题在于
.*最终匹配引号因此无法匹配,使用[^"]可能相关? -
1) 你需要点所有标志。 2)您忘记匹配引号。 3) 你不需要
select或那三个点星。