【发布时间】:2020-04-05 00:30:26
【问题描述】:
我想提取 .txt 文件中特定关键字中的所有单词。对于关键字,有一个起始关键字PROC SQL;(我需要它不区分大小写),结束关键字可以是RUN;、quit; 或QUIT;。这是我的示例.txt file。
到目前为止,这是我的代码:
with open('lan sample text file1.txt') as file:
text = file.read()
regex = re.compile(r'(PROC SQL;|proc sql;(.*?)RUN;|quit;|QUIT;)')
k = regex.findall(text)
print(k)
输出:
[('quit;', ''), ('quit;', ''), ('PROC SQL;', '')]
但是,我的预期输出是获取介于关键字之间的单词:
proc sql; ("TRUuuuth");
hhhjhfjs as fdsjfsj:
select * from djfkjd to jfkjs
(
SELECT abc AS abc1, abc_2_ AS efg, abc_fg, fkdkfj_vv, jjsflkl_ff, fjkdsf_jfkj
FROM &xxx..xxx_xxx_xxE
where ((xxx(xx_ix as format 'xxxx-xx') gff &jfjfsj_jfjfj.) and
(xxx(xx_ix as format 'xxxx-xx') lec &jgjsd_vnv.))
);
1)
jjjjjj;
select xx("xE'", PUT(xx.xxxx.),"'") jdfjhf:jhfjj from xxxx_x_xx_L ;
quit;
PROC SQL; ("CUuuiiiiuth");
hhhjhfjs as fdsjfsj:
select * from djfkjd to jfkjs
(SELECT abc AS abc1, abc_2_ AS efg, abc_fg, fkdkfj_vv, jjsflkl_ff, fjkdsf_jfkj
FROM &xxx..xxx_xxx_xxE
where ((xxx(xx_ix as format 'xxxx-xx') gff &jfjfsj_jfjfj.) and
(xxx(xx_ix as format 'xxxx-xx') lec &jgjsd_vnv.))(( ))
);
2)(
RUN;
任何建议或解决此问题的不同方法将不胜感激!
但是,有没有办法将线条分开,使其看起来像这样?:
【问题讨论】:
-
考虑使用正则表达式
-
你忘记了
re.DOTALL标志