【发布时间】:2021-09-21 05:34:33
【问题描述】:
我目前正在尝试按句子拆分包含整个文本文档的字符串,以便将其转换为 csv。当然,我会使用句点作为分隔符并执行str.split('.'),但是,该文档包含缩写“即”和“例如”在这种情况下,我想忽略句点。
例如,
原句:During this time, it became apparent that vanilla shortest-path routing would be insufficient to handle the myriad operational, economic, and political factors involved in routing. ISPs began to modify routing configurations to support routing policies, i.e. goals held by the router’s owner that controlled which routes were chosen and which routes were propagated to neighbors.
结果列表:["During this time, it became apparent that vanilla shortest-path routing would be insufficient to handle the myriad operational, economic, and political factors involved in routing", "ISPs began to modify routing configurations to support routing policies, i.e. goals held by the router’s owner that controlled which routes were chosen and which routes were propagated to neighbors."]
到目前为止,我唯一的解决方法是替换所有 'i.e' 和 'e.g.' 'ie' 和 'eg' 既低效又不合语法。我正在摆弄 Python 的正则表达式库,我怀疑它可以提供我想要的答案,但我对它的了解充其量只是新手。
这是我第一次在这里发布问题,如果我使用了不正确的格式或措辞,我深表歉意。
【问题讨论】:
-
该功能可能适合您。示例:ibb.co/FB4GX2m
-
这是一个很酷的玩具示例。有一个专门研究这个问题的领域,它不是正则表达式。