【发布时间】:2019-11-03 22:49:32
【问题描述】:
我正在尝试从列出多个 URL 的行中提取 URL。
具体来说,我想从行中选择twitter.com/dog_rates/xxxxxxx 的第一个实例并删除剩余的数据。
需要提取的文本示例
输入
1. twitter.com/dog_rates/status/892420643555336193/photo/1 (desired version)
2. www.gofundme.com/3yd6y1c,twitter.com/dog_rates/status/878281511006478336/photo/1
3. m.facebook.com/story.php?story_fbid=1888712391349242&id=1506300642923754&refsrc=ht.co%2FURVffYPPjY&_rdr,twitter.com/dog_rates/status/812503143955202048/photo/1,twitter.com/dog_rates/status/812503143955202048/photo/1
4. www.gofundme.com/sams-smile,twitter.com/dog_rates/status/810984652412424192/photo/1,twitter.com/dog_rates/status/709901256215666688/photo/1,twitter.com/dog_rates/status/709901256215666688/photo/1,twitter.com/dog_rates/status/709901256215666688/photo/1,twitter.com/dog_rates/status/709901256215666688/photo/1
5. twitter.com/dog_rates/status/888804989199671297/photo/1,twitter.com/dog_rates/status/888804989199671297/photo/1
我尝试使用切片提取 URL,但遇到了多个 URL 长度和分隔符位置不同的问题。
预期结果
twitter.com/dog_rates/status/892420643555336193/photo/1
twitter.com/dog_rates/status/878281511006478336/photo/1
twitter.com/dog_rates/status/812503143955202048/photo/1
twitter.com/dog_rates/status/810984652412424192/photo/1
twitter.com/dog_rates/status/888804989199671297/photo/1
【问题讨论】:
-
您如何决定所需的 URL 何时结束?用逗号?
标签: python string pandas extract