根据找到的最后一位数字拆分列答案

【问题标题】：Split column based on last digit found根据找到的最后一位数字拆分列
【发布时间】：2020-07-14 05:46:58
【问题描述】：

我有一个数据框，其中有一列包含地址和后面的一些文本。

例如：

Address
123 Fake St, Boulder, CO 80304 Attached Dwelling/
345 Main St, Boulder, CO 80304 Vacant Land/Lots
456 Cool Dr, Erie, CO 80516 Attached Dwelling/Building

这就是我想做的事

Address                               Type
123 Fake St, Boulder, CO 80304        Attached Dwelling/
345 Main St, Boulder CO 80304         Vacant Land/Lots
456 Cool Dr, Erie, Co 80516           Attached Dwelling/Building

我认为这可能有效，使用正则表达式查找第一个数字，但从右到左工作。但是，我收到错误“ValueError: Columns must be the same length as key”

df[['Address', 'Type']] = df['Address'].str.rsplit('\d', n=1, expand=True)

【问题讨论】：

你的意思不是基于最后一个数字吗？如果你在第一个数字上拆分，你将有 123 然后是另一列。
谢谢。更正了标题:)
只在找到的最后一个数字左侧的空格处拆分。请参阅下面的答案

标签： python pandas

【解决方案1】：

如果您想使用split，请将split 放在紧靠其左侧有五位数字的空格上并展开拆分

 df.Address.str.split('(?<=\d{5})\s+', expand=True)


                         0                           1
0  123 Fake St, Boulder, CO 80304          Attached Dwelling/
1  345 Main St, Boulder, CO 80304            Vacant Land/Lots
2     456 Cool Dr, Erie, CO 80516  Attached Dwelling/Building

【讨论】：

这有帮助吗？

【解决方案2】：

显然存在rsplit 无法使用正则表达式的已知问题（SO question、open issue）。

【讨论】：