将一列拆分为两列留下空白答案

【问题标题】：Splitting a Column into two columns leaving blanks将一列拆分为两列留下空白
【发布时间】：2020-10-28 04:07:03
【问题描述】：

我在 pandas 中有一个像这样格式化的数据框。

(df)
School ID      Column 1 
School 1       AD6000         
School 2       3000TO4000      
School 3       5000TO6000      
School 4       AC2000         
School 5       BB3300        
School 6       9000TO9900      
....

我要做的就是将第 1 列中包含单词“TO”作为分隔符的行拆分为两个新列，同时保留原始列。结果会是这样。

(df)
School ID      Column 1          Column 2     Column 3
School 1       AD6000            NaN          NaN
School 2       3000TO4000        3000         4000
School 3       5000TO6000        5000         6000
School 4       AC2000            NaN          NaN
School 5       BB3300            NaN          NaN
School 6       9000TO9900        9000         9900
....

这是我认为可行的代码，但事实证明它在第 2 列和第 3 列中留下空白，而不是将 TO 左侧和右侧的数字分成各自的列。

df[['Column 2','Column 3']] = df['Column 1'].str.extract(r'(\d+)TO(\d+)')

感谢您的帮助。

【问题讨论】：

标签： python pandas dataframe

【解决方案1】：

这是因为右侧是具有不同列名（0、1）的数据框，而 Pandas 在该数据框中找不到 Column 2 或 Column 3。

您可以传递底层 numpy 数组而不是数据框：

df[['Column 2','Column 3']] = df['Column 1'].str.extract(r'(\d+)TO(\d+)').values

输出：

  School ID    Column 1 Column 2 Column 3
0  School 1      AD6000      NaN      NaN
1  School 2  3000TO4000     3000     4000
2  School 3  5000TO6000     5000     6000
3  School 4      AC2000      NaN      NaN
4  School 5      BB3300      NaN      NaN
5  School 6  9000TO9900     9000     9900

【讨论】：

有了这个我得到一个错误说““[索引（['列1'，'列2]，dtype ='对象'，名称= 3）]都在[列] ""
这适用于我的带有 Pandas 1.1.5 的系统。也与this question有关。

【解决方案2】：

使用

new = df["Column 1"].str.split("TO", n = 1, expand = True)

并为结果列赋予新名称

df["Col1"]= new[0] 
df["Col2"]= new[1]

【讨论】：

再一次，像OTTO123这样的条目呢？