【发布时间】:2021-07-28 14:53:13
【问题描述】:
给定两个数据框如下:
df1:
id address price
0 1 8563 Parker Ave. Lexington, NC 27292 3
1 2 242 Bellevue Lane Appleton, WI 54911 3
2 3 771 Greenview Rd. Greenfield, IN 46140 5
3 4 93 Hawthorne Street Lakeland, FL 33801 6
4 5 8952 Green Hill Street Gettysburg, PA 17325 3
5 6 7331 S. Sherwood Dr. New Castle, PA 16101 4
df2:
state street quantity
0 PA S. Sherwood 12
1 IN Hawthorne Street 3
2 NC Parker Ave. 7
假设df2 中的state 和street 都包含在df2 的address 中,然后将df2 合并到df1。
我怎么能在 Pandas 中做到这一点?谢谢。
预期结果df:
id address ... street quantity
0 1 8563 Parker Ave. Lexington, NC 27292 ... Parker Ave. 7.00
1 2 242 Bellevue Lane Appleton, WI 54911 ... NaN NaN
2 3 771 Greenview Rd. Greenfield, IN 46140 ... NaN NaN
3 4 93 Hawthorne Street Lakeland, FL 33801 ... NaN NaN
4 5 8952 Green Hill Street Gettysburg, PA 17325 ... NaN NaN
5 6 7331 S. Sherwood Dr. New Castle, PA 16101 ... S. Sherwood 12.00
[6 rows x 6 columns]
我的测试代码:
df2['addr'] = df2['state'].astype(str) + df2['street'].astype(str)
pat = '|'.join(r'\b{}\b'.format(x) for x in df2['addr'])
df1['addr']= df1['address'].str.extract('\('+ pat + ')', expand=False)
df = df1.merge(df2, on='addr', how='left')
输出:
id address ... street_y quantity_y
0 1 8563 Parker Ave. Lexington, NC 27292 ... NaN nan
1 2 242 Bellevue Lane Appleton, WI 54911 ... NaN nan
2 3 771 Greenview Rd. Greenfield, IN 46140 ... NaN nan
3 4 93 Hawthorne Street Lakeland, FL 33801 ... NaN nan
4 5 8952 Green Hill Street Gettysburg, PA 17325 ... NaN nan
5 6 7331 S. Sherwood Dr. New Castle, PA 16101 ... NaN nan
[6 rows x 10 columns]
【问题讨论】:
标签: python-3.x pandas dataframe