【发布时间】:2019-05-23 10:03:00
【问题描述】:
例如,我有一个如下所示的数据框:
name eventlist
0 a [{'t': '1234', 'n': 'user_engagem1'},{'t': '2345', 'n': 'user_engagem2'},{'t': '3456', 'n': 'user_engagem3'}]
1 b [{'t': '2345', 'n': 'user_engagem4'},{'t': '1345', 'n': 'user_engagem5'},{'t': '1356', 'n': 'user_engagem6'},{'t': '1345', 'n': 'user_engagem5'},{'t': '1359', 'n': 'user_engagem6'}]
2 c [{'t': '1334', 'n': 'user_engagem3'},{'t': '2345', 'n': 'user_engagem4'},{'t': '3556', 'n': 'user_engagem2'}]
我用 re.findall 尝试了一个字符串,它似乎有效,我得到了类似的结果 ['1234', '2345', '3456'],但我无法将其应用到数据帧中
#code 1,apply to string successfully
str="[{'t': '1234', 'n': 'user_engagem'},{'t': '2345', 'n': 'user_engagem'},{'t': '3456', 'n': 'user_engagem'}]"
print(re.findall(r"t': '(.+?)', '", str))
#code 2,apply to dateframe doesn't work
df['t']=df['events'].str.findall(r"t': '(.+?)', '", df['events'])
print(list)
我想得到类似的结果
name eventlist
0 a ['1234', '2345', '3456']
1 b ['2345', '1345','1234','1356', '1356']
2 c ['1334', '2345', '3556']
甚至更好,我可以得到类似的结果
name t_first t_last
0 a 1234 3456
1 b 2345 1359
2 c 1334 3556
【问题讨论】:
标签: python json regex pandas dictionary