【发布时间】:2017-08-15 22:14:49
【问题描述】:
我有以下
[('HOMICIDE', [('2017', 1)]),
('DECEPTIVE PRACTICE', [('2017', 14), ('2016', 14), ('2015', 10), ('2013', 4), ('2014', 3)]),
('ROBBERY', [('2017', 1)])]
如何转换成
[('HOMICIDE', ('2017', 1)),
('DECEPTIVE PRACTICE', ('2015', 10)),
('DECEPTIVE PRACTICE', ('2014', 3)),
('DECEPTIVE PRACTICE', ('2017', 14)),
('DECEPTIVE PRACTICE', ('2016', 14))]
当我尝试使用地图时,它的抛出为 " AttributeError: 'list' object has no attribute 'map' "
rdd = sc.parallelize([('HOMICIDE', [('2017', 1)]), ('DECEPTIVE PRACTICE', [('2017', 14), ('2016', 14), ('2015', 10), ('2013', 4), ('2014', 3)])])
y = rdd.map(lambda x : (x[0],tuple(x[1])))
【问题讨论】:
标签: python apache-spark pyspark