【发布时间】:2020-10-18 00:43:39
【问题描述】:
我想隐藏这个值
{"timestamp":"1601093713","name":"exmple1.com","type":"mx","value":"20 alt1.aspmx.l.google.com"}
{"timestamp":"1601093713","name":"exmple1.com","type":"mx","value":"20 alt2.aspmx.l.google.com"}
{"timestamp":"1601093713","name":"exmple1.com","type":"mx","value":"30 aspmx2.googlemail.com"}
{"timestamp":"1601093713","name":"exmple1.com","type":"mx","value":"30 aspmx3.googlemail.com"}
{"timestamp":"1601093713","name":"exmple2.com","type":"mx","value":"20 alt1.aspmx.l.google.com"}
{"timestamp":"1601093713","name":"exmple2.com","type":"mx","value":"20 alt2.aspmx.l.google.com"}
{"timestamp":"1601093713","name":"exmple2.com","type":"mx","value":"30 aspmx2.googlemail.com"}
{"timestamp":"1601093713","name":"exmple2.com","type":"mx","value":"30 aspmx3.googlemail.com"}
test.printSchema()
root
|-- name: string (nullable = true)
|-- timestamp: string (nullable = true)
|-- type: string (nullable = true)
|-- value: string (nullable = true)
将具有相同名称的mx值组合在一行中 pyspark 我想要的结果
{ "timestamp":"1601093713", "name":"exmple1.com", "type":"mx", "value":" alt1.aspmx.l.google.com,alt2.aspmx.l.google.com , aspmx2.googlemail.com, aspmx3.googlemail.com" }
{ "timestamp":"1601093713", "name":"exmple2.com", "type":"mx", "value":" alt1.aspmx.l.google.com, alt2.aspmx.l.google.com , aspmx2.googlemail.com, aspmx3.googlemail.com" }
【问题讨论】:
标签: python apache-spark pyspark apache-spark-sql