【发布时间】:2022-06-10 16:57:35
【问题描述】:
我已经创建了一个 Spark DataFrame,以便通过 Graphx 构建图形,Graphx 是 Spark 的 API 并接受 Spark Dataframe 格式。所以,现在我有了这样的数据,
+--------------------+----------------+------+
| hotel_url| author|rating|
+--------------------+----------------+------+
|Hotel_Review-g194...| violettaf340| 5|
|Hotel_Review-g194...| Lagaiuzza| 5|
|Hotel_Review-g194...| ashleyn763| 5|
|Hotel_Review-g194...| DavideMauro| 5|
|Hotel_Review-g194...| Alemma11| 4|
|Hotel_Review-g194...| ladispoli| 4|
|Hotel_Review-g303...| LiliT0URS| 3|
|Hotel_Review-g303...| Amandainldn| 4|
|Hotel_Review-g303...|TwoMonkeysTravel| 5|
|Hotel_Review-g303...| BiancaB3358| 4|
|Hotel_Review-g303...| Brett-Sweden| 4|
|Hotel_Review-g303...| analuizade| 5|
|Hotel_Review-g303...| heckfy| 5|
|Hotel_Review-g303...| MatheusMedrado| 3|
|Hotel_Review-g303...|TwoMonkeysTravel| 5|
|Hotel_Review-g303...| SaStar| 4|
|Hotel_Review-g303...| chrisbG2838DY| 4|
|Hotel_Review-g303...| virninha| 5|
|Hotel_Review-g303...| AugustusC_13| 5|
|Hotel_Review-g303...| AnnaMir| 5|
+--------------------+----------------+------+
我想问你,如何从 Spark Dataframe 创建一个具有 [ (Node: hotel_url) --- (weight: rating) --- (Node: author)] 这种类型关系的图表?
您也可以从给定的图中了解所需的关系。
【问题讨论】:
标签: python apache-spark pyspark spark-graphx