【发布时间】:2021-03-24 14:54:29
【问题描述】:
假设我在 scala 中有一个关键字列表
val keywords = List("pineapple", "lemon")
还有这样的数据框
+---+-------------------------------------------+
|ID |Body |
+---+-------------------------------------------+
|123|I contain both keywords pineapple and lemon|
|456|I sadly don't contain anything... |
|789|Pineapple's are delicious |
+---+-------------------------------------------+
如何将此数据框转换为包含Body 包含的关键字的新列?我正在寻找的结果类似于
+---+-------------------------------------------+------------------+
|ID |Body |Contains_Keywords |
+---+-------------------------------------------+------------------+
|123|I contain both keywords pineapple and lemon|[pineapple, lemon]|
|456|I sadly don't contain anything... |[] |
|789|Pineapple's are delicious |[pineapple] |
+---+-------------------------------------------+------------------+
【问题讨论】:
标签: scala apache-spark