【发布时间】:2017-07-26 19:21:12
【问题描述】:
我有一个很大的 spark sql 语句,我试图将其分成更小的块以提高代码的可读性。我不想加入它,只是合并结果。
当前工作的 sql 语句-
val dfs = x.map(field => spark.sql(s"
select ‘test’ as Table_Name,
'$field' as Column_Name,
min($field) as Min_Value,
max($field) as Max_Value,
approx_count_distinct($field) as Unique_Value_Count,
(
SELECT 100 * approx_count_distinct($field)/count(1)
from tempdftable
) as perc
from tempdftable
”))
我正在尝试从上面的 sql 中取出下面的查询
(SELECT 100 * approx_count_distinct($field)/count(1) from tempdftable) as perc
用这个逻辑 -
val Perce = x.map(field => spark.sql(s"(SELECT 100 * approx_count_distinct($field)/count(1) from parquetDFTable)"))
然后将此 val Perce 与带有以下语句的第一个大 SQL 语句合并,但它不起作用 -
val dfs = x.map(field => spark.sql(s"
select ‘test’ as Table_Name,
'$field' as Column_Name,
min($field) as Min_Value,
max($field) as Max_Value,
approx_count_distinct($field) as Unique_Value_Count,
'"+Perce+ "'
from tempdftable
”))
我们如何写这个?
【问题讨论】:
标签: scala apache-spark apache-spark-sql spark-streaming spark-dataframe