【发布时间】:2019-08-06 15:12:34
【问题描述】:
这是我的 Hive 查询,直接来自 TPC-DS 工具包:
WITH customer_total_return
AS (SELECT sr_customer_sk AS ctr_customer_sk,
sr_store_sk AS ctr_store_sk,
Sum(sr_fee) AS ctr_total_return
FROM store_returns,
date_dim
WHERE sr_returned_date_sk = d_date_sk
AND d_year = 2000
GROUP BY sr_customer_sk,
sr_store_sk)
SELECT TOP 100 c_customer_id
FROM customer_total_return ctr1,
store,
customer
WHERE ctr1.ctr_total_return > (SELECT Avg(ctr_total_return) * 1.2
FROM customer_total_return ctr2
WHERE ctr1.ctr_store_sk = ctr2.ctr_store_sk)
AND s_store_sk = ctr1.ctr_store_sk
AND s_state = 'TN'
AND ctr1.ctr_customer_sk = c_customer_sk
ORDER BY c_customer_id;
但是,我在尝试运行它时收到以下错误:
失败:ParseException 行 11:11 无法识别“TOP”附近的输入 选择目标中的“100”“c_customer_id”
我的理解是 TOP 100 在 HiveQL 中的语法无效。我怎样才能正确地重写它?
【问题讨论】:
-
使用
LIMIT。以及正确的JOIN语法。