【发布时间】:2022-01-15 12:22:10
【问题描述】:
给定下表:
CREATE TABLE main
(
`job_id` UUID,
`request_time` DateTime,
`host_id` UInt8,
`status_code` LowCardinality(String),
)
ENGINE = MergeTree
ORDER BY request_time
SETTINGS index_granularity = 8192
我正在尝试按主机获取所有状态及其相对百分比。为此,我需要计算按主机和状态分组的结果,并将每个计数字段除以其总主机字段数。
例如,这个查询可以在 MySQL 中工作:
SELECT
main.host_id,
main.status_code,
COUNT() AS status_count,
COUNT() / sub.host_total * 100 AS percent
FROM
main
INNER JOIN (
SELECT host_id, COUNT() AS host_total
FROM main
GROUP BY host_id
) AS sub ON (sub.host_id = main.host_id)
GROUP BY
main.host_id,
main.status_code
但是 ClickHouse 抛出:
DB::Exception:未知标识符:host_total;有列:host_id、status_code、count():处理时 host_id、status_code、count() AS status_count、(count() / host_total) * 100 AS 百分比。 (UNKNOWN_IDENTIFIER)
可能是因为correlated (dependent) subqueries are not supported。
有人建议我使用CTE,所以我尝试了这个:
WITH sub AS (
SELECT host_id, COUNT() AS host_total
FROM main
GROUP BY host_id
)
SELECT
main.host_id,
main.status_code,
COUNT() AS status_count,
COUNT() / (SELECT host_total FROM sub WHERE sub.host_id = main.host_id) * 100 AS percent
FROM
main
GROUP BY
main.host_id,
main.status_code
但还是没有运气:
DB::Exception: 处理查询时缺少列:'main.host_id':'SELECT host_total FROM sub WHERE host_id = main.host_id',必需列:'host_total''host_id''main.host_id''host_total' 'host_id' 'main.host_id': 处理时 (SELECT host_total FROM sub WHERE sub.host_id = main.host_id) AS _subquery20: 处理时 count() / ((SELECT host_total FROM sub WHERE sub.host_id = main.host_id) AS _subquery20): 处理时 (count() / ((SELECT host_total FROM sub WHERE sub.host_id = main.host_id) AS _subquery20)) * 100 AS 百分比。 (UNKNOWN_IDENTIFIER)
【问题讨论】:
标签: sql group-by common-table-expression percentage clickhouse