【发布时间】:2021-01-15 11:36:19
【问题描述】:
我正在尝试从以下数据构造查询:
time user_id adver_id tactic_id
time1 123 adv1 tac1
time2 123 adv1 tac1
time3 123 adv1 tac2
time4 124 adv1 tac1
time6 125 adv2 tac3
time7 123 adv2 tac1
预期结果应如下所示:
adver_id adver_id_overlap tactic_id tactic_id_overlap unique_users total_records
adv1 adv1 tac1 tac1 2 3
adv1 adv1 tac1 tac2 1 2
adv1 adv2 tac1 tac1 1 2
...
我试过这个查询:
WITH adver_id_subquery AS
(
SELECT
user_id,
adver_id AS adver_id
FROM dataset1
GROUP BY user_id, adver_id
),
tactic_id_subquery AS
(
SELECT
user_id,
tactic_id AS tactic_id
FROM dataset1
GROUP BY user_id, tactic_id
)
SELECT
table1.adver_id AS adver_id, table1.adver_id AS adver_id_overlap, table2.tactic_id AS tactic_id, table2.tactic_id AS tactic_id_overlap,
COUNT(*) AS unique_users
FROM adver_id_subquery AS table1
CROSS JOIN tactic_id_subquery AS table2
WHERE table1.user_id = table2.user_id
GROUP BY adver_id,adver_id_overlap, tactic_id, tactic_id_overlap
ORDER BY adver_id,adver_id_overlap, tactic_id, tactic_id_overlap
但结果与我需要的有点不同:
adver_id adver_id_overlap tactic_id tactic_id_overlap unique_users
adv1 adv1 tac1 tac1 2
adv1 adv1 tac2 tac2 1
adv2 adv2 tac1 tac1 1
adv2 adv2 tac2 tac2 1
adv2 adv2 tac3 tac3 1
上面的结果似乎只有重复行,例如:adv1-adv2、tac1-tac1、tac2-tac2 等。我希望看到重叠,例如:tac1-tac2、tac2-tac3 等。另外,我是无法获得 total_records。 Count(*) 似乎会导致 unique_users。
感谢您在获得所需结果方面的任何帮助。
【问题讨论】:
-
请解释您想要的结果的逻辑。 “重叠”是什么意思?
-
嗨@GordonLinoff,这是为了显示不同 adver_id 之间的重叠,关于被定位的唯一用户和总记录的策略。例如,我可以看到 tact1 和 tact2 被 2 个唯一用户看到,我们拥有的总记录是 3。希望这是有道理的。
标签: sql google-bigquery overlap