在不使用分析函数的情况下实现 Rank答案

【问题标题】：Implement Rank without using analytic function在不使用分析函数的情况下实现 Rank
【发布时间】：2017-10-20 19:44:23
【问题描述】：

我想知道是否有一种方法可以在不使用内置函数的情况下实现 SQL 分析函数。

SELECT *,
    ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rownum,
    DENSE_RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS denserank,
    RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rnk
FROM emp;

【问题讨论】：

我不想在 mysql 中这样做。 SELF JOIN 会起作用吗？
很高兴看到您已经在这个问题上做了大量工作。您可以考虑接受一个答案（尤其是看看 Gordon Linoff 的优秀答案）。

标签： sql ansi-sql

【解决方案1】：

下面是三个等价的表达式：

select emp.*,
       (select count(*)
        from emp emp2
        where emp2.dept_id = emp.dept_id and
              (emp2.salary > emp.salary or
               emp2.salary = emp.salary and emp2.emp_id <= emp.emp_id
              )
       ) as "row_number",
       (select 1 + count(*)
        from emp emp2
        where emp2.dept_id = emp.dept_id and
              emp2.salary > emp.salary 
              )
       ) as "rank",
       (select count(distinct salary)
        from emp emp2
        where emp2.dept_id = emp.dept_id and
              emp2.salary >= emp.salary
       ) as "dense_rank",
from emp;

这假定存在 emp_id 以使“row_number”的行唯一。

【讨论】：

@teja 。 . .无论是否有重复，这三个都应该有效。 row_number() 取决于表中是否存在唯一键，如答案中所述。
是否有另一种解决方案可以在不使用 co 相关子查询的情况下做到这一点
@Vijiy 。 . .问题应该以 questions 的形式提出，而不是在 cmets 中，以及示例数据、所需结果和数据库标签。
我正在寻找这个精确问题的权威答案（因为 Spark Structured Streaming 不支持非时间戳列上的窗口/分析函数）。你瞧，从马的嘴里 Gordon Linoff
哎哟——刚刚意识到上面使用了相关子查询——在spark结构化流媒体中不支持..@ 987654321@

【解决方案2】：

您可以使用相关的子查询来做到这一点。

select dept_id,salary,
(select count(*) from emp e1 where e1.dept_id=e.dept_id and e1.salary>=e.salary) as rnum
from emp e

这在没有关系时效果很好。

【讨论】：

有关系时如何实现rownum？
有关系的行数是一个密集的排名。
@RossBush 怎么可能是一个密集的等级？如果你使用 ROW_NUMBER() OVER ( PARTITION BY ORDER BY ) 它会分配唯一的数字，即使有重复..
我的意思是，如果您在一个部门内按薪水排序，并给部门的每个成员一个行号来表示该顺序，但是，您为领带保留了一个占位符（每个领带都有相同的行号），然后就是 row_number 与薪水和部门 ID 的关系。

【解决方案3】：

这适用于所有情况

select DEPT_ID, SALARY,

    (select count(*)+1  from emp r where r.SALARY>o.SALARY and r.dept_id=o.dept_id) **rank**,

    (select count(distinct SALARY )+1  from emp r where r.SALARY>o.SALARY and r.dept_id=o.dept_id) *d_rank*,

    (select count(*)+1  from (select x.*,rownum rn from ( select emp.* from emp  order by DEPT_ID asc,salary desc ) x) r where r.rn<o.rn and r.dept_id=o.dept_id) **rownumm**

from (select x.*,rownum rn from ( select emp.* from emp  order by DEPT_ID asc,salary desc ) x) o 

order by DEPT_ID,salary desc;

对于排名：- 使用 (count of (values less than current rows)+1

对于密集排名：- 与排名相同（计数不同值小于当前行）+1

row_number:- 通过为每一行生成 rownum 来创建嵌套查询，这对于所有行都是不同的。现在除此之外，执行与排名相同的逻辑（大于前一个rownum（选择子查询的rownum）的值的计数）+1

【讨论】：

请添加一些解释为什么您的代码适用于所有情况。因此，其他人可以从中学习。
确定杰森我正在这样做