【问题标题】:How to unroll all nodes of a JSON tree in PostgreSQL?如何在 PostgreSQL 中展开 JSON 树的所有节点?
【发布时间】:2019-09-04 07:34:08
【问题描述】:

让我们有一个带有 jsonb 列的表,该列表示节点的树结构。要生成所有树节点的平面列表,每行一个,在使用令人惊叹的递归 CTE 的简单树结构的情况下很容易做到:

CREATE TEMPORARY TABLE api_schema (id INT, content JSONB);
INSERT INTO api_schema VALUES (1, '[{"name": "A", "category": "tuple", "children": [{"name": "B", "category": "datapoint"}, {"name": "C", "category": "datapoint"}]}]');
INSERT INTO api_schema VALUES (2, '[{"name": "D", "category": "tuple", "children": [{"name": "E", "category": "tuple", "children": [{"name": "F", "category": "datapoint"}]}]}]');

WITH RECURSIVE schema_objects (id, object) AS (
  SELECT id, jsonb_array_elements(content) FROM api_schema
  UNION
  SELECT id, jsonb_array_elements(object->'children') FROM schema_objects
  WHERE object->>'category' != 'datapoint'
) SELECT * FROM schema_objects;

棘手的部分是递归公式中需要更多逻辑时。在我的例子中,除了datapoint(没有孩子)和tuple(孩子是一个列表)类别之外,还有一个multivalue类别(孩子是一个单一的节点)。如何让CTE处理这种情况?

CTE 的天真重写是这样的:

INSERT INTO api_schema VALUES (3, '[{"name": "D", "category": "multivalue", "children": {"name": "E", "category": "tuple", "children": [{"name": "F", "category": "datapoint"}]}}]');

WITH RECURSIVE schema_objects (id, object) AS (
  SELECT id, jsonb_array_elements(content) FROM api_schema
  UNION
  SELECT id, CASE WHEN jsonb_typeof(object->'children') = 'array'
               THEN jsonb_array_elements(object->'children')
               ELSE object->'children'
             END AS object
  FROM schema_objects
  WHERE object->>'category' != 'datapoint'
) SELECT * FROM schema_objects;

但是,这里的问题是这在 Postgres 10 中不起作用:

ERROR:  set-returning functions are not allowed in CASE

我们可以创建两个 SELECT,每个 SELECT 都涵盖一个单独的类别吗?这是不允许的:

WITH RECURSIVE schema_objects (id, object) AS (
  SELECT id, jsonb_array_elements(content) FROM api_schema
  UNION
  (SELECT id, jsonb_array_elements(object->'children') FROM schema_objects WHERE object->>'category' = 'tuple'
   UNION
   SELECT id, object->'children' FROM schema_objects WHERE object->>'category' = 'multivalue')
) SELECT * FROM schema_objects WHERE id=1;

ERROR:  recursive reference to query "schema_objects" must not appear more than once

互联网上流传的一个想法是使用 CTE 来分解 CASE,但我们已经 在 CTE 中,所以这甚至无法编译:

WITH RECURSIVE schema_objects (id, object) AS (
  SELECT id, jsonb_array_elements(content) FROM api_schema
  UNION
  WITH schema_children (id, children) AS (
    SELECT CASE jsonb_typeof(object->'children') WHEN 'array' THEN object->'children' ELSE jsonb_build_array(object->'children') END AS children
    FROM schema_objects
    WHERE object->>'category' != 'datapoint'
  )
  SELECT id, jsonb_array_elements(children)
  FROM schema_children
) SELECT * FROM schema_objects WHERE id=1;

Postgres 也建议使用横向 FROM,但不清楚在单个“表”的情况下如何组合。

【问题讨论】:

    标签: json postgresql recursion tree


    【解决方案1】:

    在撰写问题时,我在早期的尝试中发现了一个错误:)。在 FROM 中使用子选择可以解决问题,将非数组重新捆绑为数组:

    WITH RECURSIVE schema_objects (id, object) AS (
      SELECT id, jsonb_array_elements(content) FROM api_schema
    
      UNION
    
      SELECT id, jsonb_array_elements(children)
      FROM (SELECT id, CASE jsonb_typeof(object->'children') WHEN 'array' THEN object->'children' ELSE jsonb_build_array(object->'children') END AS children
        FROM schema_objects
        WHERE object->>'category' != 'datapoint'
      ) s
    ) SELECT * FROM schema_objects
    

    但是,这种方法的一个非常大的普遍缺点是它非常慢 - 对于 10000 ~ 30 个节点的树,我正在查看需要 分钟 的查询。

    【讨论】:

      猜你喜欢
      • 2013-01-13
      • 2020-04-22
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-08-02
      • 1970-01-01
      • 1970-01-01
      • 2014-07-03
      相关资源
      最近更新 更多