【问题标题】:PIVOT VIEW using PostgreSQL使用 PostgreSQL 的数据透视视图
【发布时间】:2015-01-04 23:08:55
【问题描述】:

我是 PostgreSQL 新手,正在使用 9.4 版。我有一个表,其中收集了作为 字符串 的测量值,需要使用始终保持最新的东西(例如 VIEW)将其转换为一种 PIVOT 表。
此外,一些值需要转换,例如。 G。乘以 1000,就像你 可以在下面的示例中看到“sensor3”。

源表:

CREATE TABLE source (
    id bigint NOT NULL,
    name character varying(255),
    "timestamp" timestamp without time zone,
    value character varying(32672),
    CONSTRAINT source_pkey PRIMARY KEY (id)
);

INSERT INTO source VALUES
  (15,'sensor2','2015-01-03 22:02:05.872','88.4')
, (16,'foo27'  ,'2015-01-03 22:02:10.887','-3.755')
, (17,'sensor1','2015-01-03 22:02:10.887','1.1704')
, (18,'foo27'  ,'2015-01-03 22:02:50.825','-1.4')
, (19,'bar_18' ,'2015-01-03 22:02:50.833','545.43')
, (20,'foo27'  ,'2015-01-03 22:02:50.935','-2.87')
, (21,'sensor3','2015-01-03 22:02:51.044','6.56');

源表结果:

| id | name      | timestamp                 | value    |
|----+-----------+---------------------------+----------|
| 15 | "sensor2" | "2015-01-03 22:02:05.872" | "88.4"   |
| 16 | "foo27"   | "2015-01-03 22:02:10.887" | "-3.755" |
| 17 | "sensor1" | "2015-01-03 22:02:10.887" | "1.1704" |
| 18 | "foo27"   | "2015-01-03 22:02:50.825" | "-1.4"   |
| 19 | "bar_18"  | "2015-01-03 22:02:50.833" | "545.43" |
| 20 | "foo27"   | "2015-01-03 22:02:50.935" | "-2.87"  |
| 21 | "sensor3" | "2015-01-03 22:02:51.044" | "6.56"   |

期望的最终结果:

| timestamp                 | sensor1 | sensor2 | sensor3 | foo27   | bar_18  |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:02:05.872" |         | 88.4    |         |         |         |
| "2015-01-03 22:02:10.887" | 1.1704  |         |         | -3.755  |         |
| "2015-01-03 22:02:50.825" |         |         |         | -1.4    |         |
| "2015-01-03 22:02:50.833" |         |         |         |         | 545.43  |
| "2015-01-03 22:02:50.935" |         |         |         | -2.87   |         |
| "2015-01-03 22:02:51.044" |         |         | 6560.00 |         |         |

使用这个:

--    CREATE EXTENSION tablefunc;
SELECT *
    FROM
        crosstab(
            'SELECT
                source."timestamp",
                source.name,
                source.value
            FROM
                public.source
            ORDER BY
                1'
            ,
            'SELECT
                DISTINCT
                source.name
            FROM
                public.source
            ORDER BY
                1'
        )
    AS
        (
            "timestamp" timestamp without time zone,
            "sensor1" character varying(32672),
            "sensor2" character varying(32672),
            "sensor3" character varying(32672),
            "foo27" character varying(32672),
            "bar_18" character varying(32672)
        )
    ;

我得到了结果:

| timestamp                 | sensor1 | sensor2 | sensor3 | foo27   | bar_18  |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:02:05.872" |         |         |         | 88.4    |         |
| "2015-01-03 22:02:10.887" |         | -3.755  | 1.1704  |         |         |
| "2015-01-03 22:02:50.825" |         | -1.4    |         |         |         |
| "2015-01-03 22:02:50.833" | 545.43  |         |         |         |         |
| "2015-01-03 22:02:50.935" |         | -2.87   |         |         |         |
| "2015-01-03 22:02:51.044" |         |         |         |         | 6.56    |

很遗憾,

  1. 值未分配给正确的列,
  2. 列不是动态的;这意味着当名称列中存在诸如“sensor4”之类的附加条目时查询将失败,并且
  3. 我不知道如何更改某些列的值(乘)。

【问题讨论】:

  • 为什么varchar(32672) 为什么不是floatnumeric
  • 您需要另一个表,它的名称与源表匹配,并且具有比例因子和所需的列顺序,请删除“as”子句。
  • @Jasen:不是这里发明的!
  • 哇!曾叱咤风云的。你还在用 9.4 吗?

标签: sql postgresql pivot case crosstab


【解决方案1】:

您的查询是这样的:

SELECT * FROM crosstab(
  $$SELECT "timestamp", name
         , CASE name
           WHEN 'sensor3' THEN value::numeric * 1000
       --  WHEN 'sensor9' THEN value::numeric * 9000  -- add more ...
           ELSE value::numeric END AS value
    FROM   source
    ORDER  BY 1, 2$$
 ,$$SELECT unnest('{bar_18,foo27,sensor1,sensor2,sensor3}'::text[])$$
) AS (
  "timestamp" timestamp
, bar_18  numeric
, foo27   numeric
, sensor1 numeric
, sensor2 numeric
, sensor3 numeric);

要将value 与选定列相乘,请使用"simple" CASE 语句。但是您需要先转换为numeric type。在示例中使用value::numeric
这就引出了一个问题:为什么不将值存储为数字类型?

您需要使用带有两个参数的版本。详细解释:

真正动态的交叉表几乎是不可能的,因为 SQL 要求提前知道结果类型 - 最迟在调用时。但是你可以用多态类型来做一些事情

【讨论】:

    【解决方案2】:

    @Erwin:评论说“太长了 7128 个字符”!无论如何:

    你的帖子给了我正确方向的提示,非常感谢你, 但特别是在我的情况下,我需要它是真正动态的。目前我有 38886 行,包含 49 个不同的项目(= 要旋转的列)。

    首先回答您和@Jasen 的紧急问题: 源表布局不由我决定,我已经很高兴得到这个 数据到 RDBMS。如果是我,我会一直保存 UTC 时间戳!但 将数据保存为字符串还有一个原因:它可能包含 各种数据类型,如布尔、整数、浮点数、字符串等。

    为了避免进一步混淆,我创建了一个新的演示数据集,为数据添加前缀 输入(我知道有些人讨厌这个!)以避免关键字出现问题并更改 时间戳(--> 分钟)以获得更好的概览:

    --  --------------------------------------------------------------------------
    --  Create demo table of given schema and insert arbitrary data
    --  --------------------------------------------------------------------------
    
        DROP TABLE IF EXISTS table_source;
    
        CREATE TABLE table_source
        (
            column_id BIGINT NOT NULL,
            column_name CHARACTER VARYING(255),
            column_timestamp TIMESTAMP WITHOUT TIME ZONE,
            column_value CHARACTER VARYING(32672),
            CONSTRAINT table_source_pkey PRIMARY KEY (column_id)
        );
    
        INSERT INTO table_source VALUES ( 15,'sensor2','2015-01-03 22:01:05.872','88.4');
        INSERT INTO table_source VALUES ( 16,'foo27' ,'2015-01-03 22:02:10.887','-3.755');
        INSERT INTO table_source VALUES ( 17,'sensor1','2015-01-03 22:02:10.887','1.1704');
        INSERT INTO table_source VALUES ( 18,'foo27' ,'2015-01-03 22:03:50.825','-1.4');
        INSERT INTO table_source VALUES ( 19,'bar_18','2015-01-03 22:04:50.833','545.43');
        INSERT INTO table_source VALUES ( 20,'foo27' ,'2015-01-03 22:05:50.935','-2.87');
        INSERT INTO table_source VALUES ( 21,'seNSor3','2015-01-03 22:06:51.044','6.56');
    
        SELECT * FROM table_source;
    

    此外,根据@Erwin 的建议,我创建了一个视图,该视图已经 转换数据类型。除了速度快之外,它还有一个不错的功能 为已知项目添加所需的转换,但不影响其他(新) 项目。

    --  --------------------------------------------------------------------------
    --  Create view to process source data
    --  --------------------------------------------------------------------------
    
        DROP VIEW IF EXISTS view_source_processed;
    
        CREATE VIEW
            view_source_processed
        AS
            SELECT
                column_timestamp,
                column_name,
                CASE LOWER( column_name)
                    WHEN LOWER( 'sensor3') THEN CAST( column_value AS DOUBLE PRECISION) * 1000.0
                    ELSE CAST( column_value AS DOUBLE PRECISION)
                END AS column_value
            FROM
                table_source
        ;
    
        SELECT * FROM view_source_processed ORDER BY column_timestamp DESC LIMIT 100;
    

    这是整个问题的预期结果:

    --  --------------------------------------------------------------------------
    --  Desired result:
    --  --------------------------------------------------------------------------
    
    /*
    | column_timestamp          | bar_18  | foo27   | sensor1 | sensor2 | seNSor3 |
    |---------------------------+---------+---------+---------+---------+---------|
    | "2015-01-03 22:01:05.872" |         |         |         |    88.4 |         |
    | "2015-01-03 22:02:10.887" |         |  -3.755 |  1.1704 |         |         |
    | "2015-01-03 22:03:50.825" |         |    -1.4 |         |         |         |
    | "2015-01-03 22:04:50.833" |  545.43 |         |         |         |         |
    | "2015-01-03 22:05:50.935" |         |   -2.87 |         |         |         |
    | "2015-01-03 22:06:51.044" |         |         |         |         |    6560 |
    */
    

    这是@Erwin 的解决方案,被新的演示源数据采用。这是完美的, 只要项目(= 要旋转的列)不变:

    --  --------------------------------------------------------------------------
    --  Solution by Erwin, modified for changed demo dataset:
    --  http://stackoverflow.com/a/27773730
    --  --------------------------------------------------------------------------
    
    SELECT *
        FROM
            crosstab(
                $$
                    SELECT
                        column_timestamp,
                        column_name,
                        column_value
                    FROM
                        view_source_processed
                    ORDER BY
                        1, 2
                $$
            ,
                $$
                    SELECT
                        UNNEST( '{bar_18,foo27,sensor1,sensor2,seNSor3}'::text[])
                $$
            )
        AS
            (
                column_timestamp timestamp,
                bar_18  DOUBLE PRECISION,
                foo27   DOUBLE PRECISION,
                sensor1 DOUBLE PRECISION,
                sensor2 DOUBLE PRECISION,
                seNSor3 DOUBLE PRECISION
            )
        ;
    

    在阅读@Erwin 提供的链接时,我发现了一个动态 SQL 示例 @Clodoaldo Neto 并记得,我已经这样做了 事务处理-SQL;这是我的尝试:

    --  --------------------------------------------------------------------------
    --  Dynamic attempt based on:
    --  http://stackoverflow.com/a/12989297/131874
    --  --------------------------------------------------------------------------
    
    DO $DO$
    
    DECLARE
        list_columns TEXT;
    
        BEGIN
    
            DROP TABLE IF EXISTS temp_table_pivot;
    
            list_columns := (
                SELECT
                    string_agg( DISTINCT column_name, ' ' ORDER BY column_name)
                FROM
                    view_source_processed
            );
    
            EXECUTE(
                FORMAT(
                    $format_1$
                    CREATE TEMP TABLE
                        temp_table_pivot(
                            column_timestamp TIMESTAMP,
                            %1$s
                        )
                    $format_1$
                ,
                    (
                        REPLACE(
                            list_columns,
                            ' ',
                            ' DOUBLE PRECISION, '
                        ) || ' DOUBLE PRECISION'
                    )
                )
            );
    
            EXECUTE(
                FORMAT(
                    $format_2$
                        INSERT INTO temp_table_pivot
                            SELECT
                                *
                            FROM crosstab(
                                $crosstab_1$
                                SELECT
                                    column_timestamp,
                                    column_name,
                                    column_value
                                FROM
                                    view_source_processed
                                ORDER BY
                                    column_timestamp, column_name
                                $crosstab_1$
                            ,
                                $crosstab_2$
                                SELECT DISTINCT
                                    column_name
                                FROM
                                    view_source_processed
                                ORDER BY
                                    column_name
                                $crosstab_2$
                            )
                            AS
                            (
                                column_timestamp TIMESTAMP,
                                %1$s
                            );
                    $format_2$
                ,
                    REPLACE( list_columns, ' ', ' DOUBLE PRECISION, ')
                    ||
                    ' DOUBLE PRECISION'
                )
            );
    
        END;
    
    $DO$;
    
    SELECT * FROM temp_table_pivot ORDER BY column_timestamp DESC LIMIT 100;
    

    除了将其放入存储过程之外,出于性能原因,我还会: 尝试将此应用于仅插入新值的中间表。 我会及时通知您!

    谢谢!!!

    L.

    PS:不,我不想回答我自己的问题,但是“评论”字段太小了!

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2014-01-04
      • 2021-12-30
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-03-21
      相关资源
      最近更新 更多