【问题标题】:how to find size of database, schema, table in redshift如何在 redshift 中查找数据库、模式、表的大小
【发布时间】:2014-03-13 03:08:06
【问题描述】:

团队,

我的红移版本是:

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.735

如何找出数据库大小、表空间、模式大小和表大小?

但以下不适用于红移(以上版本)

SELECT pg_database_size('db_name');
SELECT pg_size_pretty( pg_relation_size('table_name') );

有没有像 oracle 一样的替代方法(来自 DBA_SEGMENTS)

对于表大小,我有以下查询,但不确定 MBYTES 的确切含义。对于第 3 行,MBYTES = 372。这意味着 372 MB?

select trim(pgdb.datname) as Database, trim(pgn.nspname) as Schema,
trim(a.name) as Table, b.mbytes, a.rows
from ( select db_id, id, name, sum(rows) as rows from stv_tbl_perm a group by db_id, id, name ) as a
join pg_class as pgc on pgc.oid = a.id
join pg_namespace as pgn on pgn.oid = pgc.relnamespace
join pg_database as pgdb on pgdb.oid = a.db_id
join (select tbl, count(*) as mbytes
from stv_blocklist group by tbl) b on a.id=b.tbl
order by a.db_id, a.name;
   database    |    schema    |      table       | mbytes |   rows
---------------+--------------+------------------+--------+----------
      postgres | public       | company          |      8 |        1
      postgres | public       | table_data1_1    |      7 |        1
      postgres | proj_schema1 | table_data1    |    372 | 33867540
      postgres | public       | table_data1_2    |     40 |  2000001

(4 rows)

【问题讨论】:

  • select trim(pgdb.datname) as Database, trim(pgn.nspname) as Schema, trim(a.name) as Table, b.mbytes, a.rows from (select db_id, id, name, sum(rows) as rows from stv_tbl_perm a group by db_id, id, name ) as join pg_class as pgc on pgc.oid = a.id join pg_namespace as pgn on pgn.oid = pgc.relnamespace join pg_database as pgdb on pgdb.oid = a.db_id join (select tbl, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl order by a.db_id, a.name; ------------此查询中 mbytes 的确切含义是什么。 ?

标签: amazon-web-services amazon-redshift


【解决方案1】:

SVV_TABLE_INFO 是一个 Redshift 系统表,它显示有关 Redshift 数据库中用户定义表(而非其他系统表)的信息。该表仅对超级用户可见。

要获取每个表的大小,请在您的 Redshift 集群上运行以下命令:

SELECT "table", size, tbl_rows 
FROM SVV_TABLE_INFO
  • table 列是表名。
  • size 列是以 MB 为单位的表大小。
  • tbl_rows 列是表中的总行数,包括已标记为删除但尚未清空的行。

Source

查看SVV_TABLE_INFO Redshift 文档以获取其他有趣的列以从该系统表中检索。

【讨论】:

    【解决方案2】:

    这个查询要容易得多:

    -- 列出集群中最大的 30 个表

    SELECT 
     "schema"
    ,"table"  AS table_name
    ,ROUND((size/1024.0),2) AS "Size in Gigabytes"
    ,pct_used AS "Physical Disk Used by This Table"
    FROM svv_table_info
    ORDER BY pct_used DESC
    LIMIT 30;
    

    【讨论】:

      【解决方案3】:

      上述答案并不总是对所使用的表空间给出正确答案。 AWS 支持已提供此查询以供使用:

      SELECT   TRIM(pgdb.datname) AS Database,
               TRIM(a.name) AS Table,
               ((b.mbytes/part.total::decimal)*100)::decimal(5,2) AS pct_of_total,
               b.mbytes,
               b.unsorted_mbytes
      FROM     stv_tbl_perm a
      JOIN     pg_database AS pgdb
        ON     pgdb.oid = a.db_id
      JOIN     ( SELECT   tbl,
                          SUM( DECODE(unsorted, 1, 1, 0)) AS unsorted_mbytes,
                          COUNT(*) AS mbytes
                 FROM     stv_blocklist
                 GROUP BY tbl ) AS b
             ON a.id = b.tbl
      JOIN     ( SELECT SUM(capacity) AS total
                 FROM   stv_partitions
                 WHERE  part_begin = 0 ) AS part
            ON 1 = 1
      WHERE    a.slice = 0
      ORDER BY 4 desc, db_id, name;
      

      【讨论】:

      • 这是我正在使用的
      • 此查询是否有意过滤一个切片? WHERE a.slice = 0
      • @imcdnzl:什么是 unsorted_mbytes?当你计算总内存时,你需要将 mbytes 和 unsorted_mybytes 相加吗?
      • 当你计算总内存时,你需要将 mbytes 和 unsorted_mybytes 相加吗?
      【解决方案4】:

      其他答案之一的修改版本。这包括数据库名称、模式名称、表名称、总行数、磁盘大小和未排序大小:

      -- sort by row count
      select trim(pgdb.datname) as Database, trim(pgns.nspname) as Schema, trim(a.name) as Table,
          c.rows, ((b.mbytes/part.total::decimal)*100)::decimal(5,3) as pct_of_total, b.mbytes, b.unsorted_mbytes
          from stv_tbl_perm a
          join pg_class as pgtbl on pgtbl.oid = a.id
          join pg_namespace as pgns on pgns.oid = pgtbl.relnamespace
          join pg_database as pgdb on pgdb.oid = a.db_id
          join (select tbl, sum(decode(unsorted, 1, 1, 0)) as unsorted_mbytes, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
          join (select id, sum(rows) as rows from stv_tbl_perm group by id) c on a.id=c.id
          join (select sum(capacity) as total from stv_partitions where part_begin=0) as part on 1=1
          where a.slice=0
          order by 4 desc, db_id, name;
      
      
      -- sort by space used
      select trim(pgdb.datname) as Database, trim(pgns.nspname) as Schema, trim(a.name) as Table,
          c.rows, ((b.mbytes/part.total::decimal)*100)::decimal(5,3) as pct_of_total, b.mbytes, b.unsorted_mbytes
          from stv_tbl_perm a
          join pg_class as pgtbl on pgtbl.oid = a.id
          join pg_namespace as pgns on pgns.oid = pgtbl.relnamespace
          join pg_database as pgdb on pgdb.oid = a.db_id
          join (select tbl, sum(decode(unsorted, 1, 1, 0)) as unsorted_mbytes, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
          join (select id, sum(rows) as rows from stv_tbl_perm group by id) c on a.id=c.id
          join (select sum(capacity) as total from stv_partitions where part_begin=0) as part on 1=1
          where a.slice=0
          order by 6 desc, db_id, name;
      

      【讨论】:

        【解决方案5】:

        你可以查看这个存储库,我相信你会在那里找到有用的东西。

        https://github.com/awslabs/amazon-redshift-utils

        要回答您的问题,您可以使用以下视图: https://github.com/awslabs/amazon-redshift-utils/blob/master/src/AdminViews/v_space_used_per_tbl.sql

        然后随意查询。 例如:select * from admin.v_space_used_per_tbl;

        【讨论】:

          【解决方案6】:

          这是我正在使用的(请将数据库名称从“mydb”更改为您的数据库名称):

          SELECT CAST(use2.usename AS VARCHAR(50)) AS OWNER
           ,TRIM(pgdb.datname) AS DATABASE
           ,TRIM(pgn.nspname) AS SCHEMA
           ,TRIM(a.NAME) AS TABLE
           ,(b.mbytes) / 1024 AS Gigabytes
           ,a.ROWS
          FROM (
           SELECT db_id
           ,id
           ,NAME
           ,SUM(ROWS) AS ROWS
           FROM stv_tbl_perm a
           GROUP BY db_id
           ,id
           ,NAME
           ) AS a
          JOIN pg_class AS pgc ON pgc.oid = a.id
          LEFT JOIN pg_user use2 ON (pgc.relowner = use2.usesysid)
          JOIN pg_namespace AS pgn ON pgn.oid = pgc.relnamespace
           AND pgn.nspowner > 1
          JOIN pg_database AS pgdb ON pgdb.oid = a.db_id
          JOIN (
           SELECT tbl
           ,COUNT(*) AS mbytes
           FROM stv_blocklist
           GROUP BY tbl
           ) b ON a.id = b.tbl
          WHERE pgdb.datname = 'mydb'
          ORDER BY mbytes DESC
           ,a.db_id
           ,a.NAME;
          

          源代码:https://aboutdatabases.wordpress.com/2015/01/24/amazon-redshift-how-to-get-the-sizes-of-all-tables/

          【讨论】:

          • 这不起作用。它会从不是最大的表中挑选一小部分样本。
          【解决方案7】:

          我不确定按数据库和方案分组,但这里有一个按表获取使用情况的简短方法,

          SELECT tbl, name, size_mb FROM
          (
            SELECT tbl, count(*) AS size_mb
            FROM stv_blocklist
            GROUP BY tbl
          )
          LEFT JOIN
          (select distinct id, name FROM stv_tbl_perm)
          ON id = tbl
          ORDER BY size_mb DESC
          LIMIT 10;
          

          【讨论】:

            【解决方案8】:

            是的,您的示例中的 mbytes 是 372Mb。这是我一直在使用的:

            select
              cast(use2.usename as varchar(50)) as owner, 
              pgc.oid,
              trim(pgdb.datname) as Database,
              trim(pgn.nspname) as Schema,
              trim(a.name) as Table,
              b.mbytes,
              a.rows
            from 
             (select db_id, id, name, sum(rows) as rows
              from stv_tbl_perm a
              group by db_id, id, name
              ) as a
             join pg_class as pgc on pgc.oid = a.id
             left join pg_user use2 on (pgc.relowner = use2.usesysid)
             join pg_namespace as pgn on pgn.oid = pgc.relnamespace 
                and pgn.nspowner > 1
             join pg_database as pgdb on pgdb.oid = a.db_id
             join 
               (select tbl, count(*) as mbytes
                from stv_blocklist
                group by tbl
               ) b on a.id = b.tbl
             order by mbytes desc, a.db_id, a.name; 
            

            【讨论】:

            • 这似乎没有得到公共架构中的任何表
            • “pgn.nspowner > 1”过滤掉了公共模式,这可能是@SandipPingle 也没有取回行的原因。
            • 最佳答案!!
            猜你喜欢
            • 2014-10-20
            • 2017-12-10
            • 2017-08-12
            • 1970-01-01
            • 1970-01-01
            • 2013-04-12
            • 1970-01-01
            • 2012-04-24
            • 2020-10-02
            相关资源
            最近更新 更多