【问题标题】:Table bloat on PostgresPostgres 上的表膨胀
【发布时间】:2019-12-23 14:07:51
【问题描述】:

我有一个小型 (~200GB) 数据仓库,在 AWS RDS 实例上的 Postgres 9.5.15 上运行。

为了稳健性,我将新数据插入到分析模式(ELT 的结果)中,如下所示:

  1. 插入新切片
  2. 使用删除命令删除旧切片
  3. 真空

我知道 Postgres 会在您执行删除或更新命令时软删除元组。考虑到表格大小,这不是问题。问题是死元组不会被 (3) 上的显式真空或常规自动真空移除。因此,如果管道多次执行,我最终会出现巨大的表膨胀,这会极大地影响表性能,更不用说浪费了额外的存储空间。

此外,当我开始调查时,我发现即使是系统表也有这个问题:

schemaname | relname | n_live_tup | n_dead_tup | ratio%
pg_catalog | pg_attribute | 46081 | 8339587 | 18097
pg_catalog | pg_depend | 27375 | 2490507 | 9097
pg_catalog | pg_statistic20094 | 1208474 | 6013

这可能会使实例的总体性能以我无法想象的方式变差。当我尝试做VACUUM FULL VERBOSE pg_catalog.pg_attribute 它给了我这个:

"pg_attribute": found 0 removable, 8387117 nonremovable row versions in 152494 pages

我已阅读诸如“表膨胀的 3 个原因”之类的文章,但两者都不适用(我没有进行复制,我没有挂起的事务等)。我可以使用pg_repack 之类的东西来按计划消除臃肿,但我想了解它发生的原因。另外我肯定不想重新打包系统表。

我唯一的假设是,vacuum 需要将所有死元组放入限制为 maintenance_work_mem 设置(我们的实例为 127MB)的内存中,我们需要增加它,但我首先需要一个侧面意见。

【问题讨论】:

  • 常规 DML 语句不会影响 pg_attributepg_depend。您是否经常创建和删除(临时)表?这通常是系统目录变得臃肿的原因之一。如果vacuum full 没有删除任何内容,这意味着您至少有一个打开的事务使用了属于属性的表。
  • 我是 ELT 管道的一部分,就像一个表插入在主插入之前需要 3 个临时表,并且每 3 小时完成一次。至于我正在检查pg_stat_activity 的事务,没有指向这些表的指针。
  • ""pg_attribute":在 152494 页中找到 0 个可移动的、8387117 个不可移动的行版本"不可移动包括活跃的和最近死亡的。如果你只是做一个select count(*) from pg_attribute呢?
  • 您的详细真空吸尘器是否产生了诸如“详细信息:900000 个死排版本尚无法删除”之类的行。
  • @ChrisSekas 最后我想出了按计划重新打包表格 - 在此处添加了答案

标签: postgresql database-administration soft-delete


【解决方案1】:

我最终编写了自己的函数来重新打包数据并按计划运行它们:

-- repack an individual table
CREATE OR REPLACE FUNCTION admin.repack_table(text)
RETURNS text
AS $$
DECLARE SQL text;
BEGIN

    SELECT
     'CREATE TEMP TABLE t1 (LIKE '||$1||');'||chr(10)||
     'INSERT INTO t1 SELECT * FROM '||$1||';'||chr(10)||
     'TRUNCATE TABLE '||$1||';'||chr(10)||
     'INSERT INTO '||$1||' SELECT * FROM t1;'||chr(10)||
     'DROP TABLE t1;'||chr(10)||
     'ANALYZE '||$1||';'
    INTO SQL;
    EXECUTE SQL;
    RETURN $1;
END;
$$ LANGUAGE plpgsql;

-- repack all tables in certain schema (with an optional threshold for N of dead tuples)
CREATE OR REPLACE FUNCTION admin.repack_schema(text,int default 5000)
RETURNS table (table_name text)
AS $$
DECLARE SQL text;
BEGIN
RETURN QUERY (
    with
     schema as (select $1)
    select admin.repack_table(t.table_schema||'.'||t.table_name)
    from information_schema.tables t
    where t.table_schema=(select * from schema)
    and t.table_name in (
        select relname
        from pg_stat_all_tables
        where schemaname=(select * from schema)
        and n_dead_tup>$2
        and n_live_tup<1000000 -- avoid repacking too large tables
    )
);
END;
$$ LANGUAGE plpgsql;

【讨论】:

    【解决方案2】:

    膨胀分数查询

    以下 SQL 查询将检查公共架构中的每个表并识别浪费磁盘空间的死行(元组)。

    SELECT schemaname || '.' || relname as tblnam,
        n_dead_tup,
        (n_dead_tup::float / n_live_tup::float) * 100 as pfrag
    FROM pg_stat_user_tables
    WHERE schemaname = 'public' and n_dead_tup > 0 and n_live_tup > 0 order by pfrag desc;
    

    如果此查询返回高百分比 (pfrag) 的死元组,则可以使用 VACUUM 命令来回收空间。

    7 被认为很高


    来自wiki.postgres.org

    SELECT
      current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
      ROUND((CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages::float/otta END)::numeric,1) AS tbloat,
      CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,
      iname, /*ituples::bigint, ipages::bigint, iotta,*/
      ROUND((CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages::float/iotta END)::numeric,1) AS ibloat,
      CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
    FROM (
      SELECT
        schemaname, tablename, cc.reltuples, cc.relpages, bs,
        CEIL((cc.reltuples*((datahdr+ma-
          (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta,
        COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
        COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols
      FROM (
        SELECT
          ma,bs,schemaname,tablename,
          (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
          (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
        FROM (
          SELECT
            schemaname, tablename, hdr, ma, bs,
            SUM((1-null_frac)*avg_width) AS datawidth,
            MAX(null_frac) AS maxfracsum,
            hdr+(
              SELECT 1+count(*)/8
              FROM pg_stats s2
              WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
            ) AS nullhdr
          FROM pg_stats s, (
            SELECT
              (SELECT current_setting('block_size')::numeric) AS bs,
              CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
              CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
            FROM (SELECT version() AS v) AS foo
          ) AS constants
          GROUP BY 1,2,3,4,5
        ) AS foo
      ) AS rs
      JOIN pg_class cc ON cc.relname = rs.tablename
      JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
      LEFT JOIN pg_index i ON indrelid = cc.oid
      LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
    ) AS sml
    ORDER BY wastedbytes DESC
    

    人类可读的浪费磁盘空间

    WITH constants AS (
      SELECT current_setting('block_size')::numeric AS bs, 23 AS hdr, 4 AS ma
    ), bloat_info AS (
      SELECT
        ma,bs,schemaname,tablename,
        (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
        (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
      FROM (
        SELECT
          schemaname, tablename, hdr, ma, bs,
          SUM((1-null_frac)*avg_width) AS datawidth,
          MAX(null_frac) AS maxfracsum,
          hdr+(
            SELECT 1+count(*)/8
            FROM pg_stats s2
            WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
          ) AS nullhdr
        FROM pg_stats s, constants
        GROUP BY 1,2,3,4,5
      ) AS foo
    ), table_bloat AS (
      SELECT
        schemaname, tablename, cc.relpages, bs,
        CEIL((cc.reltuples*((datahdr+ma-
          (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta
      FROM bloat_info
      JOIN pg_class cc ON cc.relname = bloat_info.tablename
      JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = bloat_info.schemaname AND nn.nspname <> 'information_schema'
    ), index_bloat AS (
      SELECT
        schemaname, tablename, bs,
        COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
        COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols
      FROM bloat_info
      JOIN pg_class cc ON cc.relname = bloat_info.tablename
      JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = bloat_info.schemaname AND nn.nspname <> 'information_schema'
      JOIN pg_index i ON indrelid = cc.oid
      JOIN pg_class c2 ON c2.oid = i.indexrelid
    )
    SELECT
      type, schemaname, object_name, bloat, pg_size_pretty(raw_waste) as waste
    FROM
    (SELECT
      'table' as type,
      schemaname,
      tablename as object_name,
      ROUND(CASE WHEN otta=0 THEN 0.0 ELSE table_bloat.relpages/otta::numeric END,1) AS bloat,
      CASE WHEN relpages < otta THEN '0' ELSE (bs*(table_bloat.relpages-otta)::bigint)::bigint END AS raw_waste
    FROM
      table_bloat
        UNION
    SELECT
      'index' as type,
      schemaname,
      tablename || '::' || iname as object_name,
      ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::numeric END,1) AS bloat,
      CASE WHEN ipages < iotta THEN '0' ELSE (bs*(ipages-iotta))::bigint END AS raw_waste
    FROM
      index_bloat) bloat_summary
    ORDER BY raw_waste DESC, bloat DESC
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-09-23
      • 2019-07-08
      • 2019-07-06
      • 2016-09-29
      • 1970-01-01
      相关资源
      最近更新 更多