|
| 1 | +# How many tuples can be inserted in a page |
| 2 | + |
| 3 | +> 我每天都会发布一篇新的 PostgreSQL "howto" 文章。加入我的旅程吧 — 订阅、提供反馈、分享! |
| 4 | +
|
| 5 | +在 Postgres 中,所有表都有隐藏的系统列,`ctid` 便是其中之一。通过读取 `ctid`,我们可以看到元组 (元组 = 行的物理版本) 的物理位置,包括页号以及页内偏移量: |
| 6 | + |
| 7 | +```sql |
| 8 | +nik=# create table t0 as select 1 as id; |
| 9 | +SELECT 1 |
| 10 | + |
| 11 | +nik=# select ctid, id from t0; |
| 12 | + ctid | id |
| 13 | +-------+---- |
| 14 | + (0,1) | 1 |
| 15 | +(1 row) |
| 16 | +``` |
| 17 | + |
| 18 | +👉 第 0 页,1 位置处。 |
| 19 | + |
| 20 | +单个 PostgreSQL 页面默认为 8 KiB,可以通过查看 `block_size` 来确认: |
| 21 | + |
| 22 | +```sql |
| 23 | +nik=# show block_size; |
| 24 | + block_size |
| 25 | +------------ |
| 26 | + 8192 |
| 27 | +(1 row) |
| 28 | +``` |
| 29 | + |
| 30 | +一个页面中可以容纳多少个元组?来看一下: |
| 31 | + |
| 32 | +```sql |
| 33 | +nik=# create table t0 as select i |
| 34 | +from generate_series(1, 1000) as i; |
| 35 | +SELECT 1000 |
| 36 | + |
| 37 | +nik=# select count(*) |
| 38 | +from t0 |
| 39 | +where (ctid::text::point)[0] = 0; |
| 40 | + count |
| 41 | +------- |
| 42 | + 226 |
| 43 | +(1 row) |
| 44 | + |
| 45 | +nik=# select pg_column_size(i) from t0 limit 1; |
| 46 | + pg_column_size |
| 47 | +---------------- |
| 48 | + 4 |
| 49 | +(1 row) |
| 50 | +``` |
| 51 | + |
| 52 | +👉 如果我们使用 4 字节的数字,那么可以容纳 226 条元组。此处我使用 `(ctid::text::point)[0]` 将 `ctid` 的值转换为"point"来获取第一个组成部分,即页号。 |
| 53 | + |
| 54 | +即使使用 2 字节的数字或 1 字节的布尔值 (注意,布尔值需要 1 字节,而不是 1 比特),这个数量也是相同的: |
| 55 | + |
| 56 | +```sql |
| 57 | +nik=# drop table t0; |
| 58 | +DROP TABLE |
| 59 | + |
| 60 | +nik=# create table t0 as select true |
| 61 | +from generate_series(1, 1000) as i; |
| 62 | +SELECT 1000 |
| 63 | + |
| 64 | +nik=# select count(*) |
| 65 | + from t0 |
| 66 | + where (ctid::text::point)[0] = 0; |
| 67 | + count |
| 68 | +------- |
| 69 | + 226 |
| 70 | +(1 row) |
| 71 | +``` |
| 72 | + |
| 73 | +为什么还是 226?事实上,值的大小在这里无关紧要,只要小于或等于 8 字节即可。对于每一行,对齐填充都会添加"零",因此每行始终有 8 个字节: |
| 74 | +$$ |
| 75 | +\frac{8192 - 24}{4 + 24 + 8} = 226 |
| 76 | +$$ |
| 77 | +👉 这里我们统计了以下内容: |
| 78 | + |
| 79 | +- 24 字节的页头 (`PageHeaderData`)。 |
| 80 | +- 每个元组指针 — 每个 4 字节 (`ItemIdData`)。 |
| 81 | +- 每个元组头 — 每个 23 字节,填充到 24 字节 (`HeapTupleHeaderData`)。 |
| 82 | +- 每个元组值 — 如果 ≤ 8字节,则填充到 8 字节。 |
| 83 | + |
| 84 | +源码定义了这些结构 (for [PG16](https://github.com/postgres/postgres/blob/REL_16_STABLE/src/include/storage/bufpage.h))。 |
| 85 | + |
| 86 | +**我们能容纳更多元组吗?** |
| 87 | + |
| 88 | +答案是可以的。Postgres 允许创建没有列的表!在这种情况下,计算如下: |
| 89 | +$$ |
| 90 | +\frac{8192 - 24}{4 + 24} = 291 |
| 91 | +$$ |
| 92 | +让我们观察一下 (注意 `SELECT` 子句中的空列): |
| 93 | + |
| 94 | +```sql |
| 95 | +nik=# create table t0 as select |
| 96 | +from generate_series(1, 1000) as i; |
| 97 | +SELECT 1000 |
| 98 | + |
| 99 | +nik=# select count(*) |
| 100 | +from t0 |
| 101 | +where (ctid::text::point)[0] = 0; |
| 102 | + count |
| 103 | +------- |
| 104 | + 291 |
| 105 | +(1 row) |
| 106 | +``` |
| 107 | + |
| 108 | +## 我见 |
| 109 | + |
| 110 | +以 0️⃣ 填充: |
| 111 | + |
| 112 | + |
| 113 | + |
| 114 | +>Here we only need 40 bytes per row excluding the variable sized data and 24-byte tuple header. 8 bytes being saved may not sound like much, but for tables as large as the events table it does begin to matter. For example, when storing 80 000 000 rows this translates to a space saving of at least 610 MB, all by just changing the order of a few columns. |
0 commit comments