Skip to content

Commit 14215d1

Browse files
author
xiongcc
committed
添加第66章:How many tuples can be inserted in a page
1 parent e9ed02a commit 14215d1

File tree

1 file changed

+114
-0
lines changed

1 file changed

+114
-0
lines changed
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# How many tuples can be inserted in a page
2+
3+
> 我每天都会发布一篇新的 PostgreSQL "howto" 文章。加入我的旅程吧 — 订阅、提供反馈、分享!
4+
5+
在 Postgres 中,所有表都有隐藏的系统列,`ctid` 便是其中之一。通过读取 `ctid`,我们可以看到元组 (元组 = 行的物理版本) 的物理位置,包括页号以及页内偏移量:
6+
7+
```sql
8+
nik=# create table t0 as select 1 as id;
9+
SELECT 1
10+
11+
nik=# select ctid, id from t0;
12+
ctid | id
13+
-------+----
14+
(0,1) | 1
15+
(1 row)
16+
```
17+
18+
👉 第 0 页,1 位置处。
19+
20+
单个 PostgreSQL 页面默认为 8 KiB,可以通过查看 `block_size` 来确认:
21+
22+
```sql
23+
nik=# show block_size;
24+
block_size
25+
------------
26+
8192
27+
(1 row)
28+
```
29+
30+
一个页面中可以容纳多少个元组?来看一下:
31+
32+
```sql
33+
nik=# create table t0 as select i
34+
from generate_series(1, 1000) as i;
35+
SELECT 1000
36+
37+
nik=# select count(*)
38+
from t0
39+
where (ctid::text::point)[0] = 0;
40+
count
41+
-------
42+
226
43+
(1 row)
44+
45+
nik=# select pg_column_size(i) from t0 limit 1;
46+
pg_column_size
47+
----------------
48+
4
49+
(1 row)
50+
```
51+
52+
👉 如果我们使用 4 字节的数字,那么可以容纳 226 条元组。此处我使用 `(ctid::text::point)[0]``ctid` 的值转换为"point"来获取第一个组成部分,即页号。
53+
54+
即使使用 2 字节的数字或 1 字节的布尔值 (注意,布尔值需要 1 字节,而不是 1 比特),这个数量也是相同的:
55+
56+
```sql
57+
nik=# drop table t0;
58+
DROP TABLE
59+
60+
nik=# create table t0 as select true
61+
from generate_series(1, 1000) as i;
62+
SELECT 1000
63+
64+
nik=# select count(*)
65+
from t0
66+
where (ctid::text::point)[0] = 0;
67+
count
68+
-------
69+
226
70+
(1 row)
71+
```
72+
73+
为什么还是 226?事实上,值的大小在这里无关紧要,只要小于或等于 8 字节即可。对于每一行,对齐填充都会添加"零",因此每行始终有 8 个字节:
74+
$$
75+
\frac{8192 - 24}{4 + 24 + 8} = 226
76+
$$
77+
👉 这里我们统计了以下内容:
78+
79+
- 24 字节的页头 (`PageHeaderData`)。
80+
- 每个元组指针 — 每个 4 字节 (`ItemIdData`)。
81+
- 每个元组头 — 每个 23 字节,填充到 24 字节 (`HeapTupleHeaderData`)。
82+
- 每个元组值 — 如果 ≤ 8字节,则填充到 8 字节。
83+
84+
源码定义了这些结构 (for [PG16](https://github.com/postgres/postgres/blob/REL_16_STABLE/src/include/storage/bufpage.h))。
85+
86+
**我们能容纳更多元组吗?**
87+
88+
答案是可以的。Postgres 允许创建没有列的表!在这种情况下,计算如下:
89+
$$
90+
\frac{8192 - 24}{4 + 24} = 291
91+
$$
92+
让我们观察一下 (注意 `SELECT` 子句中的空列):
93+
94+
```sql
95+
nik=# create table t0 as select
96+
from generate_series(1, 1000) as i;
97+
SELECT 1000
98+
99+
nik=# select count(*)
100+
from t0
101+
where (ctid::text::point)[0] = 0;
102+
count
103+
-------
104+
291
105+
(1 row)
106+
```
107+
108+
## 我见
109+
110+
以 0️⃣ 填充:
111+
112+
![image-20241030122538478](https://gitee.com/xiongcccc/internals_image/raw/master/docsfiy/image-20241030122538478.png)
113+
114+
>Here we only need 40 bytes per row excluding the variable sized data and 24-byte tuple header. 8 bytes being saved may not sound like much, but for tables as large as the events table it does begin to matter. For example, when storing 80 000 000 rows this translates to a space saving of at least 610 MB, all by just changing the order of a few columns.

0 commit comments

Comments
 (0)