-
Notifications
You must be signed in to change notification settings - Fork 86
Incremental PAGE backup: Could not read WAL record #394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think it's may be important too, because backup is remote $ sudo -i -u probackup pg_probackup-11 show-config --instance pg_server # Backup instance information pgdata = /pgdata/postgresql/11/main system-identifier = 6896808033732677837 xlog-seg-size = 16777216 external-dirs = /etc/postgresql/11/main # Connection parameters pgdatabase = probackup pghost = pg_server # Replica parameters replica-timeout = 5min # Archive parameters archive-timeout = 5min # Logging parameters log-level-console = OFF log-level-file = INFO log-filename = pg_probackup.log log-directory = /export/backup/log log-rotation-size = 0TB log-rotation-age = 0d # Retention parameters retention-redundancy = 2 retention-window = 30 wal-depth = 3 # Compression parameters compress-algorithm = none compress-level = 1 # Remote access parameters remote-proto = ssh remote-host = pg_server remote-port = 22 remote-user = postgres |
Another server backup log
|
Hello! |
Tail of pg_waldump for 000000010000001E000000FA look like $ /usr/lib/postgresql/11/bin/pg_waldump -p ./ -s 1E/FA8B9000 первая запись обнаружена после 1E/FA8B9000, в позиции 1E/FA8B9CE8, пропускается 3304 Б rmgr: Transaction len (rec/tot): 34/ 34, tx: 41588807, lsn: 1E/FA8B9CE8, prev 1E/FA8B86E8, desc: COMMIT 2021-06-06 01:59:21.586417 MSK rmgr: Heap len (rec/tot): 89/ 89, tx: 41588808, lsn: 1E/FA8B9D10, prev 1E/FA8B9CE8, desc: HOT_UPDATE off 4 xmax 41588808 ; new off 5 xmax 0, blkref #0: rel 1663/61568/3050978 blk 1 rmgr: Transaction len (rec/tot): 34/ 34, tx: 41588808, lsn: 1E/FA8B9D70, prev 1E/FA8B9D10, desc: COMMIT 2021-06-06 01:59:21.618743 MSK rmgr: Heap len (rec/tot): 174/ 174, tx: 41588809, lsn: 1E/FA8B9D98, prev 1E/FA8B9D70, desc: HOT_UPDATE off 10 xmax 41588809 ; new off 16 xmax 0, blkref #0: rel 1663/61568/3043386 blk 0 rmgr: Transaction len (rec/tot): 34/ 34, tx: 41588809, lsn: 1E/FA8B9E48, prev 1E/FA8B9D98, desc: COMMIT 2021-06-06 01:59:22.279005 MSK rmgr: Heap len (rec/tot): 54/ 54, tx: 41588810, lsn: 1E/FA8B9E70, prev 1E/FA8B9E48, desc: DELETE off 44 KEYS_UPDATED , blkref #0: rel 1663/61568/3047747 blk 0 rmgr: Heap len (rec/tot): 135/ 135, tx: 41588810, lsn: 1E/FA8B9EA8, prev 1E/FA8B9E70, desc: INSERT off 5, blkref #0: rel 1663/61568/3047747 blk 0 rmgr: Btree len (rec/tot): 104/ 104, tx: 41588810, lsn: 1E/FA8B9F30, prev 1E/FA8B9EA8, desc: INSERT_LEAF off 13, blkref #0: rel 1663/61568/3835328 blk 1 rmgr: Transaction len (rec/tot): 34/ 34, tx: 41588810, lsn: 1E/FA8B9F98, prev 1E/FA8B9F30, desc: COMMIT 2021-06-06 01:59:22.297038 MSK rmgr: Heap len (rec/tot): 54/ 54, tx: 41588811, lsn: 1E/FA8B9FC0, prev 1E/FA8B9F98, desc: DELETE off 41 KEYS_UPDATED , blkref #0: rel 1663/61568/3050651 blk 0 pg_waldump: СБОЙ: ошибка в записи WAL в позиции 1E/FA8B9FC0: unexpected pageaddr 1E/FA900000 in log segment 000000000000001E000000FA, offset 9150464 |
Looks like a WAL Archive corruption. |
This files have a different size by other $ du -b /export/backup/wal/pg_server1/000000010000001E000000FA 10620928 /export/backup/wal/pg_server1/000000010000001E000000FA $ du -b /export/backup/wal/pg_server1/000000010000001E000000F9 16777216 /export/backup/wal/pg_server1/000000010000001E000000F9 $ du -b /export/backup/wal/pg_server1/000000010000001E000000FB 16777216 /export/backup/wal/pg_server1/000000010000001E000000FB $ du -b /export/backup/wal/pg_server/00000001000000E000000094 9019392 /export/backup/wal/pg_server/00000001000000E000000094 $ du -b /export/backup/wal/pg_server/00000001000000E000000093 16777216 /export/backup/wal/pg_server/00000001000000E000000093 $ du -b /export/backup/wal/pg_server/00000001000000E000000095 16777216 /export/backup/wal/pg_server/00000001000000E000000095 wal files archive by command archive_command = '/usr/bin/pg_probackup-11 archive-push -B /export/backup --instance $(hostname -s) --wal-file-name=%f --remote-user probackup --remote-host file_server' |
Can you lookup the PostgreSQL server log at the mtime of those files ? It will help to understand what went wrong with |
Wow!
I'm surprised! I need a time to find out the details! |
But how |
Is there a later record in PostgreSQL log about successful archiving of |
The moment when
in postgres log look like
In log on |
Interesting.
It may indicate that we somehow ignored out of space condition second time we run archive-push. |
But not
Postgres log at this time
|
Sure, my mistake. |
|
Can you please provide a chunk of |
I wrote about this in comment #394 (comment) If this is not enough, then here is a little more
|
Нашел баг, который потенциально мог привести к подобному. |
Не очень понятно, почему постгрес вдруг забыл про 00000001000000E000000094 и перешел к 00000001000000E000000095 . |
Участок лога привел целиком, скорее всего уровень логировния отличный от первого сервера, надо посмотреть. Хотел это сделать вчера, но отвлекся ... |
странно, настройки логов вроде одинаковые ... на другом проекте я видел ситуацию, что в лог сыпятся только ошибки возникающие при архивировании, а успешное архивирование никак не фигурирует в логе в контексте pg_probackup, может ли быть причина в том, что в конфиге я log-level-console перевел в off, а log-level-file в info? |
Да, может. |
Понял, придется вернуть назад ;) |
Можете привести опции монтирования файловой системы, на которой лежат бэкапы? |
Смог воспроизвести. |
|
TODO:
|
не спрашивайте меня, зачем так, я все равно ответа не знаю :) |
Исправил. |
Спасибо! |
I have a some problem with incremental backup (PAGE)
Backup log looks like
My system is
I have this problem on two different servers and no idea what is wrong
The text was updated successfully, but these errors were encountered: