2023-09-2702:32:17.127 UTC [1] LOG: received fast shutdown request 2023-09-2702:32:17.181 UTC [1] LOG: aborting any active transactions 2023-09-2702:32:17.434 UTC [1] LOG: background worker "logical replication launcher" (PID 26) exited withexit code 1 2023-09-2702:32:17.481 UTC [21] LOG: shutting down 2023-09-2702:32:17.880 UTC [1] LOG: databasesystemis shut down
Postgres “invalid resource manager ID in primary checkpoint record” and “could not locate a valid checkpoint record”
1 2 3 4 5 6 7 8 9 10
2023-09-2702:33:23.189 UTC [1] LOG: starting PostgreSQL 13.5on x86_64-pc-linux-musl, compiled by gcc (Alpine 10.3.1_git20211027) 10.3.120211027, 64-bit 2023-09-2702:33:23.190 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 2023-09-2702:33:23.190 UTC [1] LOG: listening on IPv6 address "::", port 5432 2023-09-2702:33:23.199 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2023-09-2702:33:23.210 UTC [21] LOG: databasesystem was shut down at 2023-09-2702:32:22 UTC 2023-09-2702:33:23.210 UTC [21] LOG: invalid resource manager ID inprimarycheckpointrecord 2023-09-2702:33:23.210 UTC [21] PANIC: could not locate a validcheckpointrecord 2023-09-2702:33:24.657 UTC [1] LOG: startup process (PID 21) was terminated by signal 6: Aborted 2023-09-2702:33:24.657 UTC [1] LOG: aborting startup due to startup process failure 2023-09-2702:33:24.659 UTC [1] LOG: databasesystemis shut down
如上, WAL 文件已损坏, 应该如何恢复?
恢复步骤
🐾Warning:
目的是启动 Postgres 恢复应用的正常运行. 数据可能存在丢失.
这是一个 TTRSS feed 应用, 只供我自己使用, 只要能启动起来, 丢失一点数据无所谓.
首先, Postgres Pod 在 CrashBackoffLoop, 无法进行任何操作, 首要任务是使 Pod 启动起来, 不要关闭. 这里通过在 Deployment 添加一些命令来实现. 如下:
2023-09-2704:03:25.172 UTC [1] LOG: starting PostgreSQL 13.5on x86_64-pc-linux-musl, compiled by gcc (Alpine 10.3.1_git20211027) 10.3.120211027, 64-bit 2023-09-2704:03:25.173 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 2023-09-2704:03:25.173 UTC [1] LOG: listening on IPv6 address "::", port 5432 2023-09-2704:03:25.179 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2023-09-2704:03:25.187 UTC [20] LOG: databasesystem was shut down at 2023-09-2704:02:42 UTC 2023-09-2704:03:25.210 UTC [1] LOG: databasesystemis ready to accept connections