恢复元数据
本文介绍在 FE 节点遇到不同异常时如何恢复 StarRocks 集群中的元数据。
通常情况下,只有在出现以下情况时,可能需要进行元数据恢复:
请排查您所遇到的问题,并按照对应解决方案进行操作。建议按照文中推荐的操作执行。
FE 节点无法启动
如果元数据损坏或在回滚后与集群不兼容,可能导致 FE 节点无法启动。
回滚后的元数据不兼容
当降级 StarRocks 集群后,FE 节点有可能 因为与降级前版本元数据不兼容而无法启动。
如果在降级集群时遇到以下异常,可以确定出现了此问题:
UNKNOWN Operation Type xxx
您可以按照以下步骤恢复元数据并启动 FE 节点:
-
停止所有 FE 节点。
-
备份所有 FE 节点的元数据目录
meta_dir
。 -
在所有 FE 节点的配置文件 fe.conf 中添加配置
metadata_ignore_unknown_operation_type = true
。 -
启动所有 FE 节点,并检查数据和元数据是否完整。
-
如果数据和元数据都完整,请执行以下语句为元数据创建镜像文件:
ALTER SYSTEM CREATE IMAGE;
-
在新的镜像文件传输到所有 FE 节点的目录 meta/image 之后,需要从所有 FE 节点的配置文件中移除配置项
ignore_unknown_log_id = true
,并重新启动 FE 节点。
元数据损坏
BDBJE 或 StarRocks 的元数据损坏都会导致 FE 节点无法重新启动。
BDBJE 元数据损坏
VLSN Bug
根据以下 Error Message 识别 VLSN Bug:
recoveryTracker should overlap or follow on disk last
VLSN of 6,684,650 recoveryFirst= 6,684,652
UNEXPECTED_STATE_FATAL: Unexpected internal state, unable to continue.
Environment is invalid and must be closed.
按照以下步骤来解决此问题:
-
清除报错 FE 节点的元数据目录
meta_dir
。 -
使用 Leader FE 节点作为 Helper 重新启动当前 FE 节点。
# 将 <leader_ip> 替换为 Leader FE 节点的 IP 地址(priority_networks),
# 并将 <leader_edit_log_port>(默认:9010)替换为 Leader FE 节点的 edit_log_port。
./fe/bin/start_fe.sh --helper <leader_ip>:<leader_edit_log_port> --daemon
- 该 Bug 已在 StarRocks v3.1 中修复。您可以通过将集群升级到 v3.1 及以上版本避免此问题。
- 如果超过半数的 FE 节点遇到了此问题,则此解决方案不适用,必须按照 最终应急方案 中提供的说明来解决此问题。
RollbackException
根据以下 Error Message 识别该问题:
must rollback 1 total commits(1 of which were durable) to the earliest point indicated by transaction id=-14752149 time=2022-01-12 14:36:28.21 vlsn=28,069,415 lsn=0x1174/0x16e durable=false in order to rejoin the replication group. All existing ReplicatedEnvironment handles must be closed and reinstantiated. Log files were truncated to file 0x4467, offset 0x269, vlsn 28,069,413 HARD_RECOVERY: Rolled back past transaction commit or abort. Must run recovery by re-opening Environment handles Environment is invalid and must be closed.
导致此问题原因是,Leader FE 节点在写入 BDBJE 元数据后挂起,未能将元数据同步到其他 Follower FE 节点。重新启动后,原 Leader 节点变为 Follower,导致元数据损坏。
要解决此问题,只需要重新启动报错节点,从而清除脏数据。
ReplicaWriteException
根据 FE 日志 fe.log 中的关键字 removeReplicaDb
来识别此问题。
Caused by: com.sleepycat.je.rep.ReplicaWriteException: (JE 18.3.16) Problem closing transaction 25000090. The current state is:REPLICA. The node transitioned to this state at:Fri Feb 23 01:31:00 UTC 2024 Problem seen replaying entry NameLN_TX/14 vlsn=1,902,818,939 isReplicated="1" txn=-953505106 dbop=REMOVE Originally thrown by HA thread: REPLICA 10.233.132.23_9010_1684154162022(6)
at com.sleepycat.je.rep.txn.ReadonlyTxn.disallowReplicaWrite(ReadonlyTxn.java:114) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.dbi.DbTree.checkReplicaWrite(DbTree.java:880) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.dbi.DbTree.doCreateDb(DbTree.java:579) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.dbi.DbTree.createInternalDb(DbTree.java:507) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.cleaner.ExtinctionScanner.openDb(ExtinctionScanner.java:357) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.cleaner.ExtinctionScanner.prepareForDbExtinction(ExtinctionScanner.java:1703) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.dbi.DbTree.doRemoveDb(DbTree.java:1208) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.dbi.DbTree.removeReplicaDb(DbTree.java:1261) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.Replay.applyNameLN(Replay.java:996) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.Replay.replayEntry(Replay.java:722) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.Replica$ReplayThread.run(Replica.java:1225) ~[starrocks-bdb-je-18.3.16.jar:?]
导致此问题原因是当前 FE 节点的 BDBJE 版本(v18.3.*)与 Leader FE 节点的版本(v7.3.7)不匹配。
按照以下步骤来解决此问题:
-
删除报错的 Follower 或 Observer 节点。
-- 如需删除 Follower 节点,请将 <follower_host> 替换为 Follower 节点的 IP 地址(priority_networks),
-- 并将 <follower_edit_log_port>(默认值:9010)替换为 Follower 节点的 edit_log_port。
ALTER SYSTEM DROP FOLLOWER "<follower_host>:<follower_edit_log_port>";
-- 如需删除 Observer 节点,请将 <observer_host> 替换为 Observer 节点的 IP 地址(priority_networks),
-- 并将 <observer_edit_log_port>(默认值:9010)替换为 Observer 节点的 edit_log_port。
ALTER SYSTEM DROP OBSERVER "<observer_host>:<observer_edit_log_port>"; -
将报错节点重新添加到集群中。
-- 添加 Follower 节点。
ALTER SYSTEM ADD FOLLOWER "<follower_host>:<follower_edit_log_port>";
-- 添加 Observer 节点。
ALTER SYSTEM ADD OBSERVER "<observer_host>:<observer_edit_log_port>"; -
清除报错节点的元数据目录
meta_dir
。 -
使用 Leader FE 节点作为 Helper 重新启动报错节点。
# 将 <leader_ip> 替换为 Leader FE 节点的 IP 地址(priority_networks),
# 并将 <leader_edit_log_port>(默认:9010)替换为 Leader FE 节点的 edit_log_port。
./fe/bin/start_fe.sh --helper <leader_ip>:<leader_edit_log_port> --daemon -
在报错节点健康状态恢复后,需要将集群中的 BDBJE 软件包升级到 starrocks-bdb-je-18.3.16.jar(或将 StarRocks 集群升级到 v3.0 或更高版本)。此操作需要按照先 Follower,后 Leader 的顺序进行。
InsufficientLogException
根据以下 Error Message 识别该问题:
xxx INSUFFICIENT_LOG: Log files at this node are obsolete. Environment is invalid and must be closed.
此问题原因是 Follower 节点需要进行全量元数据同步。以下情况可能导致该问题:
- Follower 节点的元数据落后于 Leader 节点,但 Leader 节点已经进行元数据 CheckPoint。Follower 节点无法对其元数据执行增量更新,因此需要进行元数据全量同步。
- 原 Leader 节点写入元数据并 CheckPoint,但在挂起之前未能将元数据同步到 Follower 节点。重新启动后,原 Leader 节点变为 Follower 节点。由于有脏元数据已经 CheckPoint,该节点无法对其元数据执行增量删除,因此需要进行元数据全量同步。
请注意,当新的 Follower 节点加入集群时,会抛出此异常。在这种情况下,无需采取任何操作。如果已有的 Follower 节点或 Leader 节点抛出此异常,则只需重新启动该节点。
HANDSHAKE_ERROR: Error during the handshake between two nodes
根据以下 Error Message 识别该问题:
2023-11-13 21:51:55,271 WARN (replayer|82) [BDBJournalCursor.wrapDatabaseException():97] failed to get DB names for 1 times!Got EnvironmentFailureExce
com.sleepycat.je.EnvironmentFailureException: (JE 18.3.16) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 18.3.16) 10.26.5.115_9010_1697071897979(1):/data1/meta/bdb A replica with the name: 10.26.5.115_9010_1697071897979(1) is already active with the Feeder:null HANDSHAKE_ERROR: Error during the handshake between two nodes. Some validity or compatibility check failed, preventing further communication between the nodes. Environment is invalid and must be closed.
at com.sleepycat.je.EnvironmentFailureException.wrapSelf(EnvironmentFailureException.java:230) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1835) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.dbi.EnvironmentImpl.checkOpen(EnvironmentImpl.java:1844) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.Environment.checkOpen(Environment.java:2697) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.Environment.getDatabaseNames(Environment.java:2455) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.starrocks.journal.bdbje.BDBEnvironment.getDatabaseNamesWithPrefix(BDBEnvironment.java:478) ~[starrocks-fe.jar:?]
at com.starrocks.journal.bdbje.BDBJournalCursor.refresh(BDBJournalCursor.java:177) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr$5.runOneCycle(GlobalStateMgr.java:2148) ~[starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr$5.run(GlobalStateMgr.java:2216) ~[starrocks-fe.jar:?]
Caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 18.3.16) 10.26.5.115_9010_1697071897979(1):/data1/meta/bdb A replica with the name: 10.26.5.115_9010_1697071897979(1) is already active with the Feeder:null HANDSHAKE_ERROR: Error during the handshake between two nodes. Some validity or compatibility check failed, preventing further communication between the nodes. Environment is invalid and must be closed. Originally thrown by HA thread: UNKNOWN 10.26.5.115_9010_1697071897979(1) Originally thrown by HA thread: UNKNOWN 10.26.5.115_9010_1697071897979(1)
at com.sleepycat.je.rep.stream.ReplicaFeederHandshake.negotiateProtocol(ReplicaFeederHandshake.java:198) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.stream.ReplicaFeederHandshake.execute(ReplicaFeederHandshake.java:250) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.Replica.initReplicaLoop(Replica.java:709) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.Replica.runReplicaLoopInternal(Replica.java:485) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.Replica.runReplicaLoop(Replica.java:412) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.RepNode.run(RepNode.java:1869) ~[starrocks-bdb-je-18.3.16.jar:?]
导致此问题原因是,Leader 节点挂起后,剩余的 Follower 节点尝试选举新的 Leader 节点,此时原 Leader 节点重新上线。Follower 节点将尝试与原始 Leader 节点建立新连接。但是,由于旧有连接仍然存在,Leader 节点将拒绝连接请求。一旦请求被拒绝,Follower 节点会将环境设置为 Invalid 并抛出此异常。
要解决此问题,可以增加 JVM 内存大小或使用 G1 GC 算法。
DatabaseNotFoundException
根据以下 Error Message 识别该问题:
2024-01-05 12:47:21,087 INFO (main|1) [BDBEnvironment.ensureHelperInLocal():340] skip check local environment because helper node and local node are identical.
2024-01-05 12:47:21,339 ERROR (MASTER 172.17.0.1_9112_1704430041062(-1)|1) [StarRocksFE.start():186] StarRocksFE start failed
com.sleepycat.je.DatabaseNotFoundException: (JE 18.3.16) _jeRepGroupDB
at com.sleepycat.je.rep.impl.RepImpl.openGroupDb(RepImpl.java:1974) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.RepImpl.getGroupDb(RepImpl.java:1912) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.RepGroupDB.reinitFirstNode(RepGroupDB.java:1439) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.RepNode.reinitSelfElect(RepNode.java:1686) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.RepNode.startup(RepNode.java:874) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.node.RepNode.joinGroup(RepNode.java:2153) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.impl.RepImpl.joinGroup(RepImpl.java:618) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.ReplicatedEnvironment.joinGroup(ReplicatedEnvironment.java:558) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:619) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:464) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:538) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.sleepycat.je.rep.util.DbResetRepGroup.reset(DbResetRepGroup.java:262) ~[starrocks-bdb-je-18.3.16.jar:?]
at com.starrocks.journal.bdbje.BDBEnvironment.initConfigs(BDBEnvironment.java:188) ~[starrocks-fe.jar:?]
at com.starrocks.journal.bdbje.BDBEnvironment.setup(BDBEnvironment.java:174) ~[starrocks-fe.jar:?]
at com.starrocks.journal.bdbje.BDBEnvironment.initBDBEnvironment(BDBEnvironment.java:153) ~[starrocks-fe.jar:?]
at com.starrocks.journal.JournalFactory.create(JournalFactory.java:31) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.initJournal(GlobalStateMgr.java:1201) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.initialize(GlobalStateMgr.java:1150) ~[starrocks-fe.jar:?]
at com.starrocks.StarRocksFE.start(StarRocksFE.java:129) ~[starrocks-fe.jar:?]
at com.starrocks.StarRocksFE.main(StarRocksFE.java:83) ~[starrocks-fe.jar:?]