Deal with missing snapshot after ZooKeeper upgrade from version 3.4 to 3.5 or later.
I have encountered an error after upgrade from ZooKeeper 3.4.13 to 3.5.9.
Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: [2021-08-03 22:24:07,002] ERROR Unable to load database on disk (org.apache.zookeeper.server.quorum.QuorumPeer) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: java.io.IOException: No snapshot found, but there are log entries. Something is broken! Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:240) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:904) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:890) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:205) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:123) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: [2021-08-03 22:24:07,004] ERROR Unexpected exception, exiting abnormally (org.apache.zookeeper.server.quorum.QuorumPeerMain) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: java.lang.RuntimeException: Unable to run quorum server Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:941) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:890) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:205) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:123) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: Caused by: java.io.IOException: No snapshot found, but there are log entries. Something is broken! Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:240) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:904) Aug 03 22:24:07 kafka2.example.org zookeeper-server-start.sh[62023]: ... 4 more
The solution is to alter the configuration and append snapshot.trust.empty=true
option to skip this check.
$ sudo -u kafka cat /opt/kafka/kafka/config/zookeeper.properties
tickTime=2000 initLimit=10 syncLimit=5 dataDir=/opt/kafka/zookeeper_data clientPort=2181 server.1=kafka1.example.org:2888:3888 server.2=kafka2.example.org:2888:3888 server.3=kafka3.example.org:2888:3888 snapshot.trust.empty=true 4lw.commands.whitelist=*
Remember to restart ZooKeeper.
The snapshot will be created automatically when the node becomes a leader or a snapCount
is reached which by default is 10,000 requests, so do not remove this option immediately after the upgrade.
Please read Fails to load database with missing snapshot file but valid transaction log file for more information.