You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Install self hosted. Run out of disk. Kafka/zookeeper will fail. Impossible to recover (see logs), my installation is doomed.
Expected Result
The service should not break to a point it cannot be recovered. Maybe check the disk and kill itself. I'd rather loose a bunch of transactions than losing everything.
Actual Result
===> Launching kafka ...
[2024-06-17 04:55:46,504] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
[2024-06-17 04:55:47,568] INFO Starting the log cleaner (kafka.log.LogCleaner)
[2024-06-17 04:55:47,915] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas)
[2024-06-17 04:55:47,936] INFO [SocketServer listenerType=ZK_BROKER, nodeId=1001] Created data-plane acceptor and processors for endpoint : ListenerName(PLAINTEXT) (kafka.network.SocketServer)
[2024-06-17 04:55:48,020] INFO Creating /brokers/ids/1001 (is it secure? false) (kafka.zk.KafkaZkClient)
[2024-06-17 04:55:48,033] INFO Stat of the created znode at /brokers/ids/1001 is: 1478,1478,1718600148028,1718600148028,1,0,0,72130214439944228,194,0,1478
(kafka.zk.KafkaZkClient)
[2024-06-17 04:55:48,034] INFO Registered broker 1001 at path /brokers/ids/1001 with addresses: PLAINTEXT://kafka:9092, czxid (broker epoch): 1478 (kafka.zk.KafkaZkClient)
[2024-06-17 04:55:48,242] INFO [/config/changes-event-process-thread]: Starting (kafka.common.ZkNodeChangeNotificationListener$ChangeEventProcessThread)
[2024-06-17 04:55:48,259] WARN [Controller id=1001, targetBrokerId=1001] Connection to node 1001 (kafka/172.19.0.13:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2024-06-17 04:55:48,260] WARN [RequestSendThread controllerId=1001] Controller 1001's connection to broker kafka:9092 (id: 1001 rack: null) was unsuccessful (kafka.controller.RequestSendThread)
java.io.IOException: Connection to kafka:9092 (id: 1001 rack: null) failed.
at org.apache.kafka.clients.NetworkClientUtils.awaitReady(NetworkClientUtils.java:70)
at kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:298)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:251)
at org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:130)
[2024-06-17 04:55:48,341] INFO [SocketServer listenerType=ZK_BROKER, nodeId=1001] Enabling request processing. (kafka.network.SocketServer)
[2024-06-17 04:55:48,344] INFO Awaiting socket connections on 0.0.0.0:9092. (kafka.network.DataPlaneAcceptor)
[2024-06-17 04:56:20,444] ERROR Error while appending records to ingest-transactions-0 in dir /var/lib/kafka/data (org.apache.kafka.storage.internals.log.LogDirFailureChannel)
java.io.IOException: No space left on device
at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62)
at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:113)
at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:79)
at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:280)
at org.apache.kafka.common.record.MemoryRecords.writeFullyTo(MemoryRecords.java:90)
at org.apache.kafka.common.record.FileRecords.append(FileRecords.java:188)
at kafka.log.LogSegment.append(LogSegment.scala:160)
at kafka.log.LocalLog.append(LocalLog.scala:439)
at kafka.log.UnifiedLog.append(UnifiedLog.scala:911)
at kafka.log.UnifiedLog.appendAsLeader(UnifiedLog.scala:719)
at kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:1313)
at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:1301)
at kafka.server.ReplicaManager.$anonfun$appendToLocalLog$6(ReplicaManager.scala:1277)
at scala.collection.StrictOptimizedMapOps.map(StrictOptimizedMapOps.scala:28)
at scala.collection.StrictOptimizedMapOps.map$(StrictOptimizedMapOps.scala:27)
at scala.collection.mutable.HashMap.map(HashMap.scala:35)
at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:1265)
at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:868)
at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:686)
at kafka.server.KafkaApis.handle(KafkaApis.scala:180)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:153)
at java.base/java.lang.Thread.run(Thread.java:829)
[2024-06-17 04:56:20,445] WARN [ReplicaManager broker=1001] Stopping serving replicas in dir /var/lib/kafka/data (kafka.server.ReplicaManager)
[2024-06-17 04:56:20,464] WARN [ReplicaManager broker=1001] Broker 1001 stopped fetcher for partitions snuba-queries-0,outcomes-0,scheduled-subscriptions-transactions-0,events-0,cdc-0,profiles-call-tree-0,snuba-generic-metrics-sets-commit-log-0,__consumer_offsets-0,scheduled-subscriptions-events-0,outcomes-billing-0,ingest-performance-metrics-0,events-subscription-results-0,snuba-dead-letter-generic-events-0,transactions-0,snuba-dead-letter-replays-0,processed-profiles-0,snuba-dead-letter-metrics-0,snuba-attribution-0,scheduled-subscriptions-generic-metrics-distributions-0,snuba-generic-metrics-counters-commit-log-0,ingest-events-0,metrics-subscription-results-0,snuba-generic-metrics-gauges-commit-log-0,profiles-0,scheduled-subscriptions-generic-metrics-counters-0,scheduled-subscriptions-generic-metrics-sets-0,scheduled-subscriptions-generic-metrics-gauges-0,generic-metrics-subscription-results-0,snuba-transactions-commit-log-0,snuba-spans-0,ingest-replay-events-0,ingest-sessions-0,ingest-transactions-0,ingest-attachments-0,snuba-metrics-0,monitors-clock-tick-0,snuba-metrics-summaries-0,snuba-dead-letter-group-attributes-0,shared-resources-usage-0,ingest-monitors-0,ingest-occurrences-0,transactions-subscription-results-0,generic-events-0,snuba-dead-letter-generic-metrics-0,snuba-metrics-commit-log-0,ingest-metrics-0,group-attributes-0,snuba-generic-metrics-0,event-replacements-0,snuba-dead-letter-querylog-0,snuba-commit-log-0,snuba-generic-metrics-distributions-commit-log-0,ingest-replay-recordings-0,snuba-generic-events-commit-log-0,scheduled-subscriptions-metrics-0 and stopped moving logs for partitions because they are in the failed log directory /var/lib/kafka/data. (kafka.server.ReplicaManager)
[2024-06-17 04:56:20,464] WARN Stopping serving logs in dir /var/lib/kafka/data (kafka.log.LogManager)
[2024-06-17 04:56:20,466] ERROR Shutdown broker because all log dirs in /var/lib/kafka/data have failed (kafka.log.LogManager)
And zookeepers' logs
Using log4j config /etc/kafka/log4j.properties
===> User
uid=1000(appuser) gid=1000(appuser) groups=1000(appuser)
===> Configuring ...
Running in Zookeeper mode...
===> Running preflight checks ...
===> Check if /var/lib/kafka/data is writable ...
===> Check if Zookeeper is healthy ...
[2024-06-17 06:00:49,813] ERROR Unable to resolve address: zookeeper:2181 (org.apache.zookeeper.client.StaticHostProvider)
java.net.UnknownHostException: zookeeper: Name or service not known
at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:930)
at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1543)
at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1386)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1307)
at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:88)
at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:141)
at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:368)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1204)
[2024-06-17 06:00:49,818] WARN Session 0x0 for server zookeeper:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException. (org.apache.zookeeper.ClientCnxn)
Shutting down and restarting fails with dependency failed to start: container sentry-self-hosted-zookeeper-1 is unhealthy
Reinstalling fails with
dependency failed to start: container sentry-self-hosted-zookeeper-1 is unhealthy
Error in install/bootstrap-snuba.sh:3.
'$dcr snuba-api bootstrap --no-migrate --force' exited with status 1
-> ./install.sh:main:36
--> install/bootstrap-snuba.sh:source:3
Tried to follow the troubleshooting guide
sentry@workhorse:~/self-hosted$ docker compose run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --list
[+] Creating 1/0
✔ Container sentry-self-hosted-zookeeper-1 Created 0.0s
[+] Running 1/1
✔ Container sentry-self-hosted-zookeeper-1 Started 0.4s
dependency failed to start: container sentry-self-hosted-zookeeper-1 is unhealthy
Volume "sentry-self-hosted_sentry-nginx-cache" Created
external volume "sentry-zookeeper" not found
Error in install/upgrade-clickhouse.sh:15.
'$dc up -d clickhouse' exited with status 1
-> ./install.sh:main:25
--> install/upgrade-clickhouse.sh:source:15
Event ID
No response
The text was updated successfully, but these errors were encountered:
Self-Hosted Version
24.500
CPU Architecture
x86_64
Docker Version
26.1.4
Docker Compose Version
2.27.1
Steps to Reproduce
Install self hosted. Run out of disk. Kafka/zookeeper will fail. Impossible to recover (see logs), my installation is doomed.
Expected Result
The service should not break to a point it cannot be recovered. Maybe check the disk and kill itself. I'd rather loose a bunch of transactions than losing everything.
Actual Result
And zookeepers' logs
Shutting down and restarting fails with
dependency failed to start: container sentry-self-hosted-zookeeper-1 is unhealthy
Reinstalling fails with
Tried to follow the troubleshooting guide
Tried the nuclear option
But then reinstall fails
Event ID
No response
The text was updated successfully, but these errors were encountered: