Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thanos Compact stuck in continuous restart loop when index file missing #2067

Closed
zshearin opened this issue Jan 28, 2020 · 2 comments
Closed

Comments

@zshearin
Copy link

zshearin commented Jan 28, 2020

Thanos, Prometheus and Golang version used:
quay.io/thanos/thanos:v0.8.1
prom/prometheus:v2.15.2

Object Storage Provider:
MinIO

What happened:
Index file not generated for unknown reason for a chunk of data. Thanos compactor component unable to keep running/process other chunks of data - seems to get stuck in infinite loop trying to compact same chunk of data (and continuously restarts itself). I understand that if index file not generated, the chunk of data is corrupted/may be lost (see: #1199)

What you expected to happen:
Thanos compact recognize missing index file and move on to other chunks for compaction (maybe label chunk as corrupted?). Again I know this is not an ideal situation and we don't want corrupted data.

How to reproduce it (as minimally and precisely as possible):
Delete an index file for a chunk of data prior to compaction, see what thanos compactor does

Full logs to relevant components:

Logs

level=info ts=2020-01-28T14:20:42.425782486Z caller=compact.go:271 msg="start first pass of downsampling"
level=info ts=2020-01-28T14:21:19.22068824Z caller=downsample.go:257 msg="downloaded block" id=01DZGKF24PTVWYYDKHK2QQJWSY duration=36.774476362s
level=warn ts=2020-01-28T14:21:20.816846802Z caller=prober.go:154 msg="changing probe status" status=unhealthy reason="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=error ts=2020-01-28T14:21:20.817577859Z caller=main.go:215 msg="running command failed" err="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=debug ts=2020-01-28T14:21:21.655147131Z caller=main.go:122 msg="maxprocs: Leaving GOMAXPROCS=[12]: CPU quota undefined"
level=info ts=2020-01-28T14:21:21.656081744Z caller=main.go:170 msg="Tracing will be disabled"
level=info ts=2020-01-28T14:21:21.656956672Z caller=factory.go:39 msg="loading bucket configuration"
level=info ts=2020-01-28T14:21:21.657577135Z caller=compact.go:341 msg="starting compact node"
level=info ts=2020-01-28T14:21:21.657612449Z caller=prober.go:114 msg="changing probe status" status=ready
level=info ts=2020-01-28T14:21:21.657674683Z caller=main.go:353 msg="listening for requests and metrics" component=compact address=0.0.0.0:10902
level=info ts=2020-01-28T14:21:21.657755143Z caller=prober.go:143 msg="changing probe status" status=healthy
level=info ts=2020-01-28T14:21:21.657795762Z caller=compact.go:1063 msg="start sync of metas"
level=debug ts=2020-01-28T14:21:21.672008844Z caller=compact.go:287 msg="download meta" block=01DZJ951BSHB6HHGMPZJN8VSD0
level=debug ts=2020-01-28T14:21:21.672030732Z caller=compact.go:287 msg="download meta" block=01DZMVKZMDFASKV2G83H6KTGYY
level=debug ts=2020-01-28T14:21:21.672157218Z caller=compact.go:287 msg="download meta" block=01DZK4HN2D34PKE784R3JT6PB8
level=debug ts=2020-01-28T14:21:21.672210144Z caller=compact.go:287 msg="download meta" block=01DZGKF24PTVWYYDKHK2QQJWSY
level=debug ts=2020-01-28T14:21:21.672374608Z caller=compact.go:287 msg="download meta" block=01DZP2XZGFREG363K173W9PTGT
level=debug ts=2020-01-28T14:21:21.672352587Z caller=compact.go:287 msg="download meta" block=01DZM03DCXG2KEVKMXYBA99SKQ
level=debug ts=2020-01-28T14:21:21.672447155Z caller=compact.go:287 msg="download meta" block=01DZ818HQVPNVKP0GE8D2PFS6J
level=debug ts=2020-01-28T14:21:21.672599683Z caller=compact.go:287 msg="download meta" block=01DZP6BV1P6R7ZQ8CTXWAA4N2G
level=debug ts=2020-01-28T14:21:21.672692418Z caller=compact.go:287 msg="download meta" block=01DZP4MX9YYVZ4KYF3WTR0WG1K
level=debug ts=2020-01-28T14:21:21.672834474Z caller=compact.go:287 msg="download meta" block=01DZP82S0F8GREDF8ZM6PDZ170
level=debug ts=2020-01-28T14:21:21.672891794Z caller=compact.go:287 msg="download meta" block=01DZP4NYMA97CVP9BPTGWET90W
level=debug ts=2020-01-28T14:21:21.672953071Z caller=compact.go:287 msg="download meta" block=01DZNQ2HSRJ9Q8ZWEG71BMK5GS
level=debug ts=2020-01-28T14:21:21.673083333Z caller=compact.go:287 msg="download meta" block=01DZNXT2HZBNR9EYN818E95XA3
level=debug ts=2020-01-28T14:21:21.682036997Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP82S0F8GREDF8ZM6PDZ170
level=info ts=2020-01-28T14:21:21.720506582Z caller=compact.go:1069 msg="start of GC"
level=info ts=2020-01-28T14:21:21.721117506Z caller=compact.go:1075 msg="start of compaction"
level=info ts=2020-01-28T14:21:21.734779628Z caller=compact.go:264 msg="compaction iterations done"
level=info ts=2020-01-28T14:21:21.734831595Z caller=compact.go:271 msg="start first pass of downsampling"
level=info ts=2020-01-28T14:21:58.443756286Z caller=downsample.go:257 msg="downloaded block" id=01DZGKF24PTVWYYDKHK2QQJWSY duration=36.691801418s
level=warn ts=2020-01-28T14:22:00.195984936Z caller=prober.go:154 msg="changing probe status" status=unhealthy reason="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=error ts=2020-01-28T14:22:00.196474337Z caller=main.go:215 msg="running command failed" err="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=debug ts=2020-01-28T14:22:01.195721393Z caller=main.go:122 msg="maxprocs: Leaving GOMAXPROCS=[12]: CPU quota undefined"
level=info ts=2020-01-28T14:22:01.197106332Z caller=main.go:170 msg="Tracing will be disabled"
level=info ts=2020-01-28T14:22:01.198239165Z caller=factory.go:39 msg="loading bucket configuration"
level=info ts=2020-01-28T14:22:01.198940926Z caller=compact.go:341 msg="starting compact node"
level=info ts=2020-01-28T14:22:01.198984365Z caller=prober.go:114 msg="changing probe status" status=ready
level=info ts=2020-01-28T14:22:01.199216734Z caller=main.go:353 msg="listening for requests and metrics" component=compact address=0.0.0.0:10902
level=info ts=2020-01-28T14:22:01.199283789Z caller=prober.go:143 msg="changing probe status" status=healthy
level=info ts=2020-01-28T14:22:01.199426338Z caller=compact.go:1063 msg="start sync of metas"
level=debug ts=2020-01-28T14:22:01.207355739Z caller=compact.go:287 msg="download meta" block=01DZJ951BSHB6HHGMPZJN8VSD0
level=debug ts=2020-01-28T14:22:01.207395322Z caller=compact.go:287 msg="download meta" block=01DZ818HQVPNVKP0GE8D2PFS6J
level=debug ts=2020-01-28T14:22:01.207642528Z caller=compact.go:287 msg="download meta" block=01DZGKF24PTVWYYDKHK2QQJWSY
level=debug ts=2020-01-28T14:22:01.207784288Z caller=compact.go:287 msg="download meta" block=01DZMVKZMDFASKV2G83H6KTGYY
level=debug ts=2020-01-28T14:22:01.20800913Z caller=compact.go:287 msg="download meta" block=01DZP2XZGFREG363K173W9PTGT
level=debug ts=2020-01-28T14:22:01.208137345Z caller=compact.go:287 msg="download meta" block=01DZM03DCXG2KEVKMXYBA99SKQ
level=debug ts=2020-01-28T14:22:01.208210533Z caller=compact.go:287 msg="download meta" block=01DZK4HN2D34PKE784R3JT6PB8
level=debug ts=2020-01-28T14:22:01.208388163Z caller=compact.go:287 msg="download meta" block=01DZNXT2HZBNR9EYN818E95XA3
level=debug ts=2020-01-28T14:22:01.208459126Z caller=compact.go:287 msg="download meta" block=01DZP6BV1P6R7ZQ8CTXWAA4N2G
level=debug ts=2020-01-28T14:22:01.208593085Z caller=compact.go:287 msg="download meta" block=01DZP4MX9YYVZ4KYF3WTR0WG1K
level=debug ts=2020-01-28T14:22:01.208668139Z caller=compact.go:287 msg="download meta" block=01DZP4NYMA97CVP9BPTGWET90W
level=debug ts=2020-01-28T14:22:01.208775168Z caller=compact.go:287 msg="download meta" block=01DZNQ2HSRJ9Q8ZWEG71BMK5GS
level=debug ts=2020-01-28T14:22:01.208968113Z caller=compact.go:287 msg="download meta" block=01DZP82S0F8GREDF8ZM6PDZ170
level=debug ts=2020-01-28T14:22:01.253854786Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP82S0F8GREDF8ZM6PDZ170
level=info ts=2020-01-28T14:22:01.254187671Z caller=compact.go:1069 msg="start of GC"
level=info ts=2020-01-28T14:22:01.254430363Z caller=compact.go:1075 msg="start of compaction"
level=info ts=2020-01-28T14:22:01.297586517Z caller=compact.go:264 msg="compaction iterations done"
level=info ts=2020-01-28T14:22:01.297647075Z caller=compact.go:271 msg="start first pass of downsampling"
level=info ts=2020-01-28T14:22:38.178567378Z caller=downsample.go:257 msg="downloaded block" id=01DZGKF24PTVWYYDKHK2QQJWSY duration=36.825093204s
level=warn ts=2020-01-28T14:22:39.787451429Z caller=prober.go:154 msg="changing probe status" status=unhealthy reason="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=error ts=2020-01-28T14:22:39.787841632Z caller=main.go:215 msg="running command failed" err="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=debug ts=2020-01-28T14:22:40.59473739Z caller=main.go:122 msg="maxprocs: Leaving GOMAXPROCS=[12]: CPU quota undefined"
level=info ts=2020-01-28T14:22:40.595642099Z caller=main.go:170 msg="Tracing will be disabled"
level=info ts=2020-01-28T14:22:40.596485353Z caller=factory.go:39 msg="loading bucket configuration"
level=info ts=2020-01-28T14:22:40.597140894Z caller=compact.go:341 msg="starting compact node"
level=info ts=2020-01-28T14:22:40.597179572Z caller=prober.go:114 msg="changing probe status" status=ready
level=info ts=2020-01-28T14:22:40.597276286Z caller=main.go:353 msg="listening for requests and metrics" component=compact address=0.0.0.0:10902
level=info ts=2020-01-28T14:22:40.597374311Z caller=prober.go:143 msg="changing probe status" status=healthy
level=info ts=2020-01-28T14:22:40.597378463Z caller=compact.go:1063 msg="start sync of metas"
level=debug ts=2020-01-28T14:22:40.604509933Z caller=compact.go:287 msg="download meta" block=01DZJ951BSHB6HHGMPZJN8VSD0
level=debug ts=2020-01-28T14:22:40.604537007Z caller=compact.go:287 msg="download meta" block=01DZMVKZMDFASKV2G83H6KTGYY
level=debug ts=2020-01-28T14:22:40.604885196Z caller=compact.go:287 msg="download meta" block=01DZ818HQVPNVKP0GE8D2PFS6J
level=debug ts=2020-01-28T14:22:40.605006272Z caller=compact.go:287 msg="download meta" block=01DZGKF24PTVWYYDKHK2QQJWSY
level=debug ts=2020-01-28T14:22:40.605124748Z caller=compact.go:287 msg="download meta" block=01DZP2XZGFREG363K173W9PTGT
level=debug ts=2020-01-28T14:22:40.605231311Z caller=compact.go:287 msg="download meta" block=01DZNQ2HSRJ9Q8ZWEG71BMK5GS
level=debug ts=2020-01-28T14:22:40.605320203Z caller=compact.go:287 msg="download meta" block=01DZK4HN2D34PKE784R3JT6PB8
level=debug ts=2020-01-28T14:22:40.605349108Z caller=compact.go:287 msg="download meta" block=01DZNXT2HZBNR9EYN818E95XA3
level=debug ts=2020-01-28T14:22:40.605457979Z caller=compact.go:287 msg="download meta" block=01DZP6BV1P6R7ZQ8CTXWAA4N2G
level=debug ts=2020-01-28T14:22:40.605473782Z caller=compact.go:287 msg="download meta" block=01DZM03DCXG2KEVKMXYBA99SKQ
level=debug ts=2020-01-28T14:22:40.605602333Z caller=compact.go:287 msg="download meta" block=01DZP4NYMA97CVP9BPTGWET90W
level=debug ts=2020-01-28T14:22:40.605639148Z caller=compact.go:287 msg="download meta" block=01DZP4MX9YYVZ4KYF3WTR0WG1K
level=debug ts=2020-01-28T14:22:40.605980815Z caller=compact.go:287 msg="download meta" block=01DZP82S0F8GREDF8ZM6PDZ170
level=debug ts=2020-01-28T14:22:40.610525408Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP82S0F8GREDF8ZM6PDZ170
level=info ts=2020-01-28T14:22:40.612040723Z caller=compact.go:1069 msg="start of GC"
level=info ts=2020-01-28T14:22:40.612635075Z caller=compact.go:1075 msg="start of compaction"
level=info ts=2020-01-28T14:22:40.627137885Z caller=compact.go:264 msg="compaction iterations done"
level=info ts=2020-01-28T14:22:40.627175662Z caller=compact.go:271 msg="start first pass of downsampling"
level=info ts=2020-01-28T14:23:17.500291129Z caller=downsample.go:257 msg="downloaded block" id=01DZGKF24PTVWYYDKHK2QQJWSY duration=36.853673737s
level=warn ts=2020-01-28T14:23:19.072964365Z caller=prober.go:154 msg="changing probe status" status=unhealthy reason="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=error ts=2020-01-28T14:23:19.073599923Z caller=main.go:215 msg="running command failed" err="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=debug ts=2020-01-28T14:23:19.914668469Z caller=main.go:122 msg="maxprocs: Leaving GOMAXPROCS=[12]: CPU quota undefined"
level=info ts=2020-01-28T14:23:19.91549431Z caller=main.go:170 msg="Tracing will be disabled"
level=info ts=2020-01-28T14:23:19.916345976Z caller=factory.go:39 msg="loading bucket configuration"
level=info ts=2020-01-28T14:23:19.917027479Z caller=compact.go:341 msg="starting compact node"
level=info ts=2020-01-28T14:23:19.917078384Z caller=prober.go:114 msg="changing probe status" status=ready
level=info ts=2020-01-28T14:23:19.917149581Z caller=main.go:353 msg="listening for requests and metrics" component=compact address=0.0.0.0:10902
level=info ts=2020-01-28T14:23:19.917230113Z caller=prober.go:143 msg="changing probe status" status=healthy
level=info ts=2020-01-28T14:23:19.917230233Z caller=compact.go:1063 msg="start sync of metas"
level=debug ts=2020-01-28T14:23:19.938526724Z caller=compact.go:287 msg="download meta" block=01DZ818HQVPNVKP0GE8D2PFS6J
level=debug ts=2020-01-28T14:23:19.938571683Z caller=compact.go:287 msg="download meta" block=01DZJ951BSHB6HHGMPZJN8VSD0
level=debug ts=2020-01-28T14:23:19.93858042Z caller=compact.go:287 msg="download meta" block=01DZMVKZMDFASKV2G83H6KTGYY
level=debug ts=2020-01-28T14:23:19.938675192Z caller=compact.go:287 msg="download meta" block=01DZGKF24PTVWYYDKHK2QQJWSY
level=debug ts=2020-01-28T14:23:19.939122615Z caller=compact.go:287 msg="download meta" block=01DZM03DCXG2KEVKMXYBA99SKQ
level=debug ts=2020-01-28T14:23:19.939627515Z caller=compact.go:287 msg="download meta" block=01DZP2XZGFREG363K173W9PTGT
level=debug ts=2020-01-28T14:23:19.941130042Z caller=compact.go:287 msg="download meta" block=01DZNQ2HSRJ9Q8ZWEG71BMK5GS
level=debug ts=2020-01-28T14:23:19.941183097Z caller=compact.go:287 msg="download meta" block=01DZK4HN2D34PKE784R3JT6PB8
level=debug ts=2020-01-28T14:23:19.941568642Z caller=compact.go:287 msg="download meta" block=01DZP4NYMA97CVP9BPTGWET90W
level=debug ts=2020-01-28T14:23:19.941656724Z caller=compact.go:287 msg="download meta" block=01DZP4MX9YYVZ4KYF3WTR0WG1K
level=debug ts=2020-01-28T14:23:19.941638717Z caller=compact.go:287 msg="download meta" block=01DZNXT2HZBNR9EYN818E95XA3
level=debug ts=2020-01-28T14:23:19.941707505Z caller=compact.go:287 msg="download meta" block=01DZP82S0F8GREDF8ZM6PDZ170
level=debug ts=2020-01-28T14:23:19.941783078Z caller=compact.go:287 msg="download meta" block=01DZP6BV1P6R7ZQ8CTXWAA4N2G
level=debug ts=2020-01-28T14:23:19.947273066Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP82S0F8GREDF8ZM6PDZ170
level=info ts=2020-01-28T14:23:19.956042814Z caller=compact.go:1069 msg="start of GC"
level=info ts=2020-01-28T14:23:19.957096788Z caller=compact.go:1075 msg="start of compaction"
level=info ts=2020-01-28T14:23:19.96922879Z caller=compact.go:264 msg="compaction iterations done"
level=info ts=2020-01-28T14:23:19.969263079Z caller=compact.go:271 msg="start first pass of downsampling"
level=info ts=2020-01-28T14:23:56.399449754Z caller=downsample.go:257 msg="downloaded block" id=01DZGKF24PTVWYYDKHK2QQJWSY duration=36.410051757s
level=warn ts=2020-01-28T14:23:57.952841758Z caller=prober.go:154 msg="changing probe status" status=unhealthy reason="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=error ts=2020-01-28T14:23:57.953114685Z caller=main.go:215 msg="running command failed" err="error executing compaction: first pass of downsampling failed: downsampling to 5 min: input block index not valid: open index file: try lock file: open /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/index: no such file or directory"
level=debug ts=2020-01-28T14:23:58.75191713Z caller=main.go:122 msg="maxprocs: Leaving GOMAXPROCS=[12]: CPU quota undefined"
level=info ts=2020-01-28T14:23:58.75282442Z caller=main.go:170 msg="Tracing will be disabled"
level=info ts=2020-01-28T14:23:58.753725738Z caller=factory.go:39 msg="loading bucket configuration"
level=info ts=2020-01-28T14:23:58.754374327Z caller=compact.go:341 msg="starting compact node"
level=info ts=2020-01-28T14:23:58.754428327Z caller=prober.go:114 msg="changing probe status" status=ready
level=info ts=2020-01-28T14:23:58.754513715Z caller=main.go:353 msg="listening for requests and metrics" component=compact address=0.0.0.0:10902
level=info ts=2020-01-28T14:23:58.754625629Z caller=prober.go:143 msg="changing probe status" status=healthy
level=info ts=2020-01-28T14:23:58.754652791Z caller=compact.go:1063 msg="start sync of metas"
level=debug ts=2020-01-28T14:23:58.763598211Z caller=compact.go:287 msg="download meta" block=01DZMVKZMDFASKV2G83H6KTGYY
level=debug ts=2020-01-28T14:23:58.763654224Z caller=compact.go:287 msg="download meta" block=01DZGKF24PTVWYYDKHK2QQJWSY
level=debug ts=2020-01-28T14:23:58.763690343Z caller=compact.go:287 msg="download meta" block=01DZJ951BSHB6HHGMPZJN8VSD0
level=debug ts=2020-01-28T14:23:58.763941025Z caller=compact.go:287 msg="download meta" block=01DZNQ2HSRJ9Q8ZWEG71BMK5GS
level=debug ts=2020-01-28T14:23:58.763926496Z caller=compact.go:287 msg="download meta" block=01DZP2XZGFREG363K173W9PTGT
level=debug ts=2020-01-28T14:23:58.764033439Z caller=compact.go:287 msg="download meta" block=01DZK4HN2D34PKE784R3JT6PB8
level=debug ts=2020-01-28T14:23:58.764110261Z caller=compact.go:287 msg="download meta" block=01DZ818HQVPNVKP0GE8D2PFS6J
level=debug ts=2020-01-28T14:23:58.764219684Z caller=compact.go:287 msg="download meta" block=01DZP6BV1P6R7ZQ8CTXWAA4N2G
level=debug ts=2020-01-28T14:23:58.764435173Z caller=compact.go:287 msg="download meta" block=01DZP4MX9YYVZ4KYF3WTR0WG1K
level=debug ts=2020-01-28T14:23:58.763944245Z caller=compact.go:287 msg="download meta" block=01DZM03DCXG2KEVKMXYBA99SKQ
level=debug ts=2020-01-28T14:23:58.764580558Z caller=compact.go:287 msg="download meta" block=01DZP4NYMA97CVP9BPTGWET90W
level=debug ts=2020-01-28T14:23:58.764605588Z caller=compact.go:287 msg="download meta" block=01DZP82S0F8GREDF8ZM6PDZ170
level=debug ts=2020-01-28T14:23:58.768164871Z caller=compact.go:287 msg="download meta" block=01DZNXT2HZBNR9EYN818E95XA3
level=debug ts=2020-01-28T14:23:58.787912713Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP82S0F8GREDF8ZM6PDZ170
level=info ts=2020-01-28T14:23:58.790662603Z caller=compact.go:1069 msg="start of GC"
level=info ts=2020-01-28T14:23:58.790841734Z caller=compact.go:1075 msg="start of compaction"
level=info ts=2020-01-28T14:23:58.802192972Z caller=compact.go:264 msg="compaction iterations done"
level=info ts=2020-01-28T14:23:58.802229906Z caller=compact.go:271 msg="start first pass of downsampling"

I checked that chunk of data and for some unknown reason it was missing its index file. I deleted that chunk of data and saw thanos compactor component eventually recover:

level=info ts=2020-01-28T14:23:58.790662603Z caller=compact.go:1069 msg="start of GC" level=info ts=2020-01-28T14:23:58.790841734Z caller=compact.go:1075 msg="start of compaction" level=info ts=2020-01-28T14:23:58.802192972Z caller=compact.go:264 msg="compaction iterations done" level=info ts=2020-01-28T14:23:58.802229906Z caller=compact.go:271 msg="start first pass of downsampling" level=warn ts=2020-01-28T14:24:13.569133864Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.569321769Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.569356763Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.569384929Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.569421803Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.569460214Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.569498549Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.569531497Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.569590816Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.56963111Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.569669461Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY/chunks: directory not empty" level=warn ts=2020-01-28T14:24:13.570851158Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY: directory not empty" level=warn ts=2020-01-28T14:24:13.570901501Z caller=objstore.go:156 msg="failed to remove file on partial dir download error" file=/data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY err="remove /data/downsample/01DZGKF24PTVWYYDKHK2QQJWSY: directory not empty" level=warn ts=2020-01-28T14:24:14.766740723Z caller=prober.go:154 msg="changing probe status" status=unhealthy reason="error executing compaction: first pass of downsampling failed: downsampling to 5 min: download block 01DZGKF24PTVWYYDKHK2QQJWSY: get file: The specified key does not exist." level=error ts=2020-01-28T14:24:14.76724114Z caller=main.go:215 msg="running command failed" err="error executing compaction: first pass of downsampling failed: downsampling to 5 min: download block 01DZGKF24PTVWYYDKHK2QQJWSY: get file: The specified key does not exist." level=debug ts=2020-01-28T14:24:15.588223712Z caller=main.go:122 msg="maxprocs: Leaving GOMAXPROCS=[12]: CPU quota undefined" level=info ts=2020-01-28T14:24:15.589255282Z caller=main.go:170 msg="Tracing will be disabled" level=info ts=2020-01-28T14:24:15.590047222Z caller=factory.go:39 msg="loading bucket configuration" level=info ts=2020-01-28T14:24:15.590649112Z caller=compact.go:341 msg="starting compact node" level=info ts=2020-01-28T14:24:15.590688643Z caller=prober.go:114 msg="changing probe status" status=ready level=info ts=2020-01-28T14:24:15.59075377Z caller=main.go:353 msg="listening for requests and metrics" component=compact address=0.0.0.0:10902 level=info ts=2020-01-28T14:24:15.590816215Z caller=prober.go:143 msg="changing probe status" status=healthy level=info ts=2020-01-28T14:24:15.590865876Z caller=compact.go:1063 msg="start sync of metas" level=debug ts=2020-01-28T14:24:15.599163754Z caller=compact.go:287 msg="download meta" block=01DZK4HN2D34PKE784R3JT6PB8 level=debug ts=2020-01-28T14:24:15.599246865Z caller=compact.go:287 msg="download meta" block=01DZNQ2HSRJ9Q8ZWEG71BMK5GS level=debug ts=2020-01-28T14:24:15.599394366Z caller=compact.go:287 msg="download meta" block=01DZM03DCXG2KEVKMXYBA99SKQ level=debug ts=2020-01-28T14:24:15.599319912Z caller=compact.go:287 msg="download meta" block=01DZJ951BSHB6HHGMPZJN8VSD0 level=debug ts=2020-01-28T14:24:15.599657828Z caller=compact.go:287 msg="download meta" block=01DZ818HQVPNVKP0GE8D2PFS6J level=debug ts=2020-01-28T14:24:15.599790304Z caller=compact.go:287 msg="download meta" block=01DZP4MX9YYVZ4KYF3WTR0WG1K level=debug ts=2020-01-28T14:24:15.599874987Z caller=compact.go:287 msg="download meta" block=01DZP2XZGFREG363K173W9PTGT level=debug ts=2020-01-28T14:24:15.599845917Z caller=compact.go:287 msg="download meta" block=01DZNXT2HZBNR9EYN818E95XA3 level=debug ts=2020-01-28T14:24:15.599994915Z caller=compact.go:287 msg="download meta" block=01DZMVKZMDFASKV2G83H6KTGYY level=debug ts=2020-01-28T14:24:15.6001384Z caller=compact.go:287 msg="download meta" block=01DZP4NYMA97CVP9BPTGWET90W level=debug ts=2020-01-28T14:24:15.600186198Z caller=compact.go:287 msg="download meta" block=01DZP6BV1P6R7ZQ8CTXWAA4N2G level=debug ts=2020-01-28T14:24:15.600442895Z caller=compact.go:287 msg="download meta" block=01DZP82S0F8GREDF8ZM6PDZ170 level=debug ts=2020-01-28T14:24:15.604138426Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP82S0F8GREDF8ZM6PDZ170 level=info ts=2020-01-28T14:24:15.605419756Z caller=compact.go:1069 msg="start of GC" level=info ts=2020-01-28T14:24:15.606379936Z caller=compact.go:1075 msg="start of compaction" level=info ts=2020-01-28T14:24:15.616318893Z caller=compact.go:264 msg="compaction iterations done" level=info ts=2020-01-28T14:24:15.616354982Z caller=compact.go:271 msg="start first pass of downsampling" level=info ts=2020-01-28T14:25:00.643248245Z caller=downsample.go:257 msg="downloaded block" id=01DZJ951BSHB6HHGMPZJN8VSD0 duration=45.007297668s level=info ts=2020-01-28T14:41:25.332694746Z caller=streamed_block_writer.go:219 msg="finalized downsampled block" mint=1579910400000 maxt=1580083200000 ulid=01DZP8NZD6D28A917VCYRAY24Z resolution=300000 level=info ts=2020-01-28T14:41:25.332814404Z caller=downsample.go:284 msg="downsampled block" from=01DZJ951BSHB6HHGMPZJN8VSD0 to=01DZP8NZD6D28A917VCYRAY24Z duration=15m55.843747521s level=debug ts=2020-01-28T14:41:47.326194782Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/meta.json dst=debug/metas/01DZP8NZD6D28A917VCYRAY24Z.json bucket=demo-bucket level=debug ts=2020-01-28T14:41:49.898299421Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000001 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000001 bucket=demo-bucket level=debug ts=2020-01-28T14:41:52.243912111Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000002 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000002 bucket=demo-bucket level=debug ts=2020-01-28T14:41:54.602850229Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000003 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000003 bucket=demo-bucket level=debug ts=2020-01-28T14:41:58.859799988Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000004 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000004 bucket=demo-bucket level=debug ts=2020-01-28T14:42:02.479744803Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000005 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000005 bucket=demo-bucket level=debug ts=2020-01-28T14:42:05.040938974Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000006 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000006 bucket=demo-bucket level=debug ts=2020-01-28T14:42:07.150053951Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000007 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000007 bucket=demo-bucket level=debug ts=2020-01-28T14:42:09.261433314Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000008 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000008 bucket=demo-bucket level=debug ts=2020-01-28T14:42:11.476885228Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000009 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000009 bucket=demo-bucket level=debug ts=2020-01-28T14:42:13.827536732Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000010 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000010 bucket=demo-bucket level=debug ts=2020-01-28T14:42:16.121796762Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000011 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000011 bucket=demo-bucket level=debug ts=2020-01-28T14:42:18.576641672Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000012 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000012 bucket=demo-bucket level=debug ts=2020-01-28T14:42:21.066873548Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000013 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000013 bucket=demo-bucket level=debug ts=2020-01-28T14:42:23.338533853Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000014 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000014 bucket=demo-bucket level=debug ts=2020-01-28T14:42:25.446479393Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000015 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000015 bucket=demo-bucket level=debug ts=2020-01-28T14:42:27.319480786Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/chunks/000016 dst=01DZP8NZD6D28A917VCYRAY24Z/chunks/000016 bucket=demo-bucket level=debug ts=2020-01-28T14:42:36.649024421Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/index dst=01DZP8NZD6D28A917VCYRAY24Z/index bucket=demo-bucket level=debug ts=2020-01-28T14:42:36.759756435Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/index.cache.json dst=01DZP8NZD6D28A917VCYRAY24Z/index.cache.json bucket=demo-bucket level=debug ts=2020-01-28T14:42:36.765952723Z caller=objstore.go:91 msg="uploaded file" from=/data/downsample/01DZP8NZD6D28A917VCYRAY24Z/meta.json dst=01DZP8NZD6D28A917VCYRAY24Z/meta.json bucket=demo-bucket level=info ts=2020-01-28T14:42:36.766045255Z caller=downsample.go:298 msg="uploaded block" id=01DZP8NZD6D28A917VCYRAY24Z duration=49.520249132s level=info ts=2020-01-28T14:42:37.943654647Z caller=compact.go:277 msg="start second pass of downsampling" level=info ts=2020-01-28T14:42:38.095817535Z caller=compact.go:282 msg="downsampling iterations done" level=info ts=2020-01-28T14:42:38.095908902Z caller=retention.go:17 msg="start optional retention" level=info ts=2020-01-28T14:42:38.114436054Z caller=retention.go:46 msg="optional retention apply done" level=info ts=2020-01-28T14:42:38.114532791Z caller=compact.go:1063 msg="start sync of metas" level=debug ts=2020-01-28T14:42:38.115916832Z caller=compact.go:287 msg="download meta" block=01DZP82S0F8GREDF8ZM6PDZ170 level=debug ts=2020-01-28T14:42:38.11615322Z caller=compact.go:287 msg="download meta" block=01DZP8NZD6D28A917VCYRAY24Z level=debug ts=2020-01-28T14:42:38.11701229Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP82S0F8GREDF8ZM6PDZ170 level=info ts=2020-01-28T14:42:38.117586822Z caller=compact.go:1069 msg="start of GC" level=info ts=2020-01-28T14:42:38.118196663Z caller=compact.go:1075 msg="start of compaction" level=info ts=2020-01-28T14:42:38.131683866Z caller=compact.go:264 msg="compaction iterations done" level=info ts=2020-01-28T14:42:38.131724779Z caller=compact.go:271 msg="start first pass of downsampling" level=info ts=2020-01-28T14:42:38.156132968Z caller=compact.go:277 msg="start second pass of downsampling" level=info ts=2020-01-28T14:42:38.176393435Z caller=compact.go:282 msg="downsampling iterations done" level=info ts=2020-01-28T14:42:38.176441767Z caller=retention.go:17 msg="start optional retention" level=info ts=2020-01-28T14:42:38.198332882Z caller=retention.go:46 msg="optional retention apply done" level=info ts=2020-01-28T14:44:15.591191771Z caller=compact.go:1063 msg="start sync of metas" level=debug ts=2020-01-28T14:44:15.59545029Z caller=compact.go:287 msg="download meta" block=01DZP82S0F8GREDF8ZM6PDZ170 level=debug ts=2020-01-28T14:44:15.597166504Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP82S0F8GREDF8ZM6PDZ170 level=info ts=2020-01-28T14:44:15.597212812Z caller=compact.go:1069 msg="start of GC" level=info ts=2020-01-28T14:44:15.59742025Z caller=compact.go:1075 msg="start of compaction" level=info ts=2020-01-28T14:44:15.609173795Z caller=compact.go:264 msg="compaction iterations done" level=info ts=2020-01-28T14:44:15.609219124Z caller=compact.go:271 msg="start first pass of downsampling" level=info ts=2020-01-28T14:44:15.628892325Z caller=compact.go:277 msg="start second pass of downsampling" level=info ts=2020-01-28T14:44:15.647938686Z caller=compact.go:282 msg="downsampling iterations done" level=info ts=2020-01-28T14:44:15.647985758Z caller=retention.go:17 msg="start optional retention" level=info ts=2020-01-28T14:44:15.663966582Z caller=retention.go:46 msg="optional retention apply done" level=info ts=2020-01-28T14:49:15.591214679Z caller=compact.go:1063 msg="start sync of metas" level=debug ts=2020-01-28T14:49:15.595528417Z caller=compact.go:287 msg="download meta" block=01DZP82S0F8GREDF8ZM6PDZ170 level=debug ts=2020-01-28T14:49:15.595565711Z caller=compact.go:287 msg="download meta" block=01DZP9SPKK0Z8GA4MQFTATP7DC level=debug ts=2020-01-28T14:49:15.598094135Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP9SPKK0Z8GA4MQFTATP7DC level=info ts=2020-01-28T14:49:15.59816125Z caller=compact.go:1069 msg="start of GC" level=info ts=2020-01-28T14:49:15.598420973Z caller=compact.go:1075 msg="start of compaction" level=info ts=2020-01-28T14:49:15.615299837Z caller=compact.go:264 msg="compaction iterations done" level=info ts=2020-01-28T14:49:15.615337358Z caller=compact.go:271 msg="start first pass of downsampling" level=info ts=2020-01-28T14:49:15.633718744Z caller=compact.go:277 msg="start second pass of downsampling" level=info ts=2020-01-28T14:49:15.65181404Z caller=compact.go:282 msg="downsampling iterations done" level=info ts=2020-01-28T14:49:15.651845188Z caller=retention.go:17 msg="start optional retention" level=info ts=2020-01-28T14:49:15.668905547Z caller=retention.go:46 msg="optional retention apply done" level=info ts=2020-01-28T14:54:15.591244527Z caller=compact.go:1063 msg="start sync of metas" level=debug ts=2020-01-28T14:54:15.596355904Z caller=compact.go:287 msg="download meta" block=01DZP9SPKK0Z8GA4MQFTATP7DC level=debug ts=2020-01-28T14:54:15.598541543Z caller=compact.go:309 msg="block is too fresh for now" block=01DZP9SPKK0Z8GA4MQFTATP7DC level=info ts=2020-01-28T14:54:15.598642484Z caller=compact.go:1069 msg="start of GC" level=info ts=2020-01-28T14:54:15.598924204Z caller=compact.go:1075 msg="start of compaction" level=info ts=2020-01-28T14:54:15.613441393Z caller=compact.go:264 msg="compaction iterations done" level=info ts=2020-01-28T14:54:15.613500509Z caller=compact.go:271 msg="start first pass of downsampling" level=info ts=2020-01-28T14:54:15.635370095Z caller=compact.go:277 msg="start second pass of downsampling" level=info ts=2020-01-28T14:54:15.653292745Z caller=compact.go:282 msg="downsampling iterations done" level=info ts=2020-01-28T14:54:15.653343805Z caller=retention.go:17 msg="start optional retention" level=info ts=2020-01-28T14:54:15.671422298Z caller=retention.go:46 msg="optional retention apply done"

@zshearin zshearin changed the title Thanos Compact stuck in infinite loop when index file missing Thanos Compact stuck in continuous restart loop when index file missing Jan 28, 2020
@bwplotka
Copy link
Member

Can you check the latest version? This was fixed in v0.10.1. I believe it might be connected to this issue.

Please upgrade compactor (make sure only one is running) to v0.10.1, delete the malformed downsampled block from the bucket (this should be fine as you should have raw data available for this time frame)

If after this you can still repro, let us know - otherwise closing (:

@zshearin
Copy link
Author

Okay great thank you! Thanks for the quick response @bwplotka !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants