Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failing e2e test #7620

Merged
merged 1 commit into from
Aug 12, 2024
Merged

Fix failing e2e test #7620

merged 1 commit into from
Aug 12, 2024

Conversation

harry671003
Copy link
Contributor

@harry671003 harry671003 commented Aug 9, 2024

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

There seems to be a bug in avalanche. When --metric-interval alone is set, no timeseries are returned and no write requests will be made by avalanche. See: write.go#L143

Using --series-interval and --sample-interval seems to fix the test.

I haven't had a chance to look at avalanche bug in depth. But this PR should unblock Thanos e2e tests.

Verification

  • CI Workflows should pass

@harry671003 harry671003 force-pushed the fix_e2e branch 6 times, most recently from 08f7728 to 0e6cb7f Compare August 12, 2024 17:03
@harry671003 harry671003 marked this pull request as ready for review August 12, 2024 17:26
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Copy link
Member

@saswatamcode saswatamcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM nice catch! But to prevent this, we should probably pin the image we use here too?

edit: Avalanche only has a main tag, so can't really replace. So merging!

Copy link
Contributor

@MichaHoffmann MichaHoffmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@saswatamcode saswatamcode merged commit 49617f4 into thanos-io:main Aug 12, 2024
20 checks passed
@harry671003 harry671003 deleted the fix_e2e branch August 12, 2024 18:15
saswatamcode pushed a commit to saswatamcode/thanos that referenced this pull request Aug 13, 2024
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
saswatamcode pushed a commit to saswatamcode/thanos that referenced this pull request Aug 13, 2024
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
saswatamcode added a commit that referenced this pull request Aug 13, 2024
* Proxy: Query goroutine leak when `store.response-timeout` is set (#7618)

time.AfterFunc() returns a time.Timer object whose C field is nil,
accroding to the documentation. A goroutine blocks forever on reading
from a `nil` channel, leading to a goroutine leak on random slow
queries.

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>

* pkg/clientconfig: fix TLS configs with only CA (#7634)

065e3dd introduced a regression: TLS configurations for Thanos Ruler
query and alerting with only a CA file failed to load.

For instance, the following snippet is a valid query configuration:

```
- static_configs:
  - prometheus.example.com:9090
  scheme: https
  http_config:
    tls_config:
      ca_file: /etc/ssl/cert.pem
```

The test fixtures (CA, certificate and key files) are copied from
prometheus/common and are valid until 2072.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Cut patch release v0.36.1

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix failing e2e test (#7620)

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Co-authored-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Harry John <johrry@amazon.com>
saswatamcode added a commit that referenced this pull request Aug 14, 2024
* CHANGELOG: Mark 0.36 as in progress

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut release candidate v0.36.0-rc.0 (#7490)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut release candidate 0.36.0 rc.1 (#7510)

* *: fix server grpc histograms (#7493)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Close endpoints after the gRPC server has terminated (#7509)

Endpoints are currently closed as soon as we receive a SIGTERM or SIGINT.
This causes in-flight queries to get cancelled since outgoing connections
get closed instantly.

This commit moves the endpoints.Close call after the grpc server shutdown
to make sure connections are available as long as the server is running.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut release candidate v0.36.0-rc.1

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut release v0.36.0 (#7578)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut patch release `v0.36.1` (#7636)

* Proxy: Query goroutine leak when `store.response-timeout` is set (#7618)

time.AfterFunc() returns a time.Timer object whose C field is nil,
accroding to the documentation. A goroutine blocks forever on reading
from a `nil` channel, leading to a goroutine leak on random slow
queries.

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>

* pkg/clientconfig: fix TLS configs with only CA (#7634)

065e3dd introduced a regression: TLS configurations for Thanos Ruler
query and alerting with only a CA file failed to load.

For instance, the following snippet is a valid query configuration:

```
- static_configs:
  - prometheus.example.com:9090
  scheme: https
  http_config:
    tls_config:
      ca_file: /etc/ssl/cert.pem
```

The test fixtures (CA, certificate and key files) are copied from
prometheus/common and are valid until 2072.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Cut patch release v0.36.1

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix failing e2e test (#7620)

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Co-authored-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Harry John <johrry@amazon.com>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Co-authored-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Harry John <johrry@amazon.com>
hczhu-db pushed a commit to databricks/thanos that referenced this pull request Aug 22, 2024
* Proxy: Query goroutine leak when `store.response-timeout` is set (thanos-io#7618)

time.AfterFunc() returns a time.Timer object whose C field is nil,
accroding to the documentation. A goroutine blocks forever on reading
from a `nil` channel, leading to a goroutine leak on random slow
queries.

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>

* pkg/clientconfig: fix TLS configs with only CA (thanos-io#7634)

065e3dd introduced a regression: TLS configurations for Thanos Ruler
query and alerting with only a CA file failed to load.

For instance, the following snippet is a valid query configuration:

```
- static_configs:
  - prometheus.example.com:9090
  scheme: https
  http_config:
    tls_config:
      ca_file: /etc/ssl/cert.pem
```

The test fixtures (CA, certificate and key files) are copied from
prometheus/common and are valid until 2072.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Cut patch release v0.36.1

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix failing e2e test (thanos-io#7620)

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Co-authored-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Harry John <johrry@amazon.com>
GiedriusS pushed a commit to vinted/thanos that referenced this pull request Aug 28, 2024
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants