Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downgrade the version of Apache Curator from 5.5.0 to 5.3.0 to avoid a bug in the new version #16425

Merged
merged 1 commit into from
May 10, 2024

Conversation

asdf2014
Copy link
Member

Fixes #16411

Release note


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@cryptoe
Copy link
Contributor

cryptoe commented May 10, 2024

I think this is important enough to hold the release for druid 30.

@cryptoe cryptoe merged commit cb7c2c1 into apache:master May 10, 2024
88 checks passed
@asdf2014 asdf2014 deleted the downgrade-curator branch May 10, 2024 09:53
adarshsanjeev pushed a commit to adarshsanjeev/druid that referenced this pull request May 10, 2024
@gianm
Copy link
Contributor

gianm commented May 10, 2024

Are there any fixes in Curator 5.4 or 5.5 for other bugs that would be important for us? I ask since if yes- we could stay with the newer Curator and work around this bug, such as by closing and recreating our LeaderLatch when ZK session changes.

@@ -75,7 +75,7 @@
<java.version>8</java.version>
<project.build.resourceEncoding>UTF-8</project.build.resourceEncoding>
<aether.version>0.9.0.M2</aether.version>
<apache.curator.version>5.5.0</apache.curator.version>
<apache.curator.version>5.3.0</apache.curator.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be a comment here about why we're in this version, so the rationale won't get forgotten. (Link to the Curator JIRA is best)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gianm Done. FYI, #16444

cryptoe pushed a commit that referenced this pull request May 10, 2024
…a bug in the new version (#16425) (#16430)

Co-authored-by: Benedict Jin <asdf2014@apache.org>
@gianm
Copy link
Contributor

gianm commented May 10, 2024

Just reviewed the lists. By downgrading from 5.5 to 5.3 we do lose various fixes. These ones sound like they could be important:

  • [CURATOR-504] - Race conditions in LeaderLatch after reconnecting to ensemble
  • [CURATOR-638] - Curator disconnect from zookeeper when IPs change [seems especially relevant to k8s environments]
  • [CURATOR-644] - CLONE - Race conditions in LeaderLatch after reconnecting to ensemble [a live-lock issue; we believe the fix for this bug introduced the split-brain problem; so rolling back would reintroduce the live-lock issue]
  • [CURATOR-649] - Background exception was not retry-able or retry gave up [robustness]

With regard to downgrading Curator to 5.3 in Druid 30, I think we should be especially careful of these issues, and in particular CURATOR-638 and possible impact on k8s environments.

Alternate approaches do include:

  • Stay with Curator 5.5 and work around this bug, such as by closing and recreating our LeaderLatch when ZK session changes.
  • Stay with Curator 5.5 and don't work around this bug; wait for release of Curator 5.7 which will hopefully include a fix.

gianm pushed a commit to gianm/druid that referenced this pull request May 10, 2024
adarshsanjeev added a commit to adarshsanjeev/druid that referenced this pull request May 14, 2024
adarshsanjeev added a commit that referenced this pull request May 14, 2024
…o avoid a bug in the new version (#16425) (#16430)" (#16445)

This reverts commit f3d207c.
@razinbouzar razinbouzar mentioned this pull request Jun 17, 2024
10 tasks
kfaraz added a commit to kfaraz/druid that referenced this pull request Jul 3, 2024
…o avoid a bug in the new version (apache#16425)"

This reverts commit cb7c2c1.
abhishekagarwal87 pushed a commit that referenced this pull request Jul 3, 2024
…o avoid a bug in the new version (#16425)" (#16688)

This reverts commit cb7c2c1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 Coordinators Elected Leader
3 participants