Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a retry logic and a service availability check function for high availability. #136

Merged
merged 6 commits into from
Apr 30, 2024

Conversation

OlehPalanskyi
Copy link
Contributor

@OlehPalanskyi OlehPalanskyi commented Apr 14, 2024

Description

Problem:
If there are two servers in the hosts field (example: hosts "server1:9200,server2:9200"), automatic switching to the available host does not work. If server1 is unavailable, it does not switch to server2, and the service writes an error and freezes:
"Unexpected error raised. Stopping the timer. title=:in_opensearch_timer error_class=Faraday::ConnectionFailed error="EOFError (EOFError)""

Solution:
I added a retry logic in case of connection failures.
I created a service availability check function for high availability. New function get_reachable_hosts in in_opensearch.rb
I used one library opensearch to solve this problem.
And added ignoring any clear_scroll errors.

Checklist

  • tests added
  • tests passing
  • README.OpenSearchInput.md updated
  • History.md and version in gemspec are untouched
  • backward compatible
Monosnap Dockerfile—2024-04-14 14-18-21

@OlehPalanskyi OlehPalanskyi marked this pull request as draft April 14, 2024 11:41
@OlehPalanskyi OlehPalanskyi changed the title Added a retry logic for both client creation and search operations in case of connection failures. Added a retry logic and a service availability check function for high availability. Apr 15, 2024
Copy link
Collaborator

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some of the my concerns. No worries, the direction of the patch should be reasonable but we need to handle for error cases and retrying period precisely.

lib/fluent/plugin/in_opensearch.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/in_opensearch.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/in_opensearch.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/in_opensearch.rb Show resolved Hide resolved
@OlehPalanskyi
Copy link
Contributor Author

Thank you for your comments
I'll rework my patch based on your suggestions.

lib/fluent/plugin/in_opensearch.rb Outdated Show resolved Hide resolved
…on and search operations in case of connection failures.

Signed-off-by: Oleh Palanskyi <trahterber@gmail.com>

Signed-off-by: Oleh <Trahterber@gmail.com>
Signed-off-by: OlehPalanskyi <Trahterber@gmail.com>
I created a service availability check function for high availability. New function get_reachable_hosts in in_opensearch.rb
I used one library opensearch to solve this problem.

Signed-off-by: OlehPalanskyi <Trahterber@gmail.com>
Signed-off-by: OlehPalanskyi <Trahterber@gmail.com>
Signed-off-by: OlehPalanskyi <Trahterber@gmail.com>
Signed-off-by: Oleh  <trahterber@gmail.com>

Signed-off-by: OlehPalanskyi <Trahterber@gmail.com>
Signed-off-by: Oleh  <trahterber@gmail.com>

Signed-off-by: OlehPalanskyi <Trahterber@gmail.com>
Copy link
Collaborator

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work. It's brilliant work! All of the implementation should be exceeded my expectations.

@cosmo0920
Copy link
Collaborator

@kenhys Any concerns?

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Apr 26, 2024

On thing, I had a nitpick though. The retry_state often should be used inside of <buffer> sections. However, Input plugins of Fluentd do not have buffering mechanism. So, it should be wrong usage if we use <buffer> for this retrying functionality.
How you do think this nitpick, @kenhys or @daipom ?
The main concern is including retrying configurations should be going to some of the messing up for configuration parameters. So, we need to handle them within the dedicated sections somehow but it's not mandatory but optional thing.

@cosmo0920 cosmo0920 merged commit 89fe756 into fluent:main Apr 30, 2024
11 of 12 checks passed
kenhys added a commit that referenced this pull request Oct 2, 2024
Mainly improved retry logic and refresh aws credentials.

#116
#119
#131
#136
#142
#143
kenhys added a commit that referenced this pull request Oct 2, 2024
Mainly improved retry logic and refresh aws credentials.

#116
#119
#131
#136
#142
#143
Signed-off-by: Kentaro Hayashi <hayashi@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants