-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Silently dropped logs with out of box config #1255
Comments
Hi @laupow - thank you for reporting! |
@laupow Sorry for late response By using graceful shutdown period, liveness and readiness we ensure that logs are coming without misses. We observed that in fluent-bit below In addition we are going to improve load balancing and HPA by disabling keepalive for fluent-bit #1495 |
Awesome, thanks for the update. Looking forward to |
Hi @laupow let me close this issue. Please let me know if the problem still exists. |
Describe the bug
Higher volume log environments need better configuration guardrails to ensure logs aren't dropped silently.
Recently 2 different engineers have expected logs and found none in our production environment.
One instance was a long running service that had intermittent message missing (screenshot attached) Another instance was a new Deployment that did not get logs captured (logs were verified with
kubectl logs <pod>
. Screenshot in ticket)Logs
Logs available in ticket
Command used to install/upgrade Collection
with Helm 2
Configuration
To Reproduce
I have not been able to reproduce the issue. On Dec 15 we manually changed the HPA minimum from 3 to 7 nobody has reported issues since then, but 🤷
The issue occurs in our production environment so there is somewhat of a disincentive to reproduce it :)
Expected behavior
Provide a clear signal (pod crash, log message) when there is a capacity issue or other case that might cause logs to drop.
Environment (please complete the following information):
helm ls -n sumologic
): 1.3.1kubectl version
): 1.15.11Anything else do we need to know
fluentd
pod restarts, but they haven't correlated with times when logs are missing.fluend pod metrics, HPA minimum adjusted Dec 15
Sumo collector volume
The text was updated successfully, but these errors were encountered: