Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose more information in clusterloader2 logs #293

Open
4 of 5 tasks
wojtek-t opened this issue Nov 12, 2018 · 6 comments
Open
4 of 5 tasks

Expose more information in clusterloader2 logs #293

wojtek-t opened this issue Nov 12, 2018 · 6 comments
Assignees
Labels
area/clusterloader kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.

Comments

@wojtek-t
Copy link
Member

wojtek-t commented Nov 12, 2018

There are a couple things that we definitely need:

  • more state about pods from a given controlling object (number of pending, waiting, checking if something was deleted, etc.). Mostly copying this logic:
    https://github.com/kubernetes/kubernetes/blob/master/test/utils/runners.go#L803
  • pod-startup-time latency should output thing that is somewhat similar to what we currently do (for debugging purposes)
  • show more clearly where a given test finished:
W1112 13:18:48.029] I1112 13:18:48.029322    9960 clusterloader.go:127] Test testing/density/config.yaml ran successfully!"

is not very visible in those logs

  • We are currently printing about the information about nodes that is extremely helpful for debugging (this is currently part of density). It would be useful to add that too (it should probably be part of initialization of cluster loader)
  • You need to audit logs in measurements - a bunch of glog`s should actually be real failures and fail the test at the end (though not immediately). I can imagine this as something like: Reduce flakiness of density test kubernetes#66239 (comment), but also as a special measurement that inside is collecting errors (and gloging them when they happen) and at the end fails if any logs were reported (should be simpler than a separate flakes.txt file).

I guess there may be more, but let's start with those.

/assign @krzysied

@wojtek-t
Copy link
Member Author

@kubernetes/sig-scalability-bugs

@k8s-ci-robot k8s-ci-robot added sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. kind/bug Categorizes issue or PR as related to a bug. labels Nov 12, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 10, 2019
@wojtek-t
Copy link
Member Author

/remove-lifecycle stale

@krzysied - what's the status of this?

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 11, 2019
@krzysied
Copy link
Contributor

@wojtek-t First 4 points are done. The last one is partially done. There is no errors that immediately fail test, however there is no flake.txt file.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 15, 2019
@wojtek-t
Copy link
Member Author

/remove-lifecycle stale
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clusterloader kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.
Projects
None yet
Development

No branches or pull requests

4 participants