Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double dipping on job service #505

Open
johnml1135 opened this issue Oct 7, 2024 · 2 comments
Open

Double dipping on job service #505

johnml1135 opened this issue Oct 7, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@johnml1135
Copy link
Collaborator

The engine and Job servers both appear to be doing the same task (for instance, monitoring ClearML health and having a full job server). Likely this is unneeded - there should be a clearer separation between what the engine server does and what the job server does.

@johnml1135 johnml1135 added the bug Something isn't working label Oct 7, 2024
@johnml1135 johnml1135 self-assigned this Oct 7, 2024
@johnml1135
Copy link
Collaborator Author

So - can we make the Engine Server only do ClearML health checking and the Job Server only queue and report on jobs?

@ddaspit
Copy link
Contributor

ddaspit commented Oct 8, 2024

Although unlikely, it is possible that some services are having difficulty connecting to ClearML and others are not. The point of the health check for a service is to determine if that service is healthy not to determine if the whole system is healthy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 🆕 New
Development

No branches or pull requests

2 participants