-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
webhook autoscaler doesn't recognize organization target #534
Comments
@yfried Could you share us the webhook payload of the event whose delivery ID is |
@yfried This the relevant code in the controller. You could probably try debugging it and see if it's really working as intended on your webhook payload https://github.com/actions-runner-controller/actions-runner-controller/blob/082245c5db64e023cd79604d9b158f336770e3fe/controllers/horizontal_runner_autoscaler_webhook.go#L170-L184 |
@mumoshu
{
"action": "created",
"check_run": {
"id": 1234567890,
"node_id": "MDg6Q2hlY2tSdW4yNTM4NzI2Nzk0",
"head_sha": "0989e14252273b848d0a6bd20c58ec41558c02ce",
"external_id": "f73aec57-e22c-559d-9485-1eb8b431809b",
"url": "https://api.github.com/repos/Myorg-Private/myrepo/check-runs/1234567890",
"html_url": "https://github.com/Myorg-Private/myrepo/runs/1234567890",
"details_url": "https://github.com/Myorg-Private/myrepo/runs/1234567890",
"status": "queued",
"conclusion": null,
"started_at": "2021-05-09T11:46:42Z",
"completed_at": null,
"output": {
"title": null,
"summary": null,
"text": null,
"annotations_count": 0,
"annotations_url": "https://api.github.com/repos/Myorg-Private/myrepo/check-runs/1234567890/annotations"
},
"name": "Run - yarn test:integration, node 14.x",
"check_suite": {
"id": 2690524465,
"node_id": "MDEwOkNoZWNrU3VpdGUyNjkwNTI0NDY1",
"head_branch": "main",
"head_sha": "0989e14252273b848d0a6bd20c58ec41558c02ce",
"status": "queued",
"conclusion": null,
"url": "https://api.github.com/repos/Myorg-Private/myrepo/check-suites/2690524465",
"before": "c4e307e05ceba8530355c5df987c2220402956dc",
"after": "0989e14252273b848d0a6bd20c58ec41558c02ce",
"pull_requests": [
],
"app": {
"id": 15368,
"slug": "github-actions",
"node_id": "MDM6QXBwMTUzNjg=",
"owner": {
"login": "github",
"id": 9919,
"node_id": "MDEyOk9yZ2FuaXphdGlvbjk5MTk=",
"avatar_url": "https://avatars.githubusercontent.com/u/9919?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/github",
"html_url": "https://github.com/github",
"followers_url": "https://api.github.com/users/github/followers",
"following_url": "https://api.github.com/users/github/following{/other_user}",
"gists_url": "https://api.github.com/users/github/gists{/gist_id}",
"starred_url": "https://api.github.com/users/github/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/github/subscriptions",
"organizations_url": "https://api.github.com/users/github/orgs",
"repos_url": "https://api.github.com/users/github/repos",
"events_url": "https://api.github.com/users/github/events{/privacy}",
"received_events_url": "https://api.github.com/users/github/received_events",
"type": "Organization",
"site_admin": false
},
"name": "GitHub Actions",
"description": "Automate your workflow from idea to production",
"external_url": "https://help.github.com/en/actions",
"html_url": "https://github.com/apps/github-actions",
"created_at": "2018-07-30T09:30:17Z",
"updated_at": "2019-12-10T19:04:12Z",
"permissions": {
"actions": "write",
"checks": "write",
"contents": "write",
"deployments": "write",
"issues": "write",
"metadata": "read",
"organization_packages": "write",
"packages": "write",
"pages": "write",
"pull_requests": "write",
"repository_hooks": "write",
"repository_projects": "write",
"security_events": "write",
"statuses": "write",
"vulnerability_alerts": "read"
},
"events": [
"check_run",
"check_suite",
"create",
"delete",
"deployment",
"deployment_status",
"fork",
"gollum",
"issues",
"issue_comment",
"label",
"milestone",
"page_build",
"project",
"project_card",
"project_column",
"public",
"pull_request",
"pull_request_review",
"pull_request_review_comment",
"push",
"registry_package",
"release",
"repository",
"repository_dispatch",
"status",
"watch",
"workflow_dispatch",
"workflow_run"
]
},
"created_at": "2021-05-09T11:46:41Z",
"updated_at": "2021-05-09T11:46:41Z"
},
"app": {
"id": 15368,
"slug": "github-actions",
"node_id": "MDM6QXBwMTUzNjg=",
"owner": {
"login": "github",
"id": 9919,
"node_id": "MDEyOk9yZ2FuaXphdGlvbjk5MTk=",
"avatar_url": "https://avatars.githubusercontent.com/u/9919?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/github",
"html_url": "https://github.com/github",
"followers_url": "https://api.github.com/users/github/followers",
"following_url": "https://api.github.com/users/github/following{/other_user}",
"gists_url": "https://api.github.com/users/github/gists{/gist_id}",
"starred_url": "https://api.github.com/users/github/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/github/subscriptions",
"organizations_url": "https://api.github.com/users/github/orgs",
"repos_url": "https://api.github.com/users/github/repos",
"events_url": "https://api.github.com/users/github/events{/privacy}",
"received_events_url": "https://api.github.com/users/github/received_events",
"type": "Organization",
"site_admin": false
},
"name": "GitHub Actions",
"description": "Automate your workflow from idea to production",
"external_url": "https://help.github.com/en/actions",
"html_url": "https://github.com/apps/github-actions",
"created_at": "2018-07-30T09:30:17Z",
"updated_at": "2019-12-10T19:04:12Z",
"permissions": {
"actions": "write",
"checks": "write",
"contents": "write",
"deployments": "write",
"issues": "write",
"metadata": "read",
"organization_packages": "write",
"packages": "write",
"pages": "write",
"pull_requests": "write",
"repository_hooks": "write",
"repository_projects": "write",
"security_events": "write",
"statuses": "write",
"vulnerability_alerts": "read"
},
"events": [
"check_run",
"check_suite",
"create",
"delete",
"deployment",
"deployment_status",
"fork",
"gollum",
"issues",
"issue_comment",
"label",
"milestone",
"page_build",
"project",
"project_card",
"project_column",
"public",
"pull_request",
"pull_request_review",
"pull_request_review_comment",
"push",
"registry_package",
"release",
"repository",
"repository_dispatch",
"status",
"watch",
"workflow_dispatch",
"workflow_run"
]
},
"pull_requests": [
]
},
"repository": {
"id": 1231231231,
"node_id": "MDEwOlJlcG9zaXRvcnkzMjAwNTMwOTg=",
"name": "myrepo",
"full_name": "Myorg-Private/myrepo",
"private": true,
"owner": {
"login": "Myorg-Private",
"id": 11223344,
"node_id": "MDEyOk9yZ2FuaXphdGlvbjMyMzYxMTk5",
"avatar_url": "https://avatars.githubusercontent.com/u/11223344?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/Myorg-Private",
"html_url": "https://github.com/Myorg-Private",
"followers_url": "https://api.github.com/users/Myorg-Private/followers",
"following_url": "https://api.github.com/users/Myorg-Private/following{/other_user}",
"gists_url": "https://api.github.com/users/Myorg-Private/gists{/gist_id}",
"starred_url": "https://api.github.com/users/Myorg-Private/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/Myorg-Private/subscriptions",
"organizations_url": "https://api.github.com/users/Myorg-Private/orgs",
"repos_url": "https://api.github.com/users/Myorg-Private/repos",
"events_url": "https://api.github.com/users/Myorg-Private/events{/privacy}",
"received_events_url": "https://api.github.com/users/Myorg-Private/received_events",
"type": "Organization",
"site_admin": false
},
"html_url": "https://github.com/Myorg-Private/myrepo",
"description": "We are merging all of our customer facing applications into a single, seamless experience...my.myenterprise.com",
"fork": false,
"url": "https://api.github.com/repos/Myorg-Private/myrepo",
"forks_url": "https://api.github.com/repos/Myorg-Private/myrepo/forks",
"keys_url": "https://api.github.com/repos/Myorg-Private/myrepo/keys{/key_id}",
"collaborators_url": "https://api.github.com/repos/Myorg-Private/myrepo/collaborators{/collaborator}",
"teams_url": "https://api.github.com/repos/Myorg-Private/myrepo/teams",
"hooks_url": "https://api.github.com/repos/Myorg-Private/myrepo/hooks",
"issue_events_url": "https://api.github.com/repos/Myorg-Private/myrepo/issues/events{/number}",
"events_url": "https://api.github.com/repos/Myorg-Private/myrepo/events",
"assignees_url": "https://api.github.com/repos/Myorg-Private/myrepo/assignees{/user}",
"branches_url": "https://api.github.com/repos/Myorg-Private/myrepo/branches{/branch}",
"tags_url": "https://api.github.com/repos/Myorg-Private/myrepo/tags",
"blobs_url": "https://api.github.com/repos/Myorg-Private/myrepo/git/blobs{/sha}",
"git_tags_url": "https://api.github.com/repos/Myorg-Private/myrepo/git/tags{/sha}",
"git_refs_url": "https://api.github.com/repos/Myorg-Private/myrepo/git/refs{/sha}",
"trees_url": "https://api.github.com/repos/Myorg-Private/myrepo/git/trees{/sha}",
"statuses_url": "https://api.github.com/repos/Myorg-Private/myrepo/statuses/{sha}",
"languages_url": "https://api.github.com/repos/Myorg-Private/myrepo/languages",
"stargazers_url": "https://api.github.com/repos/Myorg-Private/myrepo/stargazers",
"contributors_url": "https://api.github.com/repos/Myorg-Private/myrepo/contributors",
"subscribers_url": "https://api.github.com/repos/Myorg-Private/myrepo/subscribers",
"subscription_url": "https://api.github.com/repos/Myorg-Private/myrepo/subscription",
"commits_url": "https://api.github.com/repos/Myorg-Private/myrepo/commits{/sha}",
"git_commits_url": "https://api.github.com/repos/Myorg-Private/myrepo/git/commits{/sha}",
"comments_url": "https://api.github.com/repos/Myorg-Private/myrepo/comments{/number}",
"issue_comment_url": "https://api.github.com/repos/Myorg-Private/myrepo/issues/comments{/number}",
"contents_url": "https://api.github.com/repos/Myorg-Private/myrepo/contents/{+path}",
"compare_url": "https://api.github.com/repos/Myorg-Private/myrepo/compare/{base}...{head}",
"merges_url": "https://api.github.com/repos/Myorg-Private/myrepo/merges",
"archive_url": "https://api.github.com/repos/Myorg-Private/myrepo/{archive_format}{/ref}",
"downloads_url": "https://api.github.com/repos/Myorg-Private/myrepo/downloads",
"issues_url": "https://api.github.com/repos/Myorg-Private/myrepo/issues{/number}",
"pulls_url": "https://api.github.com/repos/Myorg-Private/myrepo/pulls{/number}",
"milestones_url": "https://api.github.com/repos/Myorg-Private/myrepo/milestones{/number}",
"notifications_url": "https://api.github.com/repos/Myorg-Private/myrepo/notifications{?since,all,participating}",
"labels_url": "https://api.github.com/repos/Myorg-Private/myrepo/labels{/name}",
"releases_url": "https://api.github.com/repos/Myorg-Private/myrepo/releases{/id}",
"deployments_url": "https://api.github.com/repos/Myorg-Private/myrepo/deployments",
"created_at": "2020-12-09T19:04:06Z",
"updated_at": "2021-05-06T14:27:15Z",
"pushed_at": "2021-05-07T19:15:14Z",
"git_url": "git://github.com/Myorg-Private/myrepo.git",
"ssh_url": "git@github.com:Myorg-Private/myrepo.git",
"clone_url": "https://github.com/Myorg-Private/myrepo.git",
"svn_url": "https://github.com/Myorg-Private/myrepo",
"homepage": "https://my.myenterprise.com",
"size": 5296,
"stargazers_count": 5,
"watchers_count": 5,
"language": "JavaScript",
"has_issues": true,
"has_projects": true,
"has_downloads": true,
"has_wiki": true,
"has_pages": false,
"forks_count": 1,
"mirror_url": null,
"archived": false,
"disabled": false,
"open_issues_count": 49,
"license": null,
"forks": 1,
"open_issues": 49,
"watchers": 5,
"default_branch": "main"
},
"organization": {
"login": "Myorg-Private",
"id": 11223344,
"node_id": "MDEyOk9yZ2FuaXphdGlvbjMyMzYxMTk5",
"url": "https://api.github.com/orgs/Myorg-Private",
"repos_url": "https://api.github.com/orgs/Myorg-Private/repos",
"events_url": "https://api.github.com/orgs/Myorg-Private/events",
"hooks_url": "https://api.github.com/orgs/Myorg-Private/hooks",
"issues_url": "https://api.github.com/orgs/Myorg-Private/issues",
"members_url": "https://api.github.com/orgs/Myorg-Private/members{/member}",
"public_members_url": "https://api.github.com/orgs/Myorg-Private/public_members{/member}",
"avatar_url": "https://avatars.githubusercontent.com/u/11223344?v=4",
"description": null
},
"enterprise": {
"id": 4321,
"slug": "myenterprise",
"name": "Myenterprise",
"node_id": "MDEwOkVudGVycHJpc2U0MzYx",
"avatar_url": "https://avatars.githubusercontent.com/b/4321?v=4",
"description": "",
"website_url": "https://www.myenterprise.com/",
"html_url": "https://github.com/enterprises/myenterprise",
"created_at": "2020-10-15T17:08:47Z",
"updated_at": "2021-04-29T14:54:21Z"
},
"sender": {
"login": "myuser",
"id": 3214567,
"node_id": "MDQ6VXNlcjg1NTY0OTU=",
"avatar_url": "https://avatars.githubusercontent.com/u/3214567?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/myuser",
"html_url": "https://github.com/myuser",
"followers_url": "https://api.github.com/users/myuser/followers",
"following_url": "https://api.github.com/users/myuser/following{/other_user}",
"gists_url": "https://api.github.com/users/myuser/gists{/gist_id}",
"starred_url": "https://api.github.com/users/myuser/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/myuser/subscriptions",
"organizations_url": "https://api.github.com/users/myuser/orgs",
"repos_url": "https://api.github.com/users/myuser/repos",
"events_url": "https://api.github.com/users/myuser/events{/privacy}",
"received_events_url": "https://api.github.com/users/myuser/received_events",
"type": "User",
"site_admin": false
}
}
|
@yfried Thanks! The webhook payload seems fine. Theoretically speaking, the only chance seems to be that you have two or more HorizontalRunnerAutoscaler that targets RunnerDeplyoment for the organization Could you verify that by running |
No. |
@yfried Thanks. Can you see this message in your log? https://github.com/actions-runner-controller/actions-runner-controller/blob/082245c5db64e023cd79604d9b158f336770e3fe/controllers/horizontal_runner_autoscaler_webhook.go#L377 What's include the |
No, this message isn't in my logs, which is why I'm guessing it doesn't know to look for org |
@yfried Thanks. Maybe the next chance would that we arent' indexing the HRA correctly? In other words, can you see that it isn't matching any HRA here, by adding your own log statement there? https://github.com/actions-runner-controller/actions-runner-controller/blob/082245c5db64e023cd79604d9b158f336770e3fe/controllers/horizontal_runner_autoscaler_webhook.go#L261-L269 |
… repo and org runners Adds what I used while verifying #534
… repo and org runners Adds what I used while verifying #534
Adds some helpful debug log messages I have used while verifying #534
Adds some helpful debug log messages I have used while verifying #534
@yfried Hey! I've made a few changes to the controller myself and gave it a shot. Webhook-based autoscaling on organizational runners seems to work for me. Note that I use envsubst to replace envvars like TEST_ORG with my own github org for testing |
Is there a plan to handle multiple org. wide |
@awoimbee Could you elaborate? You have have one HRA and one RunnerDeployment per a github organization and that should just work. |
@mumoshu -> I have to create multiple organization wide RunnerDeployment and also multiple HRA. It seems like the webhook scaling doesn't handle having multiple organization wide HRA (yet) Is there a plan to handle this use-case ? |
@awoimbee How can you differentiate runners? Basically, webhook event payloads do not contain information about which runner(with certain labels, groups, orgs, repositories, etc) the webhook event is going to trigger a workflow job run on. If you have unique enough |
// So depending on your requirement, you'd need to raise feature requests to GitHub, not us. |
BTW, to be clear, although this issue says So I'm still believing this would have been some user error, even though there might be some documentation or operational enhancements we need to make. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I found the same problem on our setup This is issue is caused by a HRA without minReplica and maxReplica. |
@fabiano-amaral Thanks for the info! Could you clarify a bit more? I had no clue how minReplicas and maxReplicas of a HRA affects it, because actions-runner-controller doesn't use those fields for filtering the right HRA for a webhook event. |
@mumoshu Min and Max MAYBE can be used to set the max os jobs that can be started at same time, so, you can limit how many pods spawn, and the minReplica can set idle jobs waiting for a webhook event. but, its only my theory about this, I don't debugged the source code. This information could be in the documentation |
@awoimbee Hey! We don't yet have out-of-box support for handling multiple sets of organizational runners with a single HRA that you asked. But we do now have With that it should be pretty easy to set up multiple HRA+RunnerDeployment pairs. Does it solve your original issue? Or were you talking about anything else? |
@fabiano-amaral Thanks a lot for the info! I have not reproduced your issue yet, but I'd definitely keep your information in my mind and report back if there's any update 👍 |
Hi @mumoshu did you find any conlusion on this, cause I am having the same thing. Started testing autoscaling with webhook on github enterprise, get the same message, don't have in the log finding organization runner.. 2022-07-26T10:52:06Z DEBUG controllers.webhookbasedautoscaler Found 0 HRAs by key {"key": "engineering/ReleaseTesting"} ` apiVersion: actions.summerwind.dev/v1alpha1
So I tried the most simple approach from the docs. Now runner is in runner group, has specific tags, does that have to do anything with it? As the idea is for us to have in one org multiple runner groups per team basically, so we can divide permissions, so i have multiple runner controllers and that all works fine and nice. But now i wanted to test out scaling to limit the number of runners just waiting idle. I get the above. |
Scale target not found. If this is unexpected, ensure that there is exactly one repository-wide or organizational runner deployment that matches this webhook event {"event": "check_run" scaleUpTriggers:
githubEvent:
workflowJob: {} <---- your configured event
duration: "30m" |
Hmmm, And he will pick up anything that comes in |
It will pick up the configured event i.e. the child key of the Look for the workflow_job event: githubEvent:
workflowJob: {} Look for the check_run event: githubEvent:
checkRun:
... Look for the pull_request event: githubEvent:
pullRequest:
... We are removing all event type other than We currently don't have any child keys for the |
So did i set it up correctly for the workflow_job event? I just took the example from the docs and try to replicate but for some reason it does not work. Is there something that I need to change for HRA to do the scaling of runners? I am using these versions for helm chart: |
Your HRA looks correct, the controller logs suggest it's receiving 2022-07-26T10:52:06Z DEBUG controllers.webhookbasedautoscaler Scale target not found. If this is unexpected, ensure that there is exactly one repository-wide or organizational runner deployment that matches this webhook event {"event": "check_run", "hookID": "326", "delivery": "fd99f650-0cd0-11ed-8638-58a055482577", "checkRun.status": "completed", "action": "completed"} See the event key value for what webhook event ARC received: {"event": "check_run", "hookID": "326", "delivery": "fd99f650-0cd0-11ed-8638-58a055482577", "checkRun.status": "completed", "action": "completed"} This is a misalignment between your configured webhook on github.com and your HRA, you need to take a look at your webhook configuration on github.com. |
Ok, you were correct, i tried check_run event and it worked so went back to the webhook, didn't include workflow jobs event to be picked up. But now I have this other problem or that is behavior that is normal? Basically as soon as the jobs finished all runners were killed (pods), and all runners are now in offline mode. I understood that they should stay online for the scaleDownDelaySecondsAfterScaleOut: 300 amount of time and then they get removed one by one and still they should be removed from my list of runners. Or I misunderstood that? |
Use the
Sounds odd but I would first move over to the |
Yeah, yeah, i moved to workflow_job event and still I have 5 runners that are in offline mode.. I will wait some more to see, but they got removed as soon as the jobs completed and are in offline mode.. I will run again the same test workflow to see if It will create new 5 runners and then leave them offline as well after its finished. If it does i will go with new issue with all the details.. |
There was some weird behavior yesterday, so i will delete everything today and if i find some time bring all from zero so i have a clean slate and then can tell all the steps. But yesterday for example, after all runners were killed (as soon as they finished their jobs) they were offline, and still are, but like 5 minutes later new 4 runners were spun up and there was no action called, no webhook, nothing. And then after some time, i didn't see when, they were removed and those were not left in offline state. |
ARC tries its best to call the "Remove Runner" GitHub Actions API on every runner being shut down so that there will (ideally) be no runners hang "offline". If it failed to do so or GitHub Actions failed to handle the API call at all, you might end up with dangling "offline" runners that you have to clean up manually. I guess, though, it's very rare 🤔
What were the desired replicas of your RunnerDeploymeyment at that time? Also- did the desired replicas went down to 4 from a larger value when it brought up the new 4 runners for no use? If it did receive |
Ok, sorry about me not being back on this, I was put to do other things :) So first thing I should mention that I am doing all of this on cluster which is older version 1.20.11.
Did I understood correctly, they should stay for as long as scaleDownDelaySecondsAfterScaleOut is set and then scale them 1 by 1? Or because i am using workflowJob github event, he HRA scales it down cause it got event that job is completed? But first question of them all, should we spend time on this as the version of the cluster is 1.20.11? If not, I understand completely and will look at it when we upgrade, and come back to you, but If this is not supposed to be happening even in this version, then we can take a look at it together. I can share screen if it would make things faster. |
I keep getting this in logs:
Looks like it can't pull the org name or type from the event payload (But I might be misunderstanding the code)
This is my config:
I'm deploying the controller using helm chart from my own branch which shouldn't matter, but just in case...
Here's my values file:
The text was updated successfully, but these errors were encountered: