-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Tailorbird" initiative: making CI your friend #9506
Comments
Added myself to the list of volunteers! |
I already started doing the |
I made a script to gather data, basic stuff out of the nightly jobs actually... import requests
import sys
import os
import re
workflow_id='ci-nightly.yaml'
list_workflow_runs_url='https://api.github.com/repos/kata-containers/kata-containers/actions/workflows/' + workflow_id + '/runs'
headers = {"Accept": "application/vnd.github+json" ,"X-GitHub-Api-Version": "2022-11-28"}
token = os.getenv("GITHUB_TOKEN")
if token != None:
headers['Authorization'] = "Bearer " + token
# Get latest 10 ran workflows.
# TODO: parametize it!
#
r = requests.get("%s?per_page=10" %(list_workflow_runs_url), headers=headers)
r.raise_for_status()
page_size=100
runs_map=[]
for run in r.json()['workflow_runs']:
entry = {'id': run['id'],
'created_at': run['created_at'],
'conclusion': None,
'jobs': []}
jobs_map={}
if run['status'] == "in_progress":
runs_map.append(entry)
continue
else:
entry['conclusion'] = run['conclusion']
# Let's paginate as jobs can span in several pages.
total_count = -1
page=1
while True:
jobs_request = requests.get("%s?per_page=%s&page=%s" % (run['jobs_url'], page_size,page), headers=headers)
jobs_request.raise_for_status()
for job in jobs_request.json()['jobs']:
entry['jobs'].append({'name': job['name'], 'run_id': job['run_id'],
'conclusion': job['conclusion']})
total_count = max(total_count, jobs_request.json()['total_count'])
if len(entry['jobs']) >= total_count:
break
page += 1
runs_map.append(entry)
def collect_jobs_stats(workflows_runs):
'''
Return a map of {'runs': NUMBER, 'fails': NUMBER} index by job's name
'''
stats = {}
for run in workflows_runs:
for job in run['jobs']:
job_stat = stats.get(job['name'], {'runs': 0, 'fails': 0})
job_stat['runs']+=1
if job['conclusion'] != 'success':
job_stat['fails']+=1
stats[job['name']] = job_stat
return stats
jobs_stats = collect_jobs_stats(runs_map)
regex = re.compile('kata-containers-ci-on-push / run-.*-tests.*')
for name, stat in jobs_stats.items():
if regex.match(name):
print('%s: (%s) fail=%s' % (name, stat['runs'], stat['fails'])) @ldoktor @beraldoleal ^^^^ in case you have free cycles to help with bugs and improving it. I just ran it, see the results below. Notice that sometimes the parent fails but the children jobs don't get charged by the failure. This is something that could be improved on the script. Although interpreting the data by eyes isn't difficult.
|
@wainersm - optionally suggestion - I wonder whether using the |
There is a gh plugin that does that for us, iirc by default goes over the last 100 jobs:
I pasted the output in our slack channel, but pasting here too for visibility:
|
@beraldoleal - that's a cool plugin. I played a bit with the options and came up with: |
Very nice, @beraldoleal, with the |
hey @stevenhorsman @beraldoleal @ldoktor thanks for the feedback on the script. The One thing that intrigued me, though, is that I asked the tool to gen statistics of last 10 days but the "run count" of most jobs were "16" and I was expecting "~10" (more or less 10 because someone might have triggered the workflows manually). |
Hi folks, Generated the report today again, considering the last 10 executions:
Above report is more accurate because I fixed two problems on my script:
Note that it is counting 'canceled' as 'failed'. I might change that in future. The new version: import requests
import sys
import os
import re
workflow_id='ci-nightly.yaml'
list_workflow_runs_url='https://api.github.com/repos/kata-containers/kata-containers/actions/workflows/' + workflow_id + '/runs'
headers = {"Accept": "application/vnd.github+json" ,"X-GitHub-Api-Version": "2022-11-28"}
token = os.getenv("GITHUB_TOKEN")
if token != None:
headers['Authorization'] = "Bearer " + token
# Get latest 10 ran workflows.
# TODO: parametize it!
#
r = requests.get("%s?per_page=10" %(list_workflow_runs_url), headers=headers)
r.raise_for_status()
page_size=100
runs_map=[]
for run in r.json()['workflow_runs']:
entry = {'id': run['id'],
'created_at': run['created_at'],
'conclusion': None,
'jobs': []}
jobs_map={}
if run['status'] == "in_progress":
runs_map.append(entry)
continue
else:
entry['conclusion'] = run['conclusion']
# Let's paginate as jobs can span in several pages.
total_count = -1
page=1
while True:
jobs_request = requests.get("%s?per_page=%s&page=%s" % (run['jobs_url'], page_size,page), headers=headers)
jobs_request.raise_for_status()
for job in jobs_request.json()['jobs']:
entry['jobs'].append({'name': job['name'], 'run_id': job['run_id'],
'conclusion': job['conclusion']})
total_count = max(total_count, jobs_request.json()['total_count'])
if len(entry['jobs']) >= total_count:
break
page += 1
runs_map.append(entry)
def collect_jobs_stats(workflows_runs):
'''
Return a map of {'runs': NUMBER, 'fails': NUMBER, 'skips': NUMBER} index by job's name
'''
stats = {}
for run in workflows_runs:
for job in run['jobs']:
job_stat = stats.get(job['name'], {'runs': 0, 'fails': 0, 'skips': 0})
job_stat['runs']+=1
if job['conclusion'] != 'success':
if job['conclusion'] == 'skipped':
job_stat['skips']+=1
else: # failed and cancelled
job_stat['fails']+=1
stats[job['name']] = job_stat
return stats
jobs_stats = collect_jobs_stats(runs_map)
regex = re.compile('kata-containers-ci-on-push / run-.*-tests.*')
for name, stat in jobs_stats.items():
if regex.match(name):
print('%s: (%s) fail=%s skips=%s' % (name, stat['runs'], stat['fails'], stat['skips'])) |
Context
On Virtual Kata Containers PTG Planning of April 2024 there was a discussion session lead by @jodh-intel on regarding the current problems that Kata developers have faced with CI. Please, see the topics and notes of that session in https://etherpad.opendev.org/p/kata-ptg-planning-april-2024#L160 . We ended the session with a list of volunteers (myself, @ldoktor , @stevenhorsman , @gkurz , @littlejawa) to build a "task force" aiming to improve the CI situation as much as possible.
We want CI be your friend!
Work items
Find them on the the dashboard: https://github.com/orgs/kata-containers/projects/46/views/1
Old table:
Done criteria
When we will be done with this initiative?
Syncing up
TBD - Every X days meeting? or Slack?
Volunteers
We need help! and everybody is welcomed! Please add your name:
@ldoktor , @stevenhorsman , @gkurz , @littlejawa, @sprt
Additional information
Common Tailorbird is a mostly green bird which has a stable population
The text was updated successfully, but these errors were encountered: