Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host summary generation for large inventories is too slow #6991

Closed
ryanpetrello opened this issue May 11, 2020 · 6 comments
Closed

Host summary generation for large inventories is too slow #6991

ryanpetrello opened this issue May 11, 2020 · 6 comments

Comments

@ryanpetrello
Copy link
Contributor

ryanpetrello commented May 11, 2020

ISSUE TYPE
  • Bug Report
SUMMARY

Generate an inventory with 5000 hosts, and run a playbook.
Observe that the playbook_on_stats event takes a very long time (depending on total hosts, and how your inventory is structured, several minutes+) to show up.

This code is super duper crazy slow (mostly because it generates lots of insert and update queries for large inventories):

https://github.com/ansible/awx/blob/devel/awx/main/models/events.py#L482

@pytest.mark.django_db
def test_host_summary_generation():
    hostnames = [f'Host {i}' for i in range(5000)]
    inv = Inventory()
    inv.save()
    for h in hostnames:
        Host(name=h, inventory_id=inv.id).save()
    j = Job(inventory=inv)
    j.save()
    JobEvent.create_from_data(
        job_id=j.pk,
        parent_uuid='abc123',
        event='playbook_on_stats',
        event_data={'ok': dict((hostname, 5) for hostname in hostnames)}
    ).save()

    je = JobEvent.objects.first()
    je._update_host_summary_from_stats(hostnames)
    assert j.job_host_summaries.count() == len(hostnames)
    assert sorted([s.host_name for s in j.job_host_summaries.all()]) == sorted(hostnames)
@AlanCoding
Copy link
Member

Sound like a rinse-and-repeat of what #6290 did, just for all the other non-notification playbook_on_stats processing.

@ryanpetrello
Copy link
Contributor Author

@AlanCoding yep, exactly.

@ryanpetrello
Copy link
Contributor Author

This probably isn't incredibly urgent/pressing, because this code has existed in this slow state for quite some time as far as I can tell.

ryanpetrello added a commit to ryanpetrello/awx that referenced this issue May 11, 2020
@ryanpetrello
Copy link
Contributor Author

Here's a playbook_on_stats event with 5000 hosts:

image

Almost an entire minute to process this event.

@ryanpetrello
Copy link
Contributor Author

Here's a playbook_on_stats event with 5000 hosts After my PR:

image

@kdelee
Copy link
Member

kdelee commented Jul 7, 2020

Given that event processing time is effected by general backlog of other events to process I was not able to get a very consistent measurement, but can say that processing events on jobs that act on thousands of hosts happens in a reasonable amount of time, and that lag in event processing is no longer showing to be correlated with inventory size.

Now strongest correlation is "when" the event occurred, e.g. if the system was backed up then, it is slow, if not, it can be fast.

Closing as it is a net improvement.

@kdelee kdelee closed this as completed Jul 7, 2020
@kdelee kdelee self-assigned this Jul 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants