-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Agent activity flyout enhancements #179161
[Fleet] Agent activity flyout enhancements #179161
Conversation
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
Did a partial push with the WIP "show more" function functionality to discuss some open implementation questions - @juliaElastic @criamico if you could have a look, you probably have the most context 🙏 Question 1: backend pagination behaviourThe current backend implementation for retrieving action status items from
As discussed in the issue, it would make sense for agent policy actions to support pagination. The question is what should we expect from the parameters, since we concatenate two results. For example, with
I temporarily went with the first option in this commit. Note that because of the way agent policy actions are processed (see below), they seem to be roughly halved, so with
Question 2: agent policy actions processingMy original intention was to introduce UI logic to disable the "Show more" button if we don't have any more actions to load. I found it doable with agent actions, since we can just get the number of total hits from the ES query and then know whether there are more we can fetch, but it's not as easy with agent policy actions. The reason is that they are processed in the following way:
I decided to scrap the UI logic to disable the "Show more" button for now, since I can't see a way to predict how many policy change actions we have in total. Any thoughts on that? Question 3: UI loading implementationI'm not sure about this WIP implementation, maybe you can help: I originally intended to only load actions as needed, e.g.:
But I ran into issues because the component re-renders often and includes refresh logic, so it fetches the actions often. I found it difficult to fit this with pagination, so I eventually resorted to a simpler approach where |
Thanks for the detailed explanation!
Good point, I would go with your suggestion, I think there is no harm to show more actions rather than less, so it requires less clicking of
The reason of multiple policy docs with the same revision is that kibana creates one doc per revision with
I see your point, I'll give this some thought. |
Thanks @jillguyonnet for the thoroughly investigation! Regarding the pagination, you mention that:
It's kinda odd to request 20 items and get 30, but if there's no easy way to fix the agent policy actions it should be fine; I agree with @juliaElastic that is better to get more than fewer. |
Thank you @juliaElastic and @criamico for your input! I agree that it would be ideal if we got 20 (combined) action results if On that, thanks for the explanation about the multiple policy docs, it makes sense to add a filter for |
Related to the 3. question, I think it's unlikely that the users would have more than 1k actions and that they would click on The alternative would be to do multiple API calls on every refresh to query each page separately, and the results might get inconsistent (e.g. a new action appears on the first page, and the last action on the first page moves to the second page). Another idea to reload the in progress actions only, because completed ones are not going to be updated on the backend. This would add some complexity though to reorder on UI and merge with the completed actions that are not reloaded. I would go with the first approach to start with, and revisit later if the performance becomes an issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see the refactor to smaller components!
|
||
useEffect(() => { | ||
loadActions(); | ||
}, [loadActions, nActions]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the purpose of this additional useEffect
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fetch the actions when we want more of them (after clicking "Show more"). I think it wasn't working without it, but I can check that, as I've done a lot of back and forth with the UI code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think it should work without, because nActions
is a dependency of loadActions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks, I'll check it. Definitely don't want more useEffect
than we need 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a comment on why the second useEffect
is needed? I tried without it and didn't work, I don't have a better idea now than to leave it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, sorry for not commenting back on this question. I wasn't successful either, I will add a comment 👍
By the way, something I've observed while working on this "Show more" functionality: something potentially very confusing for the user is that newly loaded actions are not necessarily older because of that concatenation between agent and agent policy actions. For instance, say that on first query we have:
In both cases we are getting the 20 most recent items and we got all the agent policy actions (there are no more). When we query more, we get an additional 20 agent actions between 2024-01-12 and 2024-01-15, which are more recent than the oldest (agent policy) actions we already have. In the UI, this means they'll get listed above and not below and the user might be wondering if the button works. I'm wondering if that already caused confusion without the pagination, as the flyout appeared to drop recent elements and keep older ones. I'm not sure how easy it would be to fix this - one idea could be some logic where we introduce a shared time window and fetch both types of actions until we reach the pagination limit, which definitely sounds more complex than the current implementation. I'll give this more thought, but please share if you have any suggestions or concerns. |
Good catch, I think this definitely caused confusion before.
Good idea, we could do something like query one day at a time in a loop until we reach N results, though it might end up a lot of queries. |
I like those suggestions! Maybe it would be worth exploring that, I'm concerned that the current implementation might be too confusing (this may have to do with the fact that I spent an hour trying to debug an inexistent issue because I couldn't see that my new actions were higher up and it looked like nothing had happened 😬). |
...ions/fleet/sections/agents/agent_list_page/components/agent_activity_flyout/jump_to_date.tsx
Outdated
Show resolved
Hide resolved
I've had a think about the API pagination. I think the following approach would ensure we preserve combined sorting by timestamp (between the two kinds of actions):
So the query becomes bigger as |
About the UI loading implementation, where
We should probably verify if the agent activity flyout has a timeout that refetches the actions every x seconds, but if that's the case then having the This is just an idea, but maybe we could show a button that opens a new static view in the flyout where we show the summary of the selected actions and the |
Agreed, either a dynamic list (e.g. 20 most recent actions, periodically refreshed) or a static list (no refresh, load more functionality) would make sense to me. Your suggestion might be a good improvement. I also think it might be worth understanding why the component is rendered so many times in the first place. |
@@ -71,7 +71,10 @@ export const ActivityItem: React.FunctionComponent<{ | |||
id="xpack.fleet.agentActivityFlyout.completedDescription" | |||
defaultMessage="Completed {date}" | |||
values={{ | |||
date: formattedTime(action.completionTime), | |||
date: | |||
action.completionTime === '0001-01-01T00:00:00.000Z' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, it sounds like a bug, maybe we could add a comment where this value is coming from, if we want to investigate later
@juliaElastic @criamico @kpollich Can I ask for some input on points 1 and 5: 1. Review errors button in agent listIt's worth noting that at the moment actions data is only fetched when the flyout is open. I was checking how we might implement this functionality in a minimal way, but I don't think we can get around querying new actions from the page component, either periodically or after changes are made (the latter way might not even work, because it might take some time for an error to happen, e.g. in the case of a failed agent upgrade). Can you see any cause for concern about this? 5. View agents button needs to be updated to work with new Agent Policy Change sectionLonger question (apologies), which starts with my understanding of how policy change actions are generated on the fly in this function. Given a policy change action, say:
I would expect that this means that 1 agent was affected by the policy change at that time. However, say I assign another agent to this policy a few days later and then make a change to the policy. Then the above change action becomes:
Which doesn't make sense to me, as this change was really applied to 1 agent, so it looks like a bug. I've dug into the code and I think I know why that is: I think that older actions (revision less than current) have an incorrect number of assigned agents, which is the number of assigned agents on the current revision. Details:
I came across this as I was investigating how the "View Agents" button might work for policy changes, but unless I'm mistaken this would involve fixing this first and making sure we so have access to this historical data. Perhaps we would need is to generate and store policy change actions that record which agents were affected (this is what is done with agent actions). I've hidden the "View Agents" button for policy changes as a temporary measure. |
I'm not against including the action status poll without opening the flyout, but I'm wondering if this small feature is worth adding it.
This is what I had in mind when implementing the policy change agent activity: So when The logic also considers older revisions, e.g. This logic is not perfect, like you said, it is sometimes incorrect when you reassign agents to another policy e.g. if you do a policy change and then reassign, the previous policy change would say 0 agents, because there are no longer any agents with that policy, and there is no way to know which agents were assigned in the past. The way to fix this would be to add agent docs for policy change to keep track, though that would be some overhead and potential changes needed in fleet-server to ignore them. I was thinking about this before, but wasn't sure if we need this much complexity for Agent activity. The current logic is best effort. |
Thank you @juliaElastic for your quick feedback! Review errors button in agent listYes, I wasn't sure either if this small feature would be worth the extra API calls. Your suggestions is interesting, though, I'll give it a try. View agents button needs to be updated to work with new Agent Policy Change sectionThanks for confirming 🙏 I agree with you that the changes to introduce proper policy change history seem too complex for the purpose of this issue. For the UI, I suppose we have a choice then: either hide the "View Agents" button for policy changes, or show agents currently assigned to the policy (which sort of matches what the API is doing, but may confuse the user if more agents are assigned). |
I would go with showing |
How about adding a tooltip that says |
@juliaElastic How exactly did you set this up? In the absence of a fleet server, I only see the |
I had |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for addressing the review comments!
Thank you for your thorough testing and feedback @juliaElastic! The CI failures look unrelated now, mostly timeouts. QQ - were you able to successfully test the "Review errors" badge? I found it a little tricky to get errors on purpose. |
@elasticmachine merge upstream |
I haven't tested that, I can try to test locally. If hard to reproduce an error, we could try inserting errors manually to ES. |
@juliaElastic I pushed a fix along with a screen recording and updated steps in the PR description. Let me know if that looks OK to you 🙏 |
} from '../../../../hooks'; | ||
import { AgentStatusKueryHelper, ExperimentalFeaturesService } from '../../../../services'; | ||
import { AGENT_POLICY_SAVED_OBJECT_TYPE, SO_SEARCH_LIMIT } from '../../../../constants'; | ||
|
||
import { getKuery } from '../utils/get_kuery'; | ||
|
||
const REFRESH_INTERVAL_MS = 30000; | ||
const REFRESH_INTERVAL_MS = 10000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to refresh every 10s? I would keep the 30s to keep the API requests down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I changed that for testing only! Sorry, reverting.
💚 Build Succeeded
Metrics [docs]Module Count
Async chunks
Page load bundle
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work here @jillguyonnet! I took a look through the code and things look very well factored overall. Breaking down the activity flyout into smaller components definitely has a noticeable impact on the readability of the code here - made it very easy to review. 🚀
Summary
Closes #153232
Enhancements addressed in this PR:
GET /agents/action_status
.POST /agents
to allowactionIds
to either refer to agent or agent policy actions ids. This allow the "View agents" button in the Agent activity flyout to work with agent policy change actions.Steps for testing
Screenshots and screen recordings
Agent activity flyout with no date filter selected and remaining actions to load:
Agent activity flyout with a date filter selected and all actions loaded:
Agent activity flyout with a date filter selected and no actions for that date:
Tooltip on the "View agents" button for policy changes:
"Review errors" badge with 1 new error:
"Review errors" badge tooltip details (1 and 2 errors):
"Show more" and date filtering functionalities:
Screen.Recording.2024-04-05.at.16.38.28.mov
Tooltip on the "View agents" button for policy changes:
Screen.Recording.2024-04-05.at.17.02.19.mov
"Review errors" badge:
Screen.Recording.2024-04-10.at.12.42.39.mov
Checklist