Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting UI] Use recorded start/duration/end times from event log for rule details page #101662

Open
ymao1 opened this issue Jun 8, 2021 · 2 comments
Labels
enhancement New value added to drive a business result estimate:small Small Estimated Level of Effort Feature:Alerting/RulesManagement Issues related to the Rules Management UX Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@ymao1
Copy link
Contributor

ymao1 commented Jun 8, 2021

With this PR, we are persisting additional information to the event log about the start, duration and end time of each alert (instance). We should be able to simplify the calculations that are currently being performed to get the start time of active alerts (for duration calculation).

Note:

  • While we could use the duration that's now persisted in the event log, that value is updated only when the rule executes so it may be misleading for rules that execute at a longer interval. Probably better to use the recorded start time and calculate the duration to date.
  • Whatever changes we make should still work with event log documents written prior to this update.
@ymao1 ymao1 added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Jun 8, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@ymao1 ymao1 changed the title [Alerting UI] Show actual alert duration if available in rule details page [Alerting UI] Use recorded start/duration/end times from event log for rule details page Jun 8, 2021
@gmmorris gmmorris added Project:ObservabilityOfAlerting Alerting team project for observability of alerting. Feature:Alerting/RulesManagement Issues related to the Rules Management UX and removed Project:ObservabilityOfAlerting Alerting team project for observability of alerting. labels Jun 30, 2021
@gmmorris gmmorris added the loe:medium Medium Level of Effort label Jul 14, 2021
@pmuellr
Copy link
Member

pmuellr commented Aug 3, 2021

I was searching for this issue, couldn't find it, so created another one, which I'll close. But I added some additional detail, so copying that in here:

The calculation of the "alert duration", we are doing today is expensive. The relevant code is in this module: x-pack/plugins/alerting/server/lib/alert_instance_summary_from_event_log.ts which is called from here:

let events: IEvent[];
try {
const queryResults = await eventLogClient.findEventsBySavedObjectIds('alert', [id], {
page: 1,
per_page: 10000,
start: parsedDateStart.toISOString(),
end: dateNow.toISOString(),
sort_order: 'desc',
});
events = queryResults.data;
} catch (err) {

Basically, in previous releases, we didn't have the alert duration available directly in active-instance events, so we had to calculate it by finding the closest new-instance event. The date on the new-instance event becomes the activeStartDate in the AlertInstanceStatus:

export interface AlertInstanceStatus {
status: AlertInstanceStatusValues;
muted: boolean;
actionGroupId?: string;
actionSubgroup?: string;
activeStartDate?: string;
}

There were problems with this approach anyway, as the query getting the event log docs may not have gone far enough back to find a relevant new-instance event. But the big win will be not having to return all the event log docs, to get the alert duration, we can just get the last active-instance event, which contains the duration.

Another optimization we could make is to move the alert duration into the task manager state, shape here, since I think we already have to get the task manager state whenever we calculate the alert instance summary:

const metaSchema = t.partial({
lastScheduledActions: t.intersection([
t.partial({
subgroup: t.string,
}),
t.type({
group: t.string,
date: DateFromString,
}),
]),
});

@gmmorris gmmorris added enhancement New value added to drive a business result estimate:small Small Estimated Level of Effort labels Aug 13, 2021
@gmmorris gmmorris removed the loe:medium Medium Level of Effort label Sep 2, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result estimate:small Small Estimated Level of Effort Feature:Alerting/RulesManagement Issues related to the Rules Management UX Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

5 participants