[Fleet] Request diagnostics #142369

juliaElastic · 2022-09-30T14:39:10Z

Summary

Request diagnostics action

Added new action for single agent (Agent details page and Agent list row actions) to request diagnostics.
When clicking on the action, an API request is made that creates a REQUEST_DIAGNOSTICS type action in .fleet-actions index.

Diagnostics uploads display

When the action is submitted, the user is navigated to the new Agent Details / Diagnostics tab, which shows the list of pending and completed diagnostics file uploads. The information is coming from the /action_status (for action status) as well as the /uploads endpoint (for file name and path)
By clicking on a diagnostics link, the file should be downloaded in zip.

Failed uploads display:

Expired status was not specified in the design separately, it will be shown like the failed status (with warning icon).

Mock data (blocker)

Currently returning mock data in the /uploads API, because of a blocker in Kibana File Service, see here.

Bulk action

Added bulk action too:

Shows up in agent activity:

The Fleet Server / Agent changes are not there yet, though FS delivers the action, and Agents ack it (looks like default behavior for unkown actions as well)

Confirmation modal

Added a confirmation modal when clicking on action button everywhere, except for the Request diagnostics button on the Diagnostics page.
Open question:

Do we want to display the confirmation window on the Diagnostics page button too?

Download

Generated file path to download in this format: /api/fleet/agents/files/{fileId}/{fileName}

Decided not to try to use files plugin's API because it doesn't have the Fleet authorization around it.

Screen recording demonstrating the download of an agent diagnostics zip file, that I uploaded using the Fleet Server upload API (using Dan's pr locally)

file_download.mov

Notification

Added toast message to show up when a diagnostics becomes ready, when we are on the Diagnostics tab.

diag_notif.mov

Checklist

Delete any items that are not applicable to this PR.

Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
Unit or functional tests were updated or added to match the most common scenarios

elasticmachine · 2022-10-05T08:17:02Z

Pinging @elastic/fleet (Team:Fleet)

joshdover · 2022-10-05T08:18:55Z

@juliaElastic super excited to see so much progress on this already. Do we need to wait for the Fleet Server and Agent dependencies to be merged before we merge this?

juliaElastic · 2022-10-05T08:23:04Z

@joshdover yes, we need the backend implementation of the action, otherwise it would be an action that never completes.
Also we need a real implementation of the uploads api which is blocked currently.

juliaElastic · 2022-10-05T15:45:18Z

@elasticmachine merge upstream

juliaElastic · 2022-11-08T09:21:40Z

ResponseOps changes look good!

We're chatting about this PR right now, since we don't think we've reviewed your usage of task manager - but can now see it when new task types get added (via the check in check_registered_task_types.ts. Thinking it would be good to get an overview of how you're using it, we may have some thoughts/hints/tips.

@pmuellr See the reasoning here: #138870
To summarize, we have Fleet bulk actions that are being run in async mode for more than 10k agents. The Kibana Task Manager is used to catch errors and trigger retry of actions in a Task, and also to check completion in a Task after 5m.
The same logic is used for all types of actions (upgrade, unenroll, reassign, update tags, request diagnostics).

juliaElastic · 2022-11-08T09:51:57Z

x-pack/plugins/fleet/common/experimental_features.ts

@@ -15,6 +15,7 @@ export const allowedExperimentalValues = Object.freeze({
  createPackagePolicyMultiPageLayout: true,
  packageVerification: true,
  showDevtoolsRequest: true,
+  showRequestDiagnostics: false,


Added feature flag to hide Request diagnostics action by default at 3 places: Agent list single action, bulk action, Agent details action.
For local testing, set this boolean to true.

Feature flag off:

Toggle on:

Thanks for adding this!

x-pack/plugins/fleet/server/services/agents/uploads.ts

juliaElastic · 2022-11-08T15:53:28Z

x-pack/plugins/fleet/server/services/agents/uploads.ts

+    return actionResults.hits.hits.map((hit) => ({
+      actionId: hit._source?.action_id as string,
+      timestamp: hit._source?.['@timestamp'],
+      fileId: hit._source?.data?.file_id as string,


taking the file_id from action results, this part is not yet implemented, asked here: elastic/elastic-agent#1631 (comment)

juliaElastic · 2022-11-09T08:29:22Z

...applications/fleet/sections/agents/agent_details_page/components/agent_diagnostics/index.tsx

+        >
+          <FormattedMessage
+            id="xpack.fleet.requestDiagnostics.calloutText"
+            defaultMessage="Diagnostics files are stored in Elasticsearch, and as such can incur storage costs. Fleet will automatically remove old diagnostics files after 30 days."


This callout text might have to be updated if we agreed on a different max age then 30 days.

kibana-ci · 2022-11-09T09:21:19Z

💚 Build Succeeded

Buildkite Build
Commit: 89888ae

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`fleet`	733	735	+2

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`fleet`	898	912	+14

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`fleet`	861.4KB	868.8KB	+7.4KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`fleet`	115.5KB	116.9KB	+1.4KB

Unknown metric groups

API count

id	before	after	diff
`fleet`	1001	1015	+14

ESLint disabled in files

id	before	after	diff
`osquery`	1	2	+1

ESLint disabled line counts

id	before	after	diff
`enterpriseSearch`	19	21	+2
`fleet`	59	65	+6
`osquery`	108	113	+5
`securitySolution`	440	446	+6
total			+19

Total ESLint disabled count

id	before	after	diff
`enterpriseSearch`	20	22	+2
`fleet`	67	73	+6
`osquery`	109	115	+6
`securitySolution`	517	523	+6
total			+20

History

💔 Build #86339 failed 3cfaf2e
💔 Build #86214 failed b3123fa
💔 Build #86142 failed c0755cf
💔 Build #86131 failed a486196
💛 Build #85818 was flaky b164e02

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @juliaElastic

criamico

I tested the branch locally, I'm not sure if the upload part can be tested yet but up to the point works as per requirements.
In my local I see that the UI gets stuck in loading state when requesting the diagnostic, so it's difficult to test the rest of the flow.

I just have a question. From the agent activity is possible to see that the diagnostics where requested, but it's hard to know for which agent. It's totally fine in case of bulk requests, but would it make sense to consider adding the agent id for single requests?

criamico

Code LGTM 🚢

criamico · 2022-11-09T15:37:24Z

x-pack/plugins/fleet/common/constants/routes.ts

@@ -127,6 +127,8 @@ export const AGENT_API_ROUTES = {
  BULK_UNENROLL_PATTERN: `${API_ROOT}/agents/bulk_unenroll`,
  REASSIGN_PATTERN: `${API_ROOT}/agents/{agentId}/reassign`,
  BULK_REASSIGN_PATTERN: `${API_ROOT}/agents/bulk_reassign`,
+  REQUEST_DIAGNOSTICS_PATTERN: `${API_ROOT}/agents/{agentId}/request_diagnostics`,


Could you add the new endpoints to the docs? It can be done in a separate PR as well

juliaElastic · 2022-11-09T16:03:42Z

I tested the branch locally, I'm not sure if the upload part can be tested yet but up to the point works as per requirements. In my local I see that the UI gets stuck in loading state when requesting the diagnostic, so it's difficult to test the rest of the flow.

Thanks for the review!
I can document the steps how I tested e2e locally (checking out the fleet-server pr for the upload API, upload a test file with a script, adding some code to kibana that immediately saves the action result with the file id when taking the action).

I just have a question. From the agent activity is possible to see that the diagnostics where requested, but it's hard to know for which agent. It's totally fine in case of bulk requests, but would it make sense to consider adding the agent id for single requests?

We have an enhancement to add more insight into the agents actioned: #141206

I think in Request Diagnostics case this shouldn't be an issue, because when taking the action on a single agent, the UI should navigate to Agent Details / Diagnostics tab.

juliaElastic · 2022-11-10T08:19:26Z

Here are the steps how I tested e2e locally:

check out the fleet-server pr for the upload API: File Upload Feature fleet-server#1902
start Fleet Server locally

make local
./bin/fleet-server --config fleet-server.dev.yml 2>&1

upload a test file with a script: extract this zip to your fleet-server directory, run go run ./upload.go elastic-agent-diagnostics-2022-10-04T09-54-34Z-00.zip
upload_test.zip
add some code to kibana that immediately saves the action result with the file id when taking the action:
replace requestDiagnostics function in x-pack/plugins/fleet/server/services/agents/request_diagnostics.ts:

Show code

export async function requestDiagnostics(
  esClient: ElasticsearchClient,
  agentId: string
): Promise<{ actionId: string }> {
  const id = new Date().toISOString();
  const response = await createAgentAction(esClient, {
    agents: [agentId],
    created_at: new Date().toISOString(),
    type: 'REQUEST_DIAGNOSTICS',
    id,
  });
  await bulkCreateAgentActionResults(
    esClient,
    [agentId].map((agent) => ({
      agentId: agent,
      actionId: id,
      data: {file_id: 'a59cdf67-65d9-4070-90ee-6b7c488215aa.b324764f-3631-40a5-8ff1-756fbbfbc3ac'} // replace with the actual file id, you can check this in console `GET .fleet-agent-files/_search`
    }))
  );
  return { actionId: response.id };
}

add data parameter to bulkCreateAgentActionResults in actions.ts:

Show code

export async function bulkCreateAgentActionResults(
  esClient: ElasticsearchClient,
  results: Array<{
    actionId: string;
    agentId: string;
    error?: string;
    data?: any;
  }>
): Promise<void> {
  if (results.length === 0) {
    return;
  }

  const bulkBody = results.flatMap((result) => {
    const body = {
      '@timestamp': new Date().toISOString(),
      action_id: result.actionId,
      agent_id: result.agentId,
      error: result.error,
      data: result.data,
    };

    return [
      {
        create: {
          _id: uuid.v4(),
        },
      },
      body,
    ];
  });

  await esClient.bulk({
    index: AGENT_ACTIONS_RESULTS_INDEX,
    body: bulkBody,
    refresh: 'wait_for',
  });
}

Go to Agent Details UI and trigger Request Diagnostics action. The file should immediately be available to download, and the uploaded zip file should be the same as you uploaded.

juliaElastic · 2023-01-12T13:27:38Z

NOTE: this feature is turned off in 8.6 as the backend is not ready yet. The feature is planned to be enabled in 8.7.

Related to elastic/observability-docs#2506 The change for #142369 did not make it into the 8.6 release.

…8842) Related to elastic/observability-docs#2506 The change for #142369 did not make it into the 8.6 release. ## Summary n/a ### Checklist n/a

amolnater-qasource · 2023-02-27T08:47:17Z

Hi Team,

We have created 11 testcases for this feature under our Fleet test suite at link:

Fleet>Agent Diagnostics

Please let us know if any other scenario is missing from our end.
Thanks

juliaElastic added 2 commits September 30, 2022 13:12

request diagnostics action API

0ec98c2

agent diagnostics action UI

af2e1d6

juliaElastic added release_note:feature Makes this part of the condensed release notes v8.6.0 labels Sep 30, 2022

juliaElastic self-assigned this Sep 30, 2022

juliaElastic changed the title ~~Feat/request diagnostics~~ [WIP] Request diagnostics Sep 30, 2022

juliaElastic and others added 3 commits October 3, 2022 13:43

call diagnostics api when clicking button, added modal

7a47627

showing diagnostics uploads list with mock data

03e6ab6

Merge branch 'main' into feat/request-diagnostics

6cc9d88

juliaElastic marked this pull request as ready for review October 5, 2022 08:14

juliaElastic requested a review from a team as a code owner October 5, 2022 08:14

juliaElastic changed the title ~~[WIP] Request diagnostics~~ [Fleet] Request diagnostics Oct 5, 2022

botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Oct 5, 2022

juliaElastic and others added 5 commits October 5, 2022 11:18

bulk request diagnostics

478c3d5

query action status to show diagnostics status

c559655

fix for failed status display

f36b539

changed implementation to query files index in /uploads API

c0dcc51

Merge branch 'main' into feat/request-diagnostics

46b9000

kibanamachine and others added 2 commits October 5, 2022 10:11

Merge branch 'main' into feat/request-diagnostics

383c4bf

implemented file download

e795a39

juliaElastic mentioned this pull request Oct 6, 2022

[Fleet] Add workflow for requesting and downloading agent diagnostics from Fleet UI #141074

Closed

19 tasks

juliaElastic added 3 commits October 6, 2022 13:12

changed implementation to add downlaod headers for file

ed1a5d1

added toast when a diagnostics became ready

d537961

added tests on request diagnostics and uploads

f165ee7

juliaElastic force-pushed the feat/request-diagnostics branch from 77ee9bf to f165ee7 Compare October 7, 2022 13:32

fixed checks

a15ba47

juliaElastic and others added 2 commits November 8, 2022 10:24

Merge branch 'main' into feat/request-diagnostics

448f0fa

added feature flag to hide request diagnostics action

a486196

juliaElastic commented Nov 8, 2022

View reviewed changes

juliaElastic added 2 commits November 8, 2022 11:18

fixed checks

c0755cf

fixed test

b3123fa

nchaulet self-requested a review November 8, 2022 14:02

kpollich self-requested a review November 8, 2022 14:20

juliaElastic commented Nov 8, 2022

View reviewed changes

x-pack/plugins/fleet/server/services/agents/uploads.ts Outdated Show resolved Hide resolved

removed mock data and changed the query to return diagnostic files

820edee

juliaElastic commented Nov 8, 2022

View reviewed changes

juliaElastic and others added 3 commits November 8, 2022 16:54

Merge branch 'main' into feat/request-diagnostics

3cfaf2e

fixed integration test

d622ed3

added error handling, extracted index names as constants

89888ae

juliaElastic commented Nov 9, 2022

View reviewed changes

criamico reviewed Nov 9, 2022

View reviewed changes

criamico approved these changes Nov 9, 2022

View reviewed changes

juliaElastic merged commit c7cdd00 into elastic:main Nov 10, 2022

kibanamachine added the backport:skip This commit does not require backporting label Nov 10, 2022

juliaElastic added release_note:skip Skip the PR/issue when compiling release notes and removed release_note:feature Makes this part of the condensed release notes labels Jan 12, 2023

dedemorton mentioned this pull request Jan 12, 2023

[REQUEST]: Remove release notes item added in 8.6 for kibana pull 142369 elastic/observability-docs#2505

Closed

dedemorton added a commit that referenced this pull request Jan 12, 2023

Remove changelog entry for #142369 and edit Fleet entries

2264056

Related to elastic/observability-docs#2506 The change for #142369 did not make it into the 8.6 release.

dedemorton mentioned this pull request Jan 12, 2023

[DOCS] Remove changelog entry for #142369 and edit Fleet entries #148842

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fleet] Request diagnostics #142369

[Fleet] Request diagnostics #142369

juliaElastic commented Sep 30, 2022 •

edited

Loading

elasticmachine commented Oct 5, 2022

joshdover commented Oct 5, 2022

juliaElastic commented Oct 5, 2022

juliaElastic commented Oct 5, 2022

juliaElastic commented Nov 8, 2022 •

edited

Loading

juliaElastic Nov 8, 2022

kpollich Nov 8, 2022

juliaElastic Nov 8, 2022

juliaElastic Nov 9, 2022

kibana-ci commented Nov 9, 2022

API count

ESLint disabled in files

ESLint disabled line counts

Total ESLint disabled count

criamico left a comment

criamico left a comment

criamico Nov 9, 2022

juliaElastic commented Nov 9, 2022

juliaElastic commented Nov 10, 2022 •

edited

Loading

juliaElastic commented Jan 12, 2023

amolnater-qasource commented Feb 27, 2023

[Fleet] Request diagnostics #142369

[Fleet] Request diagnostics #142369

Conversation

juliaElastic commented Sep 30, 2022 • edited Loading

Summary

Request diagnostics action

Diagnostics uploads display

Mock data (blocker)

Bulk action

Confirmation modal

Download

Notification

Checklist

elasticmachine commented Oct 5, 2022

joshdover commented Oct 5, 2022

juliaElastic commented Oct 5, 2022

juliaElastic commented Oct 5, 2022

juliaElastic commented Nov 8, 2022 • edited Loading

juliaElastic Nov 8, 2022

Choose a reason for hiding this comment

kpollich Nov 8, 2022

Choose a reason for hiding this comment

juliaElastic Nov 8, 2022

Choose a reason for hiding this comment

juliaElastic Nov 9, 2022

Choose a reason for hiding this comment

kibana-ci commented Nov 9, 2022

💚 Build Succeeded

Metrics [docs]

Module Count

Public APIs missing comments

Async chunks

Page load bundle

API count

ESLint disabled in files

ESLint disabled line counts

Total ESLint disabled count

History

criamico left a comment

Choose a reason for hiding this comment

criamico left a comment

Choose a reason for hiding this comment

criamico Nov 9, 2022

Choose a reason for hiding this comment

juliaElastic commented Nov 9, 2022

juliaElastic commented Nov 10, 2022 • edited Loading

juliaElastic commented Jan 12, 2023

amolnater-qasource commented Feb 27, 2023

juliaElastic commented Sep 30, 2022 •

edited

Loading

juliaElastic commented Nov 8, 2022 •

edited

Loading

juliaElastic commented Nov 10, 2022 •

edited

Loading