Skip to content

Commit

Permalink
Merge branch 'main' into ts-4-5-2
Browse files Browse the repository at this point in the history
  • Loading branch information
brianseeders authored Dec 9, 2021
2 parents 8c2bce2 + ebe11e3 commit 2872a20
Show file tree
Hide file tree
Showing 408 changed files with 14,085 additions and 8,789 deletions.
68 changes: 0 additions & 68 deletions docs/developer/getting-started/debugging.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -130,71 +130,3 @@ Once you're finished, you can stop Kibana normally, then stop the {es} and APM s
----
./scripts/compose.py stop
----

=== Using {kib} server logs
{kib} Logs is a great way to see what's going on in your application and to debug performance issues. Navigating through a large number of generated logs can be overwhelming, and following are some techniques that you can use to optimize the process.

Start by defining a problem area that you are interested in. For example, you might be interested in seeing how a particular {kib} Plugin is performing, so no need to gather logs for all of {kib}. Or you might want to focus on a particular feature, such as requests from the {kib} server to the {es} server.
Depending on your needs, you can configure {kib} to generate logs for a specific feature.
[source,yml]
----
logging:
appenders:
file:
type: file
fileName: ./kibana.log
layout:
type: json
### gather all the Kibana logs into a file
logging.root:
appenders: [file]
level: all
### or gather a subset of the logs
logging.loggers:
### responses to an HTTP request
- name: http.server.response
level: debug
appenders: [file]
### result of a query to the Elasticsearch server
- name: elasticsearch.query
level: debug
appenders: [file]
### logs generated by my plugin
- name: plugins.myPlugin
level: debug
appenders: [file]
----
WARNING: Kibana's `file` appender is configured to produce logs in https://www.elastic.co/guide/en/ecs/master/ecs-reference.html[ECS JSON] format. It's the only format that includes the meta information necessary for https://www.elastic.co/guide/en/apm/agent/nodejs/current/log-correlation.html[log correlation] out-of-the-box.

The next step is to define what https://www.elastic.co/observability[observability tools] are available.
For a better experience, set up an https://www.elastic.co/guide/en/apm/get-started/current/observability-integrations.html[Observability integration] provided by Elastic to debug your application with the <<debugging-logs-apm-ui, APM UI.>>
To debug something quickly without setting up additional tooling, you can work with <<plain-kibana-logs, the plain {kib} logs.>>

[[debugging-logs-apm-ui]]
==== APM UI
*Prerequisites* {kib} logs are configured to be in https://www.elastic.co/guide/en/ecs/master/ecs-reference.html[ECS JSON] format to include tracing identifiers.

To debug {kib} with the APM UI, you must set up the APM infrastructure. You can find instructions for the setup process
https://www.elastic.co/guide/en/apm/get-started/current/observability-integrations.html[on the Observability integrations page].

Once you set up the APM infrastructure, you can enable the APM agent and put {kib} under load to collect APM events. To analyze the collected metrics and logs, use the APM UI as demonstrated https://www.elastic.co/guide/en/kibana/master/transactions.html#transaction-trace-sample[in the docs].

[[plain-kibana-logs]]
==== Plain {kib} logs
*Prerequisites* {kib} logs are configured to be in https://www.elastic.co/guide/en/ecs/master/ecs-reference.html[ECS JSON] format to include tracing identifiers.

Open {kib} Logs and search for an operation you are interested in.
For example, suppose you want to investigate the response times for queries to the `/api/telemetry/v2/clusters/_stats` {kib} endpoint.
Open Kibana Logs and search for the HTTP server response for the endpoint. It looks similar to the following (some fields are omitted for brevity).
[source,json]
----
{
"message":"POST /api/telemetry/v2/clusters/_stats 200 1014ms - 43.2KB",
"log":{"level":"DEBUG","logger":"http.server.response"},
"trace":{"id":"9b99131a6f66587971ef085ef97dfd07"},
"transaction":{"id":"d0c5bbf14f5febca"}
}
----
You are interested in the https://www.elastic.co/guide/en/ecs/current/ecs-tracing.html#field-trace-id[trace.id] field, which is a unique identifier of a trace. The `trace.id` provides a way to group multiple events, like transactions, which belong together. You can search for `"trace":{"id":"9b99131a6f66587971ef085ef97dfd07"}` to get all the logs that belong to the same trace. This enables you to see how many {es} requests were triggered during the `9b99131a6f66587971ef085ef97dfd07` trace, what they looked like, what {es} endpoints were hit, and so on.
76 changes: 42 additions & 34 deletions docs/settings/task-manager-settings.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,51 +9,59 @@ Task Manager runs background tasks by polling for work on an interval. You can

[float]
[[task-manager-settings]]
==== Task Manager settings
==== Task Manager settings

[cols="2*<"]
|===
| `xpack.task_manager.max_attempts`
| The maximum number of times a task will be attempted before being abandoned as failed. Defaults to 3.

| `xpack.task_manager.poll_interval`
| How often, in milliseconds, the task manager will look for more work. Defaults to 3000 and cannot be lower than 100.

| `xpack.task_manager.request_capacity`
| How many requests can Task Manager buffer before it rejects new requests. Defaults to 1000.
`xpack.task_manager.max_attempts`::
The maximum number of times a task will be attempted before being abandoned as failed. Defaults to 3.

| `xpack.task_manager.max_workers`
| The maximum number of tasks that this Kibana instance will run simultaneously. Defaults to 10.
Starting in 8.0, it will not be possible to set the value greater than 100.
`xpack.task_manager.poll_interval`::
How often, in milliseconds, the task manager will look for more work. Defaults to 3000 and cannot be lower than 100.

| `xpack.task_manager.`
`monitored_stats_health_verbose_log.enabled`
| This flag will enable automatic warn and error logging if task manager self detects a performance issue, such as the time between when a task is scheduled to execute and when it actually executes. Defaults to false.
`xpack.task_manager.request_capacity`::
How many requests can Task Manager buffer before it rejects new requests. Defaults to 1000.

| `xpack.task_manager.`
`monitored_stats_health_verbose_log.`
`warn_delayed_task_start_in_seconds`
| The amount of seconds we allow a task to delay before printing a warning server log. Defaults to 60.
`xpack.task_manager.max_workers`::
The maximum number of tasks that this Kibana instance will run simultaneously. Defaults to 10.
Starting in 8.0, it will not be possible to set the value greater than 100.

| `xpack.task_manager.ephemeral_tasks.enabled`
| Enables an experimental feature that executes a limited (and configurable) number of actions in the same task as the alert which triggered them.
These action tasks will reduce the latency of the time it takes an action to run after it's triggered, but are not persisted as SavedObjects.
These non-persisted action tasks have a risk that they won't be run at all if the Kibana instance running them exits unexpectedly. Defaults to false.
`xpack.task_manager.monitored_stats_health_verbose_log.enabled`::
This flag will enable automatic warn and error logging if task manager self detects a performance issue, such as the time between when a task is scheduled to execute and when it actually executes. Defaults to false.

`xpack.task_manager.monitored_stats_health_verbose_log.warn_delayed_task_start_in_seconds`::
The amount of seconds we allow a task to delay before printing a warning server log. Defaults to 60.

`xpack.task_manager.ephemeral_tasks.enabled`::
Enables an experimental feature that executes a limited (and configurable) number of actions in the same task as the alert which triggered them.
These action tasks will reduce the latency of the time it takes an action to run after it's triggered, but are not persisted as SavedObjects.
These non-persisted action tasks have a risk that they won't be run at all if the Kibana instance running them exits unexpectedly. Defaults to false.

`xpack.task_manager.ephemeral_tasks.request_capacity`::
Sets the size of the ephemeral queue defined above. Defaults to 10.

| `xpack.task_manager.ephemeral_tasks.request_capacity`
| Sets the size of the ephemeral queue defined above. Defaults to 10.
|===

[float]
[[task-manager-health-settings]]
==== Task Manager Health settings
==== Task Manager Health settings

Settings that configure the <<task-manager-health-monitoring>> endpoint.

[cols="2*<"]
|===
| `xpack.task_manager.`
`monitored_task_execution_thresholds`
| Configures the threshold of failed task executions at which point the `warn` or `error` health status is set under each task type execution status (under `stats.runtime.value.execution.result_frequency_percent_as_number[${task type}].status`). This setting allows configuration of both the default level and a custom task type specific level. By default, this setting is configured to mark the health of every task type as `warning` when it exceeds 80% failed executions, and as `error` at 90%. Custom configurations allow you to reduce this threshold to catch failures sooner for task types that you might consider critical, such as alerting tasks. This value can be set to any number between 0 to 100, and a threshold is hit when the value *exceeds* this number. This means that you can avoid setting the status to `error` by setting the threshold at 100, or hit `error` the moment any task fails by setting the threshold to 0 (as it will exceed 0 once a single failure occurs).

|===
`xpack.task_manager.monitored_task_execution_thresholds`::
Configures the threshold of failed task executions at which point the `warn` or
`error` health status is set under each task type execution status
(under `stats.runtime.value.execution.result_frequency_percent_as_number[${task type}].status`).
+
This setting allows configuration of both the default level and a
custom task type specific level. By default, this setting is configured to mark
the health of every task type as `warning` when it exceeds 80% failed executions,
and as `error` at 90%.
+
Custom configurations allow you to reduce this threshold to catch failures sooner
for task types that you might consider critical, such as alerting tasks.
+
This value can be set to any number between 0 to 100, and a threshold is hit
when the value *exceeds* this number. This means that you can avoid setting the
status to `error` by setting the threshold at 100, or hit `error` the moment
any task fails by setting the threshold to 0 (as it will exceed 0 once a
single failure occurs).
23 changes: 10 additions & 13 deletions docs/settings/telemetry-settings.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,29 +17,26 @@ See our https://www.elastic.co/legal/privacy-statement[Privacy Statement] to lea
[[telemetry-general-settings]]
==== General telemetry settings

[cols="2*<"]
|===
|[[telemetry-enabled]] `telemetry.enabled`
| Set to `true` to send cluster statistics to Elastic. Reporting your

[[telemetry-enabled]] `telemetry.enabled`::
Set to `true` to send cluster statistics to Elastic. Reporting your
cluster statistics helps us improve your user experience. Your data is never
shared with anyone. Set to `false` to disable statistics reporting from any
browser connected to the {kib} instance. Defaults to `true`.

| `telemetry.sendUsageFrom`
| Set to `'server'` to report the cluster statistics from the {kib} server.
`telemetry.sendUsageFrom`::
Set to `'server'` to report the cluster statistics from the {kib} server.
If the server fails to connect to our endpoint at https://telemetry.elastic.co/, it assumes
it is behind a firewall and falls back to `'browser'` to send it from users' browsers
when they are navigating through {kib}. Defaults to `'server'`.

|[[telemetry-optIn]] `telemetry.optIn`
| Set to `true` to automatically opt into reporting cluster statistics. You can also opt out through
[[telemetry-optIn]] `telemetry.optIn`::
Set to `true` to automatically opt into reporting cluster statistics. You can also opt out through
*Advanced Settings* in {kib}. Defaults to `true`.

| `telemetry.allowChangingOptInStatus`
| Set to `true` to allow overwriting the <<telemetry-optIn, `telemetry.optIn`>> setting via the {kib} UI. Defaults to `true`. +

|===

`telemetry.allowChangingOptInStatus`::
Set to `true` to allow overwriting the <<telemetry-optIn, `telemetry.optIn`>> setting via the {kib} UI. Defaults to `true`. +
+
[NOTE]
============
When `false`, <<telemetry-optIn, `telemetry.optIn`>> must be `true`. To disable telemetry and not allow users to change that parameter, use <<telemetry-enabled, `telemetry.enabled`>>.
Expand Down
2 changes: 1 addition & 1 deletion docs/user/alerting/alerting-troubleshooting.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Rules and connectors log to the Kibana logger with tags of [alerting] and [actio

[source, txt]
--------------------------------------------------
server log [11:39:40.389] [error][alerting][alerting][plugins][plugins] Executing Alert "5b6237b0-c6f6-11eb-b0ff-a1a0cbcf29b6" has resulted in Error: Saved object [action/fdbc8610-c6f5-11eb-b0ff-a1a0cbcf29b6] not found
server log [11:39:40.389] [error][alerting][alerting][plugins][plugins] Executing Rule "5b6237b0-c6f6-11eb-b0ff-a1a0cbcf29b6" has resulted in Error: Saved object [action/fdbc8610-c6f5-11eb-b0ff-a1a0cbcf29b6] not found
--------------------------------------------------
Some of the resources, such as saved objects and API keys, may no longer be available or valid, yielding error messages about those missing resources.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ And see the errors for the rules you might provide the next search query:
}
],
},
"message": "alert executed: .index-threshold:30d856c0-b14b-11eb-9a7c-9df284da9f99: 'test'",
"message": "rule executed: .index-threshold:30d856c0-b14b-11eb-9a7c-9df284da9f99: 'test'",
"error" : {
"message" : "Saved object [action/ef0e2530-b14a-11eb-9a7c-9df284da9f99] not found"
},
Expand Down
2 changes: 2 additions & 0 deletions docs/user/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,5 @@ include::management.asciidoc[]
include::api.asciidoc[]

include::plugins.asciidoc[]

include::troubleshooting.asciidoc[]
Original file line number Diff line number Diff line change
Expand Up @@ -1020,7 +1020,7 @@ This log message tells us that when Task Manager was running one of our rules, i

For example, in this case, we’d expect to see a corresponding log line from the Alerting framework itself, saying that the rule failed. You should look in the Kibana log for a line similar to the log line below (probably shortly before the Task Manager log line):

Executing Alert "27559295-44e4-4983-aa1b-94fe043ab4f9" has resulted in Error: Unable to load resource ‘/api/something’
Executing Rule "27559295-44e4-4983-aa1b-94fe043ab4f9" has resulted in Error: Unable to load resource ‘/api/something’

This would confirm that the error did in fact happen in the rule itself (rather than the Task Manager) and it would help us pin-point the specific ID of the rule which failed: 27559295-44e4-4983-aa1b-94fe043ab4f9

Expand Down
70 changes: 70 additions & 0 deletions docs/user/troubleshooting.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
[[kibana-troubleshooting]]
== Troubleshooting

=== Using {kib} server logs
{kib} Logs is a great way to see what's going on in your application and to debug performance issues. Navigating through a large number of generated logs can be overwhelming, and following are some techniques that you can use to optimize the process.

Start by defining a problem area that you are interested in. For example, you might be interested in seeing how a particular {kib} Plugin is performing, so no need to gather logs for all of {kib}. Or you might want to focus on a particular feature, such as requests from the {kib} server to the {es} server.
Depending on your needs, you can configure {kib} to generate logs for a specific feature.
[source,yml]
----
logging:
appenders:
file:
type: file
fileName: ./kibana.log
layout:
type: json
### gather all the Kibana logs into a file
logging.root:
appenders: [file]
level: all
### or gather a subset of the logs
logging.loggers:
### responses to an HTTP request
- name: http.server.response
level: debug
appenders: [file]
### result of a query to the Elasticsearch server
- name: elasticsearch.query
level: debug
appenders: [file]
### logs generated by my plugin
- name: plugins.myPlugin
level: debug
appenders: [file]
----
WARNING: Kibana's `file` appender is configured to produce logs in https://www.elastic.co/guide/en/ecs/master/ecs-reference.html[ECS JSON] format. It's the only format that includes the meta information necessary for https://www.elastic.co/guide/en/apm/agent/nodejs/current/log-correlation.html[log correlation] out-of-the-box.

The next step is to define what https://www.elastic.co/observability[observability tools] are available.
For a better experience, set up an https://www.elastic.co/guide/en/apm/get-started/current/observability-integrations.html[Observability integration] provided by Elastic to debug your application with the <<debugging-logs-apm-ui, APM UI.>>
To debug something quickly without setting up additional tooling, you can work with <<plain-kibana-logs, the plain {kib} logs.>>

[[debugging-logs-apm-ui]]
==== APM UI
*Prerequisites* {kib} logs are configured to be in https://www.elastic.co/guide/en/ecs/master/ecs-reference.html[ECS JSON] format to include tracing identifiers.

To debug {kib} with the APM UI, you must set up the APM infrastructure. You can find instructions for the setup process
https://www.elastic.co/guide/en/apm/get-started/current/observability-integrations.html[on the Observability integrations page].

Once you set up the APM infrastructure, you can enable the APM agent and put {kib} under load to collect APM events. To analyze the collected metrics and logs, use the APM UI as demonstrated https://www.elastic.co/guide/en/kibana/master/transactions.html#transaction-trace-sample[in the docs].

[[plain-kibana-logs]]
==== Plain {kib} logs
*Prerequisites* {kib} logs are configured to be in https://www.elastic.co/guide/en/ecs/master/ecs-reference.html[ECS JSON] format to include tracing identifiers.

Open {kib} Logs and search for an operation you are interested in.
For example, suppose you want to investigate the response times for queries to the `/api/telemetry/v2/clusters/_stats` {kib} endpoint.
Open Kibana Logs and search for the HTTP server response for the endpoint. It looks similar to the following (some fields are omitted for brevity).
[source,json]
----
{
"message":"POST /api/telemetry/v2/clusters/_stats 200 1014ms - 43.2KB",
"log":{"level":"DEBUG","logger":"http.server.response"},
"trace":{"id":"9b99131a6f66587971ef085ef97dfd07"},
"transaction":{"id":"d0c5bbf14f5febca"}
}
----
You are interested in the https://www.elastic.co/guide/en/ecs/current/ecs-tracing.html#field-trace-id[trace.id] field, which is a unique identifier of a trace. The `trace.id` provides a way to group multiple events, like transactions, which belong together. You can search for `"trace":{"id":"9b99131a6f66587971ef085ef97dfd07"}` to get all the logs that belong to the same trace. This enables you to see how many {es} requests were triggered during the `9b99131a6f66587971ef085ef97dfd07` trace, what they looked like, what {es} endpoints were hit, and so on.
5 changes: 3 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@
"@elastic/datemath": "link:bazel-bin/packages/elastic-datemath",
"@elastic/elasticsearch": "npm:@elastic/elasticsearch-canary@^8.0.0-canary.35",
"@elastic/ems-client": "8.0.0",
"@elastic/eui": "41.0.0",
"@elastic/eui": "41.2.3",
"@elastic/filesaver": "1.1.2",
"@elastic/node-crypto": "1.2.1",
"@elastic/numeral": "^2.5.1",
Expand Down Expand Up @@ -369,7 +369,7 @@
"redux-thunks": "^1.0.0",
"regenerator-runtime": "^0.13.3",
"remark-parse": "^8.0.3",
"remark-stringify": "^9.0.0",
"remark-stringify": "^8.0.3",
"require-in-the-middle": "^5.1.0",
"reselect": "^4.0.0",
"resize-observer-polyfill": "^1.5.1",
Expand Down Expand Up @@ -570,6 +570,7 @@
"@types/kbn__dev-utils": "link:bazel-bin/packages/kbn-dev-utils/npm_module_types",
"@types/kbn__docs-utils": "link:bazel-bin/packages/kbn-docs-utils/npm_module_types",
"@types/kbn__es-archiver": "link:bazel-bin/packages/kbn-es-archiver/npm_module_types",
"@types/kbn__es-query": "link:bazel-bin/packages/kbn-es-query/npm_module_types",
"@types/kbn__i18n": "link:bazel-bin/packages/kbn-i18n/npm_module_types",
"@types/kbn__i18n-react": "link:bazel-bin/packages/kbn-i18n-react/npm_module_types",
"@types/license-checker": "15.0.0",
Expand Down
Loading

0 comments on commit 2872a20

Please sign in to comment.