Skip to content

Commit

Permalink
tools/lightning: update Lightning docs for v2.1.9 (#1087)
Browse files Browse the repository at this point in the history
* lightning: fix typos in the command line passed to tidb-lightning-ctl

Also included the --config parameter so even blindly copying the command
would still work most of the time.

* lightning: updated the list of metrics

* lightning: update configuration and memory requirement of Importer

* lightning: change release version to 2.1.9

* lightning: adjust settings for tikv/tikv#4423 and tikv/tikv#4611

* Apply suggestions from code review

Co-Authored-By: kennytm <kennytm@gmail.com>

* tool/ligtning: fix remaining `max-open-engines` typo
  • Loading branch information
kennytm authored May 8, 2019
1 parent a75a42f commit d654dc4
Show file tree
Hide file tree
Showing 3 changed files with 97 additions and 30 deletions.
20 changes: 16 additions & 4 deletions tools/lightning/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ To achieve the best performance, it is recommended to use the following hardware
- `tikv-importer`:

- 32+ logical cores CPU
- 32 GB+ memory
- 40 GB+ memory
- 1 TB+ SSD, preferring higher IOPS (≥ 8000 is recommended)
* The disk should be larger than the total size of the top N tables, where N = max(index-concurrency, table-concurrency).
- 10 Gigabit network card (capable of transferring at ≥300 MB/s)
Expand All @@ -50,6 +50,8 @@ If you have sufficient machines, you can deploy multiple Lightning/Importer serv
> **Notes:**
>
> `tidb-lightning` is a CPU intensive program. In an environment with mixed components, the resources allocated to `tidb-lightning` must be limited. Otherwise, other components might not be able to run. It is recommended to set the `region-concurrency` to 75% of CPU logical cores. For instance, if the CPU has 32 logical cores, you can set the `region-concurrency` to 24.
>
> `tikv-importer` stores intermediate data on the RAM to speed up process. The typical memory usage can be calculated from configuration as **(`max-open-engines` × `write-buffer-size` × 2) + (`num-import-jobs` × `region-split-size` × 2)**. If the speed of writing to disk is slow, the memory usage could be even higher due to buffering.

Additionally, the target TiKV cluster should have enough space to absorb the new data.
Besides [the standard requirements](../../op-guide/recommendation.md), the total free space of the target TiKV cluster should be larger than **Size of data source × [Number of replicas](../../FAQ.md#is-the-number-of-replicas-in-each-region-configurable-if-yes-how-to-configure-it) × 2**.
Expand Down Expand Up @@ -167,7 +169,7 @@ You can find deployment instructions in [TiDB Quick Start Guide](https://pingcap

Download the TiDB-Lightning package (choose the same version as that of the TiDB cluster):

- **v2.1.6**: https://download.pingcap.org/tidb-v2.1.6-linux-amd64.tar.gz
- **v2.1.9**: https://download.pingcap.org/tidb-v2.1.9-linux-amd64.tar.gz
- **v2.0.9**: https://download.pingcap.org/tidb-lightning-v2.0.9-linux-amd64.tar.gz
- Latest unstable version: https://download.pingcap.org/tidb-lightning-test-xx-latest-linux-amd64.tar.gz

Expand Down Expand Up @@ -214,7 +216,11 @@ Download the TiDB-Lightning package (choose the same version as that of the TiDB
# The algorithm at level-0 is used to compress KV data.
# The algorithm at level-6 is used to compress SST files.
# The algorithms at level-1 to level-5 are unused for now.
compression-per-level = ["lz4", "no", "no", "no", "no", "no", "zstd"]
compression-per-level = ["lz4", "no", "no", "no", "no", "no", "lz4"]
[rocksdb.writecf]
# (same as above)
compression-per-level = ["lz4", "no", "no", "no", "no", "no", "lz4"]
[import]
# The directory to store engine files.
Expand All @@ -231,6 +237,12 @@ Download the TiDB-Lightning package (choose the same version as that of the TiDB
#stream-channel-window = 128
# Maximum number of open engines.
max-open-engines = 8
# Maximum upload speed (bytes per second) from Importer to TiKV.
#upload-speed-limit = "512MB"
# minimum ratio of target store available space: store_available_space / store_capacity.
# Importer pauses uploading SST if the availability ratio of the target store is less than this
# value, to give PD enough time to balance regions.
min-available-ratio = 0.05
```

3. Run `tikv-importer`.
Expand Down Expand Up @@ -265,7 +277,7 @@ Download the TiDB-Lightning package (choose the same version as that of the TiDB
# The sum of these two values must not exceed the max-open-engines setting
# for tikv-importer.
index-concurrency = 2
table-concurrency = 8
table-concurrency = 6
# The concurrency number of data. It is set to the number of logical CPU
# cores by default. When deploying together with other components, you can
Expand Down
24 changes: 18 additions & 6 deletions tools/lightning/errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,11 @@ Try the latest version! Maybe there is new speed improvement.

**Solutions**:

1. Delete the corrupted data with `tidb-lightning-ctl --error-checkpoint-destroy=all`, and restart Lightning to import the affected tables again.
1. Delete the corrupted data with via `tidb-lightning-ctl`, and restart Lightning to import the affected tables again.

```sh
tidb-lightning-ctl --config conf/tidb-lightning.toml --checkpoint-error-destroy=all
```

2. Consider using an external database to store the checkpoints (change `[checkpoint] dsn`) to reduce the target database's load.
Expand All @@ -55,7 +59,11 @@ Try the latest version! Maybe there is new speed improvement.
**Solutions**:
If the error was caused by invalid data source, delete the imported data using `tidb-lightning-ctl --error-checkpoint-destroy=all` and start Lightning again.
If the error was caused by invalid data source, delete the imported data using `tidb-lightning-ctl` and start Lightning again.
```sh
tidb-lightning-ctl --config conf/tidb-lightning.toml --checkpoint-error-destroy=all
```
See the [Checkpoints control](../../tools/lightning/checkpoints.md#checkpoints-control) section for other options.
Expand All @@ -65,13 +73,17 @@ See the [Checkpoints control](../../tools/lightning/checkpoints.md#checkpoints-c
**Solutions**:
1. Increase the value of `max-open-engine` setting in `tikv-importer.toml`. This value is typically dictated by the available memory. This could be calculated as:
1. Increase the value of `max-open-engines` setting in `tikv-importer.toml`. This value is typically dictated by the available memory. This could be calculated as:
Max Memory Usage ≈ `max-open-engine` × `write-buffer-size` × `max-write-buffer-number`
Max Memory Usage ≈ `max-open-engines` × `write-buffer-size` × `max-write-buffer-number`
2. Decrease the value of `table-concurrency` + `index-concurrency` so it is less than `max-open-engine`.
2. Decrease the value of `table-concurrency` + `index-concurrency` so it is less than `max-open-engines`.
3. Restart `tikv-importer` to forcefully remove all engine files. This also removes all partially imported tables, thus it is required to run `tidb-lightning-ctl --error-checkpoint-destroy=all`.
3. Restart `tikv-importer` to forcefully remove all engine files (default to `./data.import/`). This also removes all partially imported tables, thus it is required to clear the outdated checkpoints.
```sh
tidb-lightning-ctl --config conf/tidb-lightning.toml --checkpoint-error-destroy=all
```
## cannot guess encoding for input file, please convert to UTF-8 manually
Expand Down
83 changes: 63 additions & 20 deletions tools/lightning/monitor.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,26 +63,66 @@ Metrics provided by `tikv-importer` are listed under the namespace `tikv_import_

- **`tikv_import_rpc_duration`** (Histogram)

Bucketed histogram of importing RPC duration. Labels:
Bucketed histogram of total duration needed to complete an RPC action. Labels:

- **request**: RPC name, e.g. `open_engine`, `import_engine`, etc.
- **request**: `switch_mode` / `open_engine` / `write_engine` / `close_engine` / `import_engine` / `cleanup_engine` / `compact_cluster` / `upload` / `ingest` / `compact`
- **result**: `ok` / `error`

- **`tikv_import_write_chunk_bytes`** (Histogram)

Bucketed histogram of importing write chunk bytes.
Bucketed histogram of the uncompressed size of a block of KV pairs received from Lightning.

- **`tikv_import_write_chunk_duration`** (Histogram)

Bucketed histogram of importing write chunk duration.
Bucketed histogram of the time needed to receive a block of KV pairs from Lightning.

- **`tikv_import_upload_chunk_bytes`** (Histogram)

Bucketed histogram of importing upload chunk bytes.
Bucketed histogram of the compressed size of a chunk of SST file uploaded to TiKV.

- **`tikv_import_upload_chunk_duration`** (Histogram)

Bucketed histogram of importing upload chunk duration.
Bucketed histogram of the time needed to upload a chunk of SST file to TiKV.

- **`tikv_import_range_delivery_duration`** (Histogram)

Bucketed histogram of the time needed to deliver a range of KV pairs into a `dispatch-job`.

- **`tikv_import_split_sst_duration`** (Histogram)

Bucketed histogram of the time needed to split off a range from the engine file into a single SST file.

- **`tikv_import_sst_delivery_duration`** (Histogram)

Bucketed histogram of the time needed to deliver an SST file from a `dispatch-job` to an `ImportSSTJob`.

- **`tikv_import_sst_recv_duration`** (Histogram)

Bucketed histogram of the time needed to receive an SST file from a `dispatch-job` in an `ImportSSTJob`.

- **`tikv_import_sst_upload_duration`** (Histogram)

Bucketed histogram of the time needed to upload an SST file from an `ImportSSTJob` to a TiKV node.

- **`tikv_import_sst_chunk_bytes`** (Histogram)

Bucketed histogram of the compressed size of the SST file uploaded to a TiKV node.

- **`tikv_import_sst_ingest_duration`** (Histogram)

Bucketed histogram of the time needed to ingest an SST file into TiKV.

- **`tikv_import_each_phase`** (Gauge)

Indicates the running phase. Values can be 1, meaning running inside the phase, or 0, meaning outside the phase. Labels:

- **phase**: `prepare` / `import`

- **`tikv_import_wait_store_available_count`** (Counter)

Counts the number of times a TiKV node is found to have insufficient space when uploading SST files. Labels:

- **store_id**: The TiKV store ID.

### `tidb-lightning`

Expand All @@ -98,7 +138,7 @@ Metrics provided by `tidb-lightning` are listed under the namespace `lightning_*

Counting idle workers. Values should be less than the `*-concurrency` settings and are typically zero. Labels:

- **name**: `table` / `region` / `io`
- **name**: `table` / `index` / `region` / `io` / `closed-engine`

- **`lightning_kv_encoder`** (Counter)

Expand All @@ -113,6 +153,13 @@ Metrics provided by `tidb-lightning` are listed under the namespace `lightning_*
- **state**: `pending` / `written` / `closed` / `imported` / `altered_auto_inc` / `checksum` / `analyzed` / `completed`
- **result**: `success` / `failure`

* **`lightning_engines`** (Counter)

Counting number of engine files processed and their status. Labels:

- **state**: `pending` / `written` / `closed` / `imported` / `completed`
- **result**: `success` / `failure`

- **`lightning_chunks`** (Counter)

Counting number of chunks processed and their status. Labels:
Expand All @@ -123,34 +170,30 @@ Metrics provided by `tidb-lightning` are listed under the namespace `lightning_*

Bucketed histogram of the time needed to import a table.

- **`lightning_block_read_seconds`** (Histogram)
- **`lightning_row_read_bytes`** (Histogram)

Bucketed histogram of the time needed to read a block of SQL rows from the data source.
Bucketed histogram of the size of a single SQL row.

- **`lightning_block_read_bytes`** (Histogram)
- **`lightning_row_encode_seconds`** (Histogram)

Bucketed histogram of the size of a block of SQL rows.
Bucketed histogram of the time needed to encode a single SQL row into KV pairs.

- **`lightning_block_encode_seconds`** (Histogram)
- **`lightning_row_kv_deliver_seconds`** (Histogram)

Bucketed histogram of the time needed to encode a block of SQL rows into KV pairs.
Bucketed histogram of the time needed to deliver a set KV pairs corresponding to one single SQL row.

- **`lightning_block_deliver_seconds`** (Histogram)

Bucketed histogram of the time needed to deliver a block of KV pairs.
Bucketed histogram of the time needed to deliver of a block of KV pairs to Importer.

- **`lightning_block_deliver_bytes`** (Histogram)

Bucketed histogram of the size of a block of KV pairs.
Bucketed histogram of the uncompressed size of a block of KV pairs delivered to Importer.

- **`lightning_chunk_parser_read_block_seconds`** (Histogram)

Bucketed histogram of the time needed by the data file parser to read a block.

- **`lightning_chunk_parser_read_row_seconds`** (Histogram)

Bucketed histogram of the time needed by the data file parser to read a row.

- **`lightning_checksum_seconds`** (Histogram)

Bucketed histogram of the time taken to compute the checksum of a table.
Expand All @@ -159,4 +202,4 @@ Metrics provided by `tidb-lightning` are listed under the namespace `lightning_*

Bucketed histogram of the time taken to acquire an idle worker. Labels:

- **name**: `table` / `region` / `io`
- **name**: `table` / `index` / `region` / `io` / `closed-engine`

0 comments on commit d654dc4

Please sign in to comment.