Skip to content

Commit

Permalink
cherry pick pingcap#2088 to release-3.0
Browse files Browse the repository at this point in the history
Signed-off-by: sre-bot <sre-bot@pingcap.com>
  • Loading branch information
anotherrachel authored and sre-bot committed Apr 3, 2020
1 parent be1fbc5 commit 2b7ed73
Show file tree
Hide file tree
Showing 13 changed files with 275 additions and 25 deletions.
2 changes: 1 addition & 1 deletion TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,8 +255,8 @@
- [Certificate-Based Authentication](/reference/security/cert-based-authentication.md)
+ Transactions
- [Overview](/reference/transactions/overview.md)
- [Transaction Model](/reference/transactions/transaction-model.md)
- [Isolation Levels](/reference/transactions/transaction-isolation.md)
- [Optimistic Transactions](/reference/transactions/transaction-optimistic.md)
- [Pessimistic Transactions](/reference/transactions/transaction-pessimistic.md)
+ System Databases
- [`mysql`](/reference/system-databases/mysql.md)
Expand Down
16 changes: 15 additions & 1 deletion glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,20 @@ category: glossary

# Glossary

## A

### ACID

ACID refers to the four key properties of a transaction: atomicity, consistency, isolation, and durability. Each of these properties is described below.

- **Atomicity** means that either all the changes of an operation are performed, or none of them are. TiDB ensures the atomicity of the [Region](#region) that stores the Primary Key to achieve the atomicity of transactions.

- **Consistency** means that transactions always bring the database from one consistent state to another. In TiDB, data consistency is ensured before writing data to the memory.

- **Isolation** means that a transaction in process is invisible to other transactions until it completes. This allows concurrent transactions to read and write data without sacrificing consistency. TiDB currently supports the isolation level of `REPEATABLE READ`.

- **Durability** means that once a transaction is committed, it remains committed even in the event of a system failure. TiKV uses persistent storage to ensure durability.

## L

### leader/follower/learner
Expand Down Expand Up @@ -64,4 +78,4 @@ Schedulers are components in PD that generate scheduling tasks. Each scheduler i

### Store

A store refers to the storage node in the TiKV cluster (an instance of `tikv-server`). Each store has a corresponding TiKV instance.
A store refers to the storage node in the TiKV cluster (an instance of `tikv-server`). Each store has a corresponding TiKV instance.
2 changes: 1 addition & 1 deletion key-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ We believe that being able to replicate in both directions lowers the risk when

TiDB internally shards table into small range-based chunks that we refer to as "Regions". Each Region defaults to approximately 100MiB in size, and TiDB uses a Two-phase commit internally to ensure that Regions are maintained in a transactionally consistent way.

Transactions in TiDB are strongly consistent, with snapshot isolation level consistency. For more information, see transaction [behavior and performance differences](/reference/transactions/transaction-model.md). This makes TiDB more comparable to traditional relational databases in semantics than some of the newer NoSQL systems using eventual consistency.
Transactions in TiDB are strongly consistent, with snapshot isolation level consistency. For more information, see transaction [behavior and performance differences](/reference/transactions/transaction-isolation.md). This makes TiDB more comparable to traditional relational databases in semantics than some of the newer NoSQL systems using eventual consistency.

These behaviors are transparent to your application(s), which only need to connect to TiDB using a MySQL 5.7 compatible client library.

Expand Down
Binary file added media/2pc-in-tidb.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/optimistic-transaction-metric.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -250,19 +250,19 @@ set @@global.tidb_distsql_scan_concurrency = 10

- Scope: SESSION | GLOBAL
- Default value: 10
- When a transaction encounters retryable errors (such as transaction conflicts, over slow transaction commit, or table schema changes), this transaction can be re-executed. This variable is used to set the maximum number of the retries.
- This variable is used to set the maximum number of the retries. When a transaction encounters retryable errors (such as transaction conflicts, very slow transaction commit, or table schema changes), this transaction is re-executed according to this variable. Note that setting `tidb_retry_limit` to `0` disables the automatic retry.

### tidb_disable_txn_auto_retry

- Scope: SESSION | GLOBAL
- Default: on
- This variable is used to set whether to disable automatic retry of explicit transactions. The default value of `on` means that transactions will not automatically retry in TiDB and `COMMIT` statements might return errors that need to be handled in the application layer.
- This variable is used to set whether to disable the automatic retry of explicit transactions. The default value of `on` means that transactions will not automatically retry in TiDB and `COMMIT` statements might return errors that need to be handled in the application layer.

Setting the value to `off` means that TiDB will automatically retry transactions, resulting in fewer errors from `COMMIT` statements. Be careful when making this change, because it might result in lost updates.

This variable does not affect automatically committed implicit transactions and internally executed transactions in TiDB. The maximum retry count of these transactions is determined by the value of `tidb_retry_limit`.

To decide whether you can enable automatic retry, see [automatic retry and anomalies caused by automatic retry](/reference/transactions/transaction-isolation.md#automatic-retry-and-transactional-anomalies-caused-by-automatic-retry).
For more details, see [limits of retry](/reference/transactions/transaction-optimistic.md#limits-of-retry).

### tidb_backoff_weight

Expand Down
2 changes: 1 addition & 1 deletion reference/mysql-compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ However, TiDB does not support some of MySQL features or behaves differently fro

> **Note:**
>
> This page refers to general differences between MySQL and TiDB. Please also see the dedicated pages for [Security](/reference/security/compatibility.md) and [Transaction Model](/reference/transactions/transaction-model.md) compatibility.
> This page refers to general differences between MySQL and TiDB. Refer to the dedicated pages for [Security](/reference/security/compatibility.md) and [Pessimistic Transaction Model](/reference/transactions/transaction-pessimistic.md#difference-with-mysql-innodb) compatibility.
## Unsupported features

Expand Down
2 changes: 1 addition & 1 deletion reference/sql/statements/load-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,5 +48,5 @@ Records: 815264 Deleted: 0 Skipped: 0 Warnings: 0
## See also

* [INSERT](/reference/sql/statements/insert.md)
* [Transaction Model](/reference/transactions/transaction-model.md)
* [Optimistic Transaction Model](/reference/transactions/transaction-optimistic.md)
* [Import Example Database](/how-to/get-started/import-example-database.md)
4 changes: 2 additions & 2 deletions reference/sql/statements/select.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,8 @@ The `SELECT` statement is used to read data from TiDB.
|`HAVING where_condition` | The `HAVING` clause and the `WHERE` clause are both used to filter the results. The `HAVING` clause filters the results of `GROUP BY`, while the `WHERE` clause filter the results before aggregation. |
|`ORDER BY` | The `ORDER BY` clause is used to sort the data in ascending or descending order, based on columns, expressions or items in the `select_expr` list.|
|`LIMIT` | The `LIMIT` clause can be used to constrain the number of rows. `LIMIT` takes one or two numeric arguments. With one argument, the argument specifies the maximum number of rows to return, the first row to return is the first row of the table by default; with two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return.|
| `FOR UPDATE` | The `SELECT FOR UPDATE` clause locks all the data in the result sets to detect concurrent updates from other transactions. Data that match the query conditions but do not exist in the result sets are not read-locked, such as the row data written by other transactions after the current transaction is started. TiDB uses the [Optimistic Transaction Model](/reference/transactions/transaction-model.md). The transaction conflicts are not detected in the statement execution phase. Therefore, the current transaction does not block other transactions from executing `UPDATE`, `DELETE` or `SELECT FOR UPDATE` like other databases such as PostgreSQL. In the committing phase, the rows read by `SELECT FOR UPDATE` are committed in two phases, which means they can also join the conflict detection. If write conflicts occur, the commit fails for all transactions that include the `SELECT FOR UPDATE` clause. If no conflict is detected, the commit succeeds. And a new version is generated for the locked rows, so that write conflicts can be detected when other uncommitted transactions are being committed later.|
|`LOCK IN SHARE MODE` | To guarantee compatibility, TiDB parses these three modifiers, but will ignore them.|
| `FOR UPDATE` | The `SELECT FOR UPDATE` clause locks all the data in the result sets to detect concurrent updates from other transactions. Data that match the query conditions but do not exist in the result sets are not read-locked, such as the row data written by other transactions after the current transaction is started. TiDB uses the [Optimistic Transaction Model](/reference/transactions/transaction-optimistic.md). The transaction conflicts are not detected in the statement execution phase. Therefore, the current transaction does not block other transactions from executing `UPDATE`, `DELETE` or `SELECT FOR UPDATE` like other databases such as PostgreSQL. In the committing phase, the rows read by `SELECT FOR UPDATE` are committed in two phases, which means they can also join the conflict detection. If write conflicts occur, the commit fails for all transactions that include the `SELECT FOR UPDATE` clause. If no conflict is detected, the commit succeeds. And a new version is generated for the locked rows, so that write conflicts can be detected when other uncommitted transactions are being committed later. When using pessimistic transaction model, the behavior is basically the same as other databases. Refer to [Difference with MySQL InnoDB](/reference/transactions/transaction-pessimistic.md#difference-with-mysql-innodb) to see the details. |
|`LOCK IN SHARE MODE` | To guarantee compatibility, TiDB parses these three modifiers, but will ignore them. |

## Examples

Expand Down
61 changes: 55 additions & 6 deletions reference/transactions/overview.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
---
title: Transactions
summary: Learn how to use the distributed transaction statements.
summary: Learn transactions in TiDB.
category: reference
aliases: ['/docs/sql/transaction/']
---

# Transactions

TiDB supports complete distributed transactions. This document introduces transaction-related statements, explicit and implicit transactions, isolation levels, and laziness checks for transactions.
TiDB supports complete distributed transactions. Both [optimistic transaction model](/reference/transactions/transaction-optimistic.md) and [pessimistic transaction model](/reference/transactions/transaction-pessimistic.md)(introduced in TiDB 3.0) are available. This document introduces transaction-related statements, explicit and implicit transactions, isolation levels, lazy check for constraints, and transaction sizes.

The common variables include `autocommit`, [`tidb_disable_txn_auto_retry`](/reference/configuration/tidb-server/tidb-specific-variables.md#tidb_disable_txn_auto_retry), and [`tidb_retry_limit`](/reference/configuration/tidb-server/tidb-specific-variables.md#tidb_retry_limit).
The common variables include [`autocommit`](#autocommit), [`tidb_disable_txn_auto_retry`](/reference/configuration/tidb-server/tidb-specific-variables.md#tidb_disable_txn_auto_retry), and [`tidb_retry_limit`](/reference/configuration/tidb-server/tidb-specific-variables.md#tidb_retry_limit).

## Common syntax

Expand Down Expand Up @@ -71,9 +71,9 @@ Syntax:
SET autocommit = {0 | 1}
```

If you set the value of `autocommit` to `1`, the status of the current session is autocommit. If you set the value of `autocommit` to `0`, the status of the current session is non-autocommit. The value of `autocommit` is `1` by default.
When `autocommit = 1` (default), the status of the current session is autocommit. That is, statements are automatically committed immediately following their execution.

When autocommit is enabled, statements are automatically committed immediately following their execution. When autocommit is disabled, statements are only committed when you execute the `COMMIT` statement.
When `autocommit = 0`, the status of the current session is non-autocommit. That is, statements are only committed when you manually execute the `COMMIT` statement.

> **Note:**
>
Expand All @@ -95,7 +95,7 @@ SET @@GLOBAL.autocommit = {0 | 1};

## Explicit and implicit transaction

TiDB supports explicit transactions (`[BEGIN|START TRANSACTION]` and `COMMIT`) and implicit transactions (`SET autocommit = 1`).
TiDB supports explicit transactions (use `[BEGIN|START TRANSACTION]` and `COMMIT` to define the start and end of the transaction) and implicit transactions (`SET autocommit = 1`).

If you set the value of `autocommit` to `1` and start a new transaction through the `[BEGIN|START TRANSACTION]` statement, the autocommit is disabled before `COMMIT` or `ROLLBACK` which makes the transaction becomes explicit.

Expand Down Expand Up @@ -154,3 +154,52 @@ rollback;
```

In the above example, the second `insert` statement fails, and this transaction does not insert any data into the database because `rollback` is called.

## Transaction sizes

In TiDB, a transaction either too small or too large can impair the overall performance.

### Small transactions

TiDB uses the default autocommit setting (that is, `autocommit = 1`), which automatically issues a commit when executing each SQL statement. Therefore, each of the following three statements is treated as a transaction:

```sql
UPDATE my_table SET a = 'new_value' WHERE id = 1;
UPDATE my_table SET a = 'newer_value' WHERE id = 2;
UPDATE my_table SET a = 'newest_value' WHERE id = 3;
```

In this case, the latency is increased because each statement, as a transaction, uses the two-phase commit which consumes more execution time.

To improve the execution efficiency, you can use an explicit transaction instead, that is, to execute the above three statements within a transaction:

```sql
START TRANSACTION;
UPDATE my_table SET a = 'new_value' WHERE id = 1;
UPDATE my_table SET a = 'newer_value' WHERE id = 2;
UPDATE my_table SET a = 'newest_value' WHERE id = 3;
COMMIT;
```

Similarly, it is recommended to execute `INSERT` statements within an explicit transaction.

> **Note:**
>
> The single-threaded workloads in TiDB might not fully use TiDB's distributed resources, so the performance of TiDB is lower than that of a single-instance deployment of MySQL. This difference is similar to the case of transactions with higher latency in TiDB.
### Large transaction

Due to the requirement of the two-phase commit, a large transaction can lead to the following issues:

* OOM (Out of Memory) when excessive data is written in the memory
* More conflicts in the prewrite phase
* Long duration before transactions are actually committed

Therefore, TiDB intentionally imposes some limits on transaction sizes:

* The total number of SQL statements in a transaction is no more than 5,000 (default)
* Each key-value pair is no more than 6 MB

For each transaction, it is recommended to keep the number of SQL statements between 100 to 500 to achieve an optimal performance.

TiDB sets a default limit of 100 MB for the total size of key-value pairs, which can be modified by the `txn-total-size-limit` configuration item in the configuration file. The maximum value of `txn-total-size-limit` is 10 GB. The actual size limit of one transaction also depends on the memory capacity. When executing large transactions, the memory usage of the TiDB process is approximately 6 times larger than the total size of transactions.
Loading

0 comments on commit 2b7ed73

Please sign in to comment.