pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/

Home Page:https://pingcap.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

memory consumption improvement for handling partitioned table

mzhang77 opened this issue · comments

Feature Request

Is your feature request related to a problem? Please describe:
Memory consumption increase with number of partitions in table:

# 16 partitions --> 5GB memory 
MySQL> explain analyze select * from tb1 order by pk1,pk2,pk3 asc limit 125268248, 8947732;
+----------------------------+--------------+-----------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+---------+------+
| id                         | estRows      | actRows   | task      | access object            | execution info                                                                                                                                                                                                                                                                                                                                                            | operator info                   | memory  | disk |
+----------------------------+--------------+-----------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+---------+------+
| Limit_17                   | 134215980.00 | 8947732   | root      |                          | time:1m51.7s, loops:8740, RU:770310.452994                                                                                                                                                                                                                                                                                                                                | offset:125268248, count:8947732 | N/A     | N/A  |
| └─TableReader_16           | 134215980.00 | 134215980 | root      | partition:all            | time:1m51.6s, loops:131071, cop_task: {num: 6552, max: 259.7ms, min: 885.5µs, avg: 53.7ms, p95: 139.8ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 4m10.1s, tot_wait: 991ms, rpc_num: 6555, rpc_time: 5m51.4s, copr_cache_hit_ratio: 0.00, build_task_duration: 1.25ms, max_distsql_concurrency: 2}, backoff{regionMiss: 6ms}                                 | data:Limit_15                   | 5.00 GB | N/A  |
|   └─Limit_15               | 134215980.00 | 134449519 | cop[tikv] |                          | tikv_task:{proc max:136ms, min:0s, avg: 29.2ms, p80:64ms, p95:76ms, iters:157102, tasks:6552}, scan_detail: {total_process_keys: 134449519, total_process_keys_size: 39759319473, total_keys: 134456071, get_snapshot_time: 538.2ms, rocksdb: {key_skipped_count: 158332965, block: {cache_hit_count: 25882, read_count: 1285158, read_byte: 8.16 GB, read_time: 3.43s}}} | offset:0, count:134215980       | N/A     | N/A  |
|     └─TableFullScan_14     | 134215980.00 | 134449519 | cop[tikv] | table:tb1                | tikv_task:{proc max:136ms, min:0s, avg: 29.1ms, p80:64ms, p95:76ms, iters:157102, tasks:6552}                                                                                                                                                                                                                                                                             | keep order:true                 | N/A     | N/A  |
+----------------------------+--------------+-----------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+---------+------+
4 rows in set (1 min 51.76 sec)
# 32 partitions --> 10GB memory
MySQL [ep_foundation_core_entities]> explain analyze select * from tb1 order by pk1,pk2,pk3 asc limit 177730960, 8886548;
+----------------------------+--------------+-----------+-----------+-------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+---------+------+
| id                         | estRows      | actRows   | task      | access object                 | execution info                                                                                                                                                                                                                                                                                                                                                            | operator info                   | memory  | disk |
+----------------------------+--------------+-----------+-----------+-------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+---------+------+
| Limit_17                   | 186617508.00 | 8886538   | root      |                               | time:2m39s, loops:8680, RU:983811.667225                                                                                                                                                                                                                                                                                                                                  | offset:177730960, count:8886548 | N/A     | N/A  |
| └─TableReader_16           | 186617508.00 | 186617498 | root      | partition:all                 | time:2m39s, loops:182245, cop_task: {num: 8947, max: 689.9ms, min: 871µs, avg: 68.9ms, p95: 234.8ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 5m49.8s, tot_wait: 1.55s, rpc_num: 8947, rpc_time: 10m16.4s, copr_cache_hit_ratio: 0.00, build_task_duration: 1.23ms, max_distsql_concurrency: 2}                                                              | data:Limit_15                   | 10.4 GB | N/A  |
|   └─Limit_15               | 186617508.00 | 186617498 | cop[tikv] |                               | tikv_task:{proc max:176ms, min:0s, avg: 29.7ms, p80:64ms, p95:80ms, iters:217469, tasks:8947}, scan_detail: {total_process_keys: 186617498, total_process_keys_size: 56687627028, total_keys: 186626445, get_snapshot_time: 917.4ms, rocksdb: {key_skipped_count: 226889259, block: {cache_hit_count: 39229, read_count: 1844413, read_byte: 9.31 GB, read_time: 4.53s}}} | offset:0, count:186617508       | N/A     | N/A  |
|     └─TableFullScan_14     | 186617508.00 | 186617498 | cop[tikv] | table:tb1                     | tikv_task:{proc max:176ms, min:0s, avg: 29.7ms, p80:64ms, p95:80ms, iters:217469, tasks:8947}                                                                                                                                                                                                                                                                             | keep order:true                 | N/A     | N/A  |
+----------------------------+--------------+-----------+-----------+-------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+---------+------+
4 rows in set (2 min 39.06 sec)
# 64 partitions --> 20GB memory
MySQL [ep_foundation_core_entities]> explain analyze select * from tb1 order by pk1,pk2,pk3 asc limit 125268248, 8947732;
+----------------------------+--------------+-----------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+---------+------+
| id                         | estRows      | actRows   | task      | access object            | execution info                                                                                                                                                                                                                                                                                                                                                            | operator info                   | memory  | disk |
+----------------------------+--------------+-----------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+---------+------+
| Limit_17                   | 134215980.00 | 8947732   | root      |                          | time:2m19.6s, loops:8740, RU:974951.967167                                                                                                                                                                                                                                                                                                                                | offset:125268248, count:8947732 | N/A     | N/A  |
| └─TableReader_16           | 134215980.00 | 134215980 | root      | partition:all            | time:2m19.5s, loops:131071, cop_task: {num: 6272, max: 2.99s, min: 649µs, avg: 201.9ms, p95: 644ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 3m43.5s, tot_wait: 847.3ms, rpc_num: 6273, rpc_time: 21m6.3s, copr_cache_hit_ratio: 0.02, build_task_duration: 1.43ms, max_distsql_concurrency: 2}, backoff{regionMiss: 2ms}                                    | data:Limit_15                   | 21.6 GB | N/A  |
|   └─Limit_15               | 134215980.00 | 135363725 | cop[tikv] |                          | tikv_task:{proc max:116ms, min:0s, avg: 25.2ms, p80:56ms, p95:68ms, iters:156854, tasks:6272}, scan_detail: {total_process_keys: 134438250, total_process_keys_size: 39771543878, total_keys: 134444406, get_snapshot_time: 486.8ms, rocksdb: {key_skipped_count: 158470825, block: {cache_hit_count: 1309865, read_count: 2880, read_byte: 13.4 MB, read_time: 19.4ms}}} | offset:0, count:134215980       | N/A     | N/A  |
|     └─TableFullScan_14     | 134215980.00 | 135363725 | cop[tikv] | table:tb1                | tikv_task:{proc max:116ms, min:0s, avg: 25.2ms, p80:56ms, p95:68ms, iters:156854, tasks:6272}                                                                                                                                                                                                                                                                             | keep order:true                 | N/A     | N/A  |
+----------------------------+--------------+-----------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+---------+------+
4 rows in set (2 min 19.62 sec)

# table defintion
CREATE TABLE `tb1` (
  `pk1` bigint(20) NOT NULL,
  `pk2` bigint(20) NOT NULL,
  `pk3` varchar(101) NOT NULL,
  `column1` bigint(20) NOT NULL,
  `column2` varchar(100) DEFAULT 'string',
  `column3` bigint(20) DEFAULT NULL,
  `column4` bigint(20) NOT NULL,
  `column5` bigint(20) NOT NULL,
  `column6` bigint(20) NOT NULL,
  `column7` varchar(100) NOT NULL,
  PRIMARY KEY (`pk1`,`pk2`,`pk3`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
PARTITION BY HASH (`pk1`) PARTITIONS 16

Describe the feature you'd like:
tidb fetches limit+offset number of rows from every partition into memory, then do sort merge, that's why memory consumption is proportional to number of partitions.

tidb could do sort merge on the fly while fetching data from all partitions, so that total memory consumption will not be proportional to number of partitions.

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:
The above SQLs are from PingCAP paid customer.