ClickHouse 作用于主键判断词的单调函数优化
ClickHouse 会通过主键剪枝不要的数据块(granules),但如果主键被其他函数包裹:
- 若函数是单调的:可以剪枝(pruning)
- 若函数非单调:全表扫描
例子
Github Issue
作者通过 toString
调用主键,却无法命中主键索引,详见 链接。
下面贴出最小例子,可以 ClickHouse Playground 简单验证。
单调函数
CREATE TABLE users (uid Int16, name String, age Int16) ENGINE=MergeTree() ORDER BY (uid);
INSERT INTO users VALUES (1, 'John', 33);
INSERT INTO users VALUES (2, 'Ksenia', 48);
INSERT INTO users VALUES (3, 'Alice', 50);
EXPLAIN indexes=1 SELECT count(1) FROM users WHERE (uid + 2) > 3;
Condition
成功剪枝:
Expression ((Projection + Before ORDER BY))
Aggregating
Expression (Before GROUP BY)
Filter (WHERE)
ReadFromMergeTree (default.users)
Indexes:
PrimaryKey
Keys:
uid
Condition: (plus(uid, 2) in [4, +Inf))
Parts: 2/3
Granules: 2/3
非单调函数
CREATE TABLE users (uid Int16, name String, age Int16) ENGINE=MergeTree() ORDER BY (uid);
INSERT INTO users VALUES (1, 'John', 33);
INSERT INTO users VALUES (2, 'Ksenia', 48);
INSERT INTO users VALUES (3, 'Alice', 50);
EXPLAIN indexes=1 SELECT count(1) FROM users WHERE (uid * 2) > 3;
Condition
无法优化,全表扫描:
Expression ((Projection + Before ORDER BY))
Aggregating
Expression (Before GROUP BY)
Filter (WHERE)
ReadFromMergeTree (default.users)
Indexes:
PrimaryKey
Condition: true
Parts: 3/3
Granules: 3/3
源码
ClickHouse 版本
本节源码采用 ClickHouse 版本 23.4,时间 2023-06-01。
主键判断词优化主要在 src/Storages/MergeTree/KeyCondition.cpp 和 src/Storages/MergeTree/KeyCondition.h 中。