Precendence Climbing
优先级爬升 是 运算符优先级解析器 中一种紧凑、高效且灵活的实现算法,是当前业界广泛采用的方案。[1]
优先级
算法将一个运算表达式当成数个嵌套的子表达式组成,每个表达式的优先级是其最低优先级运算符的优先级。
举例:
2 + 3 * 4 * 5 - 6
+
、-
的优先级为 1,*
、/
的优先级为 2,因此:
2 + 3 * 4 * 5 - 6
|---------------| : prec 1
|-------| : prec 2
更复杂的例子,引入优先级为 3 的 ^
运算符:
2 + 3 ^ 2 * 3 + 4
|---------------| : prec 1
|-------| : prec 2
|---| : prec 3
关联性
双目运算符除了优先级还有关联性的概念,左关联运算符更倾向于左操作数,右关联运算符更倾向于右操作数。
举例:
+
、-
、*
、/
是左关联运算符
2 + 3 + 4 = (2 + 3) + 4
^
是右关联运算符
2 ^ 3 ^ 4 = 2 ^ (3 ^ 4)
算法
首先定义几个概念:
- Atoms(原子):数字或者括号内的表达式
- Expressions(表达式):由被操作符连接起来的原子组成
优先级爬升 是面向操作符(operator)的算法,伪代码如下:
compute_expr(min_prec):
result = compute_atom()
while cur token is a binary operator with precedence >= min_prec:
prec, assoc = precedence and associativity of current token
if assoc is left:
next_min_prec = prec + 1
else:
next_min_prec = prec
rhs = compute_expr(next_min_prec)
result = compute operator(result, rhs)
return result
例子:
2 + 3 ^ 2 * 3 + 4
解析流程如下:
* compute_expr(1) # Initial call on the whole expression
* compute_atom() --> 2
* compute_expr(2) # Loop entered, operator '+'
* compute_atom() --> 3
* compute_expr(3)
* compute_atom() --> 2
* result --> 2 # Loop not entered for '*' (prec < '^')
* result = 3 ^ 2 --> 9
* compute_expr(3)
* compute_atom() --> 3
* result --> 3 # Loop not entered for '+' (prec < '*')
* result = 9 * 3 --> 27
* result = 2 + 27 --> 29
* compute_expr(2) # Loop entered, operator '+'
* compute_atom() --> 4
* result --> 4 # Loop not entered - end of expression
* result = 29 + 4 --> 33
Precedence climbing - what it aims to achieve
So the basic goal of the algorithm is the following: treat an expression as a bunch of nested sub-expressions, where each sub-expression has in common the lowest precedence level of the the operators it contains.[2]
Associativity
Binary operators, in addition to precedence, also have the concept of associativity. Simply put, left associative operators stick to the left stronger than to the right; right associative operators vice versa.[2:1]