Project 3 leaderboard 优化

这是一个关于我自己的cmu15-445 project 3 leaderboard部分的攻略,不会涉及其前面基础实现部分。最终效果三个部分平均用时2000以内,进了排行榜前十。


引自joey-wang

Query 1: TopNPerGroupPlan

根据例子:

1
2
3
4
5
CREATE TABLE t1(x INT, y INT, z INT);
SELECT x, y FROM (
SELECT x, y, rank() OVER (partition by x order by y) as rank
FROM t1
) WHERE rank <= 3;

首先可以通过Bustub shell 里的 explain 得其目前的 plan tree,来分析一下执行过程:

1
2
3
4
5
6
7
8
9
10
=== OPTIMIZER ===
Projection { exprs=["#0.0", "#0.1"] }
Filter { predicate=(#0.2<=3) }
WindowFunc {
columns=#0.0, #0.1, placeholder, ,
window_functions={
2=>{ function_arg=1, type=rank, partition_by=["#0.0"], order_by=[("Default", "#0.1")] }
}
}
SeqScan { table=t1 }

可看出其有四个plan node,执行顺序如下:

1
Seqscan -> WindowFunc -> Filter -> Projection

当满足Projection <-Filter <- WindowFunc顺序,且projection将windowFunc产生的rank列清除时,它们三个node可以合并成一个TopNPerGroupPlan,扫描后直接返回每个group排名前N的tuple,即:

1
Seqscan -> TopNPerGroupPlan

即可在optimizer.h新增一个优化pattern:

1
auto OptimizeProjectionFilterWindowAsTopGroupN(const AbstractPlanNodeRef &plan) -> AbstractPlanNodeRef;

你会发现原本bustub自带的优化method实现都是在一个新的.cpp文件中,为了统一格式你自然会想到新创建一个OptimizeProjectionFilterWindowAsTopGroupN.cpp来实现该函数,再加入cmakelists里。但事与愿违,由于gradescope的限制,如果你想线上测试,就不能自己新建文件,而要在项目指定optimizer_custom_rules.cpp里实现。这点会让optimizer_custom_rules.cpp往后变的很臃肿,希望课程以后能改进这一点吧。。。

主要思路是利用一个可以保留多个tuple 的特殊priority queue来为每个组保留top N个tuple。注意如果有相等的tuple,需要同时保留,例子可见p3.leaderboard_test_q1_window。还要用一个hashmap 来分组。过程就是:先拿到一个tuple,然后获得其group值,利用group值做key在一个hashmap中找到其所在group对应的priority queue,然后插入。这时候只需要自定义一下这个priority queue,让其自动保存前n个tuple就可以了。

对于OptimizeProjectionFilterWindowAsTopGroupN,则是依次查看三层nodeplan,如果满足“Projection <-Filter <- WindowFunc顺序,且projection将windowFunc产生的rank列清除时”,即可优化成TopNPerGroup node。可以利用shell中的explain (o)语句来查看优化是否成功。

具体实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
auto Optimizer::OptimizeProjectionFilterWindowAsTopGroupN(const AbstractPlanNodeRef &plan) -> AbstractPlanNodeRef {
std::vector<AbstractPlanNodeRef> optimized_children{};
for (const auto &child : plan->children_) {
optimized_children.emplace_back(OptimizeProjectionFilterWindowAsTopGroupN(child));
}
auto optimized_plan = plan->CloneWithChildren(std::move(optimized_children));

// First node is Projection.
if (optimized_plan->GetType() == PlanType::Projection) {
const auto &projection_plan = dynamic_cast<const ProjectionPlanNode &>(*optimized_plan);
auto &first_child = projection_plan.GetChildren().at(0);
// Second node is filter.
if (first_child->GetType() == PlanType::Filter) {
const auto &filter_child = dynamic_cast<const FilterPlanNode &>(*first_child);
// auto filter_logic_expr = dynamic_cast<const LogicExpression&>(filter_child.predicate_);
auto &second_child = filter_child.GetChildren().at(0);
// Third node is window.
if (second_child->GetType() == PlanType::Window) {
const auto &window_child = dynamic_cast<const WindowFunctionPlanNode &>(*second_child);
// Try to find Rank's col index.
int rank_idx = -1;
std::string rank_name;
for (size_t col = 0; col < window_child.columns_.size(); ++col) {
if (window_child.window_functions_.find(rank_idx) != window_child.window_functions_.end()) {
// Is a window function
if (window_child.window_functions_.at(rank_idx).type_ == WindowFunctionType::Rank) {
rank_idx = col;
rank_name = window_child.OutputSchema().GetColumn(rank_idx).GetName();
goto OptimizeProjectionFilterWindowAsTopGroupN_END;;
}
}
}
// If we find Rank, we can continue to check if it stays after projection.
if (rank_idx != -1) {
const auto &projection_schema = projection_plan.OutputSchema();
for (auto &column : projection_schema.GetColumns()) {
if (column.GetName() == rank_name) { // Find the rank line
goto OptimizeProjectionFilterWindowAsTopGroupN_END;
}
}
}
// Projection removed RANK, we can optimize it to TopGroupN node.
// Find group_bys
const std::vector<AbstractExpressionRef> &group_by =
window_child.window_functions_.begin()->second.partition_by_;
// Find order by
const std::vector<std::pair<OrderByType, AbstractExpressionRef>> &order_by =
window_child.window_functions_.begin()->second.order_by_;
// Find top n;
auto value_expr = dynamic_cast<const ConstantValueExpression &>(*filter_child.predicate_->GetChildAt(1));
// auto value_expr = dynamic_cast<const ConstantValueExpression &>(*maybe_value_expr);
int n = value_expr.val_.CastAs(TypeId::INTEGER).GetAs<int>();
return std::make_shared<TopNPerGroupPlanNode>(projection_plan.output_schema_, window_child.children_.at(0),
group_by, order_by, n);
}
}
}

OptimizeProjectionFilterWindowAsTopGroupN_END:
return optimized_plan;
}

在topn_per_group_executor.cpp:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
void TopNPerGroupExecutor::Init() {
child_executor_->Init();
partition_top_n_.clear();

size_t top_n = plan_->n_;
auto group_by = plan_->GetGroupBy();
CompareTuple comparator(&plan_->GetOrderBy(), &child_executor_->GetOutputSchema());

// Use a priority queue to store the top N tuples for each group.
Tuple tuple;
RID rid;
while (child_executor_->Next(&tuple, &rid)) { // nlogn
AggregateKey group_by_key = ConstructGroupByTuple(tuple, group_by);
auto it = partition_top_n_.find(group_by_key);
if (partition_top_n_.find(group_by_key) == partition_top_n_.end()) {
it = partition_top_n_.emplace(group_by_key, TopNTuplePriorityQueue(top_n, comparator)).first;
}
it->second.Push(tuple);
}

iter_ = partition_top_n_.begin();
}

auto TopNPerGroupExecutor::Next(Tuple *tuple, RID *rid) -> bool {
if (iter_ != partition_top_n_.end()) {
if (iter_->second.Empty()) {
// Move to the next pq, and erase current pq.
iter_ = partition_top_n_.erase(iter_);
return Next(tuple, rid);
}
*tuple = iter_->second.Pop();
*rid = tuple->GetRid();
return true;
}
return false;
}

自定义版priority queue:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
class TopNTuplePriorityQueue {
public:
explicit TopNTuplePriorityQueue(std::size_t n, CompareTuple comparator)
: n_(n), pq_(comparator), freq_map_(comparator) {}

void Push(const Tuple &tuple) {
pq_.push(tuple);
freq_map_[tuple] += 1;
MaintainSize();
}

auto Pop() -> Tuple {
if (pq_.empty()) [[unlikely]] {
throw std::runtime_error("TopNTuplePriorityQueue is empty");
}
Tuple top_tuple = pq_.top();
pq_.pop();
return top_tuple;
}

auto Empty() -> bool { return pq_.empty(); }

private:
std::size_t n_;
std::priority_queue<Tuple, std::vector<Tuple>, CompareTuple> pq_;
std::map<Tuple, size_t, CompareTuple> freq_map_;

void MaintainSize() {
while (pq_.size() > n_) {
// If removing top elements will lead to underflow, break;
if (pq_.size() - freq_map_[pq_.top()] < n_) {
break;
}
// Remove top tuples.
while (freq_map_[pq_.top()] > 1) {
freq_map_[pq_.top()] -= 1;
pq_.pop();
}
freq_map_.erase(pq_.top());
pq_.pop();
}
}
};

在top_n_per_group.h中的TupleComparator和其他一些东西就留给读者思考,不写出了,具体思路我已经在上面讲过了。而这个玩意的输出不需要按照一定顺序比如升序,只需要获得的tuple对就行。

至此即可通过leaderboard test q1,我的用时目前是7000多,排十多名,在做了优化的人中算是慢的了,看来还有优化空间。

做个profiling看看👀:
在这里插入图片描述
我发现TopNTuplePriorityQueue中的**MaintainSize()函数比我想象的耗时间。MaintainSize占据了Push过半的时间,而freq_map_[]**又占据了MaintainSize过半的时间,所以用基于红黑树的map还是不行,可以把 std::map<Tuple, size_t, CompareTuple> freq_map_ 优化成 unordered_map。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void MaintainSize() {
while (pq_.size() > n_) {
// If removing top elements will lead to underflow, break;
uint32_t top_tuple_freq = freq_map_[pq_.top()];
if (pq_.size() - top_tuple_freq < n_) {
break;
}
// Remove top tuples.
freq_map_.erase(pq_.top());
for (size_t i = 0; i < top_tuple_freq; ++i) {
pq_.pop();
}
}
}

...

std::unordered_map<Tuple, size_t, TupleHash, TupleEqual> freq_map_;

然后再测试,发现居然q1用时直接从7000多减少到了4000多,快了40%多,效果十分显著啊。可见profiling对优化的指导还是很有效的。

优化后再做个profiling:
在这里插入图片描述
可见Push()用时大幅缩短。但leaderboard上还有神人能优化到1000多,真不知怎么做到的。

Query 2: Too Many Joins!

解析

一名来自严辉村的村民写了如下的sql:

1
2
3
4
5
6
7
CREATE TABLE t4(x int, y int);
CREATE TABLE t5(x int, y int);
CREATE TABLE t6(x int, y int);

SELECT * FROM t4, t5, t6
WHERE (t4.x = t5.x) AND (t5.y = t6.y) AND (t4.y >= 1000000)
AND (t4.y < 1500000) AND (t6.x >= 100000) AND (t6.x < 150000);

可见他其实想吧t4、t5 join on x, t5、t6 join on y, 但是并没有人为地分步合并,而是一股脑地把条件全放在了where后面。如果没有优化,就导致该query执行的时候会把t4、t5、t6做两个笛卡尔积(Cartesian product,即无条件的join)生产一个巨大的table,然后遍历这个巨大的table进行过滤。

当然,你在前面已经实现了包括hash join在内的部分优化器,所以实际上根据我个人前面的实现,会先对t4、t5进行一个基于hash join的笛卡尔积,然后再和t6进行一个有条件的hash join,最后进行一个过滤(filter node)。

具体经过用explain (o)可看其plan tree:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
=== OPTIMIZER ===
Filter { predicate=(((((#0.0=#0.2)and(#0.1>=1000000))and(#0.1<1500000))and
(#0.4>=100000))and(#0.4<150000)) } |
(t4.x:INTEGER, t4.y:INTEGER, t5.x:INTEGER, t5.y:INTEGER, t6.x:INTEGER, t6.y:INTEGER)

HashJoin { type=Inner, left_key=["#0.3"], right_key=["#1.1"] } |
(t4.x:INTEGER, t4.y:INTEGER, t5.x:INTEGER, t5.y:INTEGER, t6.x:INTEGER, t6.y:INTEGER)

HashJoin { type=Inner, left_key=[], right_key=[] } |
(t4.x:INTEGER, t4.y:INTEGER, t5.x:INTEGER, t5.y:INTEGER)

SeqScan { table=t4 } | (t4.x:INTEGER, t4.y:INTEGER)
SeqScan { table=t5 } | (t5.x:INTEGER, t5.y:INTEGER)
SeqScan { table=t6 } | (t6.x:INTEGER, t6.y:INTEGER)

这个根据每个人之前对Optimizer::OptimizeNLJAsHashJoin()实现的不同,这里的情况也会有不同。

关于Hash Join

而OptimizeNLJAsHashJoin其实还挺繁琐的,根据我对gradescope上leaderboard的观察,有很多人对这个函数的实现其实的错误的,只是因为gradescope上的基础测试并不全面,他们才通过了除leaderboard外其余的基础测试。而到了leaderboard测试,就会出现bug了。

所以我讲一下我的hashjoin的实现思路:大概是用一个递归遍历NLJ_plan.predicate的tree,当发现ColumnValueExpression且判断式为equal且其左右孩子expression均为ColumnValueExpression且GetTupleIdx()分列左右的时候,将其加入hash join并从原expression tree中删除。而遍历后剩下的expression tree则用于创建一个filter放在hash join 上方。而且在生成filter的时候要记得修改ColumnValueExpression的col_idx。

举个例子:

1
2
SELECT * FROM t4, t5
WHERE (t4.x = t5.x) AND (t4.x = t4.y) AND (t4.z >= t5.z) AND (t5.x = 100);

这个query会被优化成:

1
2
3
4
5
Filter: (t4.x = t4.y) AND (t4.z >= t5.z) AND (t5.z = 100)

└── HashJoin: t4.x = t5.x
├── Sequential Scan: t4
└── Sequential Scan: t5

四个条件只有$t4.x = t5.x$能被放入hash join, 因为$t4.x = t4.y$只作用在left child, $t4.z >= t5.z$不是等式而是不等式,$t5.x = 100$只作用于right child且有一个常数项。所以这三个判断都上移到filter。

而还要考虑更复杂的NLJ的条件中logic连接词为OR的情况,如:

1
2
SELECT * FROM t4, t5
WHERE (t4.x = t5.x) OR (t4.y = t5.y);

我目前的做法是如果存在OR就直接保持NLJ,但其实可以这样优化成:

1
2
3
4
5
6
7
SELECT *
FROM t4, t5
WHERE t4.x = t5.x
UNION ALL
SELECT *
FROM t4, t5
WHERE t4.y = t5.y;

或者用tree表达:

1
2
3
4
5
6
7
Union All
├── HashJoin: t4.x = t5.x
│ ├── Sequential Scan: t4
│ └── Sequential Scan: t5
└── HashJoin: t4.y = t5.y
├── Sequential Scan: t4
└── Sequential Scan: t5

即将OR连接的部分拆分成两个hash join,然后再Union起来。

优化

回到正题,这个query就是故意设计一个predicate集中分布在node tree上方,主要考察的是优化器对predicate pushdown的能力。

1
2
3
4
5
6
7
8
9
10
11
12
13
Filter { predicate=(((((#0.0=#0.2)and(#0.1>=1000000))and(#0.1<1500000))and
(#0.4>=100000))and(#0.4<150000)) } |
(t4.x:INTEGER, t4.y:INTEGER, t5.x:INTEGER, t5.y:INTEGER, t6.x:INTEGER, t6.y:INTEGER)

HashJoin { type=Inner, left_key=["#0.3"], right_key=["#1.1"] } |
(t4.x:INTEGER, t4.y:INTEGER, t5.x:INTEGER, t5.y:INTEGER, t6.x:INTEGER, t6.y:INTEGER)

HashJoin { type=Inner, left_key=[], right_key=[] } |
(t4.x:INTEGER, t4.y:INTEGER, t5.x:INTEGER, t5.y:INTEGER)

SeqScan { table=t4 } | (t4.x:INTEGER, t4.y:INTEGER)
SeqScan { table=t5 } | (t5.x:INTEGER, t5.y:INTEGER)
SeqScan { table=t6 } | (t6.x:INTEGER, t6.y:INTEGER)
1
2
3
4
5
6
7
8
9
10
# Query Plan Node Tree

Filter: (t4.y = t5.y) AND (t4.x >= 1000000)) AND (t4.x < 1500000))
| AND (t6.x >= 100000)) AND (t6.x < 150000))

└── HashJoin: t5.y = t6.y
├── HashJoin: t4.x = t5.x
│ ├── Sequential Scan: t4
│ └── Sequential Scan: t5
└── Sequential Scan: t6

画出plan node tree 可以看出,根据我之前的实现,predicate都飘在最上方,现在要做的是把predicate下推,其在后续优化函数的作用下可以与NLJ、seq_scan等node合并。

在optimizer.h新增优化函数:

1
auto OptimizePredicatePushdown(const AbstractPlanNodeRef &plan) -> AbstractPlanNodeRef;

同样在optimizer_custom_rules.cpp中实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
auto Optimizer::OptimizePredicatePushdown(const AbstractPlanNodeRef &plan) -> AbstractPlanNodeRef {
std::vector<AbstractPlanNodeRef> optimized_children{};
for (const auto &child : plan->children_) {
optimized_children.emplace_back(OptimizePredicatePushdown(child));
}
auto optimized_plan = plan->CloneWithChildren(std::move(optimized_children));

// Plan Should be a filter node.
if (optimized_plan->GetType() == PlanType::Filter) {
auto filter_plan = dynamic_cast<FilterPlanNode *>(optimized_plan.get());
std::vector<std::shared_ptr<ComparisonExpression>> pushdowns;
filter_plan->predicate_ = ParsePredicate(filter_plan->predicate_, filter_plan->OutputSchema(), pushdowns);
// No predicate could be pushed down, simply return.
if (pushdowns.empty()) {
return optimized_plan;
}
// All predicate could be pushed down, remove the filter node.
if (filter_plan->predicate_ == nullptr) {
return Pushdown(filter_plan->children_[0], pushdowns);
}
// Part of predicate could be pushed down.
return filter_plan->CloneWithChildren({Pushdown(filter_plan->children_[0], pushdowns)});
}

return optimized_plan;
}

还有几个helper method我就不放出了,说一下思路:

首先要获取predicate, 而predicate存在filter node中,所以当遇到一个filter node,用ParsePredicate() 来获取predicate。这个函数输入值filter_plan->predicate_是一个tree,递归遍历处理这个tree,将用AND连接的ComparisonExpression加入pushdowns这个vector,遇到OR则停止直接保持原样返回。加入pushdowns的expression要从原树中剔除。

这样就获得了需要下推的predicates,将这些predicates和子节点放入下推函数**Pushdown()**。Pushdown中对遇到的各种类型的plan node分类处理,有的可以直接跳过,有的会阻塞下推,有的会使下推分叉(join node)。 类似:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
auto Optimizer::Pushdown(const AbstractPlanNodeRef &plan,
std::vector<std::shared_ptr<ComparisonExpression>> &pushdowns) -> AbstractPlanNodeRef {
if (pushdowns.empty()) {
return plan;
}
// Check if plan is Filter.
switch (plan->GetType()) {
case PlanType::NestedLoopJoin: {
TODO: ... ...
return optimized_plan;
}

// Skip these nodes.
case PlanType::Filter:
case PlanType::Limit:
case PlanType::Sort: {
return plan->CloneWithChildren(std::vector<AbstractPlanNodeRef>{Pushdown(plan->GetChildAt(0), pushdowns)});
}

// Block pushdown, generate a filter plan above it.
case PlanType::SeqScan:
case PlanType::MockScan:
case PlanType::Aggregation:
case PlanType::Window:
case PlanType::Delete:
case PlanType::Update:
case PlanType::Insert:
case PlanType::Projection:
case PlanType::InitCheck: {
return std::make_shared<FilterPlanNode>(plan->output_schema_, ConcatencateComparisonAnd(pushdowns), plan);
}

// Advanced planNode, they should be generated latter.
case PlanType::TopNPerGroup:
case PlanType::TopN:
case PlanType::HashJoin:
case PlanType::NestedIndexJoin: {
UNREACHABLE("This kind of node should not appear in pushdown. They should be made after pushdown.");
}

default: {
UNREACHABLE("Not supported planNode type.");
}
}
}

其中最复杂的就是遇到NestedLoopJoin,记得要修改像右子树继续下推的predicates的col_idx

当推到可以合并的节点,如seq_scan、NLJ的时候,只需要在其上方建一个filter node,因为后续其他的优化函数会帮你合并。而OptimizePredicatePushdown我是放在了Optimizer::OptimizeCustom中的靠前位置,这样可以先下推,再合并,优化调理清晰,充分利用已经写好的其他优化函数。

实现下推后:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
=== OPTIMIZER ===
HashJoin { type=Inner, left_key=["#0.3"], right_key=["#1.1"] }
| (t4.x:INTEGER, t4.y:INTEGER, t5.x:INTEGER, t5.y:INTEGER, t6.x:INTEGER, t6.y:INTEGER)

HashJoin { type=Inner, left_key=["#0.0"], right_key=["#1.0"] }
| (t4.x:INTEGER, t4.y:INTEGER, t5.x:INTEGER, t5.y:INTEGER)

SeqScan { table=t4, filter=((#0.1<1500000)and(#0.1>=1000000)) }
| (t4.x:INTEGER, t4.y:INTEGER)

SeqScan { table=t5 } | (t5.x:INTEGER, t5.y:INTEGER)

SeqScan { table=t6, filter=((#0.0<150000)and(#0.0>=100000)) }
| (t6.x:INTEGER, t6.y:INTEGER)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
HashJoin: t4.y = t6.y
| Output: (t4.x, t4.y, t5.x, t5.y, t6.x, t6.y)

├── HashJoin: t4.x = t5.x
│ | Output: (t4.x, t4.y, t5.x, t5.y)
│ │
│ ├── SeqScan: table=t4
│ │ | Filter: (t4.y >= 1000000) AND (t4.y < 1500000)
│ │ | Output: (t4.x, t4.y)
│ │
│ └── SeqScan: table=t5
│ | No Filter
│ | Output: (t5.x, t5.y)

└── SeqScan: table=t6
| Filter: (t6.x >= 100000) AND (t6.x < 150000)
| Output: (t6.x, t6.y)

可看到谓词被下推到合适的地方了。

此时leaderboard q2用时1900多,排名第八。

进一步优化想法

  1. Join重排 join reorder。先估算表的大小,然后重排连接顺序,先join小表后join大表。
  2. 实现Hash join的比较左右表。估算左右表大小,从而用小表构造hash map。
  3. limit/agg 下推。但leaderboard test中没有针对此的测试,所以实现了对排名提高无用。
  4. query复用计算结果。leaderboard测试逻辑貌似是把同一个query执行10次,所以实现储存query结果复用有可能可以大幅加速第2~10次query。

Query 3: The Mad Data Scientist

一名购买了恒太房产的数据科学家意识到自己的所作所为之后,写下了这些奇怪的query:

1
2
3
4
5
6
7
8
9
10
11
create table t7(v int, v1 int, v2 int);
create table t8(v4 int);

explain SELECT v, d1, d2 FROM (
SELECT v,
MAX(v1) AS d1, MIN(v1), MAX(v2), MIN(v2),
MAX(v1) + MIN(v1), MAX(v2) + MIN(v2),
MAX(v1) + MAX(v1) + MAX(v2) AS d2
FROM t7 LEFT JOIN (SELECT v4 FROM t8 WHERE 1 == 2) ON v < v4
GROUP BY v
);

可以看到有很多奇怪的东西,比如始终等于false的

1
WHERE 1 == 2

还有计算了这些数据,但并没有输出:

1
2
MIN(v1), MAX(v2), MIN(v2),
MAX(v1) + MIN(v1), MAX(v2) + MIN(v2),

第一个始终等于false的表达式考察的是优化器**常量折叠(constant folding)能力,而第二个去除冗余的计算考察的是优化器列剪枝(Column pruning)**的能力。

先看看优化前:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
=== OPTIMIZER ===
Projection { exprs=["#0.0", "#0.1", "#0.7"] } | (__subquery#0.t7.v:INTEGER, __subquery#0.d1:INTEGER, __subquery#0.d2:INTEGER)

Projection { exprs=["#0.0", "#0.1", "#0.2", "#0.3", "#0.4", "(#0.5+#0.6)", "(#0.7+#0.8)", "((#0.9+#0.10)+#0.11)"] } | (__subquery#0.t7.v:INTEGER, __subquery#0.d1:INTEGER, __subquery#0.__item#2:INTEGER, __subquery#0.__item#3:INTEGER, __subquery#0.__item#4:INTEGER, __subquery#0.__item#5:INTEGER, __subquery#0.__item#6:INTEGER, __subquery#0.d2:INTEGER)

Agg { types=["max", "min", "max", "min", "max", "min", "max", "min", "max", "max", "max"],
aggregates=["#0.1", "#0.1", "#0.2", "#0.2", "#0.1", "#0.1", "#0.2", "#0.2", "#0.1", "#0.1", "#0.2"],
group_by=["#0.0"] }
| (t7.v:INTEGER, agg#0:INTEGER, agg#1:INTEGER, agg#2:INTEGER, agg#3:INTEGER, agg#4:INTEGER, agg#5:INTEGER, agg#6:INTEGER, agg#7:INTEGER, agg#8:INTEGER, agg#9:INTEGER, agg#10:INTEGER)

Filter { predicate=(#0.0<#0.3) } | (t7.v:INTEGER, t7.v1:INTEGER, t7.v2:INTEGER, __subquery#1.t8.v4:INTEGER)

HashJoin { type=Left, left_key=[], right_key=[] } | (t7.v:INTEGER, t7.v1:INTEGER, t7.v2:INTEGER, __subquery#1.t8.v4:INTEGER)

SeqScan { table=t7 } | (t7.v:INTEGER, t7.v1:INTEGER, t7.v2:INTEGER)

SeqScan { table=t8, filter=(1=2) } | (t8.v4:INTEGER)
1
2
3
4
5
6
7
8
9
10
11
12
13
Projection: (__subquery#0.t7.v, __subquery#0.d1, __subquery#0.d2)

└── Projection: (__subquery#0.t7.v, __subquery#0.d1, __subquery#0.d2, derived columns)

└── Aggregation: Group by (t7.v), Aggregates: max/min

└── Filter: (t7.v < __subquery#1.t8.v4)

└── HashJoin: Left Join (t7, t8)

├── Sequential Scan: t7

└── Sequential Scan: t8 (filtered: (1=2) )

常量折叠 Constant Folding

常量折叠指的是优化器提前将可以计算的表达式计算出结果,否则每来一个tuple,表达式就要重新计算一次。

在optimizer.h中增加declaration,并在optimizer_custom_rules.cpp中实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
auto Optimizer::OptimizeConstantFolding(const AbstractPlanNodeRef &plan) -> AbstractPlanNodeRef {
std::vector<AbstractPlanNodeRef> optimized_children{};
for (const auto &child : plan->children_) {
optimized_children.emplace_back(OptimizeConstantFolding(child));
}
auto optimized_plan = plan->CloneWithChildren(std::move(optimized_children));

// For filter, folding its predicate.
if (optimized_plan->GetType() == PlanType::Filter) {
auto filter_plan = dynamic_cast<FilterPlanNode *>(optimized_plan.get());
auto new_predicate = FoldingPredicate(filter_plan->predicate_);
// If the new_predicate is constant,
// true, return its child; false, return an empty value node.
if (auto new_predicate_const = dynamic_cast<const ConstantValueExpression *>(new_predicate.get())) {
if (new_predicate_const->Evaluate({}, Schema({})).GetAs<bool>()) {
return plan->children_[0];
}
return std::make_shared<ValuesPlanNode>(plan->output_schema_, std::vector<std::vector<AbstractExpressionRef>>{});
}
// new_predicate is not a constant, return a new filter.
return std::make_shared<FilterPlanNode>(filter_plan->output_schema_, new_predicate, filter_plan->children_[0]);
}

return optimized_plan;
}

实现常量折叠后:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Query Execution Plan

Projection: (output columns: #0.0, #0.1, #0.7)

└── Projection: (output columns: #0.0, #0.1, #0.2, #0.3, #0.4, (#0.5 + #0.6), (#0.7 + #0.8), ((#0.9 + #0.10) + #0.11))

└── Aggregation: Group By (#0.0), Aggregates (max/min on #0.1, #0.2)

└── Filter: (#0.0 < #0.3)

└── HashJoin: Left Join

├── Sequential Scan: t7

└── Values: (empty set, rows=0)

可见seq_scan t8 被折叠成空的Value node了。此时可以对Join进行优化,如果右孩为空,直接返回左孩(left join)或直接返回false(inner join)。但目前不可以直接删除Join,因为join之后schema会改变。

此时只折叠了filter里的expressions,这虽然已经足够应付测试,但你还可以将其余有expressions的plan都给折叠了,这就不贴出来了,留给读者实现。

列剪枝 ColumnPruning

可以看到实现常量折叠后的执行树中,

1
Projection: (output columns: #0.0, #0.1, #0.7)

最终结果只输出#0.0 #0.1 #0.7,但aggregate却计算了一长串:

1
Projection: (output columns: #0.0, #0.1, #0.2, #0.3, #0.4, (#0.5 + #0.6), (#0.7 + #0.8), ((#0.9 + #0.10) + #0.11)

所以当Projection的子节点为Projection或者Aggregation的时候,可以对子节点进行剪枝,避免不需要的计算和内存占用。注意,Column Pruning一定要自上而下,否则会导致pruning不完全。

第二个projection应pruned为:

1
2
3
4
5
Projection: (output columns: #0.0, #0.1, #0.2, #0.3, #0.4, (#0.5 + #0.6), (#0.7 + #0.8), ((#0.9 + #0.10) + #0.11))


\/
Projection: (output columns: #0.0, #0.1, ((#0.9 + #0.10) + #0.11))

对于 Projection 结点, 分别处理子结点为 Projection 和 Aggregation 的情况.

  • 对于 Projection 结点, 用父节点对子结点修剪, 然后用子结点替换父节点.
  • 对于 Aggregation 结点, 用 Projection 中出现的列检查 Aggregation 中是否出现, 若没有则删除. 同时检查是否存在冗余, 若存在则删除.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    auto Optimizer::OptimizeColumnPruning(const AbstractPlanNodeRef &plan) -> AbstractPlanNodeRef {
    std::shared_ptr<const AbstractPlanNode> optimized_plan = plan;

    if (optimized_plan->GetType() == PlanType::Projection) {
    auto projection_plan = dynamic_cast<const ProjectionPlanNode *>(optimized_plan.get());
    auto child_plan = projection_plan->GetChildAt(0);

    // Pruning when child is Projection.
    if (child_plan->GetType() == PlanType::Projection) {
    // Collect used cols.
    std::vector<uint32_t> used_col_idxs;
    used_col_idxs.reserve(optimized_plan->output_schema_->GetColumnCount());
    for (const auto &expr : projection_plan->expressions_) {
    CollectUsedColumnIdx(expr, used_col_idxs);
    }
    std::sort(used_col_idxs.begin(), used_col_idxs.end());
    // Prune child.
    auto child_projection_plan = dynamic_cast<const ProjectionPlanNode *>(child_plan.get());
    std::vector<AbstractExpressionRef> pruned_exprs;
    std::vector<Column> pruned_columns;
    pruned_exprs.reserve(used_col_idxs.size());
    pruned_columns.reserve(used_col_idxs.size());
    for (uint32_t idx : used_col_idxs) {
    pruned_exprs.push_back(child_projection_plan->expressions_[idx]);
    pruned_columns.push_back(child_projection_plan->output_schema_->GetColumn(idx));
    }
    // Replace parent by its optimized child.
    auto optimized_child_plan = std::make_shared<ProjectionPlanNode>(std::make_shared<Schema>(pruned_columns),
    pruned_exprs, child_plan->children_[0]);
    return OptimizeColumnPruning(optimized_child_plan);
    }

    // Pruning when child is Aggregation.
    if (child_plan->GetType() == PlanType::Aggregation) {
    // Collect used cols.
    std::vector<uint32_t> used_col_idxs;
    used_col_idxs.reserve(optimized_plan->output_schema_->GetColumnCount());
    for (const auto &expr : projection_plan->expressions_) {
    CollectUsedColumnIdx(expr, used_col_idxs);
    }
    std::sort(used_col_idxs.begin(), used_col_idxs.end());
    // Prune child.
    auto child_aggregation_plan = dynamic_cast<const AggregationPlanNode *>(child_plan.get());
    size_t group_col_length = child_aggregation_plan->group_bys_.size();
    std::vector<AbstractExpressionRef> pruned_aggregates;
    std::vector<AggregationType> pruned_agg_types;
    std::vector<Column> pruned_columns;
    pruned_aggregates.reserve(used_col_idxs.size());
    pruned_agg_types.reserve(used_col_idxs.size());
    pruned_columns.reserve(used_col_idxs.size());
    for (size_t i = 0; i < group_col_length; ++i) {
    pruned_columns.push_back(child_aggregation_plan->output_schema_->GetColumn(i));
    }
    for (uint32_t idx : used_col_idxs) { // Maybe optimized to binary, upper_bound.
    if (idx >= group_col_length) {
    pruned_aggregates.push_back(child_aggregation_plan->aggregates_[idx - group_col_length]);
    pruned_agg_types.push_back(child_aggregation_plan->agg_types_[idx - group_col_length]);
    pruned_columns.push_back(child_aggregation_plan->output_schema_->GetColumn(idx));
    }
    }
    // Make new optimized node child.
    auto optimized_aggr = std::make_shared<AggregationPlanNode>(
    std::make_shared<Schema>(pruned_columns), child_aggregation_plan->children_[0],
    child_aggregation_plan->group_bys_, pruned_aggregates, pruned_agg_types);
    // Modified parent node schema and expr.
    std::vector<AbstractExpressionRef> pruned_exprs;
    pruned_exprs.reserve(used_col_idxs.size());
    for (const auto &expr : projection_plan->expressions_) {
    pruned_exprs.push_back(PrunedProjectionExpression(expr, used_col_idxs));
    }
    optimized_plan = std::make_shared<ProjectionPlanNode>(projection_plan->output_schema_, pruned_exprs, optimized_aggr);
    return optimized_plan->CloneWithChildren({OptimizeColumnPruning(optimized_plan->children_[0])});
    }
    }

    if (optimized_plan->GetType() == PlanType::SeqScan || optimized_plan->GetType() == PlanType::MockScan ||
    optimized_plan->GetType() == PlanType::IndexScan) {
    return optimized_plan;
    }

    std::vector<AbstractPlanNodeRef> new_children{};
    for (const auto &child : plan->children_) {
    new_children.emplace_back(OptimizeColumnPruning(child));
    }
    return plan->CloneWithChildren(std::move(new_children));
    }
    到了这里,我也发现了我OptimizeNLJAsHashJoin()的一个bug,要在OptimizeNLJAsHashJoin()中需要加一行:当尝试在Join Node上方生成新的filter Node时候,如join类型不是Inner Join, 要保持NLJ而不能优化成Hash_Join

我实现OptimizeNLJAsHashJoin()的时候会尝试多生成一个filter Node,但是如果是Left Join 的话,Join后的tuple中有可能出现null value,而新的filer有可能会导致这些包含null value的tuple全被去除掉。所以在Left Join的时候,要保持去除的判断语句在Join Node内部,而不能剥离开。所以其实gradescope上给的基础测试非常不全面,即使通过所以测试,代码也可能有很多很大的bug。

debug并优化后:

1
2
3
4
5
6
7
8
9
10
11
Projection { exprs=["#0.0", "#0.1", "((#0.2+#0.3)+#0.4)"] }

Agg { types=["max", "max", "max", "max"]
,aggregates=["#0.1", "#0.1", "#0.1", "#0.2"]
,group_by=["#0.0"] }

NestedLoopJoin { type=Left, predicate=(#0.0<#1.0) }

MockScan { table=__mock_t7 }

Values { rows=0 }

此时已经可以较快地通过q3了,但你会发现,aggr中居然还有重复的计算:

1
2
types=     ["max",  "max",  "max",  "max"],
aggregates=["#0.1", "#0.1", "#0.1", "#0.2"]

对齐一下,显然对#0.1的max居然计算了三遍…所以可以进一步对aggregation去重。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
void Optimizer::CollectUsedColumnIdx(const AbstractExpressionRef &expr, std::vector<uint32_t> &col_idxs) {
if (const auto *arith_expr = dynamic_cast<const ArithmeticExpression*>(expr.get())) {
CollectUsedColumnIdx(expr->children_[0], col_idxs);
CollectUsedColumnIdx(expr->children_[1], col_idxs);
return;
}
if (const auto *column_value_expr = dynamic_cast<const ColumnValueExpression *>(expr.get())) {
if (std::find(col_idxs.begin(), col_idxs.end(), column_value_expr->GetColIdx()) == col_idxs.end()) {
col_idxs.push_back(column_value_expr->GetColIdx());
}
return;
}
if (dynamic_cast<const ConstantValueExpression *>(expr.get())) {
return;
}
std::string message = "Projection expressions should only contain arithmetic, column and constant: ";
UNREACHABLE(message);
}

auto Optimizer::PrunedProjectionExpression(const AbstractExpressionRef &expr, std::vector<int> idx_map)
-> AbstractExpressionRef {
// Notice: used_col_idxs is sorted.
if (const auto *arith_expr = dynamic_cast<const ArithmeticExpression *>(expr.get())) {
return expr->CloneWithChildren({PrunedProjectionExpression(expr->children_[0], idx_map),
PrunedProjectionExpression(expr->children_[1], idx_map)});
}
if (const auto *column_value_expr = dynamic_cast<const ColumnValueExpression *>(expr.get())) {
int new_col_idx = idx_map[column_value_expr->GetColIdx()];
return std::make_shared<ColumnValueExpression>(column_value_expr->GetTupleIdx(), new_col_idx,
column_value_expr->GetReturnType());
}
return expr;
}

auto Optimizer::OptimizeColumnPruning(const AbstractPlanNodeRef &plan) -> AbstractPlanNodeRef {
std::shared_ptr<const AbstractPlanNode> optimized_plan = plan;

if (optimized_plan->GetType() == PlanType::Projection) {
auto projection_plan = dynamic_cast<const ProjectionPlanNode *>(optimized_plan.get());
auto child_plan = projection_plan->GetChildAt(0);

// Pruning when child is Projection.
if (child_plan->GetType() == PlanType::Projection) {
// Collect used cols.
std::vector<uint32_t> used_col_idxs;
used_col_idxs.reserve(optimized_plan->output_schema_->GetColumnCount());
for (const auto &expr : projection_plan->expressions_) {
CollectUsedColumnIdx(expr, used_col_idxs);
}
std::sort(used_col_idxs.begin(), used_col_idxs.end());
// Prune child.
auto child_projection_plan = dynamic_cast<const ProjectionPlanNode *>(child_plan.get());
std::vector<AbstractExpressionRef> pruned_exprs;
std::vector<Column> pruned_columns;
pruned_exprs.reserve(used_col_idxs.size());
pruned_columns.reserve(used_col_idxs.size());
for (uint32_t idx : used_col_idxs) {
pruned_exprs.push_back(child_projection_plan->expressions_[idx]);
pruned_columns.push_back(child_projection_plan->output_schema_->GetColumn(idx));
}
// Replace parent by its optimized child.
auto optimized_child_plan = std::make_shared<ProjectionPlanNode>(std::make_shared<Schema>(pruned_columns),
pruned_exprs, child_plan->children_[0]);
return OptimizeColumnPruning(optimized_child_plan);
}

// Pruning when child is Aggregation.
if (child_plan->GetType() == PlanType::Aggregation) {
auto child_aggregation_plan = dynamic_cast<const AggregationPlanNode *>(child_plan.get());

// Collect used cols.
std::vector<uint32_t> used_col_idxs;
used_col_idxs.reserve(optimized_plan->output_schema_->GetColumnCount());
size_t group_cols = child_aggregation_plan->GetGroupBys().size();
for (size_t i = 0; i < group_cols; ++i) {
// Must use group by columns.
used_col_idxs.push_back(i);
}
for (const auto &expr : projection_plan->expressions_) {
CollectUsedColumnIdx(expr, used_col_idxs);
}
std::sort(used_col_idxs.begin(), used_col_idxs.end());

// Prune child.
size_t group_col_length = child_aggregation_plan->group_bys_.size();
std::vector<AbstractExpressionRef> pruned_aggregates;
std::vector<AggregationType> pruned_agg_types;
std::vector<Column> pruned_columns;
// For aggr deduplication, [expression.toString + AggregationType]
std::unordered_map<std::string, uint32_t> exist_expr;
std::unordered_map<uint32_t, uint32_t> re_direct;

pruned_aggregates.reserve(used_col_idxs.size());
pruned_agg_types.reserve(used_col_idxs.size());
pruned_columns.reserve(used_col_idxs.size());

for (size_t i = 0; i < group_col_length; ++i) {
pruned_columns.push_back(child_aggregation_plan->output_schema_->GetColumn(i));
}
for (uint32_t idx : used_col_idxs) { // Maybe optimized to binary, upper_bound.
if (idx >= group_col_length) { // idx >= group_col_length means it's aggr.
auto aggr_expr = child_aggregation_plan->aggregates_[idx - group_col_length];
auto aggr_type = child_aggregation_plan->agg_types_[idx - group_col_length];
std::string aggr_pair = aggr_expr->ToString() + std::to_string(static_cast<int>(aggr_type));
if (exist_expr.count(aggr_pair) == 0) {
// Not exist yet, add to pruned lists.
pruned_aggregates.push_back(aggr_expr);
pruned_agg_types.push_back(aggr_type);
pruned_columns.push_back(child_aggregation_plan->output_schema_->GetColumn(idx));
exist_expr[aggr_pair] = pruned_columns.size() - 1;
} else {
// Has existed, add to re_direct map.
re_direct[idx] = exist_expr.find(aggr_pair)->second;
}
}
}

// Make new optimized node child.
auto optimized_aggr = std::make_shared<AggregationPlanNode>(
std::make_shared<Schema>(pruned_columns), child_aggregation_plan->children_[0],
child_aggregation_plan->group_bys_, pruned_aggregates, pruned_agg_types);

// Modified parent node schema and expr.
std::vector<AbstractExpressionRef> pruned_exprs;
pruned_exprs.reserve(used_col_idxs.size());
std::vector<int> idx_map = RearrangeColIdxs(re_direct, used_col_idxs);

for (const auto &expr : projection_plan->expressions_) {
pruned_exprs.push_back(PrunedProjectionExpression(expr, idx_map));
}
optimized_plan = std::make_shared<ProjectionPlanNode>(projection_plan->output_schema_, pruned_exprs, optimized_aggr);
return optimized_plan->CloneWithChildren({OptimizeColumnPruning(optimized_plan->children_[0])});
}
}

if (optimized_plan->GetType() == PlanType::SeqScan || optimized_plan->GetType() == PlanType::MockScan ||
optimized_plan->GetType() == PlanType::IndexScan) {
return optimized_plan;
}

std::vector<AbstractPlanNodeRef> new_children{};
for (const auto &child : plan->children_) {
new_children.emplace_back(OptimizeColumnPruning(child));
}
return plan->CloneWithChildren(std::move(new_children));
}

auto Optimizer::RearrangeColIdxs(std::unordered_map<uint32_t, uint32_t> &re_direct, std::vector<uint32_t> &used_col_idxs)
-> std::vector<int> {
// Initialize the result map and the del array
int max_col_idx = static_cast<int>(used_col_idxs.back());
std::vector<int> col_idx_map(max_col_idx + 1, -1);
std::vector<int> del(max_col_idx + 2, 0);

// Cache the result of std::lower_bound for each i
std::vector<uint32_t> idx_map(max_col_idx + 1, used_col_idxs.size());
for (int i = 0; i <= max_col_idx; ++i) {
idx_map[i] = std::find(used_col_idxs.begin(), used_col_idxs.end(), i) - used_col_idxs.begin();
}

// First pass: Fill col_idx_map and update del
for (int i = 0; i <= max_col_idx; ++i) { // 11
uint32_t idx = idx_map[i];
if (idx < used_col_idxs.size()) {
if (re_direct.count(i)) {
col_idx_map[i] = re_direct[i];
del[idx + 1] -= 1;
} else {
// When it's not in re_direct.
col_idx_map[i] = idx;
}
}
}

// Compute cumulative del values
for (size_t i = 1; i < del.size(); ++i) {
del[i] += del[i - 1];
}

// Final adjustment for col_idx_map
for (int i = 0; i <= max_col_idx; ++i) {
uint32_t idx = idx_map[i];
if (col_idx_map[i] != -1 && !re_direct.count(idx)) {
col_idx_map[i] += del[col_idx_map[i]];
}
}

/*
printf("re_direct:");
printUnorderedMap(re_direct);
printf("idx_map:");
printVector(idx_map);
printf("del:");
printVector(del);
printf("used_col_idxs:");
printVector(used_col_idxs);
printf("col_idx_map:");
printVector(col_idx_map);
*/

return col_idx_map;
}

优化后:

1
2
3
4
5
6
7
8
9
10
11
12
Projection { exprs=["#0.0", "#0.1", "((#0.1+#0.1)+#0.2)"] } 
| (__subquery#0.t7.v:INTEGER, __subquery#0.d1:INTEGER, __subquery#0.d2:INTEGER)

Agg { types=["max", "max"], aggregates=["#0.1", "#0.2"], group_by=["#0.0"] }
| (t7.v:INTEGER, agg#0:INTEGER, agg#10:INTEGER)

NestedLoopJoin { type=Left, predicate=(#0.0<#1.0) }
| (t7.v:INTEGER, t7.v1:INTEGER, t7.v2:INTEGER, __subquery#1.t8.v4:INTEGER)

SeqScan { table=t7 } | (t7.v:INTEGER, t7.v1:INTEGER, t7.v2:INTEGER)

Values { rows=0 } | (__subquery#1.t8.v4:INTEGER)

可见aggr只计算了两个max。

现在q3用时可以达到700多,并且其他测试也都可以通过。

后记

由于给的基础测试太不全面了,导致我自己又写了一点,但仍然不全面。所以实际上我的代码肯定还有bug,只是还没找到且不影响线上测试。想用线上测试就必须要遵循一些代码格式,比如不能新增文件,某些函数只能写在特定文件,大部分初始代码不能自己修改…这些东西很大地限制了发挥,真诚希望以后能改进一下,给我们更多自由度,至少能新增文件吧。。。

目前三个leaderboard test总分排名进了前十,而事实上23fall也没有多少人真正完成了project 3 的全部leaderboard部分。所谓的cmu15445烂大街,其实确实是个谎言。事实上23fall的project 3 leaderboard做完的只有不到20个人,更别说project 4了。

我认为project 3是最重要的一个project。做过前两个project的都知道,project 1、2 和数据库结构真没啥太大关系。所以当我做完1、2的时候,对数据库具体结构是几乎没有进一步了解。而project 3 才是真正深入Bustub执行过程,我做完3后才对数据库各个组件有了一个真正直观的认识。非常推荐自己认真完成这个project。

贴出来的代码跟一坨答辩一样,后续我对代码进行了一些重构,但现在真是累死了,写不动了。但话说起来,这一坨也让读者阅读起来更困难,也算间接保护了课程无法被抄袭。。。