During my internship a common task is to identify patterns in CFG/AST and optimize them. I think this can be done in C++ metaprogramming. So I take a try this weekend. Here is an example code to show how it works: pattern matching in C++.

In this code we have five components.

The Pattern drives all matching process. It takes the desired pattern as its template parameter, and tries to match the input CFG/AST node.

To traverse the tree-like CFG/AST hierarchy, we need constructors. They are just empty classes such as following ones:

template <typename Root, typename T0, typename T1> class cons2;
template <typename Root, typename T0> class cons1;

We use them as a “type label” to destruct a root composition type by C++ partial specialization. So no class body is required for them.

We also want to traverse in the type hierarchy. For example, cast an Expr to a more concrete type AddExpr. For this task we need descenders. You can provide your cast(usually a dynamic_cast) for specific source and destination types.

Given constructors, descenders, and Pattern, we can do basic matching.

Besides, we have guards and dispatchers to match patterns at a finner level. The guards are decorators to a pattern to indicate we want to perform some specific operations.

In the example code, we have one guard which is called out. This guard will tell the driver that we need the value of current matched pattern. The actual action of a guard will be directed to a dispatcher. For example, to output the value to a place.

If you want to check not only the type of one pattern, but also the actual value of it, you can achieve this by adding another guard and providing your check code in dispatcher.

In this metaprogramming style the user can write very clean and intuitive matching code. Look at this one:

typedef cons2<AddExpr,
                    out<Symbol, 1>,
                    out<Constant, 0>>> Pattern;

And we can have a systematic way to retrieve the value of a matched pattern.

However, it may increase the compilation time dramatically because it generates code at compilation time. Also it cannot handle very long patterns because its recursion nature would blow the system stack.