Previously I had posted (http://wp.me/p3XwYf-3y) about how we can use Optiq (now called Calcite) to run SQL queries over a custom data source. Optiq lets us push down to the data source certain SQL query functions like scanning a subset of columns, or filtering rows based on a particular column values. This query optimization is the core of Optiq’s functionality by letting us create rules that find matching operations in query and push down those operations to the data source.
How Optiq query push down works?
Above picture depicts how query push down works in Optiq. A query comprises of an expression tree, the nodes are operations like scanning a table, filtering rows, grouping, aggregating etc. The scan operation is usually the leaf node. Optiq query push down works in the following manner:
1. Query push down is supported on Optiq tables that have overridden the toRel method returning a RelNode. A RelNode extends TableAccessRelBase which has an overridden register method that adds a set of RelOpt rules.
2. RelOpt rules register the rules in their constructor method. Creating a rule involves creating an operand with children. This will be matched with incoming query and when it matches, the onMatch method in the Rule class will be called. Some example of creating rules are at the end of this article.
3. When a query execution matches a rule and onMatch method is called, We intercept the query here and collect details from the query as to what are the projections and filters. Projections are the columns that are required by a query and filter specifies the condition on columns to filter the rows. Think of a filter as the where clause in SQL and projections as the select clause.
4. At the end of the onMatch method we transform the query call to a new TableAccessRelBase instance passing along the projections and filters we intercepted.
5. Multiple rules can match a query expression but Optiq figures out the best rule and calls the implement method on the new TableAccessRelBase created. It’s also possible for us to provide stats to help with cost based optimization.
6. The implement method in above step is responsible to call a push down method in our Optiq table that accepts the projections and filters. The table pushed down these to the data source produces an iterator only for the selected projections and and internally filters the rows. This can result in huge performance boost as the data source is in the best position to do these operations faster.
7. Once the operations are pushed down, Optiq does not redo pushed operations, however it can handle other operations like joins, aggregates which we did not push down.
Example push down rules
Let’s consider an example table: user with columns: id, name, age, country.
Rule code: operand(FilterRel.class, operand(YourTableScan.class, none()))
Example query that matches: select id, name from user where age > 10 and country = ‘IN’
Here, all columns of the table are present in the query with filter condition (age > 10) and (country = ‘IN’).
Rule code: operand(ProjectRel.class, operand(JavaBeanTableScan.class, none()))
Example query that matches: select country, avg(age) from user group by country order by country
Here, country and age are projections that will be pushed down. Optiq will take care of the aggregation and ordering if we have not pushed them down.
3. Filter on project
Rule code: operand(FilterRel.class, operand(ProjectRel.class, operand(YourTableScan.class, none())))
Example query that matches: select name, age from user where age < 18.
The difference between plain filter and this rule is that the column in the where condition is one of the fields in the select clause. And not all rows of the table are used in this query.
4. Project on filter on project
Rule code: operand( ProjectRel.class, operand( FilterRel.class, operand(ProjectRel.class, operand(YourTableScan.class, none()))))
Example query that matches: select name, age from user where country = ‘IN’
Here the column in the where condition is not part of the select clause.
I did not find much documentation on Optiq push down concepts and hence documented here as much as I have researched. Do checkout my test project that I used for this purpose, you could fork it if you want play around and test Optiq: