Electronic Conference Proceedings

Robustness in Automatic Physical Database Design

Authors

Abstract

Automatic physical database design tools rely on "what-if" interfaces to the query optimizer to estimate the execution time of the training query workload under different candidate physical designs. The tools use these what-if interfaces to recommend physical designs that minimize the estimated execution time of the input training workload. In this paper, we argue that minimizing estimated execution time alone can lead to designs with inherent problems. In particular, if the optimizer makes an error in estimating the execution time of some workload queries, then the recommended physical design may actually harm the workload instead of benefiting it. In this sense, the physical design is risky. Moreover, if the production queries are slightly different from the training queries, the recommended physical design may not benefit them at all. In this sense, the physical design is not general. We define Risk and Generality as two new metrics to evaluate the quality of a proposed physical database design, and we show how to extend the objective function being optimized by a generic physical design tool to take these measures into account. We have implemented a physical design advisor in PostgreSQLs, and we use it to experimentally demonstrate the usefulness of our approach. We show that our two new metrics result in physical designs that are more robust, which means that the user can implement them with a higher degree of confidence. This is particularly important as we move towards truly zero-administration database systems in which there is not the possibility for a DBA to vet the recommendations of the physical design tool before applying them.

Session

Research Session 5: Physical Design