The efficiency of a query execution plan depends on the accuracy of the selectivity estimates given to the query optimiser by the cost model. The cost model makes simplifying assumptions in order to
produce said estimates in a timely manner. These assumptions lead to selectivity estimation errors that have dramatic effects on the quality of the resulting query execution plans. A convenient assumption that is ubiquitous among current cost models is to assume that attributes are independent with each other. However, it ignores potential correlations which can have a huge negative impact on the accuracy of the cost model. In this paper we attempt to relax the attribute value independence assumption without unreasonably deteriorating the accuracy of the cost model. We propose a novel approach based on a particular type of Bayesian networks called Chow-Liu trees to approximate the distribution of attribute values inside each relation of a database. Our results on
the TPC-DS benchmark show that our method is an order of magnitude.
more precise than other approaches whilst remaining reasonably efficient
in terms of time and space.
We describe what mimetic interpolation is and why it is critical for some pre- and post-processing tasks. A simple test case shows how using bilinear interpolation for a flux calculation introduces numerical errors that depend on the grid, the number of segments and the number of quadrature points. In contrast, mimetic interpolation will return the exact result regardless of the grid resolution and the number of segments.
Alex Pletzer and Wolfgang Hayek and Jorge Bornemann