Abstract
One of the most challenging aspects of managing a very large data
warehouse is identifying how queries will behave before they start
executing. Yet knowing their performance characteristics - their
runtimes and resource usage - can solve two important problems. First,
every database vendor struggles with managing unexpectedly long-running
queries. When these long-running queries can be identified before
they start, they can be rejected or scheduled when they will not
cause extreme resource contention for the other queries in the system.
Second, deciding whether a system can complete a given workload in
a given time period (or a bigger system is necessary) depends on
knowing the resource requirements of the queries in that workload.
We have developed a system that uses machine learning to accurately
predict the performance metrics of database queries whose execution
times range from milliseconds to hours. For training and testing
our system, we used both real customer queries and queries generated
from an extended set of TPC-DS templates. The extensions mimic queries
that caused customer problems. We used these queries to compare how
accurately different techniques predict metrics such as elapsed time,
records used, disk I/Os, and message bytes.
The most promising technique was not only the most accurate, but also
predicted these metrics simultaneously and using only information
available prior to query execution. We validated the accuracy of
this machine learning technique on a number of HP Neoview configurations.
We were able to predict individual query elapsed time within 20%
of its actual time for 85% of the test queries. Most importantly,
we were able to correctly identify both the short and long-running
(up to two hour) queries to inform workload management and capacity
planning.
Users
Please
log in to take part in the discussion (add own reviews or comments).