Schedule

Using PostgreSQL statistics to optimize performance

Talk, July 16th

Using PostgreSQL statistics to optimize performance pdf Download

Statistical information collected by PostgreSQL produces a great impact on system performance. Knowing the data distribution statistics, the optimizer can estimate correctly the number of required rows, amount of memory needed and choose the fastest query execution plan. In certain rare cases, however, it makes a mistake, and this is when an intervention by a DBA is required.<br /> <br /> In addition to information about the distribution of data, PostgreSQL also gathers statistics about accesses to tables and indexes, function calls and even individual request calls by means of pg_stat_statements extension. This information, unlike the data distributions statistics, is of more use to system administrators than to database itself and aids in a great way for encountering and eliminating performance bottlenecks.<br /> <br /> In the talk, I will show how statistical information is collected, why is it so important and how to read and interpret it properly; what parameters can you tweak in specific cases, how to choose the optimal indexes and how to rewrite a query to correct a mistake of the scheduler.