Get query performance statistics with PostgreSQL (2023)

Get query performance statistics with PostgreSQL (1)

Kevin German

Published in• Originally) released onkaleman.netlify.app

Get query performance statistics with PostgreSQL (3) Get query performance statistics with PostgreSQL (4) Get query performance statistics with PostgreSQL (5) Get query performance statistics with PostgreSQL (6) Get query performance statistics with PostgreSQL (7)

#postgres # Database #sql #At sight

In this article, you will learn how to use some hidden PostgreSQL functions to get useful information about your queries running on PostgreSQL.

The problem

Have you tried to identify performance issues in your application? Maybe some of these are in the code (maybe a map with thousands of items...) or maybe the performance problem is due to something else: poorly defined SQL queries.

As a developer, one day, maybe sooner or later, you will have to deal with SQL. And you probably have to work with queries that other people have asked, or even queries that theOfthe created past.

The problem is that without the right tool and information, it's very difficult to identify a slow query. Because?

Some queries are slower with more data

For example, consider a simple query that joins multiple tables. At your site with probably 10 users, the query won't work (and if it does, it's easier to spot!).

Some queries require an index

Indexing is probably the main cause of performance issues. Both their absence and their presence can cause problems. With a small dataset, you can't see whether a query needs an index or not. Worse (or better, depending) PostgreSQL can omit the index from the query if the dataset is small enough to perform a sequential (i.e., row-by-row) scan.

If the problem does not occur in a production environment, then it is very difficult to identify such problems and there is a high possibility that the end-user will discover them before you.

This approach (waiting for the user to say the app is slow) is very reactive. You must wait for the problem to occur before working on a solution to the problem. But what if we can have that information?Beforedoes the problem occur?

This scenario is why some PostgreSQL views exist. These maintenance views are a gold mine for developers who want to track the performance of their queries. Let's talk more about them!

The solution: PostgreSQL maintenance views

PostgreSQL has many views for this purpose. Some of them give us statistics about disk I/O and network statistics. Others allow us to see replication statistics and the like. Here we talk about three views that can help you solve query problems:pg_stat_user_tables,pg_stat_user_indexesmipg_stat_statements.

pg_stat_user_tables

This view shows statistics about each table by schema (there is one row per table) and provides information such as the number ofsequentielle Scansthat the PG was presented in the table how muchselect/insertOperations are performed on it and so on

(Video) Understand PostgreSQL query plan in 10 minutes

Get query performance statistics with PostgreSQL (8)

As you can see here, you did 1 for the first rowsequential scanningand this scan returned 939 rows. There were 2 index scans and they returned 2 rows. The numbers are low because I'm using a local database, but these numbers should be higher in a production database.

From this point of view, in addition to all the useful information, we can answer something really interesting:Which of my tables need an index?You can easily answer this question by referring to theseq_scanmiseq_tup_readColumns!

choose scheme name, however name, seq_scan, seq_tup_read, seq_tup_read / seq_scan if Average, idx_scanvon pg_stat_user_tablesWo seq_scan > 0command von seq_tup_read Description Border 25;

Running this query returns the following

Get query performance statistics with PostgreSQL (9)

As you can see, it's a good idea to add an index to these tables because they've recently been used in sequential scans. With more data and more execution time, this query gives you a good overview of how your tables are behaving.

pg_stat_user_indexes

While adding indexes solves many problems, they're not the holy grail and come at a price: disk space. The results are good, yes, we all agree on that. But worse than having no index is having a useless one. Because? First of all, it will take up disk space on your database server. Indexes on large tables can be very expensive and get very, very large. The second reason is that the index needs to be recalculated each time it is written to the table. Of course, recalculating a useless index is like paying for food you don't eat!

So if you add a table of contents, make sure it makes sense.

But what if you're working on a code base and database schema that you didn't design? Is this the end of the world? Absolutely! PostgreSQL views back to the rescue! EITHERpg_stat_user_indexesTable can show you thatfrequency of useof their indices along with the space they occupy.

Get query performance statistics with PostgreSQL (10)

As you can see in the image above, some of my primary keys are not used yet. But that doesn't give us many details yet. Because we don't know how much space our index occupies! We may obtain this information using thepg_relation_sizework with himindexrelidour results.

(Video) PostgreSQL Query Optimization Techniques | Databases A2Z

choose scheme name, however name, indexrelname, idx_scan, pg_size_bonita(pg_relation_size(indexrelid)) if size_idx,pg_size_bonita(Soma(pg_relation_size(indexrelid))one (command von idx_scan, indexrelid)) if in totalvon pg_stat_user_indexescommand von 6;

Get query performance statistics with PostgreSQL (11)

The output of this query shows indexes that have not been used in a while, along with their space consumption. This can give you an idea of ​​what indexes to look for.

Note that the result of this query does not mean that you should drop all unused indexes. You should always investigate why the index is not in use before deleting it!

pg_stat_statements

This is probably the most useful. It's hard to understand why this view isn't enabled by default! This vision must becapablein the PostgreSQL configuration for use.

activate

To enable this view we need to add it to theshared_preloaded_librariesList. Since I'm using Docker & Docker Compose to manage my database, I can just add an option to the start command to make it look like this:

 postgr: Containername: postgr Bild: Postgres: 10 Continue: Always doors: - "5432:5432" Surroundings: - POSTGRES_PASSWORD=${PG_PASSWORD:-postgres} - PGDATA='/var/lib/postgresql/data' Domain: - "postgres" - "-C" - "shared_preload_libraries=pg_stat_statements"

After that, when you start PostgreSQL again, the library will be loaded with the DBMS

create extension

After you activate the library, you need to activate it as an extension. You can do this by running the following query

create renewal pg_stat_statements;

If this query doesn't return an error, you're done! Let's confirm this by running:

choose * von pg_stat_statements;
(Video) PostgreSQL Query Performance Insights - Hamid Akhtar | Percona Live 2022

Get query performance statistics with PostgreSQL (12)

From this view we can get very good information about the performance of our queries. For example, we have the number ofCallsI have a specific question. EITHERhalftimeof execution between all calls and even thatstddev_time(standard deviation) of calls to see if the queries have a constant execution time or vary by how much.

In this view, you can even see how many rows a query returned, whether those rows came from cache or disk, and so on!

With all this information, it's easy to get a list of the most expensive requests and why.

choose repeat(( 100 * total time / Soma(total time) one ())::numeric, 2) percent, repeat(total time::numeric, 2) if in total, Calls, repeat(halftime::numeric, 2) if mean, stddev_time, Subchain(Advice, 1, 40) if Advicevon pg_stat_statementscommand von total time DESCRBorder 10;

Get query performance statistics with PostgreSQL (13)

With this query, you now have a list of the top 10 most expensive queries, how long they took, how often they were called, and the average time variance for those queries.

That way you can keep track of which queries are taking the longest and try to fix them (or at least understand why they're working the way they do).

Diploma

Using PostgreSQL to monitor PostgreSQL is very useful and can direct you to the right place to understand your application's performance and any issues you may be having.

I hope you enjoyed the article and learned something from it!

Note: This article was also published inmy blog. Still trying to find a good domain name for it lol

(Video) Scaling Postgres Episode 3 | Modeling | Query Performance | Statistics | pgmetrics

Videos

1. Drastically Improve Query Time From 4 seconds to 70 milliseconds (50 - 60 times faster)
(Laratips)
2. Pg_stat_monitor - The New Way to Analyze Query Performance in PostgreSQL - Percona in FOSDEM 2021
(Percona)
3. Query optimization in Postgres
(Tobin Bradley)
4. 5mins of Postgres E1: Using Postgres statistics to improve bad query plans, pg_hint_plan extension
(pganalyze)
5. SELECT COUNT(*) is Slow, Estimate it Instead (with Example in Node JS and Postgres)
(Hussein Nasser)
6. Analyzing Postgres performance problems using perf and eBPF | Citus Con: An Event for Postgres 2022
(Microsoft Developer)
Top Articles
Latest Posts
Article information

Author: Terrell Hackett

Last Updated: 2023/07/27

Views: 5832

Rating: 4.1 / 5 (72 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Terrell Hackett

Birthday: 1992-03-17

Address: Suite 453 459 Gibson Squares, East Adriane, AK 71925-5692

Phone: +21811810803470

Job: Chief Representative

Hobby: Board games, Rock climbing, Ghost hunting, Origami, Kabaddi, Mushroom hunting, Gaming

Introduction: My name is Terrell Hackett, I am a gleaming, brainy, courageous, helpful, healthy, cooperative, graceful person who loves writing and wants to share my knowledge and understanding with you.