Postgresql: Monitor sequence scans with Perl

February 12, 2014

Not using indexes or huge tables without indexes, can have a very negative impact on the duration of a SQL query. The query planner will decide to make a sequence scan, which means that the query will go through the table sequentially to search for the required data. When this table is only 100 rows big, you will probably not even notice it is making a sequence scans, but if your table is 1,000,000 rows big or even more, you can probably optimize your table to use indexes to result in faster searches.

In the example script we will be using a Storable state file and we will the statistics as a JSON object in the PostgreSQL database.

First let’s take a look at the query we will be executing:

SELECT schemaname, relname, seq_tup_read 
FROM pg_stat_all_tables 
WHERE seq_tup_read &gt; '0' 
      AND relname NOT LIKE 'pg_%'
ORDER BY seq_tup_read desc

As you can see, PostgreSQL stores all the information we need about our tables in just one table, called pg_stat_all_tables. In this table there is a column called seq_tup_read, which will contain the information we need.

Just reading out this information is not going to be enough, because it contains information since the startup of your PostgreSQL database. Since production databases aren’t restarted (that often), we will have to compare this information with some previous information (hence the Storable state file).
Our plan is to run the script in a cronjob, each 5 minutes.

The statistics are also stored in as a JSON object in a database, just so that we could build some web interface for the statistics, in a later stage. And we want to keep a history of these statistics.

Furthermore the script will setuid to postgres (same like su – postgres on the command line), so that it could connect to the PostgreSQL UNIX socket file.

use strict;
use warnings;
use utf8;

use DBI;
use DateTime;
use POSIX qw/setuid/;
use Text::ASCIITable;
use JSON;

my $db   = 'mydatabase';
if(scalar @ARGV){
    $db = shift @ARGV;
}

my $host = '/var/run/postgresql';
my $user = 'postgres';
my $pass = 'undef';

my $state_db   = 'database_statistics';
my $state_host = '192.168.1.1';
my $state_user = 'skeletor';
my $state_pass = 'he-manisawhimp';

my $state_file = '/var/tmp/sequence_read.state';

# suid to postgres
setuid(scalar getpwnam 'postgres');

# define and open up the state file
my $state = {};
$state = retrieve $state_file if -f $state_file;

my $now      = DateTime-&gt;now;

# Connect to the database which we want to monitor
my $dbh = DBI-&gt;connect("dbi:Pg:dbname=$db;host=$host", $user, $pass) 
                or die "Could not connect to database: $!\n";

# Connect to the database that will be used to store the statistics
my $state_dbh = DBI-&gt;connect("dbi:Pg:dbname=$state_db;host=$state_host", $state_user, $state_pass) 
                or die "Could not connect to the State database '$state_db': $!\n";

my $sql = &lt;&lt;EOF;
SELECT schemaname, relname, seq_tup_read 
FROM pg_stat_all_tables 
WHERE seq_tup_read &gt; '0' 
      AND relname NOT LIKE 'pg_%'
ORDER BY seq_tup_read desc
EOF

# Get the statistics
my $results = $dbh-&gt;selectall_arrayref( $sql, undef);

# Store the statistics as a JSON object in the second databse
eval {
    $state_dbh-&gt;do('INSERT INTO mydbschema.seq_tup_read (data) VALUES(?)', undef, encode_json($results));
};
if($@){
    print "Insert into state-db failed: $@\n";
}

# Prepare a nice ASCII table for output
my $t = Text::ASCIITable-&gt;new({ headingText =&gt; 'Seq Tup Read ' . $now-&gt;ymd('-')     . ' ' . $now-&gt;hms(':')});
$t-&gt;setCols('Schema Name','Relation Name ', 'Seq Tup Read', 'Increase (delta)');

my $row_count = 0;
foreach my $r (@{$results}){
    last if $row_count &gt; 25;

    my (@values) = (@{$r});
    my ($increase, $delta) = (0, 0);
    # Calculate the increase and its delta
    if(defined $state-&gt;{last}{$r-&gt;[0].':'.$r-&gt;[1]}{seq_tup_read}){
        $increase = $r-&gt;[2] - $state-&gt;{last}{$r-&gt;[0].':'.$r-&gt;[1]}{seq_tup_read};
        $delta    = $increase / $state-&gt;{last}{$r-&gt;[0].':'.$r-&gt;[1]}{seq_tup_read} * 100;
        my $str = sprintf '%.0f (%.4f %%)', $increase, $delta;
        push @values, ($str);
    }
    else {
        push @values, '0 (0%)';
    }
    # Store this information for the next run of the script
    $state-&gt;{last}{$r-&gt;[0].':'.$r-&gt;[1]}{seq_tup_read} = $r-&gt;[2];
    $state-&gt;{last}{$r-&gt;[0].':'.$r-&gt;[1]}{delta}        = $delta;
    $state-&gt;{last}{$r-&gt;[0].':'.$r-&gt;[1]}{increase}     = $increase;

    # Only add the information to ASCII output table if there was an increase
    next unless $increase &gt; 0;
    $t-&gt;addRow(@values);
    $row_count++;
}
# Print out the ASCII table
print $t;

nstore $state, $state_file;

Johnny Morano Author

3 comments

Talk PostgreSQL says:
August 30, 2014 at 20:28

Nice, I was working on something like this but with bash scripts and a repository database. Your approach appears to be quicker to implement. I never thought of storing the data as a JSON object. More flexible than my approach. Thanks for posting.
Reply
Hendrik says:
February 13, 2015 at 09:25

Hi there,

Could you perhaps give a small schema for the destination/stats db?
Reply

Postgresql: Monitor unused indexes

byJohnny Morano

February 11, 2014

Monitor running processes with Perl

byJohnny Morano

May 15, 2014

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

A monitoring solution with Docker

Jenkins to manage a libvirt infrastructure with Terraform

Using multipath together with mdadm on Debian

Trending Tags

Postgresql: Monitor sequence scans with Perl

3 comments

Leave a Reply Cancel reply

Previous Post

Postgresql: Monitor unused indexes

Next Post

Monitor running processes with Perl

Postgresql: Monitor sequence scans with Perl

3 comments

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts