Postgresql: Monitor unused indexes

February 11, 2014

Working on large database systems, with many tables and many indexes, it is easy to loose the overview on what is actually being used and what is just consuming unwanted disk space.
If indexes are not closely monitored, they could end up using undesired space and moreover, they will consume unnecessary CPU cycles.

Statistics about indexes can be easily retrieved from the PostgreSQL database system. All required information is stored in two tables:

pg_stat_user_indexes
pg_index

When joining these two tables, interesting information can be read in the following columns:

idx_scan: has the query planner used this index for an ‘Index Scan’, the number returned is the amount of times it was used
idx_tup_read: how many tuples have been read by using the index
idx_tup_fetch: how many tuples have been fetch by using the index

A neat function called pg_relation_size() allows to fetch the on-disk size of a relation, in this case the index.

Based on this information, the monitoring query will be built up as follows:

SELECT 
    relid::regclass AS table, 
    indexrelid::regclass AS index, 
    pg_size_pretty(pg_relation_size(indexrelid::regclass)) AS index_size, 
    idx_tup_read, 
    idx_tup_fetch, 
    idx_scan
FROM 
    pg_stat_user_indexes 
    JOIN pg_index USING (indexrelid) 
WHERE 
    idx_scan = 0 
    AND indisunique IS FALSE

Now, all we need to do is write a script which stores this information in some kind of file and periodically report about the statistics.

First of all we will need a configuration file, which contains the database credentials.
I’ve chosen YAML because it is so versatile.

It will contain two important sets of information:

The database credentials
path to the state file

Example:

dsn: "dbi:Pg:host=/var/run/postgresql;database=testdb"
user: postgres
pass:
state_file: /var/tmp/monitor_unused_indexes.state

As you can see, we will be connect to the PostgreSQL database by using its UNIX socket.

The script will use Text::ASCIITable to output the statistics in a nice table. Storable is used to save our statistics to disk.

In the below script, we will check if an index was unused in a timespan of 30 days. If yes, the script will report this index to STDOUT.
Therefore, we will store a score and timestamp for each unused index in the state file.

#!/usr/bin/env perl
use strict;
use warnings;
use utf8;
use DBI;
use Storable qw/nstore retrieve/;
use YAML qw/LoadFile/;
use POSIX qw/setuid/;
use Getopt::Long;
use DateTime;
use Text::ASCIITable;

my $cfg_file = './monitor_unused_indexes.yaml';
my $verbose = 0;
GetOptions("cfg=s" =&gt; $cfg_file,
           "verbose|v" =&gt; $verbose, 
        );

my $sql = &lt;&lt;EOS;
SELECT 
    relid::regclass AS table, 
    indexrelid::regclass AS index, 
    pg_size_pretty(pg_relation_size(indexrelid::regclass)) AS index_size, 
    idx_tup_read, 
    idx_tup_fetch, 
    idx_scan
FROM 
    pg_stat_user_indexes 
    JOIN pg_index USING (indexrelid) 
WHERE 
    idx_scan = 0 
    AND indisunique IS FALSE
EOS

my ($cfg) = LoadFile($cfg_file);

# suid to postgres, other whatever user is configured in the config.yaml file
setuid(scalar getpwnam $cfg-&gt;{user});

# Connect to the database
my $dbh = DBI-&gt;connect($cfg-&gt;{dsn}, $cfg-&gt;{user}, $cfg-&gt;{pass}) 
            or die "Could not connect to database: $! (DBI ERROR: ".$DBI::errstr.")\n";

my $state;
if(-f $cfg-&gt;{state_file}){
    $state = retrieve $cfg-&gt;{state_file};
}

# Fetch the statistics
my $results = $dbh-&gt;selectall_arrayref( $sql, undef );

my $now_dt   = DateTime-&gt;now;

# Initialize the ASCII table
my $t = Text::ASCIITable-&gt;new({ headingText =&gt; 'INDEX STATISTICS'});
$t-&gt;setCols(qw/Table Index Index_Size idx_tup_read idx_tup_fetch idx_scan/);

# Analyze the results
foreach my $r (@$results){
    if($verbose){
        $t-&gt;addRow(@{$r});
    }
    # Only update the state file if --verbose was not specified.
    # This way the script can be check manually with --verbose many times and executed for instance
    # from a cronjob once a day without --verbose
    else {
        if(defined $state-&gt;{unused_indexes}{$r-&gt;[1]}){
            my $first_dt = DateTime-&gt;from_epoch( epoch =&gt; $state-&gt;{unused_indexes}{$r-&gt;[1]}{first_hit} );
            if($first_dt-&gt;add(days =&gt; $state-&gt;{unused_indexes}{$r-&gt;[1]}{score})-&gt;day == $now_dt-&gt;day ) {
                $state-&gt;{unused_indexes}{$r-&gt;[1]}{score}++;
            }
            else {
                $state-&gt;{unused_indexes}{$r-&gt;[1]}{score}     = 1;
                $state-&gt;{unused_indexes}{$r-&gt;[1]}{first_hit} = $now_dt-&gt;epoch;;
            }
        }
        else {
            $state-&gt;{unused_indexes}{$r-&gt;[1]}{score}     = 1;
            $state-&gt;{unused_indexes}{$r-&gt;[1]}{first_hit} = $now_dt-&gt;epoch;;
        }
    }
}

# Print out the statistics table, if --verbose was specified
print $t if $verbose; 

# Store the statistics to disk in a state file
nstore $state, $cfg-&gt;{state_file};

foreach my $idx (keys %{ $state-&gt;{unused_indexes} }){
    my $first_dt = DateTime-&gt;from_epoch( epoch =&gt; $state-&gt;{unused_indexes}{$idx}{first_hit} );
    if( $first_dt-&gt;add(days =&gt; 30) &lt;= $now_dt ){
        my $line = "Index: $idx ready for deletion";
        $line .= " (score:" . $state-&gt;{unused_indexes}{$idx}{score};
        $line .= " (|first_hit:" . DateTime-&gt;from_epoch(epoch =&gt; $state-&gt;{unused_indexes}{$idx}{first_hit})-&gt;ymd . ")";

        print $line."\n" if $verbose;
    }
}

Johnny Morano Author

3 comments

SK says:
November 16, 2016 at 22:55

Very Good Information. Thank you.
Reply
Rohan says:
July 29, 2017 at 09:10

Very Helpful article and script. Thanks for sharing.
Reply
Fabiano says:
August 4, 2018 at 12:28

Nice script! Thank you for sharing your idea and handson!

1) Why not only execute the query instead of the script to check index usage? If we use the script to create historic data about usage, can’t the pg_stat_user_indexes pg_index tables give us that information?

2) If I really need the script, what frequency it needs to be called?

Thank you!
Reply

Postgresql 9.3: Creating an index on a JSON attribute

byJohnny Morano

December 27, 2013

Postgresql: Monitor sequence scans with Perl

byJohnny Morano

February 12, 2014

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

A monitoring solution with Docker

Jenkins to manage a libvirt infrastructure with Terraform

Using multipath together with mdadm on Debian

Trending Tags

Postgresql: Monitor unused indexes

3 comments

Leave a Reply Cancel reply

Previous Post

Postgresql 9.3: Creating an index on a JSON attribute

Next Post

Postgresql: Monitor sequence scans with Perl

Postgresql: Monitor unused indexes

3 comments

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts