Postgresql 9.3: Creating an index on a JSON attribute

December 27, 2013

Recently I’ve discovered some very interesting new features in the PostgreSQL 9.3 database.
First of all, a new data type has been introduced: JSON. Together with this new data type, new functions were also introduced.

These new features now simply for instance saving web forms in your Postgresql database. Or actually any kind of dynamic data, such as for instance Perl hashes. Plus, thanks to the new JSON functions, this data can be easily searched and indexed.

Let’s start with creating a test table.

CREATE SEQUENCE data_seq    
    START WITH 1    
    INCREMENT BY 1    
    NO MINVALUE    
    NO MAXVALUE    
    CACHE 1;

CREATE TABLE data (    
    id bigint DEFAULT nextval('data_seq'::regclass) NOT NULL,
    form_name TEXT,
    form_data JSON
);

I’ve inserted into this table 100k rows of test data with a very simple Perl script.

#!/usr/bin/perl
use strict;
use DBI;
use AnyEvent;
use AnyEvent::Util;
$AnyEvent::Util::MAX_FORKS = 25;

print "Inserting test data...\n";
my $cv = AnyEvent-&gt;condvar;
$cv-&gt;begin;
foreach my $d (0..100000){
    $cv-&gt;begin;
    fork_call {
        my($d) = @_;
        my $name = do{local $/; open my $c, '-|', 'pwgen -B -s -c1 64'; &lt;$c&gt;};
        chomp($name);
        my $dbh = DBI-&gt;connect("dbi:Pg:host=/var/run/postgresql;dbname=test;port=5432",'postgres', undef);
        $dbh-&gt;do(qq{insert into data (form_name,form_data) VALUES('test_form', '{"c":{"d":"ddddd"},"name":"$name","b":"bbbbb", "count":$d}')});
        $dbh-&gt;disconnect;
        return $d;
    } $d,
    sub {
        my ($count) = @_;
        print "$d ";
        $cv-&gt;end;
    }
} 
$cv-&gt;end;
$cv-&gt;recv;
print "\n\nDone\n";

Now let’s assume that the JSON data we are going to insert (or have inserted) always contains the attribute field ‘name’. On this attribute we will create the following database index:

CREATE INDEX ON data USING btree (form_name, json_extract_path_text(form_data,'name'));

The above example creates a multi-column index.

Now let’s a make our first test.
The first test will not use the index we have created previously.

EXPLAIN ANALYZE VERBOSE SELECT * FROM data WHERE form_name = 'test_form' AND form_data-&gt;&gt;'name' = 'cbcO5twuPnAYJ1VLV6gsEv9zWs2AbQxQ9PoALLr2w6Rwpr2PtoQHCCK0hyOMuIME';
                                                                             QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on data  (cost=0.00..4337.28 rows=500 width=102) (actual time=28.608..129.945 rows=1 loops=1)
   Filter: ((data.form_name = 'test_form'::text) AND ((data.form_data -&gt;&gt; 'name'::text) = 'cbcO5twuPnAYJ1VLV6gsEv9zWs2AbQxQ9PoALLr2w6Rwpr2PtoQHCCK0hyOMuIME'::text))
   Rows Removed by Filter: 100000
 Total runtime: 129.968 ms
(5 rows)

130ms for searching through 100k rows, is actually quite ok.

Now let’s see how we can speed up this query by using the index we’ve created.

EXPLAIN ANALYZE VERBOSE SELECT * FROM data WHERE form_name = 'test_form' AND json_extract_path_text(form_data,'name') = 'cbcO5twuPnAYJ1VLV6gsEv9zWs2AbQxQ9PoALLr2w6Rwpr2PtoQHCCK0hyOMuIME';
                                                                             QUERY PLAN                                                                                                
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using data_form_name_json_extract_path_text_idx on data  (cost=0.42..8.44 rows=1 width=102) (actual time=0.056..0.057 rows=1 loops=1)
   Index Cond: ((data.form_name = 'test_form'::text) AND (json_extract_path_text(data.form_data, VARIADIC '{name}'::text[]) = 'cbcO5twuPnAYJ1VLV6gsEv9zWs2AbQxQ9PoALLr2w6Rwpr2PtoQHCCK0hyOMuIME'::text))
 Total runtime: 0.084 ms
(4 rows)

0.084ms! That’s is about 1625 times faster! What makes this index extremely interesting is that the index has only been created on one attribute of the JSON data and not on the entire JSON data. This will keep the index data small and thus will be kept longer in your database’ memory.

Johnny Morano Author

Datatables and Perl (and a little bit of jQuery)

byJohnny Morano

October 9, 2013

Postgresql: Monitor unused indexes

byJohnny Morano

February 11, 2014

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

A monitoring solution with Docker

Jenkins to manage a libvirt infrastructure with Terraform

Using multipath together with mdadm on Debian

Trending Tags

Postgresql 9.3: Creating an index on a JSON attribute

Leave a Reply Cancel reply

Previous Post

Datatables and Perl (and a little bit of jQuery)

Next Post

Postgresql: Monitor unused indexes

Postgresql 9.3: Creating an index on a JSON attribute

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts