Table of Contents

PostgreSQL Index Types

Not everything should be a B-Tree ;-)

https://postgresql.us/events/pgopen2019/sessions/session/647/slides/45/look-it-up.pdf

For example, 'bloom' indexes might be useful in case of dynamic filters like this:

WHERE picalike_brand='soliver' and picalike_gender='women' and shop_id='some_de_crawler'

The advantage over compound indexes is that you do not need to create one for every possible subset.

Take a Deeper Look

To analyze the values and distribution of a certain column in a relation:

\x
select * from pg_stats where tablename='new_filters' and attname='picalike_brand' limit 1;

Abuse FTS

When the goal is to use generic filters like

'a & b', 'a | b' or 'a - b'

and several columns are possible, it might be useful to use the ::tsvector, ::tsquery types. Those are mainly for full text search, but they allow to perform search queries like

'pg_women & pc_clothing__dresses'

in a very compact and efficient format.

But extra care must be taken, since to_tsvector transforms the data, so in our case, just pass a string to insert like that

'pg_women pg_clothing__dresses'::tsvector

. Some special characters like '&' and ':' are not allowed.

Finally, one can use

SELECT * FROM dingens where vector_column @@ 'pg_women & pc_clothing__dresses

to lookup everything that contains both filters.

Bloom

https://www.postgresql.org/docs/12/bloom.html

Trade-off: N btree indexes vs. 1 bloom index .. space versus time

Cluster Tables with an Index

To remove the clustering from a table

alter table oneshot set without cluster;

Reorganize a table by the order induced by some index

 cluster filters using picalike_category

Refresh a clustering

cluster filters

More info: https://www.postgresql.org/docs/12/sql-cluster.html

Include columns into a cluster

HINT: not possible with GIN indexes :-(

create index multi_meta on oneshot using gin ("multi") include ("metadata");

→ ERROR: access method “gin” does not support included columns