psql_sim_db
Similarity Database PostgreSQL
The goal and idea is to enhance the server with functions to determine similarity for color shape.
- The first issue is that hamming is present, but hellinger isnt'. So, pg_similarity was patched to add the new functionality.
- Functions were marked as 'SAFE PARALLEL' to allow parallel queries
Blockers:
- When a SQL query contains common table expressions to join the reference product, no parallel query plan is created. If one provides the features as values, it works as expected. This means, the reference product need to be fetched. And this means, a database lookup needs to be extremely fast, no latency is allowed. And this means, each frontend needs a local database to query features.
Indexing and performance remarks:
- clustering the shop materialized view by category improves performance slightly
- only categories index uses parallel queries, compound (categories, genders) index does not → categories index is faster
- including columns or even constructing an index with all columns did not trigger an index only scan when including calls to pgsimilarity extension
- a genders index is not used when doing a parallel scan with categories index
Keywords: sim psql postgres postgresql hamming divergence bignet shape color
psql_sim_db.txt · Last modified: 2024/04/11 14:23 by 127.0.0.1