Prefiltering And Postfiltering since v0.4.0
VectorChord provides flexible filtering mechanisms to improve vector search performance and accuracy through prefiltering and postfiltering strategies. These filtering approaches determine when query conditions (like WHERE
clauses) are applied during the vector similarity search process.
Structure
Prefiltering: Apply an additional filter before the search (original)
step.
Postfiltering: Use the PostgreSQL native filter after the search (original)
step.
Configuration
You can control the filtering strategy using the vchordrq.prefilter
setting:
-- Enable prefiltering (default: off)
SET vchordrq.prefilter = on;
-- Use postfiltering
SET vchordrq.prefilter = off;
WARNING
For complex queries, such as joint queries. Whether the prefilter takes effect is greatly affected by the planner/optimizer. Please evaluate carefully based on experiments.
Performance Trade-offs
Use prefiltering when:
- Your filtering conditions are highly selective (eliminate many candidates)
- You are using Rerank In Table index, while prefiltering can significantly reduce latency
- The filter is simple
Use postfiltering when:
- Your filtering conditions are less selective
- The filtering logic is complex and might benefit from having more candidates available
- The filter is a costly operation
Example | All rows | Selected rows | Select rate |
---|---|---|---|
A low selective filter | 1000 | 900 | 90% |
A medium selective filter | 1000 | 300 | 30% |
A highly selective filter | 1000 | 10 | 1% |
WARNING
If your filter is not a pure function and have some side effects, vchordrq.prefilter
could cause a change in behavior.
Based on our experimental results, the QPS speedup at different select rate
is as follows:
- 200% speedup at a select rate of 1%
- Not significant (5%) speedup at a select rate of 10%
