I find MongoDB's ESR (Equality, Sort, Range) Guideline[0] quite helpful in that regard, which applies to SQL databases as well (since nearly all of them too use B-trees).
> Index keys correspond to document fields. In most cases, applying the ESR (Equality, Sort, Range) Guideline to arrange the index keys helps to create a more efficient compound index.
> Ensure that equality fields always come first. Applying equality to the leading field(s) of the compound index allows you to take advantage of the rest of the field values being in sorted order. Choose whether to use a sort or range field next based on your index's specific needs:
> * If avoiding in-memory sorts is critical, place sort fields before range fields (ESR)
> * If your range predicate in the query is very selective, then put it before sort fields (ERS)
I am split on SQL. On one hand I love the declarative approach and the fact that I can improve the run-time complexity of my queries just by adding an index and leaving the queries as is. On the other hand, I hate how the run-time complexity of my queries can suddenly go from linear to quadratic if the statistics are not up to date and my query planner misjudges the amount of rows returned by a complex sub-query so it falls back to a sequential scan instead of an index lookup.
I would be interested in a query execution language where you are more explicit about the use of indexes and joins, and where a static analysis can calculate the worst-case run time complexity of a given query. Does anyone know if something like that exists?
atombender 13 hours ago [-]
I've had the same thought. I would be interested in a query language which allows me to write a query plan, including explicitly scanning indexes using specific methods.
I like the "plumbing on the outside" approach of databases like ClickHouse where you have to know the performance characteristics of the query execution engine to design performant schemas and queries.
The challenge here is that query planners are quite good at determining the best access path based on statistical information about the data. For example, "id = 1" should probably use a fast index scan because it's a point query with a single row, but "category = 42" might be have a million rows with the value 42 and would be faster with a sequential scan. But the problem doesn't surface until you have that much data. Query planners adapt to scale, while a hard-coded query plan wouldn't. That's not even getting started on things like joins (where join side order matters a lot) and index scans on multiple columns.
I'm not aware of any systems that are like this. Some of the early database systems (ISAM/VSAM, hierarchical databases like CODASYL) were a bit like this, before the relational model allowed a single unified data paradigm to be expressed as a general-purpose query language.
nijave 12 hours ago [-]
What gets me is when it's faster to build an index and rerun the query from scratch than run the original.
nitwit005 7 hours ago [-]
That's less a SQL issue, than a general issue with attempting to optimize based off of heuristics. Generally when "hints" are supported that's what'll happen every time (baring the hint not being executable).
But, remember that a lot of users wanting to query things aren't going to be app developers. They'll be wanting to do a one off query or run some report. You'll tend to need an optimizer no matter what.
magicalhippo 24 hours ago [-]
We had a case where a single OR was a massive performance problem on MSSQL, but not at all on Sybase SQLAnywhere we're migrating away from. Which one might consider slightly ironic given the origins of MSSQL...
Anyway, the solution was to manually rewrite the query as a UNION ALL of the two cases which was fast on both. I'm still annoyed though by the fact that MSSQL couldn't just have done that for me.
dspillett 13 hours ago [-]
One thing to be careful of with the UNION ALL method, is that if you have some rows that match more than one of the clauses in your set of ORs then you will have duplicate results to screen out. This won't happen if you are checking for multiple values in one field, obviously, but is something to be wary of when using this method to optimise kitchen sink queries more generally.
Slapping a DISTINCT in isn't the answer, if it was then you'd just use UNION instead of UNION ALL, because that often makes the query planner call for a lot of excess work to be done. I once found that wrapping and main kitchen sink in a CTE and applying DISTINCT in the SELECT that calls it has the desired effect, but that is risky as it relies on undefined behaviour that may change at a later date and tank your performance. If the number of rows being returned is never going to be large, and your situation allows multiple statements, a safer option is to pull the data into a temporary table and SELECT from that with DISTINCT. Or you could de-dupe in the next layer instead of the DB, or just accept dupes if exact cardinality isn't important and your users aren't going to care about the occasional double-row (i.e. a simple quick-search feature).
And, of course, sometimes you want the duplicates, again maybe in the case of a quick search feature (where you split the results into categories and each duplicate of a row is likely in a different category, especially if the categories align with the search clauses that are being ORed).
magicalhippo 9 hours ago [-]
Good point. In the described case the OR terms were guaranteed to be disjoint, in a way the query planner could easily figure out. Which Sybase's planner did.
However for cases like described in the article, you'd need to handle that.
While I like CTEs, I've had more consistent luck with subqueries. They also compose more easily.
getnormality 23 hours ago [-]
I am very grateful for databases but I have so many stories of having to manhandle them into doing what seems like it should be obvious to a reasonable query optimizer. Writing one must be very hard.
nasretdinov 17 hours ago [-]
If an optimiser was as smart as a human it would take potentially minutes to come up with a (reasonably good) SQL execution plan for any non-trivial query :)
dspillett 13 hours ago [-]
> it would take potentially minutes
This is a key part of the problem, and something that people don't realise about query planners. The goal of the planner is not to find the best query plan no matter what, or even to find the best plan at all, it is instead to try to find a good enough plan quickly.
The QPs two constraints (find something good enough, do so very quickly) are often diametrically opposed. It must be an interesting bit of code to work on, especially as new features are slowly added to the query language.
nasretdinov 10 hours ago [-]
Yeah, exactly. You need to optimise for the overall query duration, including the optimiser itself, and obviously everyone's workload is different, so the right balance may as well not exist at all
hans_castorp 11 hours ago [-]
SQL Anywhere doesn't really have "Sybase" roots.
It started out as Watcom SQL, then after Watcom was acquired by PowerSoft, it was renamed to "SQL Anywhere".
After Sybase acquired PowerSoft it was later renamed to "Adaptive Server Anywhere". I think SAP renamed it back to "SQL Anywhere" after they acquired Sybase.
frollogaston 18 hours ago [-]
Postgres sometimes handles this for you, but I'm not sure exactly when it's able to do that, so I do UNION ALL.
swasheck 22 hours ago [-]
disjunctions and stats could be a pretty nasty combination in mssql. i think it got a bit better ca. 2016 with the CE updates, but i’ve had quite a few occurrences where the solutions were the union all approach
Glyptodon 1 days ago [-]
If optimization is really as simple as applying De Morgan's laws, surely it could be done within the query planner if that really is the main optimization switch? Or am I misreading this somehow?
Edit: I guess the main difference is that it's just calculating separate sets and then merging them, which isn't really DeMorgan's, but a calculation approach.
Sesse__ 1 days ago [-]
In most cases, the end result is a lot messier than it looks in this minimal example. Consider a case where the query is a 12-way join with a large IN list somewhere (which is essentially an OR); would it still look as attractive to duplicate that 12-way join a thousand times?
There _are_ optimizers that try to represent this kind of AND/OR expression as sets of distinct ranges in order to at least be able to consider whether a thousand different index scans are worth it or not, with rather limited success (it doesn't integrate all that well when you get more than one table, and getting cardinalities for that many index scans can be pretty expensive).
contravariant 1 days ago [-]
I'm not seeing De Morgan. I am seeing inclusion exclusion, which is just a neat trick all round. Highly recommend remembering it.
I imagine negative filters to be a bit inefficient as well, though maybe not for a simple count.
alterom 8 hours ago [-]
Inclusion-exclusion is just the generalization of De Morgan's law for more than 2 sets.
And the example shows exactly two sets.
So it's exactly De Morgan's law.
renhanxue 1 days ago [-]
The core issue the article is pointing to is that most database indexes are B-trees, so if you have a predicate on the form (col_a = 'foo' OR col_b = 'foo'), then it is impossible to use a single B-tree lookup to find all rows that match the predicate. You'd have to do two lookups and then merge the sets. Some query optimizers can do that, or at least things that are similar in spirit (e.g. Postgres bitmap index scan), but it's much more expensive than a regular index lookup.
jiggawatts 1 days ago [-]
> but it's much more expensive than a regular index lookup.
It doesn't have to be, it just "is" in some database engines for various historical reasons.
Other database engines are capable of this kind of thing to various degrees.
zbentley 11 hours ago [-]
The linked article’s optimization applies to compound index queries, not “OR” condition optimization.
Unrelated or not, skip-scan will be useful in some cases. However, the cases where it adds noticeable benefit are the cases where a separate index should have been used for leading columns anyway (and in memory-constrained/frequently-cold-cache situations, a separate index might even be faster). If you can’t even begin to guess at the order of magnitude of cardinality, or if your leading-column-lacking queries are quite rare and not that perf-sensitive on a big table exerting index cache pressure, then skip scans make sense.
jiggawatts 2 hours ago [-]
Skip-scan or similar code can solve the problem if you have a compound index on both columns.
The query planner can include the entire matching range for the 'A' column and check the 'B' column for matches via skip-scan.
This is possible in principle, but I don't believe most (any?) database engines use this specific approach.
It could be optimal if the results are required in A,B sorted order.
renhanxue 11 hours ago [-]
B-tree skip scan does not address the problem in the article at all.
cryptonector 21 hours ago [-]
SQLite3 turns ORs into UNIONs.
ahoka 17 hours ago [-]
I think creating alternative plans is easy, knowing which one will be the fastest is the hard part.
ntonozzi 1 days ago [-]
I really like the extension pattern. I wish more of the tables at my company used it.
Another huge benefit that we're realizing as we move some of our heaviest tables to this pattern is that it makes it really easy to index a core set of fields in Elasticsearch, along with a tag to associate it with a particular domain model. This has drastically cut down on our search-specific denormlization and let us avoid expensive index schema updates.
Not sure on which Postgres version this was tested with, but the first example runs in about 2ms with my Postgres 17 installation ("cold cache"). It uses a BitmapOr on the two defined indexes.
This used the setup.sql from the linked GitHub repository.
ethanseal 10 hours ago [-]
When you say cold cache, did you clear the os page cache as well as the postgres buffercaches? After setup.sql, the cache will be warmish - I get 4ms on the first run. I'm using postgres 17.5
I did not explicitly evict the Postgres buffer cache, but using pg_buffercache to evict all buffers for the table, yields a runtime of 23ms for me (still going for the BitmapOr).
Given the buffer reads seem close to yours, I believe it's page cache.
prein 1 days ago [-]
This sort of thing is why looking at generated SQL while developing instead of just trusting the ORM to write good queries is so important.
I find query planning (and databases in general) to be very difficult to reason about, basically magic. Does anyone have some recommended reading or advice?
It's very readable - I always ask new hires and interns to read it.
progmetaldev 21 hours ago [-]
This website taught me a ton, even after I thought I knew more than enough about performance. Just seeing how different databases generate and execute their SQL is a huge boon (and sometimes extremely surprising when looking at one DBMS to another).
grandfugue 12 hours ago [-]
ORM often produces horrible queries that are impossible for humans to digest. I think there are two factors. First, queries are constructed incrementally and mechanically. There is no an overview for the generator to understand what developers want to compute, or no channel for developers to specify the intention. I anticipant this will change w/ AI in the near future. Second, ORM models data following the dogmatic data normalization, on which queries are destined to be horrible. I believe that people should take a moment to view their data, think what computations they want to do on top, estimate how expensive they may be, and finally settle on a reasonable model overall. Ask ORM (or maybe AI) to help with constructing and sending queries and assembling results. But do not delegate data modeling out. With right data modeling that fits computations, queries cann't be that bad.
progmetaldev 21 hours ago [-]
If you are looking to squeeze every ounce of performance from your entire application stack, I'd say you should be looking at everything your ORM produces. The ORM is basically to speed up your developers time to production, but most ORMs will have some cases where they generate terrible SQL, and you can usually run your own SQL in a stored procedure if the generated SQL is sub-optimal. I've done this quite a few times with Microsoft's Entity Framework, but as new versions come out, it's become less common for me to have to do this. Usually I need to drop to a stored procedure for code that allows searching a large number of columns, in addition to sorting on all the columns that display. I also use stored procedures for multi-table joins with a WHERE clause, when using Entity Framework. You still need to look at your generated queries, but the code is nothing like it used to be under Entity Framework under the .NET Framework (at least in my experience - YMMV - you should never just let your ORM create SQL without reviewing what it is coming up with).
parshimers 1 days ago [-]
https://pages.cs.wisc.edu/~dbbook/ is a great overview, specifically chapters 13 and 14 on query optimization. it is difficult to reason about though, and every compiler is different. it takes time and enough examples to look at a query and have an intuition for what the plan should look like, and if there's something the compiler is not handling well.
jalk 1 days ago [-]
It's a big help if you know how to retrieve and interpret execution plans for the database you use.
hobs 1 days ago [-]
Yes, I was going to say, seeing the generated SQL can be almost useless depending on the execution plan.
When you have a solid view of the schema and data sizes you can start to be more predictive about what your code will actually do, THEN you can layer on the complexity of the ORM hell code.
tehjoker 23 hours ago [-]
Query Planners were considered "AI", at least among some folks, back in the day just FYI
VladVladikoff 11 hours ago [-]
I absolutely adore LLMs for SQL help. I’m no spring chicken with SQL but so many times now I’ve taken a poorly optimized query, run it with ‘explain’ in front of it, and dumped it into an LLM asking to improve performance, and the results are really great! Performance vastly improved and I have yet seen it make a single mistake.
datadrivenangel 7 hours ago [-]
And the nice thing about this is that a SQL query can easily be tested to see if optimizations change the outputs!
Animats 1 days ago [-]
The query optimizer knows how many items are in each index, but has no advance idea how many items will be in the result of a JOIN. An "a OR b" query on a table with millions of rows might have three hits on A, or millions of hits.
The optimal query strategy for the two cases is very different.
Has anyone put machine learning in an SQL query optimizer yet?
cogman10 1 days ago [-]
> Has anyone put machine learning in an SQL query optimizer yet?
Yes, I think everyone has? At very least I know that MSSQL has because we semi regularly run into problems with it :).
MSSQL keeps track of query statistics and uses those in future planning. SOMETIMES it just so happens that the optimization for the general case makes the outlier 100x slower which kills general performance.
cameronh90 21 hours ago [-]
For a while I was maintaining software that supported both MSSQL and PGSQL, and I found that, when comparing like-for-like without DB-specific tuning, MSSQL produced better query plans on average. On a database without write contention, I'd often see 30% better performance on MSSQL.
However, it was also much more likely to hit an AWFUL pathological case which completely wrecked the performance as you describe. Combined with pessimistic locking, we ended up with far more overnight support calls from the MSSQL backend than from the PGSQL backend. Usually because it suddenly decided to switch query plan at 1AM.
I wonder if there's a trade-off where an optimizer produces better average query plans but worse outliers.
Sesse__ 14 hours ago [-]
> I wonder if there's a trade-off where an optimizer produces better average query plans but worse outliers.
There is. A (now) classic case is running a nested loop with a sequential scan on the inner side. Works great if the outer side produces few rows as expected, a disaster if there's a misestimation and you have a thousand rows there.
The MSSQL and Postgres planners are different enough that you can't really say it's about one thing, though (MSSQL is one of the few implementations of the Cascades framework in a sort-of hybrid AFAIK, Postgres is straight-up System R).
fabian2k 17 hours ago [-]
I think the big difference is that PostgreSQL doesn't cache query plans at all by default. Only if you use prepared statements. My understanding is that MSSQL does heavily cache them.
It sounds plausible to me that caching would often lead to significant performance improvements overall, but trigger bad plans much more often since the plans are not re-evaluated based on the statistics of each single query. So in Postgres you'd get individual queries with pathological performance when the statistics are off, in MSSQL all executions of that query have bad performance until the plan is re-evaluated again.
magicalhippo 18 hours ago [-]
We're experiencing the same with MSSQL, and for our most important queries have started adding a constant-valued dummy column to the SELECT section which value changes every few minutes. Essentially an integer equal to UNIX time divided by 600 or similar.
That way a cached bad plan can't cause issues for more than a few minutes, which is acceptable for our use-case.
It's a sledgehammer but it was easy to add and it works.
zbentley 11 hours ago [-]
In Oracle many years ago, we did the same thing by prepending a SQL comment to the query string. You’d think that the plan cacher normalizes queries first, but I guess not.
That might work in your case as well, without requiring modifications in logic to support the dummy field?
jayd16 23 hours ago [-]
At 100x, it seems like you could run both optimal strategies every time, let them race, and still come out way ahead.
cogman10 23 hours ago [-]
You double + the IO and potentially CPU time which is why this isn't done. It's also not always 100x, that just happens often enough. Sometimes it's only 2x or 1.5x. It's impossible to know which situation you are in and the hard thing is the outliers will be slow either way.
ethanseal 1 days ago [-]
I think having a way to build statistics on the join itself would be helpful for this. Similar to how extended statistics^1 can help when column distributions aren't independent of each other.
But this may require some basic materialized views, which postgres doesn't really have.
As somebody who implemented manual incremental materialized tables using triggers, yeah it's pretty dang hard to make sure you get all the edge cases in which the data can mutate.
throwaway81523 1 days ago [-]
Some traditional planners try some different plans on random subsets of the data to see which plan works best. Don't need machine learning, just Bayes' rule.
Sesse__ 1 days ago [-]
> The query optimizer knows how many items are in each index, but has no advance idea how many items will be in the result of a JOIN.
Query optimizers definitely try to estimate cardinalities of joins. It's a really, really hard problem, but the typical estimate is _much_ better than “eh, no idea”.
singron 1 days ago [-]
There are many papers on ML for query planners. You can search for "Learned Query Optimization". Some use ML just for the cardinality estimation.
jalk 1 days ago [-]
Can we agree that this is only applies to queries where all the filter conditions use cols with indexes? If no indexes can be used, a single full table scan with OR surely is faster than multiple full table scans.
ethanseal 1 days ago [-]
Absolutely. Though I don't recall seeing multiple sequential scans without a self-join or subquery. A basic filter within a sequential scan/loop is the most naive/simplest way of performing queries like these, so postgres falls back to that.
Also, fwiw, BitmapOr is only used with indexes: https://pganalyze.com/docs/explain/other-nodes/bitmap-or.
jalk 1 days ago [-]
That was the extreme case - the multi-scan would be gotten if a casual reader tried your (neat by all means) AND-only query on a non-indexed table (or partially indexed for that matter).
ethanseal 1 days ago [-]
Gotcha, I misunderstood your comment. The multiple counts is a definitely very contrived example to demonstrate the overhead of BitmapOr and general risk of sequential scans.
1 days ago [-]
1 days ago [-]
galkk 1 days ago [-]
I strongly dislike the way the polroblem is presented and the “solution” is promoted. Author mentions merge join with count of top, and if th database supports index merges, it can be extremely efficient in described scenario. There are a lot of real optimizations that can be baked in such merges that author chooses to ignore.
The generalized guidance without even mentioning database server as a baseline, without showing plans and/or checking if the database can be guided is just a bad teaching.
It looks like author discovered a trick and tries to hammer everything into it by overgeneralizing.
ethanseal 1 days ago [-]
For sure, there's definitely a lot of cool techniques (and I'm not aware of all of them)! And the first example is very much contrived to show a small example.
I'm not super familiar with the term index merge - this seems to be the term for a BitmapOr/BitmapAnd?
> I'm not super familiar with the term index merge - this seems to be the term for a BitmapOr/BitmapAnd?
Different databases will use similar terms for different operations, but I would guess that the comment refers to something similar to MySQL's index merge (which is essentially reading the row IDs of all the relevant ranges, then deduplicating them, then doing the final scan; it's similar to but less flexible than Postgres' BitmapOr).
ethanseal 1 days ago [-]
Cool. I'll have to read up on that.
ape4 1 days ago [-]
The OR query certainly states user intent better then the rewrite. So its better at communicating with a future programmer.
sgarland 1 days ago [-]
So add a comment. “X is easier to understand” shouldn’t mean taking a 100x performance hit.
sgarland 1 days ago [-]
As an aside, MySQL will optimize WHERE IN better than OR, assuming that the predicates are all constants, and not JSON. Specifically, it sorts the IN array and uses binary search.
That said, I’m not sure if it would have any impact on this specific query; I’d need to test.
renhanxue 1 days ago [-]
The article is specifically discussing cases where you have predicates on different columns OR'ed together, like col_a = 'foo' OR col_b = 'foo'.
sgarland 1 days ago [-]
Oof, misread that.
cyanydeez 1 days ago [-]
Yeah, just watch community. Every OR splits the universe, duh.
to11mtm 1 days ago [-]
If ORs are expensive you've probably got a poor table design or the ORs are for some form of nasty query that is more UI/Dashboard based.
A good table has the right indexes, and a good API to deal with the table is using the RIGHT indexes in it's criteria to get a good result.
Rendered at 00:23:07 GMT+0000 (Coordinated Universal Time) with Vercel.
> Index keys correspond to document fields. In most cases, applying the ESR (Equality, Sort, Range) Guideline to arrange the index keys helps to create a more efficient compound index.
> Ensure that equality fields always come first. Applying equality to the leading field(s) of the compound index allows you to take advantage of the rest of the field values being in sorted order. Choose whether to use a sort or range field next based on your index's specific needs:
> * If avoiding in-memory sorts is critical, place sort fields before range fields (ESR)
> * If your range predicate in the query is very selective, then put it before sort fields (ERS)
[0] https://www.mongodb.com/docs/manual/tutorial/equality-sort-r...
I would be interested in a query execution language where you are more explicit about the use of indexes and joins, and where a static analysis can calculate the worst-case run time complexity of a given query. Does anyone know if something like that exists?
I like the "plumbing on the outside" approach of databases like ClickHouse where you have to know the performance characteristics of the query execution engine to design performant schemas and queries.
The challenge here is that query planners are quite good at determining the best access path based on statistical information about the data. For example, "id = 1" should probably use a fast index scan because it's a point query with a single row, but "category = 42" might be have a million rows with the value 42 and would be faster with a sequential scan. But the problem doesn't surface until you have that much data. Query planners adapt to scale, while a hard-coded query plan wouldn't. That's not even getting started on things like joins (where join side order matters a lot) and index scans on multiple columns.
I'm not aware of any systems that are like this. Some of the early database systems (ISAM/VSAM, hierarchical databases like CODASYL) were a bit like this, before the relational model allowed a single unified data paradigm to be expressed as a general-purpose query language.
But, remember that a lot of users wanting to query things aren't going to be app developers. They'll be wanting to do a one off query or run some report. You'll tend to need an optimizer no matter what.
Anyway, the solution was to manually rewrite the query as a UNION ALL of the two cases which was fast on both. I'm still annoyed though by the fact that MSSQL couldn't just have done that for me.
Slapping a DISTINCT in isn't the answer, if it was then you'd just use UNION instead of UNION ALL, because that often makes the query planner call for a lot of excess work to be done. I once found that wrapping and main kitchen sink in a CTE and applying DISTINCT in the SELECT that calls it has the desired effect, but that is risky as it relies on undefined behaviour that may change at a later date and tank your performance. If the number of rows being returned is never going to be large, and your situation allows multiple statements, a safer option is to pull the data into a temporary table and SELECT from that with DISTINCT. Or you could de-dupe in the next layer instead of the DB, or just accept dupes if exact cardinality isn't important and your users aren't going to care about the occasional double-row (i.e. a simple quick-search feature).
And, of course, sometimes you want the duplicates, again maybe in the case of a quick search feature (where you split the results into categories and each duplicate of a row is likely in a different category, especially if the categories align with the search clauses that are being ORed).
However for cases like described in the article, you'd need to handle that.
While I like CTEs, I've had more consistent luck with subqueries. They also compose more easily.
This is a key part of the problem, and something that people don't realise about query planners. The goal of the planner is not to find the best query plan no matter what, or even to find the best plan at all, it is instead to try to find a good enough plan quickly.
The QPs two constraints (find something good enough, do so very quickly) are often diametrically opposed. It must be an interesting bit of code to work on, especially as new features are slowly added to the query language.
It started out as Watcom SQL, then after Watcom was acquired by PowerSoft, it was renamed to "SQL Anywhere".
After Sybase acquired PowerSoft it was later renamed to "Adaptive Server Anywhere". I think SAP renamed it back to "SQL Anywhere" after they acquired Sybase.
Edit: I guess the main difference is that it's just calculating separate sets and then merging them, which isn't really DeMorgan's, but a calculation approach.
There _are_ optimizers that try to represent this kind of AND/OR expression as sets of distinct ranges in order to at least be able to consider whether a thousand different index scans are worth it or not, with rather limited success (it doesn't integrate all that well when you get more than one table, and getting cardinalities for that many index scans can be pretty expensive).
I imagine negative filters to be a bit inefficient as well, though maybe not for a simple count.
And the example shows exactly two sets.
So it's exactly De Morgan's law.
It doesn't have to be, it just "is" in some database engines for various historical reasons.
I.e.: PostgreSQL 18 is the first version to support "B-tree Skip Scan" operators: https://neon.com/postgresql/postgresql-18/skip-scan-btree
Other database engines are capable of this kind of thing to various degrees.
Unrelated or not, skip-scan will be useful in some cases. However, the cases where it adds noticeable benefit are the cases where a separate index should have been used for leading columns anyway (and in memory-constrained/frequently-cold-cache situations, a separate index might even be faster). If you can’t even begin to guess at the order of magnitude of cardinality, or if your leading-column-lacking queries are quite rare and not that perf-sensitive on a big table exerting index cache pressure, then skip scans make sense.
The query planner can include the entire matching range for the 'A' column and check the 'B' column for matches via skip-scan.
This is possible in principle, but I don't believe most (any?) database engines use this specific approach.
It could be optimal if the results are required in A,B sorted order.
Another huge benefit that we're realizing as we move some of our heaviest tables to this pattern is that it makes it really easy to index a core set of fields in Elasticsearch, along with a tag to associate it with a particular domain model. This has drastically cut down on our search-specific denormlization and let us avoid expensive index schema updates.
https://notebin.de/?5ff1d00b292e1cd5#AU4Gg8hnY6RAmS9LoZ18xWn...
This used the setup.sql from the linked GitHub repository.
See https://github.com/ethan-seal/ors_expensive/blob/main/benchm... where I use dd to clear the os page cache.
This article by pganalyze talks about it: https://pganalyze.com/blog/5mins-postgres-17-pg-buffercache-...
https://notebin.de/?ac3fcf55e6850f47#ERXndRrqp3X4zEWX5EC3dZU...
Which plan does Postgres choose in your case that results 100ms?
Given the buffer reads seem close to yours, I believe it's page cache.
I find query planning (and databases in general) to be very difficult to reason about, basically magic. Does anyone have some recommended reading or advice?
It's very readable - I always ask new hires and interns to read it.
When you have a solid view of the schema and data sizes you can start to be more predictive about what your code will actually do, THEN you can layer on the complexity of the ORM hell code.
Has anyone put machine learning in an SQL query optimizer yet?
Yes, I think everyone has? At very least I know that MSSQL has because we semi regularly run into problems with it :).
MSSQL keeps track of query statistics and uses those in future planning. SOMETIMES it just so happens that the optimization for the general case makes the outlier 100x slower which kills general performance.
However, it was also much more likely to hit an AWFUL pathological case which completely wrecked the performance as you describe. Combined with pessimistic locking, we ended up with far more overnight support calls from the MSSQL backend than from the PGSQL backend. Usually because it suddenly decided to switch query plan at 1AM.
I wonder if there's a trade-off where an optimizer produces better average query plans but worse outliers.
There is. A (now) classic case is running a nested loop with a sequential scan on the inner side. Works great if the outer side produces few rows as expected, a disaster if there's a misestimation and you have a thousand rows there.
The MSSQL and Postgres planners are different enough that you can't really say it's about one thing, though (MSSQL is one of the few implementations of the Cascades framework in a sort-of hybrid AFAIK, Postgres is straight-up System R).
It sounds plausible to me that caching would often lead to significant performance improvements overall, but trigger bad plans much more often since the plans are not re-evaluated based on the statistics of each single query. So in Postgres you'd get individual queries with pathological performance when the statistics are off, in MSSQL all executions of that query have bad performance until the plan is re-evaluated again.
That way a cached bad plan can't cause issues for more than a few minutes, which is acceptable for our use-case.
It's a sledgehammer but it was easy to add and it works.
That might work in your case as well, without requiring modifications in logic to support the dummy field?
But this may require some basic materialized views, which postgres doesn't really have.
[1]: https://www.postgresql.org/docs/current/planner-stats.html#P...
In order to keep it up to date, the developer has to tell postgres to refresh the data and postgres will do all the work from scratch.
Incremental Materialized views are _hard_. This^2 article goes through how Materialize does it.
MSSQL does it really well from what I understand. They only have a few restrictions, though I've never used a MSSQL materialized view in production.^3
[1]: https://www.postgresql.org/docs/current/rules-materializedvi... [2]: https://www.scattered-thoughts.net/writing/materialize-decor... [3]: https://learn.microsoft.com/en-us/sql/t-sql/statements/creat...
Query optimizers definitely try to estimate cardinalities of joins. It's a really, really hard problem, but the typical estimate is _much_ better than “eh, no idea”.
The generalized guidance without even mentioning database server as a baseline, without showing plans and/or checking if the database can be guided is just a bad teaching.
It looks like author discovered a trick and tries to hammer everything into it by overgeneralizing.
I'm not super familiar with the term index merge - this seems to be the term for a BitmapOr/BitmapAnd?
Is there another optimization I'm missing?
The article links to my code for my timings here: https://github.com/ethan-seal/ors_expensive
There is an optimization that went in the new release of PostgreSQL I'm excited about that may affect this - I'm not sure. See https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...
Different databases will use similar terms for different operations, but I would guess that the comment refers to something similar to MySQL's index merge (which is essentially reading the row IDs of all the relevant ranges, then deduplicating them, then doing the final scan; it's similar to but less flexible than Postgres' BitmapOr).
That said, I’m not sure if it would have any impact on this specific query; I’d need to test.
A good table has the right indexes, and a good API to deal with the table is using the RIGHT indexes in it's criteria to get a good result.