Regression for queries with BNL plans
Product | Affected Versions | Related Issues | Fixed In |
---|---|---|---|
YSQL | v2024.1, v2024.2, v2025.1 | #28222 | v2024.2.5, v2025.1.1 |
Description
After upgrading YugabyteDB from version 2.20 to 2024.1, 2024.2, or 2025.1, a regression may be observed in query performnace. This issue arises when the query planner starts to choose Hash join or Merge join over Batch nested loop join (BNL).
This change in plan selection occurs under these specific conditions:
- BNL join is enabled in v2.20.
- The Cost-based optimizer (CBO) is not enabled.
As part of enabling BNL join by default, BNL plan cost was adjusted which caused the optimizer to incorrectly prefer Hash join and Merge join to be chosen instead of BNL in specific scenarios. This may lead to performance regression if the execution plan with BNL is significantly faster than Hash join or Merge join.
Mitigation
-
Execute the specific query which has regressed with plan hints with any of the following options:
-
Disable all the joins except BNL.
/*+ Set(enable_nestloop off) Set(enable_hashjoin off) Set(enable_mergejoin off) */
-
Set the BNL hint to force the use of BNL.
/*+ YbBatchedNL (t1 t2)*/
This requires tracking of the query plans if they were using BNL before the upgrade.
-
-
Enable cost-based Optimizer (CBO). This approach also requires running Analyze on the tables. Enabling CBO can change the query plans for various other queries.
-
If you upgrade to version 2024.2.5 or later, then set the following flags to enable legacy BNL mode:
- If your cluster has
yb_enable_optimizer_statistics = off
, setyb_enable_cbo = legacy_bnl_mode
. - If your cluster has
yb_enable_optimizer_statistics = on
, setyb_enable_cbo = legacy_stats_bnl_mode
.
- If your cluster has
Details
Example
This example illustrates how a query plan changes after an upgrade, leading to a potential performance regression.
-
Set up the tables:
drop table if exists t1, t2; create table t1 (a int, b int); insert into t1 select i, i from generate_series(1, 100) i; create table t2 (k1 int, k2 int, k3 int, k4 int, v int, primary key (k1 asc)); create unique index i_t2_k2 on t2 (k2); create index i_t2_k3 on t2 (k3); insert into t2 select i, i, (i / 10) * 10, i % 10, i from generate_series(1, 1000000) i;
-
Default plan on v2.20.9 (with BNL enabled and CBO disabled):
explain (summary off) select * from t1 join t2 on t1.a = t2.k3;
QUERY PLAN -------------------------------------------------------------------------- YB Batched Nested Loop Join (cost=0.00..105.32 rows=5000 width=28) Join Filter: (t1.a = t2.k3) -> Seq Scan on t1 (cost=0.00..100.00 rows=1000 width=8) -> Index Scan using i_t2_k3 on t2 (cost=0.00..5.20 rows=10 width=20) Index Cond: (k3 = ANY (ARRAY[t1.a, $1, $2, ..., $1023])) (5 rows)
-
Default plan on v2024.2 (with BNL enabled and CBO disabled):
explain (summary off) select * from t1 join t2 on t1.a = t2.k3;
QUERY PLAN -------------------------------------------------------------------- Merge Join (cost=299.66..379.66 rows=5000 width=28) Merge Cond: (t1.a = t2.k3) -> Sort (cost=149.83..152.33 rows=1000 width=8) Sort Key: t1.a -> Seq Scan on t1 (cost=0.00..100.00 rows=1000 width=8) -> Sort (cost=149.83..152.33 rows=1000 width=20) Sort Key: t2.k3 -> Seq Scan on t2 (cost=0.00..100.00 rows=1000 width=20) (8 rows)
After the upgrade, the same query chooses a merge join plan. Note that depending on the query, the BNL join may also be replaced by a hash join plan.
How to enable BNL
BNL is enabled by setting any of the following:
-
The ysql_yb_bnl_batch_size flag to a value greater than 1.
-
The yb_bnl_batch_size configuration parameter using ysql_pg_conf_csv.
How to enable CBO
CBO is enabled by configuring the following flags:
-
Prior to version 2024.2.4, set the configuration parameter yb_enable_base_scans_cost_model or the flag ysql_yb_enable_base_scans_cost_model to ON.
-
For version 2024.2.4 and later, set the configuration parameter yb_enable_cbo or the flag ysql_yb_enable_cbo to ON.