What's new in the YugabyteDB v2.20 LTS release series
Release announcements
Release notes
LTS - Feature availability
All features in stable releases are considered to be GA unless marked otherwise.What follows are the release notes for the YugabyteDB v2.20 release series. Content will be added as new notable features and changes are available in the patch releases of the YugabyteDB v2.20 release series.
For an RSS feed of all release series to track the latest product updates, point your feed reader to the RSS feed for releases.
Technical Advisories
v2.20.3.0 - April 9, 2024
Build: 2.20.3.0-b68
Third-party licenses: YugabyteDB, YugabyteDB Anywhere
Downloads
Docker
docker pull yugabytedb/yugabyte:2.20.3.0-b68
New feature
-
Roll back upgrades. Ability to roll back a database upgrade in-place and restore the cluster to its state before the upgrade. You can roll back a database upgrade only to the pre-upgrade release. EA
-
Added support for Red Red Hat Enterprise Linux 9.3 on x86-based systems. Refer to Operating system support for the complete list of supported operating systems.
Improvements
YSQL
- Alters temporary namespace naming in YB to
pg_temp_<tserver-uuid>_<backend_id>
frompg_temp_<backend_id>
, making them unique across nodes and preventing temp tables overwriting or deletion. #19255 - Treats
REFRESH MATERIALIZED VIEW
as a non-disruptive change, preventing unnecessary transaction terminations. The default option,REFRESH MATERIALIZED VIEW NONCONCURRENTLY
, modifies metadata but without making a disruptive alteration. #20420 - Displays distinct prefix keys explicitly in the explain output, enhancing the clarity of indexing for users. #20831
- Allows correct initialization and propagation of
ybDataSent
andybDataSentForCurrQuery
flags for both parent and child transactions, for enhanced error handling and retry mechanism. #18638 - Optimizes the
get_tablespace_distance
function, enhancing the speed of theyb_is_local_table
YSQL function. Reduces query time by cachingGeolocationDistance
value. #20860
DocDB
- Limits the number of rows returned per transaction per tablet in
pg_locks
to avoid potential memory issues during batch inserts, and includes additional fields to indicate partial lock info. #20765 - Introduces a new YSQL configuration parameter
yb_locks_txn_locks_per_tablet
to limit the number of rows returned by pg_locks, preventing the system from running out of memory during large transactions. #19934 - Introduces a new metric for untracked memory which eliminates the need to hardcode child memory trackers. Now users can effortlessly monitor untracked memory for better resource management. #18683
- Limits the number of tablets per node, and hastens reaching the desired number of tablets by lowering the values of FLAGS_tablet_split_low_phase_shard_count_per_node to 1 and FLAGS_tablet_split_low_phase_size_threshold_bytes to 128_MB. #20579
- Adds a new retrying master-to-master task, allowing for the API
AreNodesSafeToTakeDown
to check if it's safe to remove or upgrade certain nodes without disrupting overall cluster health. #17562 - Introduces
AreNodesSafeToTakeDown
API that ensures safe node removals during cluster upgrades or maintenance operations by checking tablet health and follower lag, facilitating seamless and risk-free updates. #17562 - Enables monitoring of master leader heartbeat delays through a new RPC in the MasterAdmin, ensuring undesired lags can be readily detected and mitigated. #18788
- Introduces two new metrics to alert users when CreateTable function tries to exceed the tablet count limits and to highlight when a tablet server is managing more tablets than its capacity allows. #20668
- Adjusts the tablet guardrail mechanism to wait for TServer registrations when master leadership changes, avoiding false-positive CREATE TABLE rejections due to incomplete TServer data. A new gflag now controls this wait duration. #20667
- Adds a flag
FLAGS_tablet_split_min_size_ratio
to control tablet splitting based on SST file sizes, ensuring better control over tablet size imbalance. #21458
CDC
- Added a test to certify the
safe_time
set duringGetChanges
call, reducing data loss during network failures. Ensures consistentsafe_hybrid_time
in multipleGetChanges
calls. #21240
Bug fixes
YSQL
- Eliminates unnecessary computation of range bounds in Index-Only Scan precheck condition, preventing crashes for certain queries and improving performance. #21004
- Eliminates risk of data loss by ensuring only the first statement in SQL mutation batches is retried in the event of a transaction conflict, when using PostgreSQL extended query protocol. #21297
- Temporarily reverts new field additions to the PgsqlResponsePB proto to address upgrade failures encountered when transitioning from versions 2.14/2.16 to 2.20.2. #21229
- Fixes table rewrite issue on non-colocated tables/matviews in colocated DB, ensuring the new table uses the original table's colocation setting. Includes a workaround for GH issue 20914. #20856
- Reduces excessive storage metric updates during
EXPLAIN ANALYZE
operation, enhancing performance by incorporatingstorage_metrics_version
inYBCPgExecStats
andYbInstrumentation
. #20917 - Prevents simultaneous send of read and write operations in the same RPC request that could lead to inconsistent read results, by ensuring that, in case of multiple operations, all buffered ones are flushed first. #20864
- Redesigned expression tree walkers to properly handle NULL nodes in queries with subselects used in an index condition, preventing planner crashes. #21133
- Enforces stricter locking mechanisms during concurrent updates on different columns of the same row, to maintain data consistency and prevent 'write-skew anomaly within a row’. Adds a new gflag
ysql_skip_row_lock_for_update
to toggle the new row-level locking behavior. #15196 - Adjusts heartbeat mechanism to shut down when an "Unknown Session" error occurs, reducing log alerts. This benefits idle connections with expired sessions. #21264
- Allows BNL's on outer and inner tables, even if the inner table has "unbatchable" join restrictions that can't accept batches of inputs, enhancing queries with complex join conditions. #21366
- Incorporates checks of inequality filters in the YSQL layer to avoid transmitting trivially false inequalities, preventing undesired behavior from DocDB iterators. #21383
- Allows ModifyTable EXPLAIN statements to run as a single row transactions, decreasing latency. Also enables logging for transaction types when
yb_debug_log_docdb_requests
is enabled. #19604 - Corrects an issue where certain unbatchable filters weren't detected during indexpath formation when indexpath accepted batched values from multiple relations. #21292
DocDB
- Fixes a race condition on kv_store_.colocation_to_table to prevent undefined behavior and re-enables packed row feature for colocated tables, enhancing data writing and compaction processes. #20638
- Clears
pending_deletes_
on failed delete tasks thus preventing tablets from being incorrectly retained after task failure or completion. This rectifies a race condition and allows the Load Balancer to perform operations on specific tablets and Tablet Servers. #13156 - Enhances universe upgrade process by incrementing ClusterConfig version during an update and adds checks to prevent universe_uuid modification. Also introduces a yb-ts-cli to clear universe uuid if necessary, improving troubleshooting capabilities. #21491
- Reflects the actual columns locked in conflict resolution instead of the shared in-memory locks in
pg_locks
, providing more accurate output for waiting transactions. #18399 - Modifies the
DocDB
system by shifting the acquirement ofsubmit_token_
of theWriteQuery
to the post-conflict resolution phase to prevent DDL requests from being blocked, thus optimizing both reads and writes for continued performance and enhanced data consistency. #20730 - Corrects transaction queue behavior allowing multiple waiters for a single transaction per tablet, thereby resolving conflicts and enhancing transaction handling capability. #18394
- Incorporates detection of recently aborted transactions into the transaction coordinator with a new flag
clear_deadlocked_txns_info_older_than_seconds
. #14165, #19257 - Disables the packed row feature for colocated tables, effectively preventing a possible encounter with the underlying issue in 21218 during debugging. #21218
- Prevents TServers from crashing due to duplication of tablets in two drives, occurring after repairing a faulty drive, by preventing the creation of new tablets during Raft-based system (RBS) process. #20754
- Eliminates potential issues with colocated tables during heavy DDL operations and compaction, reducing the risk of crashing on newer builds (2.20.0+) where the packed row feature is default. #21244
- Allows database drop operations to proceed smoothly by ignoring missing streams errors and skipping replication checks for already dropped tables. #21070
- Allows ListTabletServers to handle heartbeats older than 24 days by adjusting the setting to the maximum int32 value, avoiding system crash. #21096
- Includes the
indexed_table_id
with the index in table listings, eliminating the need for a second lookup to associate a main table with an index. #21159 - Corrects RPATH setting for OpenLDAP libraries that prevents system libraries being picked up or not found. Also refactors library_packager.py for improved library dependency categorization. #21236
- Reduces TPCC NewOrder latency by replacing the ThreadPoolToken with a Strand within a dedicated rpc::ThreadPool in PeerMessageQueue's NotifyObservers functions, enhancing speed and efficiency. #20912
- Early aborts transactions that fail during the promotion process, enhancing throughput in geo-partitioned workloads and offering stability in geo-partitioned tests. #21328
- Corrects block cache metrics discrepancy by ensuring Statistics object passes into LRUCache from TableCache for accurate updates. #21407
- Fixes a segmentation fault in yb-master by checking for a null pointer before dereferencing it, addressing an issue in the CDC run on
2.23.0.0-b37-arm
. #21648 - Validates the use of two arguments for
disable_tablet_splitting
, addressing a previous condition where only one was required, thereby enhancing backup process reliability. #8744
CDC
- Introduces a fix for data loss issue caused by faulty update of
cdc_sdk_safe_time
during explicit checkpointing, along with tests to ensure validity. #15718 - Fixed the decoding of NUMERIC value in CDC records to prevent precision loss by ensuring that the decoded string is not converted to scientific notation if its length is more than 20 characters. #20414
- Fixes issue with CDC packed rows, now ensures a single record for large insert operations, providing consistent data regardless of row size. #20310
- Ensures consistency in
CDCSDKYsqlTest.TestLargeTxnWithExplicitStream
test by settingFLAGS_cdc_max_stream_intent_records
value from 40 to 41, overcoming the issue of multiple records for a single insert when packed row size exceedsysql_packed_row_size_limit
. #20310
Other
- Updates the condition for HT lease reporting to ensure accurate leaderless tablet detection in RF-1 setup, preventing false alarms. #20919
- Reduces disruptions by throttling the master process log messages related to "tablet server has a pending delete" into 20-second intervals. #19331
v2.20.2.2 - April 1, 2024
Build: 2.20.2.2-b1
Third-party licenses: YugabyteDB, YugabyteDB Anywhere
Downloads
Docker
docker pull yugabytedb/yugabyte:2.20.2.2-b1
This is a YugabyteDB Anywhere-only release, with no changes to the database.
v2.20.2.1 - March 22, 2024
Build: 2.20.2.1-b3
Third-party licenses: YugabyteDB, YugabyteDB Anywhere
Downloads
Docker
docker pull yugabytedb/yugabyte:2.20.2.1-b3
Bug fixes
DocDB
- Enhances universe upgrade process by incrementing ClusterConfig version during an update and adds checks to prevent universe_uuid modification. Also, introduces a yb-ts-cli to clear universe UUID if necessary, improving troubleshooting capabilities. #21491
CDC
- Ensures numeric values are decoded without precision loss by utilizing string representation with no length limit and the PostgreSQL numeric_out method. #20414
v2.20.2.0 - March 4, 2024
Build: 2.20.2.0-b145
Third-party licenses: YugabyteDB, YugabyteDB Anywhere
Downloads
Docker
docker pull yugabytedb/yugabyte:2.20.2.0-b145
New features
-
Added support for read-only mode for DR replica, currently supported for YSQL tables.
-
Tablet splitting now can be enabled on tables with CDC configured. #18479
-
Transactional CDC now supports consistent snapshots, ensuring data integrity during replication to a sink. Snapshots, ordered by commit time across all tables and tablets, establish a reliable replication order. #19682
Improvements
YSQL
- Introduces
yb_silence_advisory_locks_not_supported_error
as a temporary solution, avoiding disruption while users transition from the use of advisory locks. #19974 - Shifts from the test flag "FLAGS_TEST_enable_db_catalog_version_mode" to the TP flag "FLAGS_ysql_enable_db_catalog_version_mode", enhancing user control over concurrent DDL execution across different databases. #12417
- Issues a notice for unsafe ALTER TABLE operations, including for ADD COLUMN...DEFAULT, to indicate existing rows won't be backfilled with the default value, enhancing user awareness. Suppression possible by setting
ysql_suppress_unsafe_alter_notice
flag to true. #19360 - Added sorting capabilities to BatchNestedLoopJoin to return the rows in the same order as NestedLoopJoin. #19589
- Replaces the
ysql_max_read_restart_attempts
andysql_max_write_restart_attempts
flags withyb_max_query_layer_retries
, applies limit to Read Committed isolation statement retries, and adjustsretry_backoff_multiplier
andretry_min_backoff
defaults. #20359 - Disallows the creation of a temporary index with a tablespace, preventing client hangs and providing a clear error message for temporary index creation with set tablespace. #19368
- Enables index tablespace modification through the
ALTER INDEX SET TABLESPACE
command and regulates column statistics using theALTER INDEX ALTER COLUMN SET STATISTICS
command. Also, allows the creation and alteration of materialized views with the specified tablespace. Suppress the beta feature warning by enablingysql_beta_feature_tablespace_alteration
flag. #6639 - Adds function to log the memory contexts of a specified backend process, enhancing memory usage monitoring and allowing users to troubleshoot memory-related issues more effectively. #14025
DocDB
- Allows customizing retryable request timeouts to respect client-side YCQL and YSQL timings, optimizing log replay and preventing the tserver from rejecting requests that exceed durations. Adjusts default retryable request timeout to 660 seconds and offers a configuration to eliminate server-side retention of retryable requests with
FLAGS_retryable_request_timeout_secs =0
. #18736 - Speeds up TServer Init by optimizing the handling of deleted and tombstoned tablets. It ensures faster startup by using a new flag
num_open_tablets_metadata_simultaneously
, which sets the number of threads for opening tablet metadata. This allows for parallel opening of metadata, improving response times even in cases with large numbers of tablets. Additionally, the handling of tablets marked asDeleted
orTombstoned
is managed asynchronously, marking tombstoned tablets as dirty for inclusion in the next heartbeat. #15088 - Logs all instances of tablet metadata creation/updation, enabling additional insights for troubleshooting in cases of multiple meta records for the same tablet. #20042
- Adds a 10-second delay (
auto_flags_apply_delay_ms
) between AutoFlag config updates and their application, allowing all tservers to receive the new config before applying it. This change enhances configuration consistency and update safety. #19932 - Enhances thread safety by setting
Wthread-safety-reference
to check when guarded members are passed by reference and resolving all build errors resulting from this change. #19365 - Automates recovery of index tables impacted by a bug, preventing performance degradation and disk size leak, by ensuring schema.table_properties.retain_delete_markers is reset to false when index backfilling is done. #19731
- Enhances the
demote_single_auto_flag
yb-admin command by returning distinct error messages for invalid process_name, flag_name, or non-promoted flag, thereby aiding easier identification of errors. ReplacesHasSubstring
withASSERT_STR_CONTAINS
in the AutoFlags yb-admin test. #20004 - Introduces support for Upgrade and Downgrade of universes with xCluster links, enhancing compatibility checks for AutoFlags during these operations. Data replication between two universes is only catered if AutoFlags of the Target universe are compatible with the Source universe. The compatibility check, stored in
ProducerEntryPB
, triggers if the AutoFlag configuration changes during upgrades and rollbacks. This prevents unnecessary RPC calls if no AutoFlag configurations have changed. Also includes fixes for cds initialization bugs inTestThreadHolder
. #19518 - Offers redesigned server level aggregation for metrics, thus introducing more metrics for enhanced debugging. Removes several unused URL parameters and makes the new output compatible with YBA and YBM, preventing double-counting issues in charts. Drops unused JSON and Prometheus callbacks from MetricEntity for a cleaner design. #18078
- Allows a limit on the addition of new tablet replicas in a cluster to conserve CPU resources, with safeguards for downscaling. Introduces test flags for controlling memory reservation and tablet replica per core limits. #16177
- Introduces verbose logging for global and per table state transitions in the load balancer to facilitate easier debugging. #20289
- Reduces server initialization time by eliminating the accumulation of deleted tablet superblocks during startup, through a modification in the
DeleteTablet
operation. #19840 - Enables automatic recovery of index tables affected by a bug, verifying their backfilling status and correcting the
retain_delete_markers
property to enhance performance. #20247 - Includes single shard (fast-path) transactions in the pg_locks by querying single shard waiters registered with the local waiting transaction registry at the corresponding Tserver, ensuring more complete transaction tracking and lock status reporting. #18195
- Creates Prometheus metrics for server hard and soft memory limits which allow detailed insight into TServer or master memory usage regardless of Google flag defaults; aids in creating dashboards charts to monitor utilization close to soft limit or TServer TCMalloc overhead. #20578
- Enables control over the batching of non-deferred indexes during backfill via a new flag, improving index management. #20213
yugabyted
-
Corrects the CPU usage reporting in the sankey diagram by filtering nodes based on region selection on the performance page. #19991
-
Repairs yugabyted-ui functionality when using custom YSQL and YCQL ports by passing values to yugabyted-ui, ensuring the correct operation of the user interface. #20406
Bug fixes
YSQL
- Reduces Proc struct consumption by enabling cleanup for killed background workers, preventing webserver start-up failure after 8 attempts. #20154
- Frees up large volumes of unused memory from the webserver after processing queries, enhancing periodic workloads by reallocating held memory to the OS more effectively. #20040
- Introduces two new boolean flags for YSQL webserver debugging and failure identification.The first will log every endpoint access and its list of arguments before running the path handler. The second will print basic tcmalloc stats after the path handler and garbage collection have run. #20157
- Rectifies an issue causing segmentation faults when the postmaster acquires already owned LWLock, enhancing stability during process cleanup. Uses KilledProcToCleanup instead of MyProc if acquiring locks under the postmaster. #20166
- Introduces two new boolean flags for YSQL webserver debugging and failure identification.The first will log every endpoint access and its list of arguments before running the path handler. The second will print basic tcmalloc stats after the path handler and garbage collection have run. #20157
- Restarts postmaster in critical sections.This applies to backend PG code; defines and improves handling of errors in these sections. #20255
- Introduces the
pg_stat_statements.yb_qtext_size_limit
flag to control the maximum file size read into memory, preventing system crashes due to oversized or corrupt qtext files. #20211 - Caps the number of attempts to read inconsistent backend entries to 1000 for safer operation and better visibility, limiting potential infinite loops in /rpcz calls and logging every hundredth attempt. #20274
- Resolves segmentation faults in the webserver SIGHUP handler during cleanup. #20309
- Display consistent wait-start times in
pg_locks
view for waiting transactions. #18603, #20120 - Reduces excessive memory consumption during secondary index scans. #20275
- Allows the finalize_plan function to now apply finalize_primnode on PushdownExprs, ensuring Parameter nodes in subplans transfer values accurately, especially in Parallel Queries. #19694
- Return correct results when Batch Nested Loop join is used for queries involving Nested LEFT JOINs on LATERAL views. #19642, #19946
- Upgrades "Unknown session" error to FATAL permitting drivers to instantly terminate stale connections, minimizing manual user intervention. #16445
- Rectifies correctness issues when BatchNestedLoop join is used and the join condition contained a mix of equality and non-equality filters. #20531
- Mitigates incorrect data entry into the default partition by incrementing the schema version when creating a new partition, enhancing data consistency across all connections. #17942
- Rectifies correctness issue when join on inequality condition and join columns contains NULL values. #20642
- Rectifies correctness issue when queries involving outer joins and aggregate use BNL. #20660
- Corrects the Batch Nested Loop's optimization logic for proper handling of cases where the given limit matches the outer table's exact size, ensuring accurate query results. #20707
- Prevents "Not enough live tablet servers to create table" error during ALTER TABLE SET TABLESPACE by correctly supplying the placement_uuid, even when creating a table in the same tablespace. #14984
- Addresses a bug that caused backup failure due to the absence of
yb_catalog_version
, by ensuring the function's existence post-normal migration. #18507 - Corrects an error in the aggregate scans' pushdown eligibility criteria to prevent wrong results from being returned when PG recheck is not expected, but YB preliminary check is required to filter additional rows. #20709
- Ensures the Linux
PDEATH_SIG
mechanism signals child processes of their parent process's exit, by correctly configuring all PG backends immediately after their fork from the postmaster process. #20396 - Fixes a MISMATCHED_SCHEMA error when upgrading from version 2.16 to 2.21 by introducing a 2-second delay for catalog version propagation when a breaking DDL statement is detected. #20842
- Return correct result for join queries using BatchNestedLoop join and DISTINCT scan using the range Index during inner table scan. #20827
- Renders a fix for memory corruption issue that caused failure in creating a valid execution plan for
SELECT DISTINCT
queries. Enables successful execution of queries without errors and prevents server connection closures by disablingdistinct pushdown
. This fix improves the stability and effectiveness of SELECT DISTINCT queries. #20893 - Eliminates unnecessary computation of range bounds in Index-Only Scan precheck condition, preventing crashes for certain queries. #21004
DocDB
- Resolves potential
WriteQuery
leak issue in CQL workloads, ensuring proper execution and destruction of queries, while preventing possible tablet shutdown blockages during conflict resolution failure. #19919 - Mitigates the issue of uneven tablet partitions and multiple pollers writing to the same consumer tablet by only applying intents on the consumer that match the producer tablet's key range. If some keys/values are filtered out from a batch, it will not delete the consumer's intents, as they may be needed by subsequent applications. Also guarantees idempotency, even if some apply records stutter and fetch older changes. #19728
- Reduces chances of transaction conflicts upon promotion by delaying the sending of UpdateTransactionStatusLocation RPCs until after the first PROMOTED heartbeat response is received, enhancing transaction consistency and accuracy. #17319
- Sets
kMinAutoFlagsConfigVersion
to 1 to resolve flag configuration mismatch issue. #19985 - Unblocks single shard waiters once a blocking subtransaction rolls back, by applying identical conflict check logic for both distributed transactions and single shard transactions. #20113
- Eliminates a race condition that can occur when simultaneous calls to
SendAbortToOldStatusTabletIfNeeded
try to send the abort RPC, thus preventing avoidable FATALs for failed geo promotions. #17113 - Fixes behavior of Tcmalloc sampling; deprecates the enable_process_lifetime_heap_sampling flag in favor of directly setting profiler_sample_freq_bytes for controlling tcmalloc sampling, enhancing control over sampling process. #20236
- Resolves instances of the leaderless tablet endpoint incorrectly reporting a tablet as leaderless post-leader change, by tweaking the detection logic to depend on the last occurrence of a valid leader, ensuring more accurate tablet reporting. #20124
- Allows early termination of ReadCommitted transactions with a
kConflict
error, enhancing overall system throughput by eliminating unnecessary blockages without waiting for the next rpc restart. #20329 - Fixes FATAL errors occurring during tablet participant shutdown due to in-progress RPCs by ensuring rpcs_.Shutdown is invoked after all status resolvers have been shut down. #19823
- Modifies SysCatalog tablet's retryable request retention duration to consider both YQL and YSQL client timeouts, reducing the likelihood of
request is too old
errors during YSql DDLs. #20330 - Handles backfill responses gracefully even when they overlap across multiple operations, reducing risks of crashes and errors due to network delays or slow masters. #20510
- Reintroduces bloom filters use during multi-row insert, improving conflict resolution and rectifying missing conflict issues, while also addressing GH 20648 problem. #20398, #20648
- Fixes handling of duplicate placement blocks in under-replication endpoint for better compatibility and correct replica counting, preventing misrepresentation of under-replicated tablespaces. #20657
- Logs the first failure during setup replication instead of the last error, facilitating better error diagnosis. #20689
- Implements a retry mechanism to acquire the shared in-memory locks during waiter resumption. Rather than failing after a single attempt, it now schedules retrying until the request's deadline, reducing request failures due to heavy contention. #20651, #19032, #19859
- Reduces log warnings in normal situations by downgrading repeated waiter resumption alerts to VLOG(1), benefiting from the direct signaling of transaction resolution. #19573
- Disables the packed row feature for colocated tables to prevent possible write failures post complications, as a workaround while investigating issue 20638. #21047
- Resolves database deletion failures by skipping replication checks for dropped tables during the database drop process, addressing errors related to missing streams. #21070
- Addresses a race condition on kv_store_.colocation_to_table reading and permits packed row features for colocated tables, tackling undefined behaviors and failed table writes. #20638
- Disables the packed row feature for colocated tables to prevent the issue 20638 that causes subsequent write failures after certain compactions. #21047
- Corrects RPATH setting for OpenLDAP libraries, preventing the system from picking up wrong versions or failing to find them at all. #21236
CDC
- Corrects the computation of the
cdcsdk_sent_lag
metric to prevent steep, disproportionate increases by updating thelast_sent_record_time
when aSafePoint
record is spotted, in addition to DMLs andREAD
ops. #15415
Other
- Adjusts
tserver start
andtserver stop
scripts to successfully terminate all running PG processes, irrespective of their PID digit count. #19817 - Updates the condition for HT lease reporting to ensure accurate leaderless tablet detection in RF-1 setup, preventing false alarms. #20919
v2.20.1.3 - January 25, 2024
Build: 2.20.1.3-b3
Third-party licenses: YugabyteDB, YugabyteDB Anywhere
Downloads
Docker
docker pull yugabytedb/yugabyte:2.20.1.3-b3
Bug fixes
YSQL
- Fixed index scans where a query with join on inequality condition incorrectly returns rows with NULL in the key column. #20642
DocDB
v2.20.1.2 - January 17, 2024
Downloads
Use 2.20.1.3 or later.
Bug fixes
YSQL
- Fix BNL local join lookup equality function. #20531
CDC
- Fix addition of new tables to stream metadata after drop table. #20428
v2.20.1.1 - January 11, 2024
Downloads
Use 2.20.1.3 or later.
Improvements
YSQL
- In Read Committed Isolation, limit the number of retry attempts when the aborted query is retried. #20359
Bug fixes
YSQL
-
Return correct results when Batch Nested Loop join is used for queries involving Nested LEFT JOINs on LATERAL views. #19642, #19946
-
Added a regression test for nested correlated subquery. #20316
CDC
- Fixed decimal type precision while decoding CDC record. #20414
DocDB
- In Read Committed Isolation, immediately abort transactions when conflict is detected. #20329
v2.20.1.0 - December 27, 2023
Downloads
Use 2.20.1.3 or later.
Improvements
YSQL
- Adjusts ysql_dump when run with
--include-yb-metadata
argument, to circumvent unnecessary automatic index backfilling. This avoids unnecessary RPC calls. #19457 - Imports the
pgcrypto: Check for error return of px_cipher_decrypt
upstream PostgreSQL patch; OpenSSL 3.0+ prerequisite. #19732 - Imports the upstream PG commit
Disable OpenSSL EVP digest padding in pgcrypto
; OpenSSL 3.0+ prerequisite. #19733 - Imports an upstream PostgreSQL commit that adds an alternative output when OpenSSL 3 doesn't load legacy modules; OpenSSL 3.0+ prerequisite. #19734
- Removes the need for a table scan when adding a new column with a
NOT NULL
constraint and a non-volatileDEFAULT
value, enhancing ADD COLUMN operations. #19355 - Mitigates CVE-2023-39417 by importing a security-focused upstream Postgres commit from REL_11_STABLE for YSQL users. #14419
- Fixes CVE-2020-1720 by importing upstream Postgres commit from REL_11_STABLE, enabling support for
ALTER <object>DEPENDS ON EXTENSION
in future for objects like function, procedure, routine, trigger, materialized view, and index. #14419 - Modifies the planner to allow ordered index scans with IN conditions on lower columns; this takes advantage of the YB LSM indexes, which maintain index order. #19576
- Increase the oom_score_adj (Out of Memory score adjustment) of the YSQL webserver to 900, the same as PG backends, prioritizing its termination when it significantly consumes memory. #20028
- Blocks the use of advisory locks in YSQL, as they are currently unsupported. #18954
- Introduces
yb_silence_advisory_locks_not_supported_error
as a temporary solution, avoiding disruption while users transition from the use of advisory locks. #19974
DocDB
- Enable checks for trailing zeros in SST data files to help identify potential corruption. #19691
- Includes version information in the error message when the yb process crashes due to AutoFlags being enabled on an unsupported older version, enhancing the ease of identifying the issue. #16181
- Prints long trace logs with a continuation marker when a trace is split into multiple LOG(INFO) outputs. This is for log readability. #19532, #19808
- Upgrades OpenSSL to version 3.0.8, disabling Linuxbrew builds and updating glog for stack unwinding based on the backtrace function. #19736
- Expands debug information to help investigate SELECT command errors that imply faults in read path processing or provisional record writing. #19876
- Adds support for
rocksdb_max_sst_write_retries
flag: maximum allowed number of attempts to write SST file in case of detected corruption after write (by default 0 which means no retries). Implemented for both flushes and compactions. #19730 - Balances load during tablet creation across all drives, preventing bottlenecks and underutilized drives during remote-bootstrapping; uses total number of tablets assigned for tie-breaking. #19846
- Adds tserver page to display all ongoing Remote Bootstrap sessions, and shows the source of bootstrap sessions in the
Last Status
field, for improved visibility. #19568 - Enhances error reporting from XClusterPollers by storing just the error code instead of detailed status, making it safer against master and Tserver restarts and reducing memory usage. #19455
- Incorporates a JoinStringsLimitCount utility for displaying only the first 20 elements in a large array, saving memory and space while logging or reporting tablet IDs. #19527
- Adds a sanity check to prevent restoration of the initial sys catalog snapshot when the
master_join_existing_universe
flag is set. #19357 - Parallelizes RPCs in
DoGetLockStatus
to fetch locks swiftly, enhancing the database's response time while retrieving old transactions. #18034 - Simplifies xCluster replication failure debugging via a new
get_auto_flags_config
yb-admin command, returning the current AutoFlags config. #20046 - Enables automatic recovery of index tables affected by #19544, improving performance and disk size management by removing excess tombstones in the SST files. #19731
- Introduces verbose logging for global and per table state transitions in the load balancer to facilitate easier debugging. #20289
CDC
- Allows the catalog manager to eliminate erroneous entries from the
cdc_state
table for newly split tablets; prevents race conditions by reversing the order of operations during theCleanUpCDCStreamsMetadata
process. #19746
yugabyted
- Resolves
0.0.0.0
to127.0.0.1
as the IP for master, tserver and yugabyted-UI when0.0.0.0
is specified as theadvertise_address
inyugabyted start
command. #18580 - Ensures the yugabyted UI can fetch and display
Alert
messages from different nodes by adjusting the/alerts
endpoint in the API server to include anode_address
parameter. #19972 - Eliminates the deprecated
use_initial_sys_catalog_snapshot
gflag in yugabyted, thus reducing log warning messages. #20056
Other
- Better indentation of multi-line remote trace entry output, for readability. #19758
- Implements a preventative safeguard in tserver operations by adding a check that a tablet has explicitly been deleted before issuing a
DeleteTablet
command, minimizing data loss risks and enhancing reliability. This feature, enabled withmaster_enable_deletion_check_for_orphaned_tablets=true
, is upgrade and downgrade safe. #18332
Bug fixes
YSQL
- Allows the postprocess script in
pg_regress
to run on alternative expected files alongsidedefault_expectfile
andplatform_expectfile
, fixing unintended mismatches. #19737 - Prevents stuck PostgreSQL processes and ensures successful acquisition of locks in the future. This enhancement particularly aids in preventing deadlocks when creating replication slots with duplicate names. #19509
- Ensures OID uniqueness within a database in YB environment by introducing a new per-database OID allocator, avoiding risk of collisions across multiple nodes or tenants. This change is safe for upgrades and rollbacks, enabled by a new gflag
ysql_enable_pg_per_database_oid_allocator
. #16130 - Prevents postmaster crashes during connection termination cleanup by using the killed process's
ProcStruct
to wait on a lock. #18000 - Fixes the distinct query iteration to distinguish between deleted and live keys; this prevents missing live rows during distinct queries on tables with dead tuples. #19911
- Fixes bug when type-checking bound tuple IN conditions involving binary columns like UUID. #19753
- Ensures PK modification does not disable Row-Level Security (RLS). #19815
- Tracks spinlock acquisition during process initialisation, and reduce the time to detect spinlock deadlock. #18272, #18265
- Reverts to PG restart in rare event of process termination during initiation or cleanup; this avoids needing to determine what subset of shared memory items to clean. #19945
- Prevents unexpected crashes during plan creation when joining two varchar columns. Handles the derivation of
element_typeid
in Batched Nested Loop (BNL) queries; casts both sides to text to ensure valid equality expression. #20003 - Reduces Proc struct consumption by enabling cleanup for killed background workers, preventing webserver start-up failure after 8 attempts. #20154
- Frees up large volumes of unused memory from the webserver after processing queries, enhancing periodic workloads by reallocating held memory to the OS more effectively. #20040
- Introduces two new boolean flags for YSQL webserver debugging and failure identification.The first will log every endpoint access and its list of arguments before running the path handler. The second will print basic tcmalloc stats after the path handler and garbage collection have run. #20157
- Rectifies an issue causing segmentation faults when the postmaster acquires already owned LWLock, enhancing stability during process cleanup. Uses KilledProcToCleanup instead of MyProc if acquiring locks under the postmaster. #20166
- Improves insight into webserver memory usage
:13000/memz
and:13000/webserver-heap-prof
, which print tcmalloc stats and display currents or peak allocations respectively. #20157 - Restarts postmaster in critical sections.This applies to backend PG code; defines and improves handling of errors in these sections. #20255
- Caps the number of attempts to read inconsistent backend entries to 1000 for safer operation and better visibility, limiting potential infinite loops in
/rpcz
calls and logging every hundredth attempt. #20274 - Resolves segmentation faults in the webserver SIGHUP handler during cleanup. #20309
- Allows the finalize_plan function to now apply finalize_primnode on PushdownExprs, ensuring Parameter nodes in subplans transfer values accurately, especially in Parallel Queries. #19694
DocDB
- Enables Masters to decrypt universe key registry correctly when Master turns into a leader following Remote Bootstrap. correctly when it turns into a leader. #19513
- Fixes metric retryable requests mem-tracker metric so it's aggregated at table-level #19301
- Ensure data consistency by checking if there is pending apply record, preventing the removal of large transactions that are still applying intents during bootstrap. #19359
- Prevents a write to freed memory;
RefinedStream::Connected
returns status instead of callingcontext_->Destroy(status)
. #19727 - Allows serving remote bootstrap from a follower peer, i.e. initial remote log anchor request is at the follower's last logged operation id index, reducing probability of fallback to bootstrap from the leader; this could increase system efficiency under high load. #19536
- Corrects Master's tablet_overhead memory tracker to correctly display consumption, ensuring accurate reflection of data in Memory Breakdown UI page, and aligns MemTracker metric names between TServer and Master. #19904
- Resolves intermittent index creation failure for empty YCQL tables by validating the
is_running
state instead of checking the index state directly; prevents inaccurate values in old indexes and ensuringretain_delete_markers
are correctly set to false. #19933 - Addresses a deadlock issue during tablet shutdown with wait-queues enabled by modifying the Wait-Queue shutdown path to set shutdown flags during
StartShutdown
and process callbacks duringCompleteShutdown
. #19867 - Reduces race conditions in
MasterChangeConfigTest.TestBlockRemoveServerWhenConfigHasTransitioningServer
by letting asynchronous threads operate onExternalMaster*
copies, notcurrent_masters
. #19927 - Allows only a single heap profile to run at a time, restricting concurrent profile runs to avoid misinterpretation of sampling frequency reset values. #19841
- Eliminates misleading
Transaction Metadata Missing
errors during a deadlock by replacing them with more accurate deadlock-specific messages. #20016 - Fixes behavior of Tcmalloc sampling; deprecates the
enable_process_lifetime_heap_sampling
flag in favor of directly settingprofiler_sample_freq_bytes
for controlling tcmalloc sampling, enhancing control over sampling process. #20236
CDC
- Corrects a deadlock during the deletion of a database with CDC streams by ensuring namespace-level CDC stream does not add the same stream per table more than once. #19879
yugabyted
- Adds a check for a join flag to avoid error when starting two different local RF-1 YugabyteDB instances on mac #20018
Other
- Rectifies the CentOS GCC11 and background worker asan build issues by utilizing stack allocation and eliminating the unsupported std::string::erase with const iterators. #20147, #20087
- Fixes segmentation fault in stats collector after Postmaster reset by cleaning up after terminated backends and properly resetting shared memory and structures, resulting in improved stability and reliability of the system. #19672
- Allows custom port configurations in multi-region/zone YugabyteDB clusters to prevent connectivity problems during cluster setup. #15334
v2.20.0.2 - December 15, 2023
Downloads
Use 2.20.1.3 or later.
This is a YugabyteDB Anywhere-only release, with no changes to the database.
v2.20.0.1 - December 5, 2023
Downloads
Use 2.20.1.3 or later.
Bug fixes
- [19753] [YSQL] Fix tuple IN condition type-checking in the presence of binary columns
- [19904] [DocDB] Fix Master's tablet_overhead mem_tracker zero consumption issue
- [19911] [DocDB] Avoid AdvanceToNextRow seeking past live rows upon detecting a deleted row during distinct iteration
- [19933] [DocDB] Backfill done may not be triggered for empty YCQL table
v2.20.0.0 - November 13, 2023
Downloads
Use 2.20.1.3 or later.
Highlights
- Support for OIDC token-based authentication via Azure AD. This allows YSQL database users to sign in to YugabyteDB universes using their JSON Web Token (JWT) as their password.
- Support for Transactional CDC.
- Wait-on conflict concurrency control with pg_locks. This feature is especially important to users that have workloads with many multi-tablet statements, which may have multiple sessions contending for the same locks concurrently. Wait-on conflict concurrency control helps avoid a deadlock scenario as well as ensure fairness so a session is not starved by new sessions. EA
- Catalog metadata caching to improve YSQL connection scalability and performance with faster warm up times. EA
- Support for ALTER COLUMN TYPE with on-disk changes is now GA for production deployments. This feature expands ALTER TABLE capabilities with a broader range of data type modifications, including changes to the type itself, not just size increments.
New Features
- [13686] [DocDB] Introduce the ability to Rollback And Demote an AutoFlag
- [18748] [CDCSDK] Change default checkpointing type to EXPLICIT while stream creating
Improvements
- [4906] [YSQL] Uncouple variables
is_single_row_txn
fromis_non_transactional
- [9647] [DocDB] Add metric for the number of running tablet peers on a YB-TServer
- [12751] [CDCSDK] Made maxAttempts and sleepTime for retrying RPCs configurable in AsyncClient
- [13358] [YSQL] YSQL DDL Atomicity Part 5: YB-Backup support
- [16177] [DocDB] Mechanisms to reject table creation requests based on cluster resources, switch the tablet limit gflags from test flags to runtime flags
- [16785] [DocDB] Add tracking for event stats metrics
- [17904] [DocDB] Prevent YB-TServer heartbeats to master leader in a different universe
- [18055] [DocDB] Resume waiters in consistent order across tablets
- [18335] [DocDB] Reject ChangeConfig requests for sys catalog when another server is amidst transition
- [18384] [14114] [DocDB] Throw consistent error on detected deadlocks
- [18522] [YSQL] Change default unit for
yb_fetch_size_limit
to bytes from kilobytes - [18940] [DocDB] Metric to track active WriteQuery objects
- [19071] [DocDB] Add estimated bytes/count to pprof memory pages
- [19099] [YSQL] [DocDB] Adopt the trace correctly in pg_client_session
- [19203] [xCluster] Add xCluster YB-TServer UI page, Switch to throughput in the YB-TServer UI pages from MiBps to KiBps
- [19221] [DocDB] Improved timeout handling for YCQL index scan
- [19272] [DocDB] Clear ResumedWaiterRunner queue during WaitQueue Shutdown
- [19274] [xCluster] Update xCluster apply safe time if there are no active transactions
- [19292] [CDCSDK] Publish snapshot_key in
GetCheckpointResponse
only if it is present - [19295] [yugabyted] Password Authentication showing enabled in UI even when it's not turned on.
- [19301] [DocDB] Make re-tryable requests mem-tracker metric to be aggregated for Prometheus
- [19351] Harden more YSQL backends manager tests
- [19353] [xCluster] Introduce XClusterManager, Cleanup code on master and XClusterConsumer
- [19417] [DocDB] Support tracing UpdateConsensus API
- [19454] build: Move macOS support off of mac11
- [19482] [CDCSDK] Add GFlag to disable tablet split on tables part of CDCSDK stream
- [19532] [DocDB] Allow Traces to get past Glog's 30k limit
- [YSQL] Disable backends manager by default
- [YSQL] Some extra logging in Postgres Automatically determine upstream branch name from local branch name
Bug fixes
- [17025] [xCluster] Calculate replication lag metrics for split tablet children
- [17229] [DocDB] Prevent GC of any schema packings referenced in xCluster configuration
- [18081] [19535] [DocDB] Update TabletState only for tablets that are involved in write, Ignore statuses from old status tablet on promotion
- [18157] [DocDB] Don't pack rows without liveness column
- [18540] [xCluster] [TabletSplitting] Fix setting opid in cdc_state for split children
- [18711] [YSQL] Fix
pg_stat_activity
taking exclusive lock on t-server session - [18732] [DocDB] Don't update cache in LookupByIdRpc if the fetched tablet has been split
- [18909] [YSQL] Fix ALTER TYPE on temporary tables
- [18911] [19382] [YSQL] Fix ALTER TYPE null constraint violation and failure for range key tables
- [19021] [YSQL] Fix bug in computation of semi/anti join factors during inner unique joins
- [19033] [DocDB] PGSQL operation triggered during parent tablet shutting down may be not retried
- [19063] [YSQL] Fix table rewrite on a table referenced through a partitioned table's foreign key
- [19308] [YSQL] Fix bug where tuple IN filters were not bound to the request
- [19316] [19314] [yugabyted] Bugs regarding
--join
flag provided. - [19329] [YSQL] Avoid O(n^2) complexity in list traversal while generating ybctid for IN operator
- [19348] [CDCSDK] Only delete removed tablets from
cdc_state
inCleanUpCDCStreamsMetadata
- [19384] [YSQL] Fix handling of case where a given RowCompareExpression cannot be bound
- [19385] [CDCSDK] Fix WAL GC issue for tables added after stream creation
- [19394] [CDCSDK] Only populate key to
GetChangesRequest
when it is not null, Set snapshot key from correct parameter in explicit checkpointing - [19407] [YSQL] Single shard operations should pick read time only after conflict resolution
- [19414] [yugabyted] UI bugs
- [19434] [CDCSDK] Do not completely fail while fetching tablets if one table hits error
- [19440] [DocDB] Fix bug where invalid filter key was passed to Iterator initialization in backwards scans
- [19450] [xCluster] Fix CDCServiceTestMultipleServersOneTablet.TestUpdateLagMetrics
- [19497] [DocDB] Collect end-to-end traces only if the parent trace is non-null
- [19514] [DocDB] Do not ignore replicas on dead YB-TServer in under-replicated endpoint
- [19523] [xCluster] Fix TSAN error in xClusterConfig
- [19544] [DocDB] Hash and Range Indexes grow in size even after rows are deleted, leading to slower queries
- [19546] [DocDB] Add missing call to Prepare in PlayChangeMetadataRequest
- [19605] [YSQL] Fix test failures encountered upon enabling DDL Atomicity
- [19663] [DocDB] Fix AutoFlagsMiniClusterTest.PromoteOneFlag test