What's new in the YugabyteDB v2.20 LTS release series

Release announcements

Release notes

LTS - Feature availability

All features in stable releases are considered to be GA unless marked otherwise.

What follows are the release notes for the YugabyteDB v2.20 release series. Content will be added as new notable features and changes are available in the patch releases of the YugabyteDB v2.20 release series.

For an RSS feed of all release series to track the latest product updates, point your feed reader to the RSS feed for releases.

Technical Advisories

  • Impacts: v2.20.0.0 to 2.20.1.2 - TA-20648 - Index update can be wrongly applied on batch writes
  • Impacts: v2.20.1.0 to 2.20.1.3 - TA-20827 - Correctness issue for queries using SELECT DISTINCT

v2.20.3.0 - April 9, 2024

Build: 2.20.3.0-b68

Third-party licenses: YugabyteDB, YugabyteDB Anywhere

Downloads

Docker

docker pull yugabytedb/yugabyte:2.20.3.0-b68

New feature

  • Roll back upgrades. Ability to roll back a database upgrade in-place and restore the cluster to its state before the upgrade. You can roll back a database upgrade only to the pre-upgrade release. EA

  • Added support for Red Red Hat Enterprise Linux 9.3 on x86-based systems. Refer to Operating system support for the complete list of supported operating systems.

Improvements

YSQL

  • Alters temporary namespace naming in YB to pg_temp_<tserver-uuid>_<backend_id> from pg_temp_<backend_id>, making them unique across nodes and preventing temp tables overwriting or deletion. #19255
  • Treats REFRESH MATERIALIZED VIEW as a non-disruptive change, preventing unnecessary transaction terminations. The default option, REFRESH MATERIALIZED VIEW NONCONCURRENTLY, modifies metadata but without making a disruptive alteration. #20420
  • Displays distinct prefix keys explicitly in the explain output, enhancing the clarity of indexing for users. #20831
  • Allows correct initialization and propagation of ybDataSent and ybDataSentForCurrQuery flags for both parent and child transactions, for enhanced error handling and retry mechanism. #18638
  • Optimizes the get_tablespace_distance function, enhancing the speed of the yb_is_local_table YSQL function. Reduces query time by caching GeolocationDistance value. #20860

DocDB

  • Limits the number of rows returned per transaction per tablet in pg_locks to avoid potential memory issues during batch inserts, and includes additional fields to indicate partial lock info. #20765
  • Introduces a new YSQL configuration parameter yb_locks_txn_locks_per_tablet to limit the number of rows returned by pg_locks, preventing the system from running out of memory during large transactions. #19934
  • Introduces a new metric for untracked memory which eliminates the need to hardcode child memory trackers. Now users can effortlessly monitor untracked memory for better resource management. #18683
  • Limits the number of tablets per node, and hastens reaching the desired number of tablets by lowering the values of FLAGS_tablet_split_low_phase_shard_count_per_node to 1 and FLAGS_tablet_split_low_phase_size_threshold_bytes to 128_MB. #20579
  • Adds a new retrying master-to-master task, allowing for the API AreNodesSafeToTakeDown to check if it's safe to remove or upgrade certain nodes without disrupting overall cluster health. #17562
  • Introduces AreNodesSafeToTakeDown API that ensures safe node removals during cluster upgrades or maintenance operations by checking tablet health and follower lag, facilitating seamless and risk-free updates. #17562
  • Enables monitoring of master leader heartbeat delays through a new RPC in the MasterAdmin, ensuring undesired lags can be readily detected and mitigated. #18788
  • Introduces two new metrics to alert users when CreateTable function tries to exceed the tablet count limits and to highlight when a tablet server is managing more tablets than its capacity allows. #20668
  • Adjusts the tablet guardrail mechanism to wait for TServer registrations when master leadership changes, avoiding false-positive CREATE TABLE rejections due to incomplete TServer data. A new gflag now controls this wait duration. #20667
  • Adds a flag FLAGS_tablet_split_min_size_ratio to control tablet splitting based on SST file sizes, ensuring better control over tablet size imbalance. #21458

CDC

  • Added a test to certify the safe_time set during GetChanges call, reducing data loss during network failures. Ensures consistent safe_hybrid_time in multiple GetChanges calls. #21240

Bug fixes

YSQL

  • Eliminates unnecessary computation of range bounds in Index-Only Scan precheck condition, preventing crashes for certain queries and improving performance. #21004
  • Eliminates risk of data loss by ensuring only the first statement in SQL mutation batches is retried in the event of a transaction conflict, when using PostgreSQL extended query protocol. #21297
  • Temporarily reverts new field additions to the PgsqlResponsePB proto to address upgrade failures encountered when transitioning from versions 2.14/2.16 to 2.20.2. #21229
  • Fixes table rewrite issue on non-colocated tables/matviews in colocated DB, ensuring the new table uses the original table's colocation setting. Includes a workaround for GH issue 20914. #20856
  • Reduces excessive storage metric updates during EXPLAIN ANALYZE operation, enhancing performance by incorporating storage_metrics_version in YBCPgExecStats and YbInstrumentation. #20917
  • Prevents simultaneous send of read and write operations in the same RPC request that could lead to inconsistent read results, by ensuring that, in case of multiple operations, all buffered ones are flushed first. #20864
  • Redesigned expression tree walkers to properly handle NULL nodes in queries with subselects used in an index condition, preventing planner crashes. #21133
  • Enforces stricter locking mechanisms during concurrent updates on different columns of the same row, to maintain data consistency and prevent 'write-skew anomaly within a row’. Adds a new gflag ysql_skip_row_lock_for_update to toggle the new row-level locking behavior. #15196
  • Adjusts heartbeat mechanism to shut down when an "Unknown Session" error occurs, reducing log alerts. This benefits idle connections with expired sessions. #21264
  • Allows BNL's on outer and inner tables, even if the inner table has "unbatchable" join restrictions that can't accept batches of inputs, enhancing queries with complex join conditions. #21366
  • Incorporates checks of inequality filters in the YSQL layer to avoid transmitting trivially false inequalities, preventing undesired behavior from DocDB iterators. #21383
  • Allows ModifyTable EXPLAIN statements to run as a single row transactions, decreasing latency. Also enables logging for transaction types when yb_debug_log_docdb_requests is enabled. #19604
  • Corrects an issue where certain unbatchable filters weren't detected during indexpath formation when indexpath accepted batched values from multiple relations. #21292

DocDB

  • Fixes a race condition on kv_store_.colocation_to_table to prevent undefined behavior and re-enables packed row feature for colocated tables, enhancing data writing and compaction processes. #20638
  • Clears pending_deletes_ on failed delete tasks thus preventing tablets from being incorrectly retained after task failure or completion. This rectifies a race condition and allows the Load Balancer to perform operations on specific tablets and Tablet Servers. #13156
  • Enhances universe upgrade process by incrementing ClusterConfig version during an update and adds checks to prevent universe_uuid modification. Also introduces a yb-ts-cli to clear universe uuid if necessary, improving troubleshooting capabilities. #21491
  • Reflects the actual columns locked in conflict resolution instead of the shared in-memory locks in pg_locks, providing more accurate output for waiting transactions. #18399
  • Modifies the DocDB system by shifting the acquirement of submit_token_ of the WriteQuery to the post-conflict resolution phase to prevent DDL requests from being blocked, thus optimizing both reads and writes for continued performance and enhanced data consistency. #20730
  • Corrects transaction queue behavior allowing multiple waiters for a single transaction per tablet, thereby resolving conflicts and enhancing transaction handling capability. #18394
  • Incorporates detection of recently aborted transactions into the transaction coordinator with a new flag clear_deadlocked_txns_info_older_than_seconds. #14165, #19257
  • Disables the packed row feature for colocated tables, effectively preventing a possible encounter with the underlying issue in 21218 during debugging. #21218
  • Prevents TServers from crashing due to duplication of tablets in two drives, occurring after repairing a faulty drive, by preventing the creation of new tablets during Raft-based system (RBS) process. #20754
  • Eliminates potential issues with colocated tables during heavy DDL operations and compaction, reducing the risk of crashing on newer builds (2.20.0+) where the packed row feature is default. #21244
  • Allows database drop operations to proceed smoothly by ignoring missing streams errors and skipping replication checks for already dropped tables. #21070
  • Allows ListTabletServers to handle heartbeats older than 24 days by adjusting the setting to the maximum int32 value, avoiding system crash. #21096
  • Includes the indexed_table_id with the index in table listings, eliminating the need for a second lookup to associate a main table with an index. #21159
  • Corrects RPATH setting for OpenLDAP libraries that prevents system libraries being picked up or not found. Also refactors library_packager.py for improved library dependency categorization. #21236
  • Reduces TPCC NewOrder latency by replacing the ThreadPoolToken with a Strand within a dedicated rpc::ThreadPool in PeerMessageQueue's NotifyObservers functions, enhancing speed and efficiency. #20912
  • Early aborts transactions that fail during the promotion process, enhancing throughput in geo-partitioned workloads and offering stability in geo-partitioned tests. #21328
  • Corrects block cache metrics discrepancy by ensuring Statistics object passes into LRUCache from TableCache for accurate updates. #21407
  • Fixes a segmentation fault in yb-master by checking for a null pointer before dereferencing it, addressing an issue in the CDC run on 2.23.0.0-b37-arm. #21648
  • Validates the use of two arguments for disable_tablet_splitting, addressing a previous condition where only one was required, thereby enhancing backup process reliability. #8744

CDC

  • Introduces a fix for data loss issue caused by faulty update of cdc_sdk_safe_time during explicit checkpointing, along with tests to ensure validity. #15718
  • Fixed the decoding of NUMERIC value in CDC records to prevent precision loss by ensuring that the decoded string is not converted to scientific notation if its length is more than 20 characters. #20414
  • Fixes issue with CDC packed rows, now ensures a single record for large insert operations, providing consistent data regardless of row size. #20310
  • Ensures consistency in CDCSDKYsqlTest.TestLargeTxnWithExplicitStream test by setting FLAGS_cdc_max_stream_intent_records value from 40 to 41, overcoming the issue of multiple records for a single insert when packed row size exceeds ysql_packed_row_size_limit. #20310

Other

  • Updates the condition for HT lease reporting to ensure accurate leaderless tablet detection in RF-1 setup, preventing false alarms. #20919
  • Reduces disruptions by throttling the master process log messages related to "tablet server has a pending delete" into 20-second intervals. #19331

v2.20.2.2 - April 1, 2024

Build: 2.20.2.2-b1

Third-party licenses: YugabyteDB, YugabyteDB Anywhere

Downloads

Docker

docker pull yugabytedb/yugabyte:2.20.2.2-b1

This is a YugabyteDB Anywhere-only release, with no changes to the database.

v2.20.2.1 - March 22, 2024

Build: 2.20.2.1-b3

Third-party licenses: YugabyteDB, YugabyteDB Anywhere

Downloads

Docker

docker pull yugabytedb/yugabyte:2.20.2.1-b3

Bug fixes

DocDB

  • Enhances universe upgrade process by incrementing ClusterConfig version during an update and adds checks to prevent universe_uuid modification. Also, introduces a yb-ts-cli to clear universe UUID if necessary, improving troubleshooting capabilities. #21491

CDC

  • Ensures numeric values are decoded without precision loss by utilizing string representation with no length limit and the PostgreSQL numeric_out method. #20414

v2.20.2.0 - March 4, 2024

Build: 2.20.2.0-b145

Third-party licenses: YugabyteDB, YugabyteDB Anywhere

Downloads

Docker

docker pull yugabytedb/yugabyte:2.20.2.0-b145

New features

  • Added support for read-only mode for DR replica, currently supported for YSQL tables.

  • Tablet splitting now can be enabled on tables with CDC configured. #18479

  • Transactional CDC now supports consistent snapshots, ensuring data integrity during replication to a sink. Snapshots, ordered by commit time across all tables and tablets, establish a reliable replication order. #19682

Improvements

YSQL

  • Introduces yb_silence_advisory_locks_not_supported_error as a temporary solution, avoiding disruption while users transition from the use of advisory locks. #19974
  • Shifts from the test flag "FLAGS_TEST_enable_db_catalog_version_mode" to the TP flag "FLAGS_ysql_enable_db_catalog_version_mode", enhancing user control over concurrent DDL execution across different databases. #12417
  • Issues a notice for unsafe ALTER TABLE operations, including for ADD COLUMN...DEFAULT, to indicate existing rows won't be backfilled with the default value, enhancing user awareness. Suppression possible by setting ysql_suppress_unsafe_alter_notice flag to true. #19360
  • Added sorting capabilities to BatchNestedLoopJoin to return the rows in the same order as NestedLoopJoin. #19589
  • Replaces the ysql_max_read_restart_attempts and ysql_max_write_restart_attempts flags with yb_max_query_layer_retries, applies limit to Read Committed isolation statement retries, and adjusts retry_backoff_multiplier and retry_min_backoff defaults. #20359
  • Disallows the creation of a temporary index with a tablespace, preventing client hangs and providing a clear error message for temporary index creation with set tablespace. #19368
  • Enables index tablespace modification through the ALTER INDEX SET TABLESPACE command and regulates column statistics using the ALTER INDEX ALTER COLUMN SET STATISTICS command. Also, allows the creation and alteration of materialized views with the specified tablespace. Suppress the beta feature warning by enabling ysql_beta_feature_tablespace_alteration flag. #6639
  • Adds function to log the memory contexts of a specified backend process, enhancing memory usage monitoring and allowing users to troubleshoot memory-related issues more effectively. #14025

DocDB

  • Allows customizing retryable request timeouts to respect client-side YCQL and YSQL timings, optimizing log replay and preventing the tserver from rejecting requests that exceed durations. Adjusts default retryable request timeout to 660 seconds and offers a configuration to eliminate server-side retention of retryable requests with FLAGS_retryable_request_timeout_secs =0. #18736
  • Speeds up TServer Init by optimizing the handling of deleted and tombstoned tablets. It ensures faster startup by using a new flag num_open_tablets_metadata_simultaneously, which sets the number of threads for opening tablet metadata. This allows for parallel opening of metadata, improving response times even in cases with large numbers of tablets. Additionally, the handling of tablets marked as Deleted or Tombstoned is managed asynchronously, marking tombstoned tablets as dirty for inclusion in the next heartbeat. #15088
  • Logs all instances of tablet metadata creation/updation, enabling additional insights for troubleshooting in cases of multiple meta records for the same tablet. #20042
  • Adds a 10-second delay (auto_flags_apply_delay_ms) between AutoFlag config updates and their application, allowing all tservers to receive the new config before applying it. This change enhances configuration consistency and update safety. #19932
  • Enhances thread safety by setting Wthread-safety-reference to check when guarded members are passed by reference and resolving all build errors resulting from this change. #19365
  • Automates recovery of index tables impacted by a bug, preventing performance degradation and disk size leak, by ensuring schema.table_properties.retain_delete_markers is reset to false when index backfilling is done. #19731
  • Enhances the demote_single_auto_flag yb-admin command by returning distinct error messages for invalid process_name, flag_name, or non-promoted flag, thereby aiding easier identification of errors. Replaces HasSubstring with ASSERT_STR_CONTAINS in the AutoFlags yb-admin test. #20004
  • Introduces support for Upgrade and Downgrade of universes with xCluster links, enhancing compatibility checks for AutoFlags during these operations. Data replication between two universes is only catered if AutoFlags of the Target universe are compatible with the Source universe. The compatibility check, stored in ProducerEntryPB, triggers if the AutoFlag configuration changes during upgrades and rollbacks. This prevents unnecessary RPC calls if no AutoFlag configurations have changed. Also includes fixes for cds initialization bugs in TestThreadHolder. #19518
  • Offers redesigned server level aggregation for metrics, thus introducing more metrics for enhanced debugging. Removes several unused URL parameters and makes the new output compatible with YBA and YBM, preventing double-counting issues in charts. Drops unused JSON and Prometheus callbacks from MetricEntity for a cleaner design. #18078
  • Allows a limit on the addition of new tablet replicas in a cluster to conserve CPU resources, with safeguards for downscaling. Introduces test flags for controlling memory reservation and tablet replica per core limits. #16177
  • Introduces verbose logging for global and per table state transitions in the load balancer to facilitate easier debugging. #20289
  • Reduces server initialization time by eliminating the accumulation of deleted tablet superblocks during startup, through a modification in the DeleteTablet operation. #19840
  • Enables automatic recovery of index tables affected by a bug, verifying their backfilling status and correcting the retain_delete_markers property to enhance performance. #20247
  • Includes single shard (fast-path) transactions in the pg_locks by querying single shard waiters registered with the local waiting transaction registry at the corresponding Tserver, ensuring more complete transaction tracking and lock status reporting. #18195
  • Creates Prometheus metrics for server hard and soft memory limits which allow detailed insight into TServer or master memory usage regardless of Google flag defaults; aids in creating dashboards charts to monitor utilization close to soft limit or TServer TCMalloc overhead. #20578
  • Enables control over the batching of non-deferred indexes during backfill via a new flag, improving index management. #20213

yugabyted

  • Corrects the CPU usage reporting in the sankey diagram by filtering nodes based on region selection on the performance page. #19991

  • Repairs yugabyted-ui functionality when using custom YSQL and YCQL ports by passing values to yugabyted-ui, ensuring the correct operation of the user interface. #20406

Bug fixes

YSQL

  • Reduces Proc struct consumption by enabling cleanup for killed background workers, preventing webserver start-up failure after 8 attempts. #20154
  • Frees up large volumes of unused memory from the webserver after processing queries, enhancing periodic workloads by reallocating held memory to the OS more effectively. #20040
  • Introduces two new boolean flags for YSQL webserver debugging and failure identification.The first will log every endpoint access and its list of arguments before running the path handler. The second will print basic tcmalloc stats after the path handler and garbage collection have run. #20157
  • Rectifies an issue causing segmentation faults when the postmaster acquires already owned LWLock, enhancing stability during process cleanup. Uses KilledProcToCleanup instead of MyProc if acquiring locks under the postmaster. #20166
  • Introduces two new boolean flags for YSQL webserver debugging and failure identification.The first will log every endpoint access and its list of arguments before running the path handler. The second will print basic tcmalloc stats after the path handler and garbage collection have run. #20157
  • Restarts postmaster in critical sections.This applies to backend PG code; defines and improves handling of errors in these sections. #20255
  • Introduces the pg_stat_statements.yb_qtext_size_limit flag to control the maximum file size read into memory, preventing system crashes due to oversized or corrupt qtext files. #20211
  • Caps the number of attempts to read inconsistent backend entries to 1000 for safer operation and better visibility, limiting potential infinite loops in /rpcz calls and logging every hundredth attempt. #20274
  • Resolves segmentation faults in the webserver SIGHUP handler during cleanup. #20309
  • Display consistent wait-start times in pg_locks view for waiting transactions. #18603, #20120
  • Reduces excessive memory consumption during secondary index scans. #20275
  • Allows the finalize_plan function to now apply finalize_primnode on PushdownExprs, ensuring Parameter nodes in subplans transfer values accurately, especially in Parallel Queries. #19694
  • Return correct results when Batch Nested Loop join is used for queries involving Nested LEFT JOINs on LATERAL views. #19642, #19946
  • Upgrades "Unknown session" error to FATAL permitting drivers to instantly terminate stale connections, minimizing manual user intervention. #16445
  • Rectifies correctness issues when BatchNestedLoop join is used and the join condition contained a mix of equality and non-equality filters. #20531
  • Mitigates incorrect data entry into the default partition by incrementing the schema version when creating a new partition, enhancing data consistency across all connections. #17942
  • Rectifies correctness issue when join on inequality condition and join columns contains NULL values. #20642
  • Rectifies correctness issue when queries involving outer joins and aggregate use BNL. #20660
  • Corrects the Batch Nested Loop's optimization logic for proper handling of cases where the given limit matches the outer table's exact size, ensuring accurate query results. #20707
  • Prevents "Not enough live tablet servers to create table" error during ALTER TABLE SET TABLESPACE by correctly supplying the placement_uuid, even when creating a table in the same tablespace. #14984
  • Addresses a bug that caused backup failure due to the absence of yb_catalog_version, by ensuring the function's existence post-normal migration. #18507
  • Corrects an error in the aggregate scans' pushdown eligibility criteria to prevent wrong results from being returned when PG recheck is not expected, but YB preliminary check is required to filter additional rows. #20709
  • Ensures the Linux PDEATH_SIG mechanism signals child processes of their parent process's exit, by correctly configuring all PG backends immediately after their fork from the postmaster process. #20396
  • Fixes a MISMATCHED_SCHEMA error when upgrading from version 2.16 to 2.21 by introducing a 2-second delay for catalog version propagation when a breaking DDL statement is detected. #20842
  • Return correct result for join queries using BatchNestedLoop join and DISTINCT scan using the range Index during inner table scan. #20827
  • Renders a fix for memory corruption issue that caused failure in creating a valid execution plan for SELECT DISTINCT queries. Enables successful execution of queries without errors and prevents server connection closures by disabling distinct pushdown. This fix improves the stability and effectiveness of SELECT DISTINCT queries. #20893
  • Eliminates unnecessary computation of range bounds in Index-Only Scan precheck condition, preventing crashes for certain queries. #21004

DocDB

  • Resolves potential WriteQuery leak issue in CQL workloads, ensuring proper execution and destruction of queries, while preventing possible tablet shutdown blockages during conflict resolution failure. #19919
  • Mitigates the issue of uneven tablet partitions and multiple pollers writing to the same consumer tablet by only applying intents on the consumer that match the producer tablet's key range. If some keys/values are filtered out from a batch, it will not delete the consumer's intents, as they may be needed by subsequent applications. Also guarantees idempotency, even if some apply records stutter and fetch older changes. #19728
  • Reduces chances of transaction conflicts upon promotion by delaying the sending of UpdateTransactionStatusLocation RPCs until after the first PROMOTED heartbeat response is received, enhancing transaction consistency and accuracy. #17319
  • Sets kMinAutoFlagsConfigVersion to 1 to resolve flag configuration mismatch issue. #19985
  • Unblocks single shard waiters once a blocking subtransaction rolls back, by applying identical conflict check logic for both distributed transactions and single shard transactions. #20113
  • Eliminates a race condition that can occur when simultaneous calls to SendAbortToOldStatusTabletIfNeeded try to send the abort RPC, thus preventing avoidable FATALs for failed geo promotions. #17113
  • Fixes behavior of Tcmalloc sampling; deprecates the enable_process_lifetime_heap_sampling flag in favor of directly setting profiler_sample_freq_bytes for controlling tcmalloc sampling, enhancing control over sampling process. #20236
  • Resolves instances of the leaderless tablet endpoint incorrectly reporting a tablet as leaderless post-leader change, by tweaking the detection logic to depend on the last occurrence of a valid leader, ensuring more accurate tablet reporting. #20124
  • Allows early termination of ReadCommitted transactions with a kConflict error, enhancing overall system throughput by eliminating unnecessary blockages without waiting for the next rpc restart. #20329
  • Fixes FATAL errors occurring during tablet participant shutdown due to in-progress RPCs by ensuring rpcs_.Shutdown is invoked after all status resolvers have been shut down. #19823
  • Modifies SysCatalog tablet's retryable request retention duration to consider both YQL and YSQL client timeouts, reducing the likelihood of request is too old errors during YSql DDLs. #20330
  • Handles backfill responses gracefully even when they overlap across multiple operations, reducing risks of crashes and errors due to network delays or slow masters. #20510
  • Reintroduces bloom filters use during multi-row insert, improving conflict resolution and rectifying missing conflict issues, while also addressing GH 20648 problem. #20398, #20648
  • Fixes handling of duplicate placement blocks in under-replication endpoint for better compatibility and correct replica counting, preventing misrepresentation of under-replicated tablespaces. #20657
  • Logs the first failure during setup replication instead of the last error, facilitating better error diagnosis. #20689
  • Implements a retry mechanism to acquire the shared in-memory locks during waiter resumption. Rather than failing after a single attempt, it now schedules retrying until the request's deadline, reducing request failures due to heavy contention. #20651, #19032, #19859
  • Reduces log warnings in normal situations by downgrading repeated waiter resumption alerts to VLOG(1), benefiting from the direct signaling of transaction resolution. #19573
  • Disables the packed row feature for colocated tables to prevent possible write failures post complications, as a workaround while investigating issue 20638. #21047
  • Resolves database deletion failures by skipping replication checks for dropped tables during the database drop process, addressing errors related to missing streams. #21070
  • Addresses a race condition on kv_store_.colocation_to_table reading and permits packed row features for colocated tables, tackling undefined behaviors and failed table writes. #20638
  • Disables the packed row feature for colocated tables to prevent the issue 20638 that causes subsequent write failures after certain compactions. #21047
  • Corrects RPATH setting for OpenLDAP libraries, preventing the system from picking up wrong versions or failing to find them at all. #21236

CDC

  • Corrects the computation of the cdcsdk_sent_lag metric to prevent steep, disproportionate increases by updating the last_sent_record_time when a SafePoint record is spotted, in addition to DMLs and READ ops. #15415

Other

  • Adjusts tserver start and tserver stop scripts to successfully terminate all running PG processes, irrespective of their PID digit count. #19817
  • Updates the condition for HT lease reporting to ensure accurate leaderless tablet detection in RF-1 setup, preventing false alarms. #20919

v2.20.1.3 - January 25, 2024

Build: 2.20.1.3-b3

Third-party licenses: YugabyteDB, YugabyteDB Anywhere

Downloads

Docker

docker pull yugabytedb/yugabyte:2.20.1.3-b3

Bug fixes

YSQL

  • Fixed index scans where a query with join on inequality condition incorrectly returns rows with NULL in the key column. #20642

DocDB

  • Restore bloom filter usage during multi-row insert. #20398, #20648

v2.20.1.2 - January 17, 2024

Downloads

Use 2.20.1.3 or later.

Bug fixes

YSQL

  • Fix BNL local join lookup equality function. #20531

CDC

  • Fix addition of new tables to stream metadata after drop table. #20428

v2.20.1.1 - January 11, 2024

Downloads

Use 2.20.1.3 or later.

Improvements

YSQL

  • In Read Committed Isolation, limit the number of retry attempts when the aborted query is retried. #20359

Bug fixes

YSQL

  • Return correct results when Batch Nested Loop join is used for queries involving Nested LEFT JOINs on LATERAL views. #19642, #19946

  • Added a regression test for nested correlated subquery. #20316

CDC

  • Fixed decimal type precision while decoding CDC record. #20414

DocDB

  • In Read Committed Isolation, immediately abort transactions when conflict is detected. #20329

v2.20.1.0 - December 27, 2023

Downloads

Use 2.20.1.3 or later.

Improvements

YSQL

  • Adjusts ysql_dump when run with --include-yb-metadata argument, to circumvent unnecessary automatic index backfilling. This avoids unnecessary RPC calls. #19457
  • Imports the pgcrypto: Check for error return of px_cipher_decrypt upstream PostgreSQL patch; OpenSSL 3.0+ prerequisite. #19732
  • Imports the upstream PG commit Disable OpenSSL EVP digest padding in pgcrypto; OpenSSL 3.0+ prerequisite. #19733
  • Imports an upstream PostgreSQL commit that adds an alternative output when OpenSSL 3 doesn't load legacy modules; OpenSSL 3.0+ prerequisite. #19734
  • Removes the need for a table scan when adding a new column with a NOT NULL constraint and a non-volatile DEFAULT value, enhancing ADD COLUMN operations. #19355
  • Mitigates CVE-2023-39417 by importing a security-focused upstream Postgres commit from REL_11_STABLE for YSQL users. #14419
  • Fixes CVE-2020-1720 by importing upstream Postgres commit from REL_11_STABLE, enabling support for ALTER <object>DEPENDS ON EXTENSION in future for objects like function, procedure, routine, trigger, materialized view, and index. #14419
  • Modifies the planner to allow ordered index scans with IN conditions on lower columns; this takes advantage of the YB LSM indexes, which maintain index order. #19576
  • Increase the oom_score_adj (Out of Memory score adjustment) of the YSQL webserver to 900, the same as PG backends, prioritizing its termination when it significantly consumes memory. #20028
  • Blocks the use of advisory locks in YSQL, as they are currently unsupported. #18954
  • Introduces yb_silence_advisory_locks_not_supported_error as a temporary solution, avoiding disruption while users transition from the use of advisory locks. #19974

DocDB

  • Enable checks for trailing zeros in SST data files to help identify potential corruption. #19691
  • Includes version information in the error message when the yb process crashes due to AutoFlags being enabled on an unsupported older version, enhancing the ease of identifying the issue. #16181
  • Prints long trace logs with a continuation marker when a trace is split into multiple LOG(INFO) outputs. This is for log readability. #19532, #19808
  • Upgrades OpenSSL to version 3.0.8, disabling Linuxbrew builds and updating glog for stack unwinding based on the backtrace function. #19736
  • Expands debug information to help investigate SELECT command errors that imply faults in read path processing or provisional record writing. #19876
  • Adds support for rocksdb_max_sst_write_retries flag: maximum allowed number of attempts to write SST file in case of detected corruption after write (by default 0 which means no retries). Implemented for both flushes and compactions. #19730
  • Balances load during tablet creation across all drives, preventing bottlenecks and underutilized drives during remote-bootstrapping; uses total number of tablets assigned for tie-breaking. #19846
  • Adds tserver page to display all ongoing Remote Bootstrap sessions, and shows the source of bootstrap sessions in the Last Status field, for improved visibility. #19568
  • Enhances error reporting from XClusterPollers by storing just the error code instead of detailed status, making it safer against master and Tserver restarts and reducing memory usage. #19455
  • Incorporates a JoinStringsLimitCount utility for displaying only the first 20 elements in a large array, saving memory and space while logging or reporting tablet IDs. #19527
  • Adds a sanity check to prevent restoration of the initial sys catalog snapshot when the master_join_existing_universe flag is set. #19357
  • Parallelizes RPCs in DoGetLockStatus to fetch locks swiftly, enhancing the database's response time while retrieving old transactions. #18034
  • Simplifies xCluster replication failure debugging via a new get_auto_flags_config yb-admin command, returning the current AutoFlags config. #20046
  • Enables automatic recovery of index tables affected by #19544, improving performance and disk size management by removing excess tombstones in the SST files. #19731
  • Introduces verbose logging for global and per table state transitions in the load balancer to facilitate easier debugging. #20289

CDC

  • Allows the catalog manager to eliminate erroneous entries from the cdc_state table for newly split tablets; prevents race conditions by reversing the order of operations during the CleanUpCDCStreamsMetadata process. #19746

yugabyted

  • Resolves 0.0.0.0 to 127.0.0.1 as the IP for master, tserver and yugabyted-UI when 0.0.0.0 is specified as the advertise_address in yugabyted start command. #18580
  • Ensures the yugabyted UI can fetch and display Alert messages from different nodes by adjusting the /alerts endpoint in the API server to include a node_address parameter. #19972
  • Eliminates the deprecated use_initial_sys_catalog_snapshot gflag in yugabyted, thus reducing log warning messages. #20056

Other

  • Better indentation of multi-line remote trace entry output, for readability. #19758
  • Implements a preventative safeguard in tserver operations by adding a check that a tablet has explicitly been deleted before issuing a DeleteTablet command, minimizing data loss risks and enhancing reliability. This feature, enabled with master_enable_deletion_check_for_orphaned_tablets=true, is upgrade and downgrade safe. #18332

Bug fixes

YSQL

  • Allows the postprocess script in pg_regress to run on alternative expected files alongside default_expectfile and platform_expectfile, fixing unintended mismatches. #19737
  • Prevents stuck PostgreSQL processes and ensures successful acquisition of locks in the future. This enhancement particularly aids in preventing deadlocks when creating replication slots with duplicate names. #19509
  • Ensures OID uniqueness within a database in YB environment by introducing a new per-database OID allocator, avoiding risk of collisions across multiple nodes or tenants. This change is safe for upgrades and rollbacks, enabled by a new gflag ysql_enable_pg_per_database_oid_allocator. #16130
  • Prevents postmaster crashes during connection termination cleanup by using the killed process's ProcStruct to wait on a lock. #18000
  • Fixes the distinct query iteration to distinguish between deleted and live keys; this prevents missing live rows during distinct queries on tables with dead tuples. #19911
  • Fixes bug when type-checking bound tuple IN conditions involving binary columns like UUID. #19753
  • Ensures PK modification does not disable Row-Level Security (RLS). #19815
  • Tracks spinlock acquisition during process initialisation, and reduce the time to detect spinlock deadlock. #18272, #18265
  • Reverts to PG restart in rare event of process termination during initiation or cleanup; this avoids needing to determine what subset of shared memory items to clean. #19945
  • Prevents unexpected crashes during plan creation when joining two varchar columns. Handles the derivation of element_typeid in Batched Nested Loop (BNL) queries; casts both sides to text to ensure valid equality expression. #20003
  • Reduces Proc struct consumption by enabling cleanup for killed background workers, preventing webserver start-up failure after 8 attempts. #20154
  • Frees up large volumes of unused memory from the webserver after processing queries, enhancing periodic workloads by reallocating held memory to the OS more effectively. #20040
  • Introduces two new boolean flags for YSQL webserver debugging and failure identification.The first will log every endpoint access and its list of arguments before running the path handler. The second will print basic tcmalloc stats after the path handler and garbage collection have run. #20157
  • Rectifies an issue causing segmentation faults when the postmaster acquires already owned LWLock, enhancing stability during process cleanup. Uses KilledProcToCleanup instead of MyProc if acquiring locks under the postmaster. #20166
  • Improves insight into webserver memory usage :13000/memz and :13000/webserver-heap-prof, which print tcmalloc stats and display currents or peak allocations respectively. #20157
  • Restarts postmaster in critical sections.This applies to backend PG code; defines and improves handling of errors in these sections. #20255
  • Caps the number of attempts to read inconsistent backend entries to 1000 for safer operation and better visibility, limiting potential infinite loops in /rpcz calls and logging every hundredth attempt. #20274
  • Resolves segmentation faults in the webserver SIGHUP handler during cleanup. #20309
  • Allows the finalize_plan function to now apply finalize_primnode on PushdownExprs, ensuring Parameter nodes in subplans transfer values accurately, especially in Parallel Queries. #19694

DocDB

  • Enables Masters to decrypt universe key registry correctly when Master turns into a leader following Remote Bootstrap. correctly when it turns into a leader. #19513
  • Fixes metric retryable requests mem-tracker metric so it's aggregated at table-level #19301
  • Ensure data consistency by checking if there is pending apply record, preventing the removal of large transactions that are still applying intents during bootstrap. #19359
  • Prevents a write to freed memory; RefinedStream::Connected returns status instead of calling context_->Destroy(status). #19727
  • Allows serving remote bootstrap from a follower peer, i.e. initial remote log anchor request is at the follower's last logged operation id index, reducing probability of fallback to bootstrap from the leader; this could increase system efficiency under high load. #19536
  • Corrects Master's tablet_overhead memory tracker to correctly display consumption, ensuring accurate reflection of data in Memory Breakdown UI page, and aligns MemTracker metric names between TServer and Master. #19904
  • Resolves intermittent index creation failure for empty YCQL tables by validating the is_running state instead of checking the index state directly; prevents inaccurate values in old indexes and ensuring retain_delete_markers are correctly set to false. #19933
  • Addresses a deadlock issue during tablet shutdown with wait-queues enabled by modifying the Wait-Queue shutdown path to set shutdown flags during StartShutdown and process callbacks during CompleteShutdown. #19867
  • Reduces race conditions in MasterChangeConfigTest.TestBlockRemoveServerWhenConfigHasTransitioningServer by letting asynchronous threads operate on ExternalMaster* copies, not current_masters. #19927
  • Allows only a single heap profile to run at a time, restricting concurrent profile runs to avoid misinterpretation of sampling frequency reset values. #19841
  • Eliminates misleading Transaction Metadata Missing errors during a deadlock by replacing them with more accurate deadlock-specific messages. #20016
  • Fixes behavior of Tcmalloc sampling; deprecates the enable_process_lifetime_heap_sampling flag in favor of directly setting profiler_sample_freq_bytes for controlling tcmalloc sampling, enhancing control over sampling process. #20236

CDC

  • Corrects a deadlock during the deletion of a database with CDC streams by ensuring namespace-level CDC stream does not add the same stream per table more than once. #19879

yugabyted

  • Adds a check for a join flag to avoid error when starting two different local RF-1 YugabyteDB instances on mac #20018

Other

  • Rectifies the CentOS GCC11 and background worker asan build issues by utilizing stack allocation and eliminating the unsupported std::string::erase with const iterators. #20147, #20087
  • Fixes segmentation fault in stats collector after Postmaster reset by cleaning up after terminated backends and properly resetting shared memory and structures, resulting in improved stability and reliability of the system. #19672
  • Allows custom port configurations in multi-region/zone YugabyteDB clusters to prevent connectivity problems during cluster setup. #15334

v2.20.0.2 - December 15, 2023

Downloads

Use 2.20.1.3 or later.

This is a YugabyteDB Anywhere-only release, with no changes to the database.

v2.20.0.1 - December 5, 2023

Downloads

Use 2.20.1.3 or later.

Bug fixes

  • [19753] [YSQL] Fix tuple IN condition type-checking in the presence of binary columns
  • [19904] [DocDB] Fix Master's tablet_overhead mem_tracker zero consumption issue
  • [19911] [DocDB] Avoid AdvanceToNextRow seeking past live rows upon detecting a deleted row during distinct iteration
  • [19933] [DocDB] Backfill done may not be triggered for empty YCQL table

v2.20.0.0 - November 13, 2023

Downloads

Use 2.20.1.3 or later.

Highlights

  • Support for OIDC token-based authentication via Azure AD. This allows YSQL database users to sign in to YugabyteDB universes using their JSON Web Token (JWT) as their password.
  • Support for Transactional CDC.
  • Wait-on conflict concurrency control with pg_locks. This feature is especially important to users that have workloads with many multi-tablet statements, which may have multiple sessions contending for the same locks concurrently. Wait-on conflict concurrency control helps avoid a deadlock scenario as well as ensure fairness so a session is not starved by new sessions. EA
  • Catalog metadata caching to improve YSQL connection scalability and performance with faster warm up times. EA
  • Support for ALTER COLUMN TYPE with on-disk changes is now GA for production deployments. This feature expands ALTER TABLE capabilities with a broader range of data type modifications, including changes to the type itself, not just size increments.

New Features

  • [13686] [DocDB] Introduce the ability to Rollback And Demote an AutoFlag
  • [18748] [CDCSDK] Change default checkpointing type to EXPLICIT while stream creating

Improvements

  • [4906] [YSQL] Uncouple variables is_single_row_txn from is_non_transactional
  • [9647] [DocDB] Add metric for the number of running tablet peers on a YB-TServer
  • [12751] [CDCSDK] Made maxAttempts and sleepTime for retrying RPCs configurable in AsyncClient
  • [13358] [YSQL] YSQL DDL Atomicity Part 5: YB-Backup support
  • [16177] [DocDB] Mechanisms to reject table creation requests based on cluster resources, switch the tablet limit gflags from test flags to runtime flags
  • [16785] [DocDB] Add tracking for event stats metrics
  • [17904] [DocDB] Prevent YB-TServer heartbeats to master leader in a different universe
  • [18055] [DocDB] Resume waiters in consistent order across tablets
  • [18335] [DocDB] Reject ChangeConfig requests for sys catalog when another server is amidst transition
  • [18384] [14114] [DocDB] Throw consistent error on detected deadlocks
  • [18522] [YSQL] Change default unit for yb_fetch_size_limit to bytes from kilobytes
  • [18940] [DocDB] Metric to track active WriteQuery objects
  • [19071] [DocDB] Add estimated bytes/count to pprof memory pages
  • [19099] [YSQL] [DocDB] Adopt the trace correctly in pg_client_session
  • [19203] [xCluster] Add xCluster YB-TServer UI page, Switch to throughput in the YB-TServer UI pages from MiBps to KiBps
  • [19221] [DocDB] Improved timeout handling for YCQL index scan
  • [19272] [DocDB] Clear ResumedWaiterRunner queue during WaitQueue Shutdown
  • [19274] [xCluster] Update xCluster apply safe time if there are no active transactions
  • [19292] [CDCSDK] Publish snapshot_key in GetCheckpointResponse only if it is present
  • [19295] [yugabyted] Password Authentication showing enabled in UI even when it's not turned on.
  • [19301] [DocDB] Make re-tryable requests mem-tracker metric to be aggregated for Prometheus
  • [19351] Harden more YSQL backends manager tests
  • [19353] [xCluster] Introduce XClusterManager, Cleanup code on master and XClusterConsumer
  • [19417] [DocDB] Support tracing UpdateConsensus API
  • [19454] build: Move macOS support off of mac11
  • [19482] [CDCSDK] Add GFlag to disable tablet split on tables part of CDCSDK stream
  • [19532] [DocDB] Allow Traces to get past Glog's 30k limit
  • [YSQL] Disable backends manager by default
  • [YSQL] Some extra logging in Postgres Automatically determine upstream branch name from local branch name

Bug fixes

  • [17025] [xCluster] Calculate replication lag metrics for split tablet children
  • [17229] [DocDB] Prevent GC of any schema packings referenced in xCluster configuration
  • [18081] [19535] [DocDB] Update TabletState only for tablets that are involved in write, Ignore statuses from old status tablet on promotion
  • [18157] [DocDB] Don't pack rows without liveness column
  • [18540] [xCluster] [TabletSplitting] Fix setting opid in cdc_state for split children
  • [18711] [YSQL] Fix pg_stat_activity taking exclusive lock on t-server session
  • [18732] [DocDB] Don't update cache in LookupByIdRpc if the fetched tablet has been split
  • [18909] [YSQL] Fix ALTER TYPE on temporary tables
  • [18911] [19382] [YSQL] Fix ALTER TYPE null constraint violation and failure for range key tables
  • [19021] [YSQL] Fix bug in computation of semi/anti join factors during inner unique joins
  • [19033] [DocDB] PGSQL operation triggered during parent tablet shutting down may be not retried
  • [19063] [YSQL] Fix table rewrite on a table referenced through a partitioned table's foreign key
  • [19308] [YSQL] Fix bug where tuple IN filters were not bound to the request
  • [19316] [19314] [yugabyted] Bugs regarding --join flag provided.
  • [19329] [YSQL] Avoid O(n^2) complexity in list traversal while generating ybctid for IN operator
  • [19348] [CDCSDK] Only delete removed tablets from cdc_state in CleanUpCDCStreamsMetadata
  • [19384] [YSQL] Fix handling of case where a given RowCompareExpression cannot be bound
  • [19385] [CDCSDK] Fix WAL GC issue for tables added after stream creation
  • [19394] [CDCSDK] Only populate key to GetChangesRequest when it is not null, Set snapshot key from correct parameter in explicit checkpointing
  • [19407] [YSQL] Single shard operations should pick read time only after conflict resolution
  • [19414] [yugabyted] UI bugs
  • [19434] [CDCSDK] Do not completely fail while fetching tablets if one table hits error
  • [19440] [DocDB] Fix bug where invalid filter key was passed to Iterator initialization in backwards scans
  • [19450] [xCluster] Fix CDCServiceTestMultipleServersOneTablet.TestUpdateLagMetrics
  • [19497] [DocDB] Collect end-to-end traces only if the parent trace is non-null
  • [19514] [DocDB] Do not ignore replicas on dead YB-TServer in under-replicated endpoint
  • [19523] [xCluster] Fix TSAN error in xClusterConfig
  • [19544] [DocDB] Hash and Range Indexes grow in size even after rows are deleted, leading to slower queries
  • [19546] [DocDB] Add missing call to Prepare in PlayChangeMetadataRequest
  • [19605] [YSQL] Fix test failures encountered upon enabling DDL Atomicity
  • [19663] [DocDB] Fix AutoFlagsMiniClusterTest.PromoteOneFlag test