What's new in the v2.21 preview release series

What's new in the v2.21 preview release series

What follows are the release notes for the YugabyteDB v2.21 release series. Content will be added as new notable features and changes are available in the patch releases of the YugabyteDB v2.21 release series.

For an RSS feed of all release series, point your feed reader to the RSS feed for releases.

Changes to supported operating systems

YugabyteDB 2.21.0.0 and newer releases do not support v7 Linux versions (CentOS7, Red Hat Enterprise Linux 7, Oracle Enterprise Linux 7.x), Amazon Linux 2, and Ubuntu 18. If you're currently using one of these Linux versions, upgrade to a supported OS version before installing YugabyteDB v2.21. Refer to Operating system support for the complete list of supported operating systems.

v2.21.0.0 - March 26, 2024

Build: 2.21.0.0-b545

Downloads

Docker:

docker pull yugabytedb/yugabyte:2.21.0.0-b545

Highlights

PostgreSQL Emulation mode

We're pleased to announce the tech preview of the new PostgreSQL emulation mode in the 2.21.0.0 release. This mode enables you to take advantage of many new improvements in both code compatibility and performance parity, thus making it even easier to lift and shift your applications from PostgreSQL to YugabyteDB. When this mode is turned on, YugabyteDB will use the Read-Committed isolation mode, the Wait-on-Conflict concurrency mode for predictable P99 latencies, and the new Cost Based Optimizer that takes advantage of the distributed storage layer architecture and includes query pushdowns, LSM indexes, and batched nested loop joins to offer PostgreSQL like performance.

You can enable the emulation mode by passing the enable_pg_parity_tech_preview flag to yugabyted, when bringing up your cluster.

For example, from your YugabyteDB home directory, run the following command:

./bin/yugabyted start --enable_pg_parity_tech_preview

New YugabyteDB Kubernetes Operator

A preliminary version of the completely rewritten YugabyteDB Kubernetes Operator is available in Tech Preview. The new operator automates the deployment, scaling, and management of YugabyteDB clusters in Kubernetes environments. It streamlines database operations, reducing manual effort for developers and operators.

For more information, refer to the YugabyteDB Kubernetes Operator GitHub project.

New features

  • New Kubernetes Operator. Automated deployment and management of clusters via the Kubernetes operator pattern. Includes support for YugabyteDB universes as a Kubernetes custom resource. Backup, upgrade, scale-out, scale-in, and more are possible on this Kubernetes custom resource. TP

  • Backup and restore support in yugabyted. yugabyted now supports backup and restore of databases and keyspaces. You can also upload backups to public clouds, including AWS and GCP. TP

  • YSQL: DDL concurrency. Support for isolating DDLs per database. Specifically, a DDL in one database does not cause catalog cache refreshes or aborts transactions due to breaking change in another database. TP

  • YSQL: DDL atomicity. Ensures that YSQL DDLs are fully atomic between YSQL and DocDB layers, that is in case of any errors, they are fully rolled back, and in case of success they are applied fully. Currently, such inconsistencies are rare but can happen. TP

  • YSQL: Lower latency for large scans with size-based fetching. A static size based fetch limit value to control how many rows can be returned in one request from DocDB. TP

  • YSQL: ALTER TABLE support. TP Adds support for the following variants of ALTER TABLE ADD COLUMN:

    • with a SERIAL data type
    • with a volatile DEFAULT
    • with a PRIMARY KEY

Change log

View the detailed changelog

Improvements

YSQL
  • Offers consistent, specific deadlock error reporting regardless of when a transaction realizes its aborted state, through in-memory storage of recently deadlocked transaction information. #18384, #14114
  • Introduces a new model for estimating DocDB seek and next operations, enhancing the accuracy of cost calculations for index lookups, especially when various types of index filters are applied. #19354
  • Modifies the BNL costing model to charge for unmatched outer tuples in semi/anti/inner unique joins, enhancing the accuracy of join ordering for efficient query execution. #19054
  • Introduces a new flag index_scan_prefer_sequential_scan_for_boundary_condition that potentially enhances speed in range-sharded databases by utilizing sequential scan over Local Skip scan under specified conditions. #16178
  • Allows testing of seek and next estimations through added Java tests, guarding against potential regressions. #19082
  • Corrects the computation of semi/anti join factors for inner unique joins, addressing a bug in the costing code that incorrectly estimated the fraction of outer join tuples having a match. This adjustment enhances the accuracy of join clause selectivity computations enhancing the database's performance. Additionally, fixes a bug in the final_cost_nestloop where outer_matched_rows were inaccurately set as 0, thus improving query estimation and execution. #19021
  • Reintroduces the use of Local Skip scan for index scanning with primary key filters in range sharded databases, reversing a previous change due to identified correctness issues. #16178
  • Alters the YSQLDump to generate CREATE INDEX NONCONCURRENTLY instead of CREATE INDEX, preventing automated index back-filling in the backup-restore, thereby accelerating the process. #19457
  • Mitigates CVE-2023-39417 by incorporating an upstream Postgres commit from REL_11_STABLE, which prevents the substitution of extension schemas or owners matching ["$']. #14419
  • Offers quick regression tests for CBO using the cbo_stat_dump and cbo_stat_load tools, enhancing developer productivity and performance feedback by rapidly validating CBO changes through the TAQO framework. #19657
  • Ensures Row Level Security (RLS) policy remains intact during table rewrite by accurately copying both relrowsecurity and relforcerowsecurity fields. #19815
  • Sets the tuple count to 1000 for all tables appearing empty or unanalyzed when yb_enable_optimizer_statistics is true, improving Cost-Based Optimizer's query plan selection. #16825
  • Imports upstream postgres commit from REL_11_STABLE as a preventive measure for future support of DEPENDS ON EXTENSION for objects like FUNCTION, PROCEDURE, etc, mitigating potential risks like CVE-2020-1720 and CVE-2023-39417. #14419
  • Introduces sorting abilities to BNL nodes, matching their sorting properties to that of other joins, with a GUC flag yb_bnl_optimize_first_batch controlling it, enhancing performance especially in presence of small LIMIT clauses. #19589
  • Enables tracking and aggregating of table mutation counts at the cluster level by sending the counts to an auto-analyze service, easing automatic triggering of ANALYZE when mutation thresholds exceed. #15670
  • Ensures response cache invalidation when temporary tables are discarded without altering the catalog version, avoiding discrepancies while utilizing the advantages of session-bound modifications. #19178
  • Includes MyDatabaseId in the T-server cache key to resolve stale shared relation issues as a result of different databases sharing T-server cache entries. #19363
  • Streamlines YSQL DDL functionality by replacing the IsTransactionalDdlStatement function with the YbGetDdlMode function, offering more cohesiveness through enums instead of booleans for significant DDL modes while enabling easier addition of new modes. #19178
  • Enables the upgrade to OpenSSL 3.0+ by importing the upstream PostgreSQL commit Disable OpenSSL EVP digest padding in pgcrypto. #19733
  • Enables importing of the upstream PG commit, preparing the platform for OpenSSL 3.0+ upgrades. #19734
  • Blocks the use of advisory locks in YSQL and responds to the external client with an error message when they are requested. #18954
  • Imports the pgcrypto: Check for error return of px_cipher_decrypt upstream PG commit essential for upgrading OpenSSL to 3.0+. #19732
  • Adjusts the webserver's Out Of Memory (OOM) score through the yb_webserver_oom_score_adj flag (default 900) to prevent unnecessary shutdowns while allowing quick termination if it starts consuming excessive memory. #20028
  • Sets yb_bnl_batch_size to 1024 and yb_prefer_bnl to true by default, ensuring BNL's replace nested loop joins without altering non-NL join plans. #19273
  • Replaces remaining unnecessary scans of the pg_inherits table with cache lookups, reducing wasteful calls to the YB-Master and optimizing DDL operations. Fixes a structuring bug in the INHERITSRELID cache for better future compatibility. #10478
  • Enables READ COMMITTED isolation by default in debug builds, eliminates setting a transaction to READ ONLY via pg_hint_plan, and updates certain tests to instead run explicitly in REPEATABLE READ. #18462
  • Introduces a new flag, ysql_use_relcache_file, to control the use of relcache init file, helping regulate Postgres backend memory usage, and modify unpredicted system table preloading, reducing overall memory usage. #19226
  • Introduces asynchronous support for ALTER INDEX SET TABLESPACE, ALTER INDEX ALTER COLUMN SET STATISTICS, CREATE MATERIALIZED VIEW with TABLESPACE, and ALTER MATERIALIZED VIEW SET TABLESPACE enhancing database flexibility, with a traceable warning for beta features that can be muted by adjusting the ysql_beta_feature_tablespace_alteration flag to true. #6639
  • Changes the default unit for the yb_fetch_size_limit to bytes from kilobytes, allowing a size limit setting to non-integer kilobyte values, enhancing query performance during upgrades. #18522
  • Enables Postgres' parallel query feature and implements parallel scan of YB tables in YBSeqScan, IndexScan, and IndexOnlyScan nodes, resulting in potentially faster query results. #18095
  • Replaces outdated PGConn Fetch* functions with more robust versions for improved database testing, now supporting additional BasePGType and OptionalPGType elements. #19906
  • Prevents creation of index with TABLESPACE on a temporary table, averting client hangups and displaying an error message: ERROR: cannot set tablespace for temporary index instead. #19368
  • Offers more context to the wait states in tserver layer by adding Active Session History (ASH) metadata to Perform RPCs, providing insights for PGPROC and ASH collectors. Updates yb_enable_ash GFlag and assures upgrade/downgrade safety. #19135
  • Reduces contention and potential deadlock risk during the execution of pg_stat_activity request by introducing a transaction cache at the t-server, which stores the active sessions and their transaction mapping. This allows the request to access the cache under a shared lock, alleviating the need for an exclusive lock. #18711
  • Resolves the record type not registered error that appeared when retrieving fieldnames for batched index condition expressions in YB Batched Nested Loop through bypassing fieldname resolution for indecipherable batched expressions. #19094
  • Trims unnecessary master RPC calls during connection initialization by removing YB_YQL_PREFETCHER_NO_CACHE enum value and introducing YBCStartSysTablePrefetchingNoCache function. #19304
  • Enables the PgIndexBackfillTest.NoAbortTxn C++ test for explicit flag setting, increasing its resilience against any default changes in YSQL backend manager flags. #19351
  • Strengthens PgIndexBackfillTest.NoAbortTxn and other tests to endure potential YSQL backends manager flags' default value alterations, thereby boosting resilience. #19351
  • Enables unified server functionality following process termination by resorting to restarting the postmaster for a crashed or killed Postgres backend, contributing to simplicity and fewer bugs. #19180
  • Resolves an issue with RowCompareExpression bindings that previously led to incorrect results and occasional crashes in YbBindScanKeys by accounting for unique PgGate request conditions. #19384
  • Reduces unnecessary error logs related to tablespace during initdb by checking the FLAGS_create_initial_sys_catalog_snapshot before initiating the tablespace refresh task. #19386
  • Eliminates unnecessary error logs during initdb bootstrap process by checking for the existence of pg_yb_tablegroup catalog only in non-bootstrap mode. #19387
  • Enhances read committed isolation by enabling each statement to pick a read time on docdb when possible, ensuring more efficient operations and adding a test for this functionality. #19397
  • Removes the TransactionCache class shifting session's transactions' information closer to the session in the SessionInfo structure, averting a potential deadlock scenario by ensuring smoother test execution when per-database catalog version mode is activated. #18711
  • Corrects the handling of RowCompareExpression bindings in YbBindScanKeys to prevent inaccurate results and potential system crashes. #19384
  • Launches the yb_auh extension, building the foundation for the Active Universe History project with a circular buffer for wait events storage and a background worker for local tserver and PG backends polling. New Gflags are introduced: enable_yb_auh, yb_auh.circular_buffer_size, yb_auh.sampling_interval, and yb_auh.sample_size. Default settings are disabled, 16 MB, 1000 ms, and 500, respectively. #19127
  • Adds pg_hint_plan syntax and functionality to control batched nested loop joins, allows setting hints YbBatchedNL(t1 t2) and NoYbBatchedNL, and modifies yb_prefer_bnl handling. Also, it removes BNL's dependency on enable_nestloop and adjusts cost model. #19494
  • Enables the modification of is_single_row_txn for finer control over non-transactional writes required by COPY, index backfill, or when yb_disable_transactional_writes is set, preventing issues during non-bufferable operations for single row transactions. #4906
  • Introduces a new PG function yb_active_session_history_internal and a corresponding view yb_active_session_history for easier querying, which require the Gflag TEST_yb_enable_ash to be enabled; errors will occur otherwise. #19128
  • Enables fetching of ASH samples from all PG processes, excluding prepared transactions, background workers, and backends without set ASH metadata, using a newly-created Postgres backend. #19129
  • Introduces a NOTICE for potentially unsafe ALTER TABLE operations (such as altering primary key, altering type), ensuring users are aware of the risks. To suppress this notice, adjust the ysql_suppress_unsafe_alter_notice gflag to true. #19360
  • Adds a new column with both a NOT NULL constraint and a non-volatile DEFAULT value without needing a table scan, leading to faster YSQL Alter Table operations. No table scan is needed as all existing rows will use the non-volatile DEFAULT value in their new column, reducing constraint violation checks time. #19355
  • Simplifies the code in the pg_dml_read file by replacing the DocKeyBuilder helper class with a function and switches from using an arena array to boost::small_vector. #19685
  • Enables an alternative table rewrite approach that only drops and recreates associated DocDB tables and indexes, using the relfilenode field to map a PostgreSQL table OID to the respective DocDB table, resulting in a more efficient way to perform operations such as ALTER TYPE and ADD/DROP primary key. #4034
  • Allows ordered index scans with IN conditions on a lower column, ensuring accurate result order for YB LSM indexes, and generalizes the fix to all such indexes. #19576
  • Enables PgClientServiceImpl to periodically clear its own reserved_oids_map_, enhancing database cleaning and eliminating reliance on TabletServer for scheduling. #19916
  • Optimizes scans not requiring certain row order by allowing parallel scans of multiple partitions and secondary index scans, potentially altering the output row order in some queries without the ORDER BY clause. #13737
  • Replaces deprecated FetchValue with FetchRow, simplifying changes and fixing indentation issues in ‘pg_mini-’ without modifying formatting in other areas. #19918
  • Renames the term Active Universe History to Active Session History for enhanced comprehension. #19948
  • Introduces yb_silence_advisory_locks_not_supported_error as a temporary solution for users to avoid disruption when using advisory locks without actual lock acquisition. #19974
  • Marks the ysql_enable_read_request_caching GFlag as non-runtime since Postgres flags, except PG_FLAGs, cannot be dynamically updated, enhancing cache configuration consistency. #19983
  • Adds a configuration option for altering default key sorting from HASH to ASC in YSQL, facilitating smoother PostgreSQL migrations and efficiently using indexes with ASC sorting, especially for inequality and ORDER BY clause queries. #19937
  • Reworks the wait event format in YSQL and ASH to match the Postgres format, enhancing compatibility and simplifying association of wait events. #19130
  • Enables the start and end of wait events in the PGGate layer through a callback, introducing a new Flusher class, which returns a FlushFuture object providing an updated wait event and flush request duration. #19137, #20022
  • Enables the pushdown of aggregates where the split is AGGSPLIT_INITIAL_SERIAL, thereby effectively forwarding phase 1 results from YB scan to a higher level, labeled as "Noop Aggregate". #19839
  • Enables ALTER TABLE rewrite commands, adding support for ALTER TABLE ADD COLUMN operations and modernizing REINDEX implementation for end-user indexes. #19563
  • Enables ignoring already existing tablespaces during YSQL DB backup-restore process with the newly added flag ignore_existing_tablespaces in the yb_backup.py script. #20334
  • Adjusts preload settings to allow users to specify additional tables in the ysql_catalog_preload_additional_table_list without forcing preloading of default tables. #20290
  • Adds Storage Row statistics to the EXPLAIN (ANALYZE,DIST) output, enabling users to distinguish between work done by the storage layer and the query layer and understand the selectivity of remote filters and index conditions. #12676
  • Reworks TID expectations in index scans for more clarity and convenience by sidelining the use of TID t_self or t_ybctid and ensuring the setting of either yb_agg_slot, xs_hitup, or xs_itup. #20373
  • Refactors IndexScanDesc yb_agg_slot to prevent setting during non-pushdown cases and eliminates return value from ybFetchNext for unnecessary instances, preventing future misuse. #20371
  • Replaces existing retry attempt flags ysql_max_read_restart_attempts and ysql_max_write_restart_attempts with a unified GUC variable yb_max_query_layer_retries to control retries in all isolation levels including Read Committed, with default reset to 60 retries. Defaults for retry_backoff_multiplier and retry_min_backoff adjusted to 1.2 and 10ms respectively. #20359
  • Centralizes all code for creating internal PostgreSQL connections, simplifying usage in ysql_upgrade, ysql index backfill, WaitForYsqlBackendsCatalogVersion and ddl replication. Now utilizes the detailed error message from PGConn::Connect. #20655
  • Revamps the ToString function to create unique responses for optional types (std/boost::optional), enhancing log readability and data relevance. #20719
  • Adds a new GUC yb_explain_hide_non_deterministic_fields to remove non-deterministic fields from EXPLAIN ANALYZE's output, reduces flakiness between runs in pg_regress tests. #19492
  • Corrects formatting errors in the pg_stat_get_activity function, aligns variable names, adds yb_prefix to txn_rpc_timestamp, and applies column indexing based on PG_STAT_GET_ACTIVITY_COLS macro. #20281
  • Relocates Unknown Session Unit Test to pg_libpq, renaming it from PgBackendsTestSessionExpire to PgBackendsSessionExpireTest for convention conformity, enhancing testing protocol. #20545
YCQL
  • Introduces an UpdateMapRemoveKey API, enabling the removal of specific keys from a Map, leaving all other keys unaffected. #19829
DocDB
  • Introduces yb_read_time GUC variable, usable by superusers to query the database at a specific point in time in the past, specifically aiding backup and restore scenarios. This variable helps generate a database schema of a specific past point using ysql_dump. Make sure it's not set before a DDL operation or during it. Default value is 0, meaning the data is read in real-time, while setting a Unix timestamp (in microseconds) allows reading data as of that time. #19114
  • Accelerates rollback and downgrade processes by introducing capability to demote AutoFlags, offering enhanced control over rollback version and emergency repair functionality with new yb-admin commands. #13686
  • Enables tracking of active WriteQuery objects and outstanding transaction status rpc requests at the tablet level for easier debugging. #18940
  • Introduces an /xcluster UI page for yb-tserver to track real-time statuses of xCluster source streams and target pollers with a capability to reset data following a restart. Also features sorting and a search box for easier navigation. #19203
  • Introduces a read-time flag in ysql_dump, offering a way to dump the database schema as of a specific point in time, improving backup restoration capabilities. #19258
  • Enhances timeout handling for YCQL index scans to avert overruns, resulting in less log spew, ensuring index tablet scans do not timeout prematurely at the YCQLProxy/YBClient side, and eliminating unnecessary repeated master leader requests. #19221
  • Reduces chances of transaction deadlocks and improves fairness in read committed isolation by modifying the order of transactions resumption across all tablets based on xactStartTimestamp. #18055
  • Switches the data transfer rate on the tserver UI from MiBps to KiBps for enhanced precision, considering the typical tablet data transfer range. #19203
  • Reduces tablet shutdown issues and delayed database operations by addressing a bug causing unnecessary blockage in clearing the ResumedWaiterRunner queue during WaitQueue shutdown. #19272
  • Offers redesigned server level aggregation for metrics, thus introducing more metrics for enhanced debugging. Removes several unused URL parameters and makes the new output compatible with YBA and YBM, preventing double-counting issues in charts. Drops unused Json and Prometheus callbacks from MetricEntity for a cleaner design. #18078
  • Replaces glog includes with yb/util, introducing yb VLOG macros for clearer differentiation between INFO and VERBOSE logs, while addressing issues of duplicate includes. #15273
  • Adjusts the verbose level for VLOG macros to help differentiate between INFO and VERBOSE logs, fostering ease in debugging and analysis with better log filtration. #15273
  • Aligns retryable request timeouts with respective YCQL and YSQL client write timeouts, thus reducing unnecessary log replay during YCQL tablet bootstrap. #18736
  • Eliminates duplicate includes from specific files, providing clearer differentiation between INFO and VERBOSE logs for enhanced user debugging experience. #15273
  • Enables a retry mechanism for acquiring shared in-memory locks from the wait-queue during waiter resumption to respect client/statement timeout, reducing request failures and associated latency in contentious workloads. #19032, #19859
  • Accelerates TServer Init by handling deleted and tombstoned tablets asynchronously on startup, therefore, enabling the quick starting of the RPC port. Introduces a new flag num_open_tablets_metadata_simultaneously to set the number of threads for opening tablets' metadata during startup, enhancing the startup time. The modification also takes steps towards deleting the superblock in DeletedTablet. #15088
  • Introduces automatic recovery of index tables affected by a bug, effectively preventing performance degradation and disk size leak by ensuring that tombstones are properly filtered out by compactions once index backfilling is complete. #19731
  • Adds a 10s delay between an AutoFlag config update and its application, ensuring all tservers have the new config before any AutoFlags switch and begin producing new data. Guarantees process continuity by temporarily holding back new configs if the process restarts during apply time. #19932
  • Parallelizes the RPCs made during the DoGetLockStatus process in pg_client_service.cc to expedite fetching locks, enhancing database performance. #18034
  • Introduces support for upgrade and rollback of universes with xCluster links, checking AutoFlag compatibility during configuration changes. Includes error handling and broadcasting of AutoFlag config changes. The aim of these changes is to ensure that the target universe has the superset of specific AutoFlags. #19518
  • Enables logging of all instances of tablet metadata creation/updating, providing additional insights in case of tablet server startup crashes due to multiple meta records for the same tablet. #20042
  • Introduces a new get_auto_flags_config yb-admin command to retrieve the current AutoFlags configuration, aiding in debugging xCluster replication failures. #20046
  • Enhances pg_locks by including results from Single Shard transactions that previously went untracked, enabling users to query these transactions. During upgrades or downgrades to version 2024.1 and above, pg_locks queries may fail due to nodes lacking the newly implemented GetOldSingleShardWaiters service method. #18195
  • Expands load balancer metrics by incorporating tablets_in_wrong_placement, blacklisted_leaders, and tablet_load_variance, enhancing the tracking of load balancer progress. #20118
  • Adds new regular expression filters to the Prometheus metric endpoint by creating a distinct API for YBA, offering server-level aggregation for tablet and table metrics. Users should add version=v2 to the URL for enabling this feature, granting control over metric output filters and determining the scope of metric aggregation effectively. #19943
  • Limits the number of rows returned per transaction per tablet in pg_locks to avoid potential memory issues during batch inserts, and includes additional fields to indicate partial lock info. #20765
  • Introduces a new GUC yb_locks_txn_locks_per_tablet to limit the number of rows returned by pg_locks, preventing the system from running out of memory during large transactions. #19934
  • Allows for the check of zero bytes at the end of SST data files, and enables an error report with the number of zeros once the flag rocksdb_check_sst_file_tail_for_zeros is set to a positive value. #19691
  • Boosts the bootstrap process by reading entries from the offset of the last flushed operation id instead of the secustomerent's beginning, significantly reducing unnecessary reading. For colocated tables, it enforces the replaying of at least two segments when the lazy_flush_superblock is enabled. #18312
  • Prevents tservers from communicating with master leaders in different universe clusters averting possible data loss, by introducing a new universe_uuid field and an autoflag master_enable_universe_uuid_heartbeat_check to manage the tserver heartbeat checks. #17904
  • Rejects ConfigChange requests for system catalog while another server is transitioning, preventing potential data loss from mistaken quorum formation by new peers. #18335
  • Enables tracing of UpdateConsensus API by activating the collect_update_consensus_traces flag, offering visibility into remote follower traces and adding trace messages to local logs. The feature ensures upgrade/rollback safety and impacts the leader and follower only if both incorporate the change. #19417
  • Introduces the rocksdb_max_sst_write_retries flag to set the number of retry attempts if corruption is detected when writing SST file, affecting both flushes and compactions. #19730
  • Safeguards the master_join_existing_universe flag to prevent unnecessary initial sys catalog snapshot restoration. #19357
  • Adds a retry mechanism on block checksum mismatches and enhances error logging for better identification of transient read errors. #20102
  • Refines error messages on block checksum failure by including a retry scheme and logging on success or failure, offering better error tracking. #20102
  • Adds a URL parameter, show_help, to the scrape endpoint, enabling control over display of help and metadata information, overriding the export_help_and_type_in_prometheus_metrics GFlag. #19176
  • Renames AsyncClientInitialiser to AsyncClientInitializer for consistency in naming conventions. #19920
  • Introduces flags tablet_replicas_per_gib_limit, tablet_replicas_per_core_limit, and tablet_overhead_size_percentage to customize tablet replication based on cluster resources, enhancing user control over system load balance. #16177
  • Introduces a new script, analyze_test_results.py, to reconcile discrepancies between Spark-based test runner and JUnit-compatible XML test reports, offering more accurate and reliable test results. #18594
  • Allows for YSQL parallel scans by breaking table tablets keyspaces into ranges of similar data size for efficient scanning time. #19341
  • Reduces unwanted logging in LogAfterLoad when a single 0 version is loaded, thus minimizing unnecessary log generation especially when managing many YSQL databases. #18489
  • Introduces AreNodesSafeToTakeDown API that ensures safe node removals during cluster upgrades or maintenance operations by checking tablet health and follower lag, facilitating seamless and risk-free updates. #17562
  • Adds a show-changes command to the sys-catalog-tool to search and provide details of all updated entries marked as ADD, CHANGE, or REMOVE. This needs to be run before update to validate the expected changes in the SysCatalog JSON file. Notably, this command exclusively interacts with the file, without reading or writing to the SysCatalog. #18800
  • Enhances the TCMalloc heap snapshot functionality with additional columns for estimated bytes and samples count from a call stack, allowing direct comparison with the total system memory and accurate proxy for memory usage. #19071
  • Tracks and batches updates for rocksdb and tablet-level event stats metrics, distinguishing between counter and gauge metrics, and exposing them in EXPLAIN (ANALYZE, DIST, DEBUG) and tracing. #16785
  • Adopts the trace outside the block for ensuring correct execution of per-session tracing with standalone traces, and fixes callbacks to adopt the appropriate trace. #19099
  • Modifies the use of scan choices to increase effectiveness in scenarios where only the lower bound is specified, enhancing both speed and performance. #19117
  • Allows tracking of per-RPC wait-states using WaitStateInfo for incoming RPC updates, ensuring safe upgrades and functioning ASH without interfering with existing functionalities. #19138
  • Optimizes PgWire response serialization for large query results, enhancing overall read performance. #19213
  • Reduces high load issues by renaming blocking synchronous YBSession flush functions to TEST_* and replacing them with non-blocking asynchronous versions (FlushAsync). #12165
  • Reduces the safe time lag in the xCluster by sending the apply safe time more frequently when there are no active transactions. #19274
  • Elevates the timeout in TSAN mode for the PgSharedMemTest.TimeOut test, averting potential table creation timeouts. #19313
  • Adds a new retrying master-to-master task, allowing for the API AreNodesSafeToTakeDown to check if it's safe to remove or upgrade certain nodes without disrupting overall cluster health. #17562
  • Replaces EnableVerboseLoggingForModule with google::SetVLOGLevel for a less complex procedure in setting the module log level, eliminating the updating of the vmodule gflag. #19344
  • Renames cdc to xcluster, moves ValidateTableSchema to xrepl_catalog_manager and renames it to ValidateTableSchemaForXCluster. Revises allow_ycql_transactional_xcluster to be a TEST flag, enhances XClusterManager's ability to handle XCluster related control logic, and launches dedicated XClusterConfig class. #19353
  • Reduces macOS 13.6 linker warnings by updating the compiler to avoid duplicate RPATHs, enables failure on duplicate RPATHs through YB_FAIL_ON_DUPLICATE_RPATH, and cleans build system. #19378
  • Enables thread safety for members passed by reference by setting the Wthread-safety-reference, fixing all resulting build errors for increased stability. #19365
  • Enables TEST_SYNC_POINT macro in release builds reducing its impact in production by adding the check for FLAGS_TEST_enable_sync_points before making expensive SyncPoint calls. #19379
  • Introduces XClusterManager to handle all XCluster related control logic in the yb-master, creates a dedicated class XClusterConfig for changes to XClusterConfigInfo, and makes allow_ycql_transactional_xcluster a TEST flag. #19353
  • Adds a skip_indexes command line option to create_snapshot and create_keyspace_snapshot, allowing users to exclude indexes when creating backups in YCQL. #14142
  • Enables a fallback to RPC when request or response exceeds the scope of allocated shared memory, ensuring continued functionality in larger data scenarios. #19430
  • Enhances thread safety analysis by enabling the -Wthread-safety-precise compiler flag, which increases scrutiny on mutex field assignments, and adds the ability to override the compiler type for third-party archive selection using YB_COMPILER_TYPE_FOR_THIRDPARTY environment variable. #19462
  • Simplifies xCluster code by allocating related tests to a separate file, introducing XClusterManager for better control logic, and establishing a dedicated XClusterConfig class for changes to XClusterConfigInfo. #19353
  • Removes a disabled test, enhancing master start in shell mode with either an empty master_addresses or a set master_join_existing_universe flag. #19528
  • Saves memory and disk space by introducing a JoinStringsLimitCount utility, which limits reporting and logging to the first 20 elements of large number arrays like tablet Ids. #19527
  • Filters out tservers in the read cluster when determining whether to add new tablet replicas to the cluster, providing the dual ability to manage CPU usage when maintaining idle tablets and ensure robust front-end work operations. This process includes configuration adjustments to tablet_replicas_per_gib_limit, tablet_replicas_per_core_limit, and tablet_overhead_size_percentage flags. #16177
  • Renames test file "xcluster_ysq_colocated" to "xcluster_ysql_colocated" for enhanced clarity and correction of a previous error. #19531
  • Allows longer GLog traces exceeding 30k limit by splitting output into less than 30k per line and introduces a new Gflag trace_max_dump_size to limit size of printed traces. #19532
  • Adds a metric for running tablet peers per tserver for easy calculation of tablet peers to cores, and tablet peers to memory ratios on YBM clusters. #9647
  • Renames CDCTabletMetrics to XClusterTabletMetrics and several related files, refines metrics retrieval and setting, and enhances handling of race conditions for smoother data management. #20079
  • Switches tablet_replicas_per_core_limit and tablet_replicas_per_gib_limit to runtime flags, for setting and adjusting resource-based tablet limits on-the-go. #16177
  • Enables aggregation of retryable requests mem-tracker metric at table-level for Prometheus by assigning the entity to the mem-tracker after the Tablet opens with the tablet metric entity. #19301
  • Implements a wait period after the addition of new transaction status tablets, enhancing the stability of XClusterYSqlTestConsistentTransactionsTest.UnevenTxnStatusTablets. #19302
  • Upgrades OpenSSL to version 3.0.8, disabling Linuxbrew builds and enabling glog to use the stack unwinding function based on backtrace. #19736
  • Facilitates the use of remote_build.py tool by interpreting arguments for yb_build.sh even when they couldn't be correctly parsed as remote_build.py arguments. #19696
  • Introduces the trace_max_dump_size Gflag (default 25000) for limiting trace print sizes, works around GLog's character limit for printing long traces. #19532, #19769
  • Relocates XClusterConfigInfo and XClusterSafeTimeInfo from catalog_entity_info.h to xcluster_catalog_entity.h, and from catalog_loaders.h to xcluster_catalog_entity.h, respectively. Also, establishes a SingletonMetadataCowWrapper for singleton catalog entities, creates an XClusterManager interface, and transfers xcluster_safe_time_info_ and its functions from Catalocustomeranager to XClusterManager. #19713
  • Facilitates a more rapid server initialization by deleting the superblock within the DeleteTablet process when the delete_type is TABLET_DATA_DELETED, reducing the number of DELETED tablet superblocks at server startup. #19840
  • Introduces a continuation marker for better traceability when a trace segment is split into multiple LOG(INFO) outputs; also adds a new GFlag trace_max_dump_size to limit the size of traces printed. #19532, #19808
  • Generates an enhanced error message displaying the version info when the yb process incorrectly starts on an older version after AutoFlags have been enabled, aiding in easier problem identification. #16181
  • Renames producer_id to replication_group_id in older proto messages, standardizing the replication group identity for enhanced consistency and rollback safety. #19825
  • Centralizes common helper functions for YCQL xcluster tests into XClusterYcqlTestBase for streamlined testing procedures. #19830
  • Balances tablet load more evenly across all drives, preventing bottlenecks during remote-bootstrapping by evenly distributing tablets and utilizing available disk bandwidth. #19846
  • Introduces additional debug logs for troubleshooting SELECT statement errors that could arise from processing non-provisional records or writing provisional records without a hybrid timestamp. #19876
  • Cleans up allocated shared memory objects on TServer startup if the TServer process didn't shut down gracefully. #19988
  • Enhances the demote_single_auto_flag yb-admin command by returning specific error messages for invalid process_name, AutoFlag name, or non-promoted AutoFlag, making identifications easier. #20004
  • Enables monitoring of master leader heartbeat delays through a new RPC in the MasterAdmin, ensuring undesired lags can be readily detected and mitigated. #18788
  • Avoids indefinite mutex lock and TServer thread blockage by correctly handling crashes during request transmission via shared memory. #20050
  • Eliminates usage of UNKNOWN flags in tools, marking them as NON_RUNTIME since dynamic update of these flags is not supported. #20123
  • Renames the misleading cdc xCluster metric entity to xcluster, ensuring an accurate representation without affecting dependencies as services like YBA rely on the unchanged metric name. #20131
  • Establishes a flag to manage indexing backfills, offering control over whether non-deferred indexes should be batched during the backfill operation. #20213
  • Delivers automatic recovery for index tables affected by a bug previously found and addressed, preventing any future performance issues triggered by incorrectly set property values. #20247
  • Changes Successfully read [n]ops from disk. logs to verbose logging, lowering the frequency of identical log outputs and boosting performance. #20287
  • Allows configuration of the yb_build.sh script via .git/yb_buildrc and ~/.yb_buildrc bash scripts, to specify implicit arguments or alternative defaults before parsing command line arguments. #20291
  • Converts UNKNOWN flags to either RUNTIME or NON_RUNTIME in DocDB for optimal flag management. #16979
  • Marks the Tserver flag num_concurrent_backfills_allowed as RUNTIME instead of UNKNOWN for better manageability. #20348
  • Upgrades unit test key/certificate pairs from 1024-bit RSA keys to 2048-bit, meeting FIPS 140-2 requirements, and integrates their generation into the build process. #20370
  • Marks the force_global_transactions, ycql_use_local_transaction_tables, and auto_promote_nonlocal_transactions_to_global gflags as runtime, enabling them to be changed directly as required for each new transaction. #20479
  • Organizes AutoFlags management across dedicated MasterAutoFlagsManager, TserverAutoFlagsManager and subset AutoFlagsManagerBase, offering neat code architecture and resolving a bug in Master::InitAutoFlags. #19958
  • Renames cdc::ProducerTabletInfo to cdc::TabletStreamInfo and removes ReplicationGroupId from it, relocates ReplicationGroupId from cdc to xcluster namespace, and introduces xcluster::ProducerTabletInfo to optimize naming consistency. #20452
  • Enables the use of the OpenSSL FIPS module by setting the new openssl_require_fips = true gflag, ensuring FIPS standard compliance for database cluster creation. #20524
  • Adds Prometheus metrics for server hard and soft memory limits, enabling better tracking of memory use in TServer or master and creation of dashboard charts for universes using non-default values. #20578
  • Introduces a helper function that checks if a CowObject has a write lock, offering special functionality in retail mode and debug mode for enhanced thread safety. #20599
  • Eliminates the issue of accessing erased objects in the ClusterLoadBalancer::RunLoadBalancerWithOptions, enhancing the runtime performance. #20673
  • Streamlines bloom filter key calculation by avoiding duplicate calculations. This results in approximately 4.5% tserver time improvement, and overall 1.5% performance boost. #20720
  • Limits the number of tablets per node, and hastens reaching the desired number of tablets by lowering the values of FLAGS_tablet_split_low_phase_shard_count_per_node to 1 and FLAGS_tablet_split_low_phase_size_threshold_bytes to 128_MB. #20579
  • Introduces new auto flags to stave off backward compatibility issues related to version 2.20, ensuring the stable existence of previously promoted AutoFlags during process startup time. #13474
  • Adds verbose logs for frequent global and per-table state changes within a load balancer run for easier debugging. #20289
  • Splits XClusterManager into two separate managers, XClusterSourceManager and XClusterTargetManager, each handling different objects, to enhance code readability and component isolation. #20737
CDC
  • Integrates CDCSDK stream creation for a namespace into YugabyteDB master, introducing support for garnering a CDC stream via cdcsdk_ysql_replication_slot_name. Invalidates deprecated logic in cdc_service, focusing on YSQL strategies instead. Promotes explicit parameter requirements for request validation when namespace_id is populated. Addresses a race condition and initial checkpoint discrepancy in CreateCDCStream. This alteration modifies sys-catalog entry and necessitates client checking of the autoflag yb_enable_replication_commands. #19211
  • Enables CRUD syntax for Publications in YSQL as part of a YSQL API for CDC via the PG logical replication mechanism, allowing users to specify tables for streaming through CDC. However, CDC does not support certain features, which may limit table selection and result in errors. The change is irreversible due to the introduction of the yb_enable_replication_commands autoflag. #18930, #18933, #18931
  • Allows maxAttempts for RPCs in AsyncClient to be adjustable, decreasing the risk of Too many attempts exceptions occurring in a short period. #12751
  • Enables deletion of CDCSDK streams through replication slot names, advancing the support for SQL syntax for CDC via the PG logical replication model. However, this feature isn't rollback safe and is disabled during upgrades, requiring a subsequent check of the autoflag yb_enable_replication_commands. #19212
  • Introduces support for creating, viewing, and dropping replication slots in YSQL. Adds two interfaces for support, functions pg_create_logical_replication_slot and pg_drop_replication_slot, and Walsender commands CREATE_REPLICATION_SLOT and DROP_REPLICATION_SLOT. Inserts view pg_replication_slots for viewing replication slots. Fixes two issues concerning cleanup of held locks and skipping cache refresh. #19211, #19212, #19509
  • Prevents Object already exists error during consecutive CreateCDCStream and DeleteCDCStream calls by effectively handling the stream delete state, and supports creating a CDCSDK stream for a namespace via SQL syntax. #19211, #19212
  • Automatically forwards CreateCDCStream requests to yb-master for atomic creation of CDCSDK streams, enhancing consistent snapshot capability. This is covered by the ysql_yb_enable_replication_commands flag and temporarily bypasses the requirement for a replication slot name. #18890
  • Unveils enhanced replica command recognition to overcome issues, paving the way for new replication slot support. Also incorporates the ability to create a CDCSDK stream for a namespace via SQL syntax and remedy specific race conditions. #19211, #19212
  • Defines replication slots as active or inactive in YugabyteDB, considering a slot active if it's consumed within the set duration defined by the ysql_cdc_active_replication_slot_window_ms Tserver GFlag. This change allows better visibility into slot activities and prevents dropping of active slots. It also addresses a bug in the WaitForGetChangesToFetchRecords function used in testing. #19211, #19212
  • Supports the creation of CDCSDK stream for a namespace, with the ability to fetch it using cdcsdk_ysql_replication_slot_name. Simultaneously, addresses a race condition problem during the CreateCDCStream operation and ensures proper initial checkpoint setting in cdc_state_table. Introduces limits on replication slots (CDC stream) utilizing a GFlag and reports the status when the slots limit is reached. This change also accommodates the detection of replication commands in yb_is_dml_command. #19211, #19212
  • Enables reading of Decimal and VarInt datatypes in CDC for CQL. #19726
  • Reinstates support for identifying replication commands after a previous rollback. Allows users to create a CDCSDK stream for a namespace and to retrieve a CDC stream using cdcsdk_ysql_replication_slot_name. Addresses a race condition issue between CreateCDCStream and the Catalocustomeranager's background cleanup task and fixes a problem related to the initial checkpoint of tables in the cdc_state_table for CDCSDK. Also reintroduces the ability to determine whether a replication slot (CDC stream) is active or inactive. #19211, #19212
  • Limits the number of replication slots in YSQL with max_replication_slots GFlag, introducing an error code for when the limit is reached, and enhances CDC stream creation. #19211
  • Displays the replication commands conducted by walsenders in the pg_stat_activity section. The new implementation supports the creation of a CDCSDK stream for a namespace via cdcsdk_ysql_replication_slot_name, enables the detection of replication commands without errors, and introduces the limitation of the number of CDCSDK streams by the max_replication_slots GUC. #19211, #19212
  • Expands the range of SQL commands that can be issued to a walsender, increases support for creating CDCSDK stream for a namespace, and guards against a potential race condition between CreateCDCStream and Catalocustomeranager background cleanup task. #19211, #19212
  • Avoids erroneous deletions from the cdc_state table caused by a race condition during tablet splits by reversing the call order in the CleanUpCDCStreamsMetadata method. #19746
  • Detects replication commands in yb_is_dml_command, supports creating logical replication slots through SQL using CREATE_REPLICATION_SLOT and pg_create_logical_replication_slot. The change includes support for CDCSDK stream creation, imposes limit on replication slots/streams, and resolves a race condition related to CreateCDCStream. #19211, #19212
  • Changes the yb_enable_replication_commands from an autoflag to a TEST flag, making it safer and more flexible for enabling replication slots feature by default. Supports YSQL commands for replication slots when the flag is true, while disallows them when the flag is set to false. It also rectifies a race condition between CreateCDCStream and the CataloCustomeranager background cleanup task. The revision further supports the creation of CDCSDK stream for a namespace, aiding in the long-term goal of supporting SQL syntax for CDC. #19211, #18890
  • Ensures cleanup of entries from cdcsdk_replication_slots_to_stream_map_ when corresponding entries are deleted from cdc_stream_map_, avoiding potential inconsistencies. #19211
  • Introduces a new yb-admin CLI command and master RPC to enable backfilling of a replication slot name to existing CDCSDK streams, providing manageable streams via YSQL Publication/Replication slot interface. #19261
  • Logs a NOTICE for each unsupported table when creating a publication using the FOR ALL TABLES case in CDC, improving user visibility on skipped tables. #19291
  • Enriches the CDCStreamInfo java class with a new cdcsdk_replication_slot_name field and an accessor method for better support of Publication/Replication slot. #19811
  • Optimizes the CreateCDCStream by eliminating unnecessary sleep statements, preventing a race condition, and ensuring correct initial checkpoint settings for the CDCSDK. Also, this code change introduces support for SQL syntax for CDC using the Postgres logical replication model, allows detecting replication commands without errors and defines whether a replication slot (CDC stream) is active or not. #19211
  • Transforms yb_enable_replication_commands into a runtime PG preview flag, correcting a bug that caused publication commands to always be enabled regardless of flag value. #18930
  • Introduces a GFlag to toggle automatic tablet splitting for tables within a CDCSDK stream, enhancing user control over replication processes. #19482
  • Expands support for two new record types: PG_DEFAULT and PG_NOTHING based on Postgres replica identity types while maintaining backwards compatibility by renaming ALLand MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES modes to PG_FULL and PG_CHANGE_OLD_NEW respectively. A failsafe cdc_enable_postgres_replica_identity autoflag is added. #19260
  • Addresses a test failure in TestCreateCDCStreamForNamespaceLimitReached by specifically adding the record type CHANGE to the stream request. Enables support for two new record types PG_DEFAULT and PG_NOTHING, while retaining the ALL and MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES modes. Adjusts settings using the newly added autoflag cdc_enable_postgres_replica_identity. #19260
  • Introduces support for two new record types, DEFAULT and NOTHING, based on Postgres replica identity types, and renames ALL and MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES modes to PG_FULL and PG_CHANGE_OLD_NEW respectively for backward compatibility. It introduces an autoflag cdc_enable_postgres_replica_identity during CDC stream creation and adjusts the failing test TestCreateCDCStreamForNamespaceLimitReached by specifying the record type CHANGE. #19260
  • Enhances CDCSDK to report tablet splits promptly upon detection, controls data duplication by cross-referencing hash_key bounds, and optimizes the retrieval of child tablets via tablet_peer. #18479
  • Refines the GetCheckpointResponse to indicate snapshot_key presence only when present, enhancing accuracy of bootstrapping and streaming processes. #19292
  • Introduces the UpdateMapUpsertKeyValue API that lets you update specific keys without needing to re-add all keys, allowing for more efficient updates. #19577
  • Enhances the CDC State Table's key update efficiency by selectively updating or removing keys as needed, without having to replace the entire map column. #19577
  • Reactivates the cdcsdk_stream-test for TSAN mode, previously disabled, enhancing overall testing capabilities. #19752
  • Helps ensure failed CDCSDK stream creation processes are rolled back effectively, reducing problems caused by incomplete creations through a ScopeExit mechanism. Manual clean-up may be required in certain failure scenarios until DDL atomicity for alter table statements is implemented. #18934
  • Enables the tests in cdcsdk_snapshot-test to run in TSAN mode, augmenting their utility and coverage. #19752
  • Rectifies the intermittent failure issue in TestReleaseResourcesOnUnpolledSplitTablets by ensuring that UpdatePeersAndMetrics thread refreshes the cached CDC stream metadata if in the initialized state. #18934
  • Alters the default checkpoint type to EXPLICIT during stream creation, ensuring no upgrade or rollback issues due to alterations in the default proto field value. #18748
  • Allows yb-client to apply retries for retryable error codes, preventing the unnecessary resetting of attempts and deadlines when a CDCErrorException is encountered. #19648
  • Releases retention barriers on tables that are not of interest in the CDC Consistent Snapshot feature stream, defined by the new GFlag "cdcsdk_tablet_not_of_interest_timeout_secs." This enhances user control over snapshot consumption. #20146
  • Refactors tests to use ASSERT_EQ assertions, not ASSERT_GE, for checking consumed record count, utilizing GetChangeRecordCount method for more accurate record handling and tablet splitting. #20261
  • Switches the default consistent snapshot option to USE_SNAPSHOT when creating a new stream, and converts the Consistent Snapshot feature to a preview feature guarded by the RUNTIME_PREVIEW flag yb_enable_cdc_consistent_snapshot_streams. #20367
  • Modifies the default value of gflag "cdcsdk_tablet_not_of_interest_timeout_secs" to 4 hours enhancing CDC Consistent Snapshot feature and remains guarded by the PREVIEW flag "yb_enable_cdc_consistent_snapshot_streams". #20378
yugabyted
  • Integrates client-to-server encryption support for Ysql Connection Manager, securing the connection between the client application, Ysql connection manager, and pg_backend through enabling SSL connectivity. Uses the existing use_client_to_server_encryption and certs_for_client_dir flags to enable and configure this feature, while not supporting certification files set via ysql_pg_conf_csv and cert-based authentication. Ensures upgrade and rollback safely without the need for an auto flag or node communication. #19108
  • Publishes Ysql Connection Manager metrics on <tserver_ip_address>:13000/prometheus-metrics, enhancing data monitoring and diagnostics. #19109
  • Alters the format of YSQL connection manager's prometheus metrics on the prometheus-metrics endpoint to include the database as a metric label. #19484
  • Enables faster and more secure unix socket connections between Ysql Connection Manager and pg backend on the same machine, replacing the previous TCP/IP connections. Introduces a new flag ysql_conn_mgr_use_unix_conn to configure this feature. #19483
  • Enables the use of the YSQL Connection Manager feature as an alternative in the yb-pgsql java test framework by setting the YB_ENABLE_YSQL_CONN_MGR_IN_TESTS environment variable to true. #19703
  • Allows for the creation of separate pools for each user/database combination in the Ysql Connection Manager, eradicating the need to set the user context at the beginning of each transaction. Updates to stats/metrics format also enhance database pool tracking. #19722
  • Enables restriction of encryption to the logical connection only in YSQL Connection Manager by setting use_client_to_server_encryption. Physical connections, between the YSQL connection manager and Postgres process on the same machine, are not encrypted, enhancing internal performance without sacrificing secure external communications. #19108
  • Introduces the GUC variable ysql_conn_mgr_sticky_object_count for easier and faster tracking of connection stickiness in YSQL Connection Manager tests, eliminating the need to modify pool sizes. #20067
  • Introduces the GUC variable yb_use_tserver_key_auth for authenticating clients using yb-tserver-key. Removes the "postgres only" requirement for yb-tserver-key authentication and sets ysql_conn_mgr_use_unix_conn as true by default. Requires no HBA changes. #19996
  • Integrates a database migration visualization tool in the yugabyted UI, including a new dashboard for monitoring migration progress and complexity, facilitating smoother transition from other databases. #18782
  • Corrects the CPU usage Sankey diagram to accurately report used and available values, enhancing reliability of performance metrics on the performance page. #19991
  • Enables a new user interface feature in yugabyted for connection management metrics, displaying metrics on active and total logical/physical connections, and providing a clickable banner to navigate to dedicated connections visuals. #18805
  • Rectifies confusion with the yugabyted-UI; password authentication no longer incorrectly shows as enabled for an insecure cluster unless the encryption-at-rest is activated. #19295
  • Rectifies the misalignment in the display of status messages for specific scenarios in yugabyted. #19334
  • Corrects the display of the total number of CPUs on the overview page and ensures live queries show all statuses, not just idle. #19414
  • Offers the ability to set preferred regions using yugabyted CLI for lower latencies, by expanding the functionality of the constraint_value flag, offering a way to assign preference orders to Availability Zones (AZ). #19415
  • Corrects join flag bugs, ensuring a smooth start command even if a node's join IP is not an active master and enables error handling when the placement_uuid from the join IP can't be obtained. Now supports Hostnames and handles edge cases for addresses provided through CLI. #19316, #19314
  • Adjusts the yugabyted start command to interpret 0.0.0.0 as 127.0.0.1 in the advertise_address, aligning with the IP use in master, tserver, and yugabyted-UI. #18580
  • Adds prerequisite checks to confirm if default ports are open before yugabyted starts, resulting in either failure to launch or impaired functionality with warnings depending on the blocked ports. #19504
  • Integrates ysql connection manager stats into the tserver metrics snapshotter, which can be enabled via the metrics_snapshotter_tserver_metrics_whitelist gflag, offering visibility into total logical and physical connections. #18805
  • Allows metrics whitelist to include ysql_conn_mgr flag only if the connection manager is enabled, enhancing the accord between connection manager metrics and yugabyted UI. #18805
  • Enables Yugabyted UI to display Alert messages from all nodes by directing API calls through the yugabyted API server. #19972
  • Resolves an issue where the UI failed to launch when advertise_address=0.0.0.0 by ensuring 127.0.0.1 is used instead, and adds a connection check for address uniqueness and timeout for tserver API calls. #18580
  • Enables the starting of two different local RF-1 instances on Mac by adding a check for empty join flag during the second node's initiation. #20018
  • Removes the deprecated gflag use_initial_sys_catalog_snapshot, replaced by enable_ysql that is now true by default, eliminating repetitive warning messages on starting yugabyted nodes. #20056
  • Adapts yugabyted-ui to efficiently support Kubernetes (k8s) deployments, ensuring correct function for nodes with only masters. A new bind_address flag added for customizing the API server's bind address. #20301
  • Rectifies the malfunction in yugabyted-ui when yugabyted utilizes custom ysql_port and ycql_port values by introducing a new flag for YCQL port number. #20406
  • Updates the yugabyted-ui backend to align with changes in the connection manager stats consumed from the :13000/connections endpoint, catering for removal of pool_name and addition of database_name and user_name. #20494
  • Adds yugabyted-ui support to the K8s OSS Yugabyte helm chart, including new values to control UI and metrics snapshotter activation for enhanced metrics visualisation in the K8s environment. #20344
  • Retains the integrity of user's custom configuration file by associating config flag with start command, and directs updates to a yugabyted generated file within base_dir/conf directory. #20881
  • Allows a smooth restart of the second node in a cluster using the join flag without throwing any errors. #20684
  • Enables a predefined set of gflags related to the pg-parity project using the enable_pg_parity flag in the yugabyted start command. #21221
  • Changes the flag enable_pg_parity to enable_pg_parity_tech_preview for activating a predefined set of gflags related to the pg-parity project with the yugabyted start command. #21221
Other improvements
  • Introduces a strict deletion check for orphaned tablets to prevent erroneous data loss when the master issues DeleteTablets to tservers, with the feature guard master_enable_deletion_check_for_orphaned_tablets=true, ensuring upgrade and downgrade safety. #18332
  • Simplifies reading of remotely fetched traces by introducing proper nesting levels and splitting multi-line trace entries into different lines. #19758
  • Enables monitoring of inbound calls for read and write RPCs without any performance impact, by maintaining and updating WaitStateInfo during execution and annotating waits during I/O and lock/condition waiting. #19143
  • Switches release packaging to use native libraries on lowest common version (centos7 for linux-x86) instead of linuxbrew libraries, introducing changes to the default calculation for linuxbrew builds in the 2.21 release. #19219
  • Redefines release packaging to use native library build instead of linuxbrew, boosting compatibility with later OS versions. Changes the default setting for linuxbrew builds to false. Fixes shellcheck errors in compiler wrapper. #19219
  • Redesigns build options parsing in Jenkins for better compatibility, switching from YB_BUILD_OPTS evaluation to YB_*environment variables, and mends shellcheck mistakes in compiler wrapper. #19219
  • Corrects the Jenkins build error that occurred when YB_BUILD_OPT was not set, ensuring smooth build operations even in the absence of YB_BUILD_OPTS. The change switches the packaging method to use native library build instead of Linuxbrew, offering better compatibility with later OS versions. #19219
  • Ensures consistency at the time of stream creation in the CDC Consistent Snapshot feature by selecting a single common read point across all tablets within the input database. Additionally, guards changes with the TEST flag yb_enable_cdc_consistent_snapshot_streams, set to false by default. Also includes alteration to create stream workflow on the Master side and introduces retention barriers on Regular db, WAL, and IntentsDB. #19678
  • Allows you to preserve information sources during stream creation until snapshot records and related changes are consumed by maintaining retention barriers on WAL/Intents/RegularDB. Also, ensures data consistency during failover scenarios by performing preparations as part of the Apply of Raft operation. Includes support for colocated tables during snapshot stream creation, with a filter to exclude WAL records with commit_time lower than or equal to the snapshot_time. Currently, changes are hidden behind the TEST flag, which will later be an autoflag. #19679
  • Extends MiniCluster with YB Controller servers and introduces graceful shutdown feature, ensuring a smoother testing experience. #19849
  • Extends MiniYBCluster to include YB Controller servers and allows for their graceful shutdown. #19849
  • Introduces snapshot and streaming consumption changes as well as support for colocated tables in the context of consistent snapshot stream, allowing exhaustive and mutually exclusive snapshot and change records. #19680
  • Enhances the yb-admin CLI to support the creation of consistent snapshot streams, increasing control over snapshot options like NOEXPORT_SNAPSHOT and USE_SNAPSHOT. #19682
  • Introduces retention_barrier_no_revision_interval_secs gflag to avoid race conditions in setting retention barriers during stream creation, increasing the consistency of snapshot streams. #20145
  • Introduces a generic task that runs tasks after all tablets are created on new tables and fixes issues that could leave the table in the RUNNING state or schedule tasks before updating the data on disk. #20577

Bug fixes

YSQL
  • Allows for ALTER TYPE to run on temporary tables without blocking PG table rewrite, preventing data corruption and enabling smoother transaction handling. #18909
  • Introduces a per-database PG new OID allocator, ensuring OID uniqueness within the database and enhancing horizontal scalability in multi-node and multi-tenancy environments. This new mechanism mitigates OID collisions and allows OID consistency in backup-restore scenarios across clusters. A new GFlag ysql_enable_pg_per_database_oid_allocator is provided to return to old OID allocator behavior if necessary. #16130
  • Restarts the postmaster when a process is killed during its own initialization or cleanup to prevent potential mishandling of shared memory items. #19945
  • Resolves a bug that incorrectly type-checks bound tuple IN conditions involving binary columns like UUID for releases 2.17.1 and higher, improving database consistency. #19753
  • Adjusts the default values of yb_local_throughput_cost, yb_local_latency_cost, and yb_docdb_remote_filter_overhead_cycles, enhancing performance across most TAQO workloads. #20032
  • Ensures consistent wait start times in pg_locks by tracking the RPC request start time for the waiter instead of the time-out in the wait-queue, providing a more accurate reflection of real progress. #18603, #20120
  • Converts the "Unknown session" error into a FATAL error, allowing drivers to instantly finish a non-responsive connection, enhancing client connection management. #16445
  • Corrects a backup failure issue by ensuring the function yb_catalog_version is introduced, especially in 2.4.x or 2.6.x clusters where it was previously missed due to a YSQL upgrade code bug. #18507
  • Ensures the Linux PDEATH_SIG mechanism signals child processes of their parent process's exit, by correctly configuring all PG backends immediately after their fork from the postmaster process. #20396
  • Enhances distinct iteration to avoid missing live rows after detecting a deleted row, by making AdvanceToNextRow aware of whether a fetched row is deleted, thereby ensuring no rows are missed during distinct queries-to-tables with deleted tuples. #19911
  • Enables cleanup after killed backends, fixing an issue where killing a background worker uses up a Proc struct, therefore preventing the webserver from failing after 8 attempts. #20154
  • Releases memory to the operating system after processing each endpoint call, effectively managing large amounts of data produced by long and unique queries and preventing unnecessary accumulation of memory. #20040
  • Eliminates segmentation fault in webserver SIGHUP handler at cleanup by ensuring MyLatch usage in all instances in order to manage process life cycle. #20309
  • Adds a regression test for nested correlated subqueries to guard against reintroducing a previously fixed issue and ensures correct query results, with plans to backport it to relevant branches. #20316
  • Corrects the lookup function in BNL (Block Nested Loop) to ensure matching outer tuples are found accurately when the join condition contains more than just hashable equality filters. #20531
  • Marks BNL plannodes that sort results as unable to project, addressing a regression in sorted BNL's performance and ensuring the accuracy of sorting when a target list changes due to merged overhead projection operators. #20660
  • Extends early termination of index scans for conditions with the form index_column OP NULL to additional btree operators >/>=/</<=, ensuring such conditions no longer send unnecessary data to DocDB. #20642
  • Corrects an error in the aggregate scans' pushdown eligibility criteria to prevent wrong results from being returned when PG recheck is not expected, but YB preliminary check is required to filter additional rows. #20709
  • Corrects the inaccurate detection of constants in distinct prefix computation during distinct index scans, ensuring reliable query results for batch nested loop joins. #20827
  • Renders a fix for memory corruption issue that caused failure in creating a valid execution plan for SELECT DISTINCT queries. Enables successful execution of queries without errors and prevents server connection closures by disabling distinct pushdown. This fix improves the stability and effectiveness of SELECT DISTINCT queries. #20893
  • Eliminates unnecessary computation of range bounds in Index-Only Scan precheck condition, preventing crashes for certain queries and improving performance. #21004
  • Trims down the probability of inaccurate behaviour involving conflicts between single shard INSERT operations by ensuring read times are chosen after conflict resolution, enhancing data consistency. #19407
  • Reduces the time spent on preparing read requests in queries with a large number of operands in the IN operator by avoiding O(n^2) complexity in list traversal when generating ybctids. #19329
  • Refines parameter computation for Nested Loop joins in YSQL, removing the need to manually track relations that can't be batched parameters, thus mitigating bugs and simplifying logic. #19642, #19946
  • Includes additional tests that capture and demonstrably rectify previously recurring errors from Batched Nested Loop Left Join due to incorrectly parameterized batched expressions in multiple loop scenarios. #19642, #19946, #20495
  • Corrects the incrementation timing of pg_stat_user_indexes idx_scan column for LSM index for accurate stat generation, ensuring it no longer increments too early. #17495
  • Reduces spinlock deadlock detection time by 75% for prompt handling of potential freezes and restarts Postmaster when a process holding a spinlock is killed, ensuring successful initiation of new connections. #18272, #18265
  • Prevents potential postmaster crashes during cleanup of killed connections by using the killed process's ProcStruct to wait on an unavailable LWLock. #18000
  • Overhauls the handling of DDL statements, preventing them from restarting in READ COMMITTED mode, better managing DDL transactions, and ensuring more immediate clean-up of DDL transactions. #18761
  • Rectifies the issue of filters not binding to the request by amending the erroneous duplication-check of the bindings on the first column of the row element, enhancing query performance. #19308
  • Resolves an issue by safely dropping all foreign key constraints in one pass, preventing errors when altering a column referenced by a foreign key in partitioned tables. #19063
  • Cures null constraint violations in ALTER TYPE operations and failures on tables with a range key, ensuring accurate operation and error reduction. #18911, #19382
  • Restores previous conditions after test PgRegressIndex yb_index_scan fails due to a commit reversion. #19477
  • Eliminates unnecessary file creation for views on temporary tables by checking if storage is actually needed. #19522
  • Moves estimated seeks and nexts in the EXPLAIN plan from VERBOSE to DEBUG flag, enhancing Sequential Scan nodes to include these estimates. #19938
  • Corrects DDL Atomicity by cleaning up failed CREATE TABLE operations, allowing for multiple sub-commands in ALTER TABLE ALTER COLUMN TYPE, adequately looking up Materialized views in PG schema, and addressing order field-dependency in DocDB columns. #19605
  • Rectifies the serialization mismatch in YBBatchedNestLoop, reducing errors when Parallel Query is enabled. #19612
  • Corrects an error that prevents the ALTER TABLE SET TABLESPACE command from executing successfully when the cluster has a placement_uuid set, by properly filling in the placement_uuid during validation. #14984
  • Allows transfer of parameter values to and from background workers in Parallel Query by correcting the finalize_plan function, improving Nested Correlated Subquery results. #19694
  • Enables running the postprocess script on alternate expected files in pg_regress, effectively fixing mismatches previously noticed due to its absence. #19737
  • Reduces maintenance time by switching to a less complex implementation of SideBySideDiff.java, thereby eliminating errors from SideBySideDiff.sanityCheckLinesMatch. #19690
  • Prevents PostgreSQL backend crashes induced by assert errors in the YbPgInheritsCache as it now correctly cleans up unreleased references, improving transaction reliability. #19807
  • Safeguards against potential bugs by ensuring that yb_transaction_priority_lower_bound and yb_transaction_priority_upper_bound are disregarded in read committed isolation, irrespective of the enable_wait_queue status. #19921
  • Adjusts the shared relcache init file invalidation to ensure correct refresh of the rel cache after executing DDL statements, ensuring consistency with Postgres results. #19955
  • Streamlines the creation of a publication for all tables in per-database catalog version mode by making updates to pg_yb_catalog_version that bypass CheckCmdReplicaIdentity function, eliminating DDL errors. #19965
  • Eliminates unnecessary catalog version incrementation on no-op GRANT DDL statements to enhance optimization by rectifying a previously missed case. #19981
  • Allows successful dropping of table groups when DDL Atomicity is enabled by verifying if the tables within the group are marked for deletion, instead of ensuring the group is empty. #20002
  • Revises YbSeqScan to send ysql_catalog_version in user-initiated system table requests, ensuring system table scans use an up-to-date catalog and reducing chances of TestPgRegressIndex failure. #20017
  • Rectifies the assertion failure issue in the per-database catalog version mode. The fix updates the conditions for treating DDL statements, eliminating previous failures caused by treating some DDL statements as non-DDL statements. #19975
  • Increases the delay when restarting the test cluster in tsan build to prevent occasional failures in unit test PgOidCollisionTest.TablespaceOidCollision/0. #20008
  • Corrects the method for deriving element_typeid to prevent crashes when running aggregations with join by ensuring it's derived from the RHS of the index condition, not the LHS. #20003
  • Resolves a bug ensuring ddl_transaction_state gets properly reset even if YbIncrementMasterCatalogVersionTableEntry throws an exception, preventing non-global DDL statements from being incorrectly handled as global ones. #20038
  • Prevents a possible system crash in YSQL backends manager by ensuring essential checks are in place before using the job database object. #20060
  • Enforces stricter locking mechanisms during concurrent updates on different columns of the same row, to maintain data consistency and prevent 'write-skew anomaly within a row’. Adds a new gflag ysql_skip_row_lock_for_update to toggle the new row-level locking behavior. #15196
  • Ensures removal of both shared and per-database relation cache initialization files during postmaster startup to prevent the reusing of outdated files. #20125
  • Disables CheckCmdReplicaIdentity for tables when yb_non_ddl_txn_for_sys_tables_allowed is set to true, preventing YSQL upgrades from failing during update/delete operations on system tables. #20085, #20143
  • Eliminates the possibility of a segfault during the LWLock process when the postmaster cleans up a killed process, by using KilledProcToCleanup instead of MyProc. #20166
  • Restores PostgreSQL 11 code to its original format, facilitating an easier merge with PG15. #20176
  • Enhances visibility and debugging capabilities by introducing two boolean flags, which log every endpoint access and print basic tcmalloc stats after path handler and garbage collection. Now yb_pg_metrics handles the SIGHUP signal to update flags values. Also adds :13000/memz and :13000/webserver-heap-prof to expose memory usage with a new runtime variable to control tcmalloc sampling. #20157
  • Introduces the pg_stat_statements.yb_qtext_size_limit flag, controlling the maximum file size read into memory, limiting potentially large or corrupt qtext files impacting system memory usage. #20211
  • Unveils fresh insight into webserver memory usage through the creation of :13000/memz and :13000/webserver-heap-prof for printing tcmalloc stats and displaying current or peak allocations, respectively. #20157
  • Rectifies an issue with corrupted state manipulation, caused by processes being killed during writing, by restarting the postmaster anytime a backend is extraordinarily killed in a critical section. This helps avoid infinite loops and CPU overuse, thereby enhancing database stability. #20255
  • Caps retrieval of beentry from localhost:13000/rpcz to 1000 iterations, preventing indefinite waits and ensuring safety even in cases of inconsistent states. #20274
  • Blocks new-version DDL statements in an invalid per-database catalog version configuration to avoid possible stale read/write RPCs and provide accurate results during cluster upgrades. #20300
  • Moves the Active Session History (ASH) code from extension to core Postgres, eliminating the chance of partial feature activation and ensuring control solely through the TEST_yb_enable_ash gflag, enhancing the user's control over the ASH functionality. #20180
  • Enables rollback from PostgreSQL 15 upgrade to preserve PostgreSQL 11 data directory, therefore preventing a loss of stored data such as statistics. #20319
  • Renames the debug field in ExplainState to yb_debug and repositions it to the bottom of the struct for clarity purposes. #20366
  • Reduces memory consumption during secondary index scans by introducing a separate arena for batch operations, lowering the risk of a node run out due to high memory usage. #20275
  • Prevents background worker crashes caused by assertion failures in Active Session History (ASH) when MyProcPort is not established. #20338
  • Adds an extra null check to avoid runtime errors when ASH is enabled by default and prevents the execution of ASH code while running initdb, fixing the PcustomeriniAsh test failure. #20362
  • Reduces likelihood of Restart read required error during Cross-DB Concurrent DDLs with per-database catalog version enabled by initiating the function YbInitPinnedCacheIfNeeded before starting the DDL transaction. Also, improper usage of yb_non_ddl_txn_for_sys_tables_allowed with a DDL statement has been rectified. #20303
  • Disables the flaky PcustomeriniAsh.Ash test in tsan builds to ensure accurate and consistent test results. #20387
  • Increases the schema version of the default partition whenever you create a new partition, preventing erroneous data insertion into the default partition due to cache refresh issues. #17942
  • Enhances test environment on Mac by fixing clean-up issues, and introduces a rollback ability for stashed PG11 data during PG15 upgrade. #20319
  • Adds PgClient session id to ASH metadata to support aggregations for tserver wait events based on client session id, controlled by TEST_yb_enable_ash. Safe to upgrade/downgrade. #20242
  • Revamps the initialization of YbPgInheritsCache's hash table to use binary comparison with HASH_BLOBS flag, ensuring correct hash lookups, while also stopping marathon Java partitioning tests on TSAN to prevent timeouts and test failures. #20436
  • Rectifies the mismatched sizes of various ASH fields, ensuring upgrade and downgrade safety, while providing new functionality without disturbing the existing one. Note that if you downgrade, ASH will become unavailable and it is guarded by TEST_yb_enable_ash. #20454
  • Mitigates MISMATCHED_SCHEMA error in cross DB concurrent DDLs with per-database catalog version turned on, by ensuring backends only apply messages sent by themselves. #20340
  • Eliminates tsan warnings in the MetricWatcher helper class by using MetricEntity class, preventing potential test failures. #20580
  • Rectifies potential flakiness in TestYbAsh testEmptyCircularBuffer by ensuring buffer remains empty during idle cluster and excluding certain query samples. #20629
  • Refines the Batch Nested Loop (BNL) first batch building logic to accurately handle scenarios when the provisional first batch size equalizes the outer table's size for correct query results. #20707
  • Corrects the division by zero error occurring with certain queries when the yb_enable_base_scans_cost_model is activated and yb_fetch_size_limit is enforced by setting a fixed size for result width when it equals zero. #20892
  • Reduces PostgreSQL connection startup timeouts in geo-distributed clusters with a new wait_for_ysql_backends_catalog_version_master_tserver_rpc_timeout_ms GFlag, increasing the default timeout value to 60s from 30s. This alteration only impacts one specific RPC - WaitForYsqlBackendsCatalogVersion, not all RPCs, which should diminish time-out incidents. #18228
  • Updates two column names in the yb_active_session_history view: yql_endpoint_tserver_uuid changes to top_level_node_id for intuition, and session_id changes to ysql_session_id for clarity. #20920
  • Fixes YSQL upgrade failure from 2.16 to 2.21 by adding a 2-second delay before moving to the next connection if the previous script included a breaking DDL statement. #20842
YCQL
  • Solves a concurrency issue in the TestCQLServiceWithCassAuth.TestReadSystemTableAuthenticated unit test by adjusting the CQLServer's shared_pointer reset method. #17779
DocDB
  • Resolves potential WriteQuery leak issue in CQL workloads, ensuring proper execution and destruction of queries, while preventing possible tablet shutdown blockages during conflict resolution failure. #19919
  • Enhances error reporting of cross-cluster pollers, addressing persistence of stale or missed errors and simplifies the corresponding code. Now, instead of storing verbose detailed status, only error codes are stored for efficient memory usage. #19455
  • Refines meta cache updates to avoid overwriting child tablets and consequently causing stale data, ensuring more accurate partition map refreshes. #18732
  • Streamlines transaction processing by updating TabletState only for tablets engaged in writes and ignoring old statuses during transaction promotion, reducing failure errors and boosting consistency. #18081, #19535
  • Resolves an inconsistency problem where indexes grow in size even after delete operations, causing slower query performance. The fix involves intelligent handling of backfill done events on the tablet server side. Note, it only works for newly created indexes and will not auto-recover from current buggy states. #19544
  • Enables wait-on-conflict by default in release builds across all isolation levels. #19837
  • Addresses potential deadlock during tablet shutdown when wait-queues are enabled by refactoring the Wait-Queue shutdown path to execute thread_pool_token_->Shutdown as part of WaitQueue::Impl::CompleteShutdown instead of StartShutdown. #19867
  • Includes a script to ensure no index tables retain delete markers post-backfill, addressing a bug causing indexes to expand in size following row deletion, which slowed queries. The bug affected both YCQL and YSQL APIs for new indexes created with versions 2.14.x/2.16.x/2.18.x and led to increasing storage needs due to accumulated delete markers. This script negates these issues and boosts index performance. #19544
  • Sets kMinAutoFlagsConfigVersion to 1, providing accurate configuration version comparison and reducing potential confusion. #19985
  • Reduces the occurrence of Transaction Metadata Missing errors by accurately reporting deadlocked transactions that may result from multiple aborts in a deadlocked cycle. #20016
  • Enables single shard waiters to progress after a blocking subtransaction rolls back, by applying the same logic used for distributed transactions. #20113
  • Handles backfill responses getting interleaved across different operations more gracefully to prevent crashes caused by slow masters or network delays. #20510
  • Reintroduces bloom filters use during multi-row insert, improving conflict resolution and rectifying missing conflict issues, while also addressing GH 20648 problem. #20398, #20648
  • Reschedules the resumption of contentious waiters on the same underlying Scheduler::Impl::strand_, which is used for executing incoming rpc calls, instead of reactor threads, thus preventing a fatal issue. #20651
  • Reduces log warnings in normal situations by downgrading repeated waiter resumption alerts to VLOG(1), benefiting from the direct signaling of transaction resolution. #19573
  • Disables the wait-on-conflict feature in 2.21.0 by default to fix a launch-blocking bug linked to multiple requests per session to a single tablet. #20978
  • Reflects the actual columns locked in conflict resolution instead of the shared in-memory locks in pg_locks, providing more accurate output for waiting transactions. #18399
  • Deactivates the packed row feature for colocated tables, averting potential write failure issues identified in 20638 during specific kinds of compactions. #21047
  • Enables segfault prevention originating from pg_locks queries when wait-queues are disabled by explicitly checking the existence of server_->tablet_manager ->waiting_txn_registry before its usage. #20772
  • Fixes a race condition on kv_store_.colocation_to_table to prevent undefined behavior and re-enables packed row feature for colocated tables, enhancing data writing and compaction processes. #20638
  • Modifies the DocDB system by shifting the acquirement of submit_token_ of the WriteQuery to the post-conflict resolution phase to prevent DDL requests from being blocked, thus optimizing both reads and writes for continued performance and enhanced data consistency. #20730
  • Corrects transaction queue behavior allowing multiple waiters for a single transaction per tablet, thereby resolving conflicts and enhancing transaction handling capability. #18394
  • Restores the wait-on-conflict feature in the 2.21.0 branch that was previously disabled due to a bug, now resolved. #20978
  • Filters out external intents beyond producer tablet range to address disparity in tablet partitions, ensuring each consumer tablet only receives relevant intents. This resolves the issue of potential hidden batch records due to erroneous starting of write_ids from zero. #19728
  • Resolves the issue where transactions continue and commit despite supposed immediate abort after promotion, due to a timing gap between sending UpdateTransactionStatusLocation RPCs and reception of the first PROMOTED heartbeat. This update delays the sending of UpdateTransactionStatusLocation RPCs until the first PROMOTED heartbeat is acknowledged. #17319
  • Refines the leaderless tablet detection logic to prevent incorrect reporting of tablets having recently undergone leader changes as leaderless, improving data consistency. #20124
  • Prevents the deletion of active snapshots during a database backup, even if their corresponding tables are dropped, enhancing the reliability of backup operations. #17616
  • Adjusts calculation of replication lag metrics for split tablet children by incorporating parent tablet's last sent/committed record time, promoting greater accuracy in metric results. #17025
  • Addresses the bug where large transactions partially apply to regular RocksDB during tablet server restarts, thus ensuring consistent transaction data after restarts. #19359
  • Allows setting all columns of a row to NULL, resulting in deletion instead of creating a row consisting of NULLs, rectifying an issue during compaction. #18157
  • Corrects an issue where an invalid filter key negatively affected the performance of backwards scans, by improperly passing all SST files through the bloom filter. This update will be applied to versions 2.20 and 2.18. #19440
  • Resolves issues of data validation failure and unreachable nodes by properly setting child checkpoints in cdc_state during tablet splits, curbing log amplification. #18540
  • Allows tracing of outgoing calls only if the current RPC is being traced, reducing excessive memory consumption and logging. #19497
  • Introduces retry logic to synchronize metadata and checkpoint creation during remote bootstrap initialization, reducing inconsistency risks associated with schema packing. #19546
  • Stops Garbage Collection (GC) of schema packings that XCluster config references to avoid data loss during replication, taking into account network partitions and schema changes. #17229
  • Removes a regression that could crash the TServer when replaying alter schema during local bootstrap by adding ANNOTATE_UNPROTECTED_WRITE to CqlPackedRowTest.RemoteBootstrap. #19546
  • Corrects Master's tablet_overhead mem_tracker issue, ensuring it displays accurate memory consumption, addressing discrepancy in MemTracker metric names between TServer and Master. #19904
  • Resolves a race condition in MasterChangeConfigTest.TestBlockRemoveServerWhenConfigHasTransitioningServer by ensuring the launched async thread operates on a copy of ExternalMaster* instead of the mutating current_masters vector. #19927
  • Corrects intermittent index creation failure for empty YCQL tables by evaluating the result of is_running rather than checking index state directly, ensuring accurate retain_delete_markers and reducing potential performance issues. #19933
  • Addresses a PITR restore issue by terminating all active transactions, ensuring inserted or updated data doesn't get omitted, and giving a clear signal about the non-application of such transactions. #14290
  • Adds retries around the leader step down in the PgNamespaceTest.CreateNamespaceFromTemplateLeaderFailover test to allow the target leader time to properly catch up, preventing previous failures. #14316
  • Disables the packed row feature for colocated tables, effectively preventing a possible encounter with the underlying issue in 21218 during debugging. #21218
  • Prevents system crashes caused by the CallHome class calling a pure virtual function due to a timing issue during system shutdown. #18254
  • Corrects an Xcluster Consumer shutdown issue encountered during testing by implementing a temporary mitigation that waits for the Flush with a timeout. #19402
  • Amends RaftGroupMetadata::CreateSubtabletMetadata to update the log prefix, preventing the use of parent tablet ID in child tablet's metadata logging. #19375
  • Resolves crashes in sys-catalog-tool linked with TabletBootstrap failing due to uninitialized transaction_participant_context, enhancing stability. #19412
  • Corrects a previously non-retryable PGSQL operation, preventing errors from being returned back to PG layer during a parent tablet shutdown scenario. #19033
  • Enables transaction promotion in TestPgWaitQueuesRegress for an enhanced testing process. #19575
  • Restores the original behavior of not counting tablets on dead tservers towards the replica count, ensuring accurate representation of under-replicated tablets. #17867
  • Ensures the correct in-memory state for the master coming out of shell mode by fetching the universe key from other masters, enabling proper decryption of the universe key registry. #19513
  • Corrects a lock order inversion in the transaction loader to prevent potential deadlock scenarios. #19508
  • Adds tests for handling indexes in colocated databases in transactional and non-transactional xCluster environments, enhancing database reliability and consistency. Also simplifies WaitForReplicationDrain test helper for easier usage. #18427, #16758
  • Rectifies the issue causing the XClusterYsqlIndexTest.FailedCreateIndex test to fail by altering the over-aggressive DCHECK to an efficient SCHECK to allow for transient ALTER operations. #18967
  • Rectifies the use-after-free issue in RefinedStream::Connected failure path by ensuring a status return rather than causing memory writes to a freed space. #19727
  • Introduces macros that simplify the creation of comma-separated expression lists to a stream, reducing repetition. #19761
  • Redefines the structure of thirdparty_archives.yml by eliminating redundant fields, implementing sensible default values, and introducing blank lines for improved readability between distinct third-party archive build sections. #19883
  • Increases the visibility of Remote Bootstrap (RBS) sessions by adding a dedicated tserver page that lists all ongoing RBS sessions, including the remote log anchor sessions. Additionally, amplifies the Last status field on the tserver's tablets page to display the source a peer is or has been bootstrapping from. #19568
  • Resolves a maybe-uninitialized compilation error in almalinux8 release gcc11, enhancing the reliability of the code by addressing both identified issues. #19987
  • Rectifies the TestYSQLDumpAsOfTime compilation issue by replacing <int64_t> with <PGUint64>. #19992
  • Eliminates the extra verbosity in MiniCluster logs by removing entries with hk!!. #20007
  • Resolves an issue where the webserver may start prematurely and fail, by ensuring cds::Initialize is called before executing any function on cds::threading::Manager, minimizing race conditions. #20119
  • Introduces an asynchronous interface for PgClient shared memory exchange, allowing for multiple requests and parallel query processing. #20151
  • Displays the errno when unable to open version_metadata.json or auto_flags.json files, providing clarity on the nature of the IO error. #20250
  • Deprecates the enable_process_lifetime_heap_sampling flag, simplifying tcmalloc sampling control to only setting profiler_sample_freq_bytes, which if <=0 disables sampling. #20236
  • Prevents application crashes caused by an interrupted interprocess semaphore which previously threw an exception. #20325
  • Allows early termination of old single statement read-committed transactions facing kConflict errors to enhance system throughput. #20329
  • Eliminates premature shutdowns during transaction status resolution by ensuring the rpcs_.Shutdown only occurs after all status resolvers of the participant have ended, avoiding any in-progress status resolver rpc(s). #19823
  • Reduces potential request is too old errors during YSQL DDLs by setting the SysCatalog tablet's retryable request retain duration to the maximum of YSQL and YQL client timeout. #20330
  • Fixes ./yb_build.sh help to correctly display the help command instead of an error message due to a mismatched function name. #20390
  • Removes non-trivially destructible static initializations from the code, eliminating complexities that could lead to difficult to identify bugs. #20407
  • Replaces the deprecated exec_program command with execute_process in CMake, resolving issue 20481 and eliminating potential warning CMP0153 for developers. #20481
  • Allows bulk load time reduction by packing all values when inserting a row with multiple values into the PostgreSQL layer. Apply the preview flag -ysql_pack_inserted_value to enable this feature and note it currently uses v1 encoding. #20713
  • Stores the first error from a failed setup replication to ensure more accurate feedback to the user, instead of a final generic error message like Universe is being deleted. #20689
  • Changes the path in yb_build.sh to locate generate_test_truststore.sh in $YB_BUILD_SUPPORT_DIR, solving build failures on GitHub Actions. #20747
  • Reduces TPCC NewOrder latency by replacing the ThreadPoolToken with a Strand within a dedicated rpc::ThreadPool in PeerMessageQueue's NotifyObservers functions, enhancing speed and efficiency. #20912
  • Early aborts transactions that fail during the promotion process, enhancing throughput in geo-partitioned workloads and offering stability in geo-partitioned tests. #21328
  • Eliminates a race condition that can occur when simultaneous calls to SendAbortToOldStatusTabletIfNeeded try to send the abort RPC, thus preventing avoidable FATALs for failed geo promotions. #17113
  • Changes the initial remote log anchor request to be at the follower's last logged operation id index, reducing the probability of falling back to bootstrapping from the leader and improving the success rate of remote bootstraps. #19536
  • Prevents concurrent heap profiles from running and problematic resetting of sampling frequency, allowing only one heap profile to run at a time. #19841
  • Resolves use-after-move errors detected by clang-tidy's bugprone-use-after-move-check for increased code stability. #20435
  • Resolves issues in the under-replicated endpoint algorithm, ensuring correct counting of replicas only when the block's minimum number of replicas has not been fulfilled yet, hence offering accurate replica tally for placement blocks. #20657
CDC
  • Introduces an additional test case ensuring that only tablets belonging to a dropped table get deleted from the cdc_state table. #19196
  • Eliminates deadlock during the deletion of namespace-level CDC streams, enabling the successful execution of the ysqlsh drop database command even when the database has multiple tables. #19879
  • Resolves an issue preventing newly created tables from being added to the stream metadata and CDC state table after an existing table is dropped, by considering streams in DELETING_METADATA state as well as ACTIVE state during dynamic table addition. #20428
  • Removes only non-active tablets from cdc_state in CleanUpCDCStreamsMetadata, including retaining parent split tablets, to preserve essential data during stream cleaning. #19348
  • Fixes the issue of WAL garbage collection for tables added after stream creation by enabling WAL retention for each such tablet, reducing connector failure. #19385
  • Reinstates the creation of CDC streams with old record types to ensure backwards compatibility and prevent CDC error 9 when the ALL mode is utilized. #19929
  • Fixed the decoding of NUMERIC value in CDC records to prevent precision loss by ensuring that the decoded string is not converted to scientific notation if its length is more than 20 characters. Additionally, the fix involves using the string representation with no limit on length and employing the Postgres numeric_out method for decoding, which is identical to the decoding of numerics in a PG query. #20414
  • Rectifies an error within the CDCService side, where Merger tried to set tablet safetime to a lower value. Now, for non-consistent snapshot streams, the commit_time_threshold adjusts correctly to the safe_hybrid_time value as per the request, instead of always setting to zero. #20356
  • Rectifies consistent snapshot stream creation by ensuring tablets complete their tasks and snapshot safe opids populate in the cdc_state table for proper initialization. #20477
  • Allows continuation of tablet fetching, even if certain tables face errors, by logging a warning instead of sending unnecessary errors to the client. #19434
  • Rectifies pg_replication_slots view failure prior to any cdc/xCluster stream creation by refining the logic to read the cdc_state_table only when a cdc stream exists. #20073
  • Updates the CDCSDK stream metadata with consistent snapshot-related details and ensures its persistence in the sys_catalog, enhancing the stability and accuracy of data. #20202
  • Fixes flaky behavior in CDCSDKTabletSplitTest by ensuring GetChanges is called after registering children tablets, reducing test failures. #20469
  • Corrects the AsyncYBClient method to pass the explicit_cdc_sdk_opid instead of a null value, ensuring proper snapshot checkpointing and enhancing snapshot resume functionality in EXPLICIT mode. #19394
  • Alleviates a regression in the connector snapshot resume capability by adjusting the key population in GetChangesRequest, ensuring the key is populated only when it is not null. #19394
  • Removes potential crash in DEBUG mode by ensuring each entry returned from the cdc_state_table iteration in pg_replication_slots view is checked with RETURN_NOT_OK before usage. #19894
  • Increases the value of FLAGS_update_min_cdc_indices_interval_secs from 2 to 5, ensuring the CDC state table tablet has enough time to wait for a new leader and correctly update the log. #18156
  • Corrects the calculation of the cdcsdk_sent_lag metric to prevent disproportionate growth, by updating the last_sent_record_time with each SafePoint record, reducing inconsistency between transactions. #15415
  • Eliminates errors in streaming changes from child tablets in CDCSDK by accurately determining the slowest consumer and preventing unnecessary Garbage Collection of intents. #20284
  • Allows propagation of RPC deadline from clients to YB-Master for CreateCDCStream, reducing unnecessary retries and correctly timing out requests. #20583
  • Resolves memory leak errors in the asan environment caused by not freeing YBCStatus from YBCPgExecCreateReplicationSlot in case of AlreadyPresent or LimitReached errors. #20279
  • Resolves CDCLog and CDCService test failures by setting FLAGS_cdcsdk_retention_barrier_no_revision_interval_secs to 0, ensuring upgrade and rollback safety. #20353
  • Rectifies timing issues in the CDCSDKConsistentSnapshotTest.TestRetentionBarrierSettingRace, enhancing stability for TSAN builds via application of WaitFor with an adequate timeout. #20455
  • Prevents write pausing on a tablet for an AlterSchema procedure that is solely setting retention barriers during consistent snapshot stream creation. #20620
  • Stream creation failures now trigger a thorough cleanup to avoid resource misuse, resolving issues caused by late ALTER TABLE responses. #20725
yugabyted
  • Revises auth failure handling in Ysql Connection Manager to give accurate error messages, prevent broken control connections, and improves error packet handling. #17289, #19781, #19800
  • Adjusts Ysql Conn Mgr Stats setting to align with Ysql Conn Mgr's status, maintaining FALSE setting even when Postgres process is created without a tablet server. #19998
  • Resolves the hanging issue in Odyssey when incoming packet size exceeds a limit, by ensuring COPY_DATA and QUERY message types are fully received before processing. #19245, #19284
  • Maintains sticky object count bi-directionally when creating new sub transactions or returning to parent transactions, aligning count with actual usage. #20071
  • Allows usage of SET LOCAL query to set temporary session parameters for specific transactions, with values reverting after transaction completion. #19556
  • Introduces a JSON endpoint at /api/v1/mem-trackers, enhancing data reliability by avoiding parsing of the HTML page at the /mem-trackers server endpoint for memory usage data. #18057
  • Modifies yugabyted UI apiserver to acquire memory usage data from the new JSON endpoint /api/v1/mem-trackers instead of parsing HTML from /mem-trackers, ensuring more reliability. #18057
Other fixes
  • Ensures the tserver start and tserver stop scripts successfully terminate all running PG processes, regardless of PID length, enhancing process management. #19817
  • Updates the condition for HT lease reporting to ensure accurate leaderless tablet detection in RF-1 setup, preventing false alarms. #20919
  • Increases the max_stack_depth from 900kB to 950kB for proper execution and lessens the excessive logging triggered by inherits cache in yb_pg_errors.sql. #19443
  • Reduces disruptions by throttling the master process log messages related to "tablet server has a pending delete" into 20-second intervals. #19331
  • Prevents segmentation faults in the stats collector after a Postmaster reset, ensuring the stats collector's operations are uninterrupted even when a query is terminated. #19672

Other

  • Streamlines code base by eliminating over 900 unnecessary includes, splitting oversized .proto files, enhancing the protoc-gen-yrpc to produce forward headers for protobuf, and upgrading precompiled headers. Also restructures MasterService, divides it into smaller services improving build times, and moves encryption-related classes. Updates now allow less system entropy drain via revised UUID generation. #10584
  • Validates the use of two arguments for disable_tablet_splitting, addressing a previous condition where only one was required, thereby enhancing backup process reliability. #8744
  • Enables passing of username and password to the connect command akin to ysqlsh, permitting direct connection to the desired database/keyspace. #14869
  • Introduces documentation for GFlags pertinent to the bootstrap from closest peer feature in the tserver flags page. #18061
  • Corrects a nonfunctional link in the RBS GFlags description and adds documentation for the bootstrap from closest peer feature. #18061
  • Reduces network requests when running ./yb_build.sh offline for a smoother rebuild process and adds helpful error messages for easier debugging. #19476
  • Rectifies the issue where yugabyted crashes if yugabyted-ui binary doesn't exist, allowing the cluster to start with the UI disabled, similar to setting ui=false and alerts the user with a warning. #16098
  • Resolves the odyssey build failure on Ubuntu 23.04 when compiling using ./yb_build.sh release gcc13 by addressing -Werror=address issue. #19959
  • Adjusts previously hardcoded ports such as master_rpc_port, tserver_webserver_port, and master_webserver_port to dynamically accommodate custom configurations, solving connectivity issues in multi-region/zone cluster setups. #15334
  • Ensures better visibility into local calls by tracking them and allowing DumpRunningRpcs API to fetch them; if rolled back, this functionality will turn unavailable. #19697
  • Transitions primary build and packaging from Centos7 to AlmaLinux8, discontinuing support for Linux OS's with glibc less than 2.28 for future integrations, while preserving it for versions 2.20 and earlier. #20173