What's new in the v2.21 preview release series
What follows are the release notes for the YugabyteDB v2.21 release series. Content will be added as new notable features and changes are available in the patch releases of the YugabyteDB v2.21 release series.
For an RSS feed of all release series, point your feed reader to the RSS feed for releases.
Changes to supported operating systems
YugabyteDB 2.21.0.0 and newer releases do not support v7 Linux versions (CentOS7, Red Hat Enterprise Linux 7, Oracle Enterprise Linux 7.x), Amazon Linux 2, and Ubuntu 18. If you're currently using one of these Linux versions, upgrade to a supported OS version before installing YugabyteDB v2.21. Refer to Operating system support for the complete list of supported operating systems.v2.21.1.0 - June 13, 2024
Build: 2.21.1.0-b271
Downloads
Docker:
docker pull yugabytedb/yugabyte:2.21.1.0-b271
New features
- Bitmap scan support. Combine multiple indexes for more efficient scans. TP
- Active Session History. Get real-time and historical information about active sessions to analyze and troubleshoot performance issues. TP
Change log
View the detailed changelog
Improvements
YSQL
- Enhances logging for DDL transaction conflicts and PG catalog version mismatches by including the DDL command tag and specific log details outside of the
log_ysql_catalog_versions
gflag. #20084 - Adds a webserver with a Prometheus endpoint on ysql_bench for resilience and scale documentation. #19667
- Alters temporary namespace naming in YB to
pg_temp_<tserver-uuid>_<backend_id>
frompg_temp_<backend_id>
, making them unique across nodes and preventing temp tables overwriting or deletion. #19255 - Enhances
ADD/DROP PK
using a new table rewrite approach, preserving PG level metadata during rewrite operations and enabling concurrent DML abortions to manage schema version mismatches. #17130 - Introduces new flags in
yb_backup.py
:backup_tablespaces
,restore_tablespaces
, and a redefineduse_tablespaces
, enhancing backup and restore procedures. #20389 - Refines VACUUM warning messages, removes beta feature sign and makes it clear that garbage collection of dead tuples is automatic. #18330
- Allows BNL (Block Nested Loop) joins on different integer types, such as
int2
andint4
, promoting flexibility in join-orderings. #20715 - Adds function to log memory contexts of specific backend process, helping debug local memory bloat issues. #14025
- Treats
REFRESH MATERIALIZED VIEW
as a non-disruptive change, preventing unnecessary transaction terminations. The default option,REFRESH MATERIALIZED VIEW NONCONCURRENTLY
, modifies metadata but without making a disruptive alteration. #20420 - Resolves
ToString
function issue which caused non-const references of std/boost::optional objects to display as pointers. #20719 - Renamed
EXPLAIN
field names:Remote Filter
toStorage Filter
,Remote SQL
toStorage SQL
, andRemote Index Filter
toStorage Index Filter
. #14503 - Redefines the
ToString
function order to prevent compilation failures when used for collections withstd::optional
. #20887 - Introduces new
yb_backup.py
flags includingbackup_roles
,restore_roles
,ignore_existing_roles
, anduse_roles
with distinct semantics for enhanced role management during backup and restore operations. #20972 - Reduces stack trace duplication in yb_debug_report_error_stacktrace and refines its debugging functionality. #21017
- Now, the
yb_prefer_bnl
flag takes precedence over aNestLoop
hint, ensuring smoother upgrades. #21129 - Enables easier alteration of the
ysql_enable_db_catalog_version_mode
gflag default value through a new framework. #21127 - Removes unnecessary epoch parameter from LaunchTS and changes epoch back to term in YsqlBackendsManager. #21217
- Introduces a new flag
master_ts_ysql_catalog_lease_ms
to decrease the lease period to 10 seconds, reducing the waiting time for unresponsive tservers. #21249 - Refine
ALTER TYPE
functionality to apply the table rewrite approach enhancing database upgrade processes. #17130 - Displays distinct prefix keys explicitly in the explain output, enhancing the clarity of indexing for users. #20831
- Adds auto gflag
ysql_yb_enable_ddl_atomicity_infra
to control DDL atomicity feature during the upgrade phase. #21535 - Allows YbInitPinnedCacheIfNeeded to only load the shared pinned cache, enhancing concurrent handling of DDLs in various databases. #21635
- Now logs global-impact DDL statements that increment all database catalog versions. #21826
- Adds a new YSQL view for YCQL statement metrics, allowing it to be joined with YCQL wait events in the
yb_active_universe_history
table. #20616 - Reduces per-backend memory consumption by reinstating TOAST compression for catalogue tables. #21040
- Avoids schema version mismatch errors during ALTER TABLE operations in cases where DDL atomicity is enabled. #21787
- Resolves schema version mismatch errors that occur after an ALTER TABLE operation due to DDL transaction verification in non-debug builds. #21787
- Introduces a new YSQL configuration parameter
yb_enable_parallel_append
to disable the unannounced featureparallel append
. #21934 - Adds new columns to localhost:13000/statements for more comprehensive database management, including user and database IDs along with varied block level statistics. #21735
- Enables DDL atomicity feature by default by altering
ysql_yb_ddl_rollback_enabled
,report_ysql_ddl_txn_status_to_master
, andysql_ddl_transaction_wait_for_ddl_verification
flags' defaults. #22097 - Enhances YSQL backfill logging clarity and documentation for PgsqlBackfillSpecPB proto. #21154
- Allows ysql_bench to execute a connection initialization SQL like Hikari CP, useful for setting parameters such as statement_timeout during resilience testing. #19741
- Optimizes the
get_tablespace_distance
function, enhancing the speed of theyb_is_local_table
YSQL function. Reduces query time by cachingGeolocationDistance
value. #20860 - Updates the description for the
ysql_catalog_preload_additional_tables
flag to accurately reflect preloading behavior. #20791 - Avoids utilizing unsupported ybgin index scans by adjusting the ybgin cost estimation, favoring sequential scans for better effectiveness when specific conditions are met, mitigating potential misuse of indices. #9960
- Allows hiding non-deterministic "Memory:" fields in EXPLAIN output using the GUC
yb_explain_hide_non_deterministic_fields
, primarily for pg15. #20958 - Delivers consistent error messages for aborted transactions that align with those from serialization and deadlock errors. #21043
YCQL
- Now throws an error when using the unsupported GROUP BY clause in YCQL with autoflag
ycql_suppress_group_by_error
available for compatibility issues. #13956
DocDB
- Marks the unused
rpc_queue_limit
flag as deprecated in the latest releases. #20830 - Ensures flag validators are defined in the same source file as the flag for reliable initialization order. #20915
- Adds the ability to cancel the ScopeExit action for more efficient resource cleanup during successful function completions. #20595
- Allows the TServer and master memory allocation to automatically adjust based on the available node RAM. #20664
- Adds a log message to notify when a tablet cannot be moved from a blacklisted server due to replication factor constraints. #15624
- Reduces tablet overhead by eliminating unnecessary allocation of the WritableFileWriter in RocksDB's Write-Ahead Log (WAL). #7996
- Introduces a new macro to verify if specific fields in a protobuf message are set, enhancing error tracking. #20802
- Streamlines the process of registering master web server URL paths, making the code less prone to errors. #20858
- Enables all catalog entities to track and abort tasks, ensuring task termination when entities are deleted or master loses leadership. #20859
- Avoids potential deadlock by using xcluster safe time excluding the ddl_queue table when adding new tables. #21076
- Reduces log clutter by making
generate_test_certificates.sh
only output on failure. #20979 - Renames
MultiStepTableTask
toMultiStepCatalogEntityTask
to ensure broader support for CatalogEntityWithTasks. #20982 - Simplifies use and readability of
IsOperationDoneResult
by relocating tois_operation_done_result.h
for broader usage. #21085 - Allows immediate halt and return of actual error when table creation fails, instead of waiting for a generic "Timed out" error. #17132
- Enables filtering of files based on hybrid time during conflict resolution to enhance performance and allows row number override in
pg_single_tserver-test
using a test gflag. #20666 - Allows
IsOperationDoneResult
to be accessed for non-xcluster usage, enhancing functionality. #21085 - Introduces a new limit on Prometheus metric entries to prevent server overwhelm when there are too many metrics. #18089
- Fixes a race condition on xClusterPoller shutdown to prevent unexpected error_map additions. #21134
- Aligns pointer to the left in
arcanist_util/pre_commit_hook.sh
for consistent styling. #21329 - Shifts
async_client_initializer
to the server and minimizesclient
libraries dependency onserver_process
, enhancing its proper operation within a server. #21337 - Relocates specific flags to
rpc
library andcommon_flags.cc
fromserver_process
, and removesFsManager
usage inclient
library. #21338 - Allows the combination of SCHECK and STATUS_EC_FORMAT with the addition of SCHECK_EC_FORMAT. #21373
- Increases follower lag check to
1s *kTimeMultiplier
in AreNodesSafeToTakeDown tests, minimizing test flakiness. #21247
CDC
- Preserves CDC stream even when all associated tables are dropped, tying its lifecycle to the database. #21419
- Added a test to certify the
safe_time
set duringGetChanges
call, reducing data loss during network failures. Ensures consistentsafe_hybrid_time
in multipleGetChanges
calls. #21240 - Allows modification of the publication refresh interval using the
cdcsdk_publication_list_refresh_interval_secs
flag. #21796 - Enables
REPLICA IDENTITY
syntax for altering table commands in YSQL, allowing control over CDC image information. #20143 - Allows separate creation functions for xcluster and cdcsdk, resolving an issue in stream creation. #20536
- Removes unused includes from CDC files, potentially reducing build times. #21235
- Introduces replica identity in CDC to populate before image records, allowing table-level before image information fetching and retaining in stream metadata. #21314
- Eliminates unnecessary NOTICE messages when setting yb_read_time from walsender, reducing message clutter. #22379
- Fixes CDCSDK flaky test by ensuring the write is persisted before reading to avoid race condition. #20491
- Enables transaction state to be cleared promptly after a table is deleted, preventing table deletion from getting stuck and resulting in faster functionality. #22095
yugabyted
- Allows
DROP DATABASE
query to accurately check for active connections before succeeding or failing. #20581 - Allows smooth execution of DML queries interspersed with relevant DDL queries by accurately handling unnamed prepared statements. #21367
- Allows setting of YSQL configuration parameters in various scenarios including SET, RESET, RESET ALL, and SET LOCAL using Connection Manager. #19989
- Updates the yugabyted-ui backend to align with changes in the connection manager stats consumed from the
:13000/connections
endpoint, catering for removal ofpool_name
and addition ofdatabase_name
anduser_name
. #20494 - Allows a smooth restart of the second node in a cluster using the
join
flag without throwing any errors. #20684 - Runs point-in-time recovery operations for specific databases or keyspaces directly through the new
configure point_in_time_recovery
sub-command in yugabyted. #20493 - Retains the integrity of user's custom configuration file by associating
config
flag with start command, and directs updates to a yugabyted generated file within base_dir/conf directory. #20881 - Facilitates faster loading time for UI by incorporating a local cache of master/tserver addresses in the yugabyted-ui api server. #21181
- Allows separate counting of YSQL and YCQL connections when YSQL connection manager is active. #21182
- Enables a predefined set of gflags related to the pg-parity project using the
enable_pg_parity
flag in the yugabyted start command. #21221 - Enables parsing of the
allowed_preview_flags_csv
master flag when given usingmaster_flags
. #21364 - Changes the flag
enable_pg_parity
toenable_pg_parity_tech_preview
for activating a predefined set of gflags related to the pg-parity project with the yugabyted start command. #21221 - Adds support for Prepare Statements via simple query protocol in Ysql Connection Manager, ensuring connection
stickiness
. #19601
Bug fixes
YSQL
- Ensures the Linux
PDEATH_SIG
mechanism signals child processes of their parent process's exit, by correctly configuring all PG backends immediately after their fork from the postmaster process. #20396 - Reduces the likelihood of a CHECK failure when restarting a DDL statement in debug build. #20820
- Corrects the inaccurate detection of constants in distinct prefix computation during distinct index scans, ensuring reliable query results for batch nested loop joins. #20827
- Renders a fix for memory corruption issue that caused failure in creating a valid execution plan for
SELECT DISTINCT
queries. Enables successful execution of queries without errors and prevents server connection closures by disablingdistinct pushdown
. This fix improves the stability and effectiveness of SELECT DISTINCT queries. #20893 - Fixes table rewrite issue on non-colocated tables/matviews in colocated DB, ensuring the new table uses the original table's colocation setting. Includes a workaround for GH issue 20914. #20856
- Reduces excessive storage metric updates during
EXPLAIN ANALYZE
operation, enhancing performance by incorporatingstorage_metrics_version
inYBCPgExecStats
andYbInstrumentation
. #20917 - Prevents simultaneous send of read and write operations in the same RPC request that could lead to inconsistent read results, by ensuring that, in case of multiple operations, all buffered ones are flushed first. #20864
- Returns accurate data by checking actual column type before fetching in libpq_utils' template functions. #20683
- Prevents YSQL upgrade failure from versions 2.16 to 2.21 by adding a 2-second delay if there's a breaking DDL statement. #20842
- Corrects the division by zero error occurring with certain queries when the
yb_enable_base_scans_cost_model
is activated andyb_fetch_size_limit
is enforced by setting a fixed size for result width when it equals zero. #20892 - Catches and manages expected errors during concurrent DML & DDL operations on the same table. #20953
- Resolves PcustomeriniTest.CatalogVersionUpdateIfNeeded test failure in perdb catalog version mode. #20985
- Allows new-version DDL in an invalid per-db catalog version configuration during the trial phase, primarily for reversing unproductive upgrades. #20300
- Ensures successful CREATE INDEX operation during the upgrade to per-database catalog version mode, even before the execution of the YSQL migration script. #20300
- Transaction abort error is now considered an expected error in the TestPgDdlConcurrency.testModifiedTableWrite unit test. #21022
- Allows safer execution of DDL statements during the finalization phase of cluster upgrades, reducing risks of data inconsistencies. #21066
- Allows ModifyTable EXPLAIN statements to run as a single row transactions, decreasing latency. Also enables logging for transaction types when
yb_debug_log_docdb_requests
is enabled. #19604 - Adjusts heartbeat mechanism to shut down when an "Unknown Session" error occurs, reducing log alerts. This benefits idle connections with expired sessions. #21264
- Reduces PostgreSQL connection startup timeouts in geo-distributed clusters with a new
wait_for_ysql_backends_catalog_version_master_tserver_rpc_timeout_ms
GFlag, increasing the default timeout value to 60s from 30s. This alteration only impacts one specific RPC - WaitForYsqlBackendsCatalogVersion, not all RPCs, which should diminish time-out incidents. #18228 - Changes the index backfill timeout-related flags to lower the possibility of running into timeout-related failures, especially significant when working with YSQL. #10650
- Corrects the "create index" error by adjusting master's operation mode based on
pg_yb_catalog_version
table checks, ensuring accurate catalog version table mode. #21230 - Reduces delay during master leader changes and cluster startups by having the master wait out the lease period before responding to WaitForYsqlBackendsCatalogVersion requests. #21251
- Grants CREATE privilege on SCHEMA public to all users, enabling PgCatalogVersionTest.DBCatalogVersionGlobalDDL and PgCatalogVersionTest.DBCatalogVersionDisableGlobalDDL tests to pass in both PG11 and PG15. #21326
- Allows BNL's on outer and inner tables, even if the inner table has "unbatchable" join restrictions that can't accept batches of inputs, enhancing queries with complex join conditions. #21366
- Enabling
yb_enable_base_scans_cost_model
flag triggers PG selectivity estimation and ignoresyb_enable_optimizer_statistics
flag. #21368 - Sets
LC_ALL
environment variable toC.UTF-8
when runningpgrep
in yb-ctl, preventing failure due to UTF-8 characters in other processes' names. #21381 - Fixes seg faults in parallel index/indexonly queries with attributes exceeding those in the relation. #21427
- Corrects the scanning direction error for GiST index by verifying if the scan relation is a YB relation and applying
NoMovement
direction only in that case. #21435 - Corrects YbGate cleanup after errors to ensure proper functioning of tests and eliminates potential segmentation fault. Additionally, enhances error logging mechanism. #21180
YCQL
- Solves CQL check-failure issue for
No wait state
when usingyb_enable_ash
without altering the default flag value. #21136 - Allows the deletion of the Cassandra role in YCQLsh without it regenerating upon cluster restart, by adding a flag to mark if the role was previously created. #21057
DocDB
- Clears
pending_deletes_
on failed delete tasks thus preventing tablets from being incorrectly retained after task failure or completion. This rectifies a race condition and allows the Load Balancer to perform operations on specific tablets and Tablet Servers. #13156 - Allows users to specify Gzip stream compression levels enhancing file fetching speed and RPC performance. #20848
- Ensures
Create Table
operation fails ifAlter Replication
encounters an error, enhancing the reliability of replication setup. #21732 - Converted the
ysql_skip_row_lock_for_update
to an auto-flag to resolve compatibility issues during upgrade, preventing incorrect DB record creations that can affect row visibility and integrity. #22057 - Fixes a timeout issue when flushing tablets by handling failed RPC call responses. #20948
- Modifies memory consumption calculations for pending operations to ensure accurate rejection of new writes at bootstrap, preventing loading failures. #21254
- Trims large error messages in AsyncRpc::Failed to prevent hitting memory limit and resulting unavailability. #21402
- Renames and updates the description of the gflag min_secustomerent_size_to_rollover_at_flush for clarity. #21691
- Changes the class of
enable_automatic_tablet_splitting
gflag fromkLocalPersisted
(class 2) tokExternal
(class 4) to eliminate setup issues with XCluster configurations. #22088 - Allows DML operations on target cluster databases not involved in xCluster replication STANDBY mode. #21245
- Eliminates duplication of the colocation parent table in snapshots created by schedules. #20541
- Enables optional "INCLUDE_NONRUNNING" flag to list all namespaces in yb-admin, aiding in debugging. #20331
- Reduces build failure chances on MacOS by modifying
generate_test_certificates.sh
to employ third-partyopenssl
instead of system'sopenssl
. #20764 - Allows packing multiple values in Postgres layer for direct insertion into DocDB, reducing insert time and duplication with the use of the
ysql_pack_inserted_value
gflag. #20713 - Erases errors from the altered universe after merging it back into the original one. #20789
- Fixes test failure by updating the test after adding new
tablet_id
option to db_options. #20975 - Enable lightweight profiling for identifying and timing slow-performing function call sites using the
enable_callsite_profile
andenable_callsite_profile_timing
flags. #21008 - Reduces TPCC NewOrder latency by replacing the ThreadPoolToken with a Strand within a dedicated rpc::ThreadPool in PeerMessageQueue's NotifyObservers functions, enhancing speed and efficiency. #20912
- Allows database drop operations to proceed smoothly by ignoring missing streams errors and skipping replication checks for already dropped tables. #21070
- Switches all builds, excluding ASAN, to Clang 17 while updating the default compiler type selection logic. #21077
- Allows ListTabletServers to handle heartbeats older than 24 days by adjusting the setting to the maximum int32 value, avoiding system crash. #21096
- Switches the ASAN build type to Clang 17, resolves its issues, and now supports Clang 18 compiler type. #21077
- Adds a test to detect missed conflicts with index-only scans from concurrent transactions on non-unique indexes. #20486
- Includes the
indexed_table_id
with the index in table listings, eliminating the need for a second lookup to associate a main table with an index. #21159 - Ensures only missing replicas are marked as over-replicated, avoiding the incorrect removal of tablet replicas. #21135
- Activates the wait_states-itest for kBackfillIndex_WaitForAFreeSlot. #21239
- Allows DML operations on non-replicated databases and blocks DML only on databases in transactional xCluster replication STANDBY mode. Now only databases part of an inbound transactional xCluster replication group in the xCluster safe time map will have DML operations blocked. Also, certain attributes are moved from tserver to TserverXClusterContext. #21245
- Enables logging stack traces during call site profiling for identifying frequent callers of hot spots. #21305
- Early aborts transactions that fail during the promotion process, enhancing throughput in geo-partitioned workloads and offering stability in geo-partitioned tests. #21328
- Corrects block cache metrics discrepancy by ensuring Statistics object passes into LRUCache from TableCache for accurate updates. #21407
- Enables submitting multiple tasks to a thread sub-pool and waiting for all tasks to complete without enforcing sequential execution. #21344
- Disables CppCassandraDriverTest.BatchWriteDuringSoftMemoryLimit to prevent test Spark job cancellations. #21459
- Fixes a segmentation fault in yb-master by checking for a null pointer before dereferencing it, addressing an issue in the CDC run on
2.23.0.0-b37-arm
. #21648 - Allows DML operations on non-replicated databases and blocks DML only on databases in transactional xCluster replication STANDBY mode. Now only databases part of an inbound transactional xCluster replication group in the xCluster safe time map will have DML operations blocked. Also, certain attributes are moved from tserver to TserverXClusterContext. #21245
- Adds a TSAN suppression to manage the apparent race condition in the function
boost::regex_match
. #21585 - Eliminates potential FATAL errors during reported tabletPB creation by ensuring retrieval of schema version is atomic. #21340
- Enables the session to outlive the callback by holding a shared pointer to it, preventing potential crashes during concurrent DML queries. #21103
- Prevents yb-master crash by ensuring background task isn't deleted before the callback is invoked. #21773
- Corrects the
ClientTest.TestCreateTableWithRangePartition
by letting the system select the suitable namespace ID. #21827 - Enables callback completion wait in PollTransactionStatusBase during shutdown to prevent unexpected process termination. #21773
- Allows viewing of the rpc bind addresses in the master leader UI, especially beneficial in cases like k8s where the rpc bind address with the pod DNS is more useful than the broadcast address. #21959
- Reduces unnecessary logging during checkpoint operations by lowering INFO level logs to DEBUG_LEVEL, enhancing log readability. #21658
- Prevents fatal errors by skipping ReserveMarker/AsyncAppend if the tablet peer has already been shut down. #21769
- Enhances YSQL operation by refining task shutdown procedures and avoiding unnecessary task aborts. #21917
- Enhances load balancer efficiency by refining validation logic to block tablet replica additions only for those with a pending delete in progress on the same server, avoiding potential slowdowns during mass tablet replica moves. #21806
- Avoids multiple destruction of the same database connection, preventing system crashes due to simultaneous connection failures. #21738
- Stops fatal errors caused by the re-use of remote log anchor session during remote bootstrap from a non-leader peer. This fix ensures shared pointers are accurately tracked for
tablet_peer
objects using the=
operator, preventing unintentional destruction of underlying objects. #22007 - Corrects a bug causing some tablet metrics to display incorrect metric_type attribute. #21608
- Enables the
skip_table_tombstone_check
for colocated tables to prevent errors. #22115 - Initializes
prev_op
toUNKNOWN
to prevent AlmaLinux 8 fastdebug gcc11 compilation failures. #21811 - Delays
min_running_ht
initialization until after the successful completion of tablet bootstrap to prevent unexpected behaviors. #22099 - Resolves the issue of
pg_locks
query failure due to missing host node UUID in distributed transactions. #22181 - Eliminates latency spikes in conflicting workloads by preventing redundant ProbeTransactionDeadlock rpcs. #22426
- Validates the use of two arguments for
disable_tablet_splitting
, addressing a previous condition where only one was required, thereby enhancing backup process reliability. #8744
CDC
- Fixes issue with CDC packed rows, now ensures a single record for large insert operations, providing consistent data regardless of row size. #20310
- Introduces a fix for data loss issue caused by faulty update of
cdc_sdk_safe_time
during explicit checkpointing, along with tests to ensure validity. #15718 - Fixes a NullPointerException in yb-client by adding a check for
null
in thepartitionKey
before callinggetTablets
. #20636 - Resolves the issue of sending empty batches after a failed attempt to add a column on ALTER TABLE. #20871
- Enables filtering out duplicate DDL operations when
ysql_ddl_rollback_enabled
flag is set to true. #20989 - Reduces replication mismatches and RPC call failures by triggering RPC to random tablet with active tservers. #20717
- Integrates retry logic over FlushTables calls in test to prevent test run failures due to timing out issues. #20778
- Adds retry logic to avoid race condition in
TestModifyPrimaryKeyBeforeImage
by ensuringhistorical_max_op_id
is updated before calling GetChanges RPC. #20779 - Fixes a memory leakage issue in the walsender process by deep freeing the cached record batch after streaming to the client. #21530
- Adds more debug logs in the walsender to aid in investigating issues like linked data loss. #21465
- Reduces risk of segmentation fault during tablet split tests by safely handling null tablet peers. #21723
- Adds more debug logs for stress run debugging, skips RollbackToSubTransaction RPC to local tserver if not needed, and enhances debugging of the ListReplicationSlots function. #21780, #21519, #21652
- Fixes flaky tests to ensure proper response return when getting consistent changes and stable table addition after stream. #22068
- Removes table level attributes from CDCSDK metrics to avoid tserver crash due to failed DCHECK assertion. #22142
- Fixes the segmentation fault in walsender for dynamic table addition by refreshing stored replica identities and preventing a race condition when creating dynamic tables. #22273
- Solves an issue where CDCSDK incorrectly deduces tablets as not interesting for stream before reaching the configured time limit. #22383
- Assigns the correct "cdc_sdk_safe_time" for child tablets after a tablet split, preventing unintentional barriers or compactions. #20429
- Enhances logging for memory pressure rejections by including blocker memory tracker details and rejected memory requirement. #20776
- Cuts down the number of insert batch and inserts per batch in
TestCDCSDKConsistentStreamWithTabletSplit
and mends a data race issue. #21315 - Enables support for streaming update operations via Walsender, enhancing PG compatible logical replication support. Now executes schema changes in the logical replication protocol and maintains a record of changes in each table's read_time_ht hybrid time in the PG catalog. Includes handling late ALTER TABLE responses and addressing incomplete cleanup in the case of a stream creation failure. This feature is disabled under test flag
ysql_TEST_enable_replication_slot_consumption
. #20725 - Prevents failures in decoding change events by refreshing
cached_schema_details
when executing a newGetChanges
request if the client indicates a necessity for the schema. #20698
yugabyted
v2.21.0.1 - May 17, 2024
Build: 2.21.0.1-b1
Downloads
Docker:
docker pull yugabytedb/yugabyte:2.21.0.1-b1
Bug fix
DocDB
Converted the ysql_skip_row_lock_for_update
to an auto-flag to resolve compatibility issues during upgrade, preventing incorrect DB record creations that can affect row visibility and integrity.
v2.21.0.0 - March 26, 2024
Download
Use 2.21.0.1
Highlights
Enhanced Postgres Compatibility Mode TP
We're pleased to announce the tech preview of the new Enhanced Postgres Compatibility Mode in the 2.21.0.0 release. This mode enables you to take advantage of many new improvements in both PostgreSQL compatibility and performance parity, making it even easier to lift and shift your applications from PostgreSQL to YugabyteDB. When this mode is turned on, YugabyteDB uses the Read-Committed isolation mode, the Wait-on-Conflict concurrency mode for predictable P99 latencies, and the new Cost Based Optimizer that takes advantage of the distributed storage layer architecture and includes query pushdowns, LSM indexes, and batched nested loop joins to offer PostgreSQL-like performance.
You can enable the compatibility mode by passing the enable_pg_parity_tech_preview
flag to yugabyted, when bringing up your cluster.
For example, from your YugabyteDB home directory, run the following command:
./bin/yugabyted start --enable_pg_parity_tech_preview
Note: When enabling the cost models, ensure that packed row for colocated tables is enabled by setting the --ysql_enable_packed_row_for_colocated_table
flag to true.
New YugabyteDB Kubernetes Operator
A preliminary version of the completely rewritten YugabyteDB Kubernetes Operator is available in Tech Preview. The new operator automates the deployment, scaling, and management of YugabyteDB clusters in Kubernetes environments. It streamlines database operations, reducing manual effort for developers and operators.
For more information, refer to the YugabyteDB Kubernetes Operator GitHub project.
New features
-
New Kubernetes Operator. Automated deployment and management of clusters via the Kubernetes operator pattern. Includes support for YugabyteDB universes as a Kubernetes custom resource. Backup, upgrade, scale-out, scale-in, and more are possible on this Kubernetes custom resource. TP
-
YSQL: DDL concurrency. Support for isolating DDLs per database. Specifically, a DDL in one database does not cause catalog cache refreshes or aborts transactions due to breaking change in another database. TP
-
YSQL: DDL atomicity. Ensures that YSQL DDLs are fully atomic between YSQL and DocDB layers, that is in case of any errors, they are fully rolled back, and in case of success they are applied fully. Currently, such inconsistencies are rare but can happen. TP
-
YSQL: Lower latency for large scans with size-based fetching. A static size based fetch limit value to control how many rows can be returned in one request from DocDB. TP
-
YSQL: ALTER TABLE support. TP Adds support for the following variants of ALTER TABLE ADD COLUMN:
- with a SERIAL data type
- with a volatile DEFAULT
- with a PRIMARY KEY
-
yugabyted
-
Docker-based deployments. Improves the yugabyted Docker user experience for RF-3 deployments and docker container/host restarts. EA
-
Set preferred regions. The preferred region handles all read and write requests from clients. Use the
yugabyted configure data_placement
command to specify preferred regions for clusters. EA -
Backup and restore support in yugabyted. yugabyted now supports backup and restore of databases and keyspaces. You can also upload backups to public clouds, including AWS and GCP. TP
-
Change log
View the detailed changelog
Improvements
YSQL
- Offers consistent, specific deadlock error reporting regardless of when a transaction realizes its aborted state, through in-memory storage of recently deadlocked transaction information. #18384, #14114
- Introduces a new model for estimating DocDB seek and next operations, enhancing the accuracy of cost calculations for index lookups, especially when various types of index filters are applied. #19354
- Modifies the BNL costing model to charge for unmatched outer tuples in semi/anti/inner unique joins, enhancing the accuracy of join ordering for efficient query execution. #19054
- Introduces a new flag
index_scan_prefer_sequential_scan_for_boundary_condition
that potentially enhances speed in range-sharded databases by utilizing sequential scan over Local Skip scan under specified conditions. #16178 - Allows testing of seek and next estimations through added Java tests, guarding against potential regressions. #19082
- Corrects the computation of semi/anti join factors for inner unique joins, addressing a bug in the costing code that incorrectly estimated the fraction of outer join tuples having a match. This adjustment enhances the accuracy of join clause selectivity computations enhancing the database's performance. Additionally, fixes a bug in the
final_cost_nestloop
whereouter_matched_rows
were inaccurately set as 0, thus improving query estimation and execution. #19021 - Reintroduces the use of Local Skip scan for index scanning with primary key filters in range sharded databases, reversing a previous change due to identified correctness issues. #16178
- Alters the YSQLDump to generate
CREATE INDEX NONCONCURRENTLY
instead ofCREATE INDEX
, preventing automated index back-filling in the backup-restore, thereby accelerating the process. #19457 - Mitigates CVE-2023-39417 by incorporating an upstream Postgres commit from REL_11_STABLE, which prevents the substitution of extension schemas or owners matching ["$']. #14419
- Offers quick regression tests for CBO using the
cbo_stat_dump
andcbo_stat_load
tools, enhancing developer productivity and performance feedback by rapidly validating CBO changes through the TAQO framework. #19657 - Ensures Row Level Security (RLS) policy remains intact during table rewrite by accurately copying both
relrowsecurity
andrelforcerowsecurity
fields. #19815 - Sets the tuple count to 1000 for all tables appearing empty or unanalyzed when
yb_enable_optimizer_statistics
is true, improving Cost-Based Optimizer's query plan selection. #16825 - Imports upstream postgres commit from REL_11_STABLE as a preventive measure for future support of
DEPENDS ON EXTENSION
for objects like FUNCTION, PROCEDURE, etc, mitigating potential risks like CVE-2020-1720 and CVE-2023-39417. #14419 - Introduces sorting abilities to BNL nodes, matching their sorting properties to that of other joins, with a GUC flag
yb_bnl_optimize_first_batch
controlling it, enhancing performance especially in presence of small LIMIT clauses. #19589 - Enables tracking and aggregating of table mutation counts at the cluster level by sending the counts to an auto-analyze service, easing automatic triggering of ANALYZE when mutation thresholds exceed. #15670
- Ensures response cache invalidation when temporary tables are discarded without altering the catalog version, avoiding discrepancies while utilizing the advantages of session-bound modifications. #19178
- Includes MyDatabaseId in the T-server cache key to resolve stale shared relation issues as a result of different databases sharing T-server cache entries. #19363
- Streamlines YSQL DDL functionality by replacing the
IsTransactionalDdlStatement
function with theYbGetDdlMode
function, offering more cohesiveness through enums instead of booleans for significant DDL modes while enabling easier addition of new modes. #19178 - Enables the upgrade to OpenSSL 3.0+ by importing the upstream PostgreSQL commit
Disable OpenSSL EVP digest padding in pgcrypto
. #19733 - Enables importing of the upstream PG commit, preparing the platform for OpenSSL 3.0+ upgrades. #19734
- Blocks the use of advisory locks in YSQL and responds to the external client with an error message when they are requested. #18954
- Imports the
pgcrypto: Check for error return of px_cipher_decrypt
upstream PG commit essential for upgrading OpenSSL to 3.0+. #19732 - Adjusts the webserver's Out Of Memory (OOM) score through the
yb_webserver_oom_score_adj
flag (default 900) to prevent unnecessary shutdowns while allowing quick termination if it starts consuming excessive memory. #20028 - Sets
yb_bnl_batch_size
to 1024 andyb_prefer_bnl
to true by default, ensuring BNL's replace nested loop joins without altering non-NL join plans. #19273 - Replaces remaining unnecessary scans of the pg_inherits table with cache lookups, reducing wasteful calls to the YB-Master and optimizing DDL operations. Fixes a structuring bug in the INHERITSRELID cache for better future compatibility. #10478
- Enables
READ COMMITTED
isolation by default in debug builds, eliminates setting a transaction toREAD ONLY
via pg_hint_plan, and updates certain tests to instead run explicitly inREPEATABLE READ
. #18462 - Introduces a new flag,
ysql_use_relcache_file
, to control the use of relcache init file, helping regulate Postgres backend memory usage, and modify unpredicted system table preloading, reducing overall memory usage. #19226 - Introduces asynchronous support for
ALTER INDEX SET TABLESPACE
,ALTER INDEX ALTER COLUMN SET STATISTICS
,CREATE MATERIALIZED VIEW
withTABLESPACE
, andALTER MATERIALIZED VIEW SET TABLESPACE
enhancing database flexibility, with a traceable warning for beta features that can be muted by adjusting theysql_beta_feature_tablespace_alteration
flag to true. #6639 - Changes the default unit for the yb_fetch_size_limit to bytes from kilobytes, allowing a size limit setting to non-integer kilobyte values, enhancing query performance during upgrades. #18522
- Enables Postgres' parallel query feature and implements parallel scan of YB tables in YBSeqScan, IndexScan, and IndexOnlyScan nodes, resulting in potentially faster query results. #18095
- Replaces outdated PGConn Fetch* functions with more robust versions for improved database testing, now supporting additional BasePGType and OptionalPGType elements. #19906
- Prevents creation of index with TABLESPACE on a temporary table, averting client hangups and displaying an error message:
ERROR: cannot set tablespace for temporary index
instead. #19368 - Offers more context to the wait states in tserver layer by adding Active Session History (ASH) metadata to
Perform
RPCs, providing insights forPGPROC
and ASH collectors. Updatesyb_enable_ash
GFlag and assures upgrade/downgrade safety. #19135 - Reduces contention and potential deadlock risk during the execution of pg_stat_activity request by introducing a transaction cache at the t-server, which stores the active sessions and their transaction mapping. This allows the request to access the cache under a shared lock, alleviating the need for an exclusive lock. #18711
- Resolves the
record type not registered
error that appeared when retrieving fieldnames for batched index condition expressions in YB Batched Nested Loop through bypassing fieldname resolution for indecipherable batched expressions. #19094 - Trims unnecessary master RPC calls during connection initialization by removing
YB_YQL_PREFETCHER_NO_CACHE
enum value and introducingYBCStartSysTablePrefetchingNoCache
function. #19304 - Enables the PgIndexBackfillTest.NoAbortTxn C++ test for explicit flag setting, increasing its resilience against any default changes in YSQL backend manager flags. #19351
- Strengthens PgIndexBackfillTest.NoAbortTxn and other tests to endure potential YSQL backends manager flags' default value alterations, thereby boosting resilience. #19351
- Enables unified server functionality following process termination by resorting to restarting the postmaster for a crashed or killed Postgres backend, contributing to simplicity and fewer bugs. #19180
- Resolves an issue with RowCompareExpression bindings that previously led to incorrect results and occasional crashes in
YbBindScanKeys
by accounting for unique PgGate request conditions. #19384 - Reduces unnecessary error logs related to tablespace during initdb by checking the
FLAGS_create_initial_sys_catalog_snapshot
before initiating the tablespace refresh task. #19386 - Eliminates unnecessary error logs during initdb bootstrap process by checking for the existence of
pg_yb_tablegroup
catalog only in non-bootstrap mode. #19387 - Enhances read committed isolation by enabling each statement to pick a read time on docdb when possible, ensuring more efficient operations and adding a test for this functionality. #19397
- Removes the
TransactionCache
class shifting session's transactions' information closer to the session in theSessionInfo
structure, averting a potential deadlock scenario by ensuring smoother test execution when per-database catalog version mode is activated. #18711 - Corrects the handling of RowCompareExpression bindings in YbBindScanKeys to prevent inaccurate results and potential system crashes. #19384
- Launches the
yb_auh
extension, building the foundation for the Active Universe History project with a circular buffer for wait events storage and a background worker for local tserver and PG backends polling. New Gflags are introduced:enable_yb_auh
,yb_auh.circular_buffer_size
,yb_auh.sampling_interval
, andyb_auh.sample_size
. Default settings are disabled, 16 MB, 1000 ms, and 500, respectively. #19127 - Adds pg_hint_plan syntax and functionality to control batched nested loop joins, allows setting hints
YbBatchedNL(t1 t2)
andNoYbBatchedNL
, and modifiesyb_prefer_bnl
handling. Also, it removes BNL's dependency onenable_nestloop
and adjusts cost model. #19494 - Enables the modification of
is_single_row_txn
for finer control over non-transactional writes required byCOPY
, index backfill, or whenyb_disable_transactional_writes
is set, preventing issues during non-bufferable operations for single row transactions. #4906 - Introduces a new PG function
yb_active_session_history_internal
and a corresponding viewyb_active_session_history
for easier querying, which require the GflagTEST_yb_enable_ash
to be enabled; errors will occur otherwise. #19128 - Enables fetching of ASH samples from all PG processes, excluding prepared transactions, background workers, and backends without set ASH metadata, using a newly-created Postgres backend. #19129
- Introduces a
NOTICE
for potentially unsafe ALTER TABLE operations (such as altering primary key, altering type), ensuring users are aware of the risks. To suppress this notice, adjust theysql_suppress_unsafe_alter_notice
gflag to true. #19360 - Adds a new column with both a
NOT NULL
constraint and a non-volatileDEFAULT
value without needing a table scan, leading to faster YSQL Alter Table operations. No table scan is needed as all existing rows will use the non-volatileDEFAULT
value in their new column, reducing constraint violation checks time. #19355 - Simplifies the code in the pg_dml_read file by replacing the
DocKeyBuilder
helper class with a function and switches from using an arena array toboost::small_vector
. #19685 - Enables an alternative table rewrite approach that only drops and recreates associated DocDB tables and indexes, using the relfilenode field to map a PostgreSQL table OID to the respective DocDB table, resulting in a more efficient way to perform operations such as ALTER TYPE and ADD/DROP primary key. #4034
- Allows ordered index scans with
IN
conditions on a lower column, ensuring accurate result order for YB LSM indexes, and generalizes the fix to all such indexes. #19576 - Enables
PgClientServiceImpl
to periodically clear its ownreserved_oids_map_
, enhancing database cleaning and eliminating reliance onTabletServer
for scheduling. #19916 - Optimizes scans not requiring certain row order by allowing parallel scans of multiple partitions and secondary index scans, potentially altering the output row order in some queries without the ORDER BY clause. #13737
- Replaces deprecated
FetchValue
withFetchRow
, simplifying changes and fixing indentation issues in ‘pg_mini-’ without modifying formatting in other areas. #19918 - Renames the term
Active Universe History
toActive Session History
for enhanced comprehension. #19948 - Introduces
yb_silence_advisory_locks_not_supported_error
as a temporary solution for users to avoid disruption when using advisory locks without actual lock acquisition. #19974 - Marks the
ysql_enable_read_request_caching
GFlag as non-runtime since Postgres flags, except PG_FLAGs, cannot be dynamically updated, enhancing cache configuration consistency. #19983 - Adds a configuration option for altering default key sorting from HASH to ASC in YSQL, facilitating smoother PostgreSQL migrations and efficiently using indexes with ASC sorting, especially for inequality and ORDER BY clause queries. #19937
- Reworks the wait event format in YSQL and ASH to match the Postgres format, enhancing compatibility and simplifying association of wait events. #19130
- Enables the start and end of wait events in the PGGate layer through a callback, introducing a new Flusher class, which returns a FlushFuture object providing an updated wait event and flush request duration. #19137, #20022
- Enables the pushdown of aggregates where the split is AGGSPLIT_INITIAL_SERIAL, thereby effectively forwarding phase 1 results from YB scan to a higher level, labeled as "Noop Aggregate". #19839
- Enables ALTER TABLE rewrite commands, adding support for
ALTER TABLE ADD COLUMN
operations and modernizing REINDEX implementation for end-user indexes. #19563 - Enables ignoring already existing tablespaces during YSQL DB backup-restore process with the newly added flag
ignore_existing_tablespaces
in theyb_backup.py
script. #20334 - Adjusts preload settings to allow users to specify additional tables in the
ysql_catalog_preload_additional_table_list
without forcing preloading of default tables. #20290 - Adds Storage Row statistics to the
EXPLAIN (ANALYZE,DIST)
output, enabling users to distinguish between work done by the storage layer and the query layer and understand the selectivity of remote filters and index conditions. #12676 - Reworks TID expectations in index scans for more clarity and convenience by sidelining the use of TID t_self or t_ybctid and ensuring the setting of either yb_agg_slot, xs_hitup, or xs_itup. #20373
- Refactors IndexScanDesc yb_agg_slot to prevent setting during non-pushdown cases and eliminates return value from ybFetchNext for unnecessary instances, preventing future misuse. #20371
- Replaces existing retry attempt flags
ysql_max_read_restart_attempts
andysql_max_write_restart_attempts
with a unified GUC variableyb_max_query_layer_retries
to control retries in all isolation levels including Read Committed, with default reset to 60 retries. Defaults forretry_backoff_multiplier
andretry_min_backoff
adjusted to 1.2 and 10ms respectively. #20359 - Centralizes all code for creating internal PostgreSQL connections, simplifying usage in ysql_upgrade, ysql index backfill, WaitForYsqlBackendsCatalogVersion and ddl replication. Now utilizes the detailed error message from PGConn::Connect. #20655
- Revamps the
ToString
function to create unique responses for optional types (std/boost::optional), enhancing log readability and data relevance. #20719 - Adds a new GUC
yb_explain_hide_non_deterministic_fields
to remove non-deterministic fields from EXPLAIN ANALYZE's output, reduces flakiness between runs inpg_regress
tests. #19492 - Corrects formatting errors in the
pg_stat_get_activity
function, aligns variable names, addsyb_prefix
totxn_rpc_timestamp
, and applies column indexing based onPG_STAT_GET_ACTIVITY_COLS
macro. #20281 - Relocates
Unknown Session
Unit Test to pg_libpq, renaming it fromPgBackendsTestSessionExpire
toPgBackendsSessionExpireTest
for convention conformity, enhancing testing protocol. #20545
YCQL
- Introduces an
UpdateMapRemoveKey
API, enabling the removal of specific keys from a Map, leaving all other keys unaffected. #19829
DocDB
- Introduces
yb_read_time
GUC variable, usable by superusers to query the database at a specific point in time in the past, specifically aiding backup and restore scenarios. This variable helps generate a database schema of a specific past point using ysql_dump. Make sure it's not set before a DDL operation or during it. Default value is 0, meaning the data is read in real-time, while setting a Unix timestamp (in microseconds) allows reading data as of that time. #19114 - Accelerates rollback and downgrade processes by introducing capability to demote AutoFlags, offering enhanced control over rollback version and emergency repair functionality with new yb-admin commands. #13686
- Enables tracking of active
WriteQuery
objects and outstanding transaction status RPC requests at the tablet level for easier debugging. #18940 - Introduces an
/xcluster
UI page for yb-tserver to track real-time statuses of xCluster source streams and target pollers with a capability to reset data following a restart. Also features sorting and a search box for easier navigation. #19203 - Introduces a
read-time
flag in ysql_dump, offering a way to dump the database schema as of a specific point in time, improving backup restoration capabilities. #19258 - Enhances timeout handling for YCQL index scans to avert overruns, resulting in less log spew, ensuring index tablet scans do not timeout prematurely at the YCQLProxy/YBClient side, and eliminating unnecessary repeated master leader requests. #19221
- Reduces chances of transaction deadlocks and improves fairness in read committed isolation by modifying the order of transactions resumption across all tablets based on
xactStartTimestamp
. #18055 - Switches the data transfer rate on the tserver UI from MiBps to KiBps for enhanced precision, considering the typical tablet data transfer range. #19203
- Reduces tablet shutdown issues and delayed database operations by addressing a bug causing unnecessary blockage in clearing the
ResumedWaiterRunner
queue duringWaitQueue
shutdown. #19272 - Offers redesigned server level aggregation for metrics, thus introducing more metrics for enhanced debugging. Removes several unused URL parameters and makes the new output compatible with YugabyteDB Anywhere and YugabyteDB Aeon, preventing double-counting issues in charts. Drops unused Json and Prometheus callbacks from MetricEntity for a cleaner design. #18078
- Replaces glog includes with yb/util, introducing yb VLOG macros for clearer differentiation between INFO and VERBOSE logs, while addressing issues of duplicate includes. #15273
- Adjusts the verbose level for VLOG macros to help differentiate between INFO and VERBOSE logs, fostering ease in debugging and analysis with better log filtration. #15273
- Aligns retryable request timeouts with respective YCQL and YSQL client write timeouts, thus reducing unnecessary log replay during YCQL tablet bootstrap. #18736
- Eliminates duplicate includes from specific files, providing clearer differentiation between INFO and VERBOSE logs for enhanced user debugging experience. #15273
- Enables a retry mechanism for acquiring shared in-memory locks from the wait-queue during waiter resumption to respect client/statement timeout, reducing request failures and associated latency in contentious workloads. #19032, #19859
- Accelerates TServer Init by handling deleted and tombstoned tablets asynchronously on startup, therefore, enabling the quick starting of the RPC port. Introduces a new flag
num_open_tablets_metadata_simultaneously
to set the number of threads for opening tablets' metadata during startup, enhancing the startup time. The modification also takes steps towards deleting the superblock in DeletedTablet. #15088 - Introduces automatic recovery of index tables affected by a bug, effectively preventing performance degradation and disk size leak by ensuring that tombstones are properly filtered out by compactions once index backfilling is complete. #19731
- Adds a 10s delay between an AutoFlag config update and its application, ensuring all tservers have the new config before any AutoFlags switch and begin producing new data. Guarantees process continuity by temporarily holding back new configs if the process restarts during apply time. #19932
- Parallelizes the RPCs made during the
DoGetLockStatus
process inpg_client_service.cc
to expedite fetching locks, enhancing database performance. #18034 - Introduces support for upgrade and rollback of universes with xCluster links, checking AutoFlag compatibility during configuration changes. Includes error handling and broadcasting of AutoFlag config changes. The aim of these changes is to ensure that the target universe has the superset of specific AutoFlags. #19518
- Enables logging of all instances of tablet metadata creation/updating, providing additional insights in case of tablet server startup crashes due to multiple meta records for the same tablet. #20042
- Introduces a new
get_auto_flags_config
yb-admin command to retrieve the current AutoFlags configuration, aiding in debugging xCluster replication failures. #20046 - Enhances
pg_locks
by including results from Single Shard transactions that previously went untracked, enabling users to query these transactions. During upgrades or downgrades to version 2024.1 and above,pg_locks
queries may fail due to nodes lacking the newly implementedGetOldSingleShardWaiters
service method. #18195 - Expands load balancer metrics by incorporating
tablets_in_wrong_placement
,blacklisted_leaders
, andtablet_load_variance
, enhancing the tracking of load balancer progress. #20118 - Adds new regular expression filters to the Prometheus metric endpoint by creating a distinct API for YugabyteDB Anywhere, offering server-level aggregation for tablet and table metrics. Users should add
version=v2
to the URL for enabling this feature, granting control over metric output filters and determining the scope of metric aggregation effectively. #19943 - Limits the number of rows returned per transaction per tablet in
pg_locks
to avoid potential memory issues during batch inserts, and includes additional fields to indicate partial lock info. #20765 - Introduces a new GUC
yb_locks_txn_locks_per_tablet
to limit the number of rows returned by pg_locks, preventing the system from running out of memory during large transactions. #19934 - Allows for the check of zero bytes at the end of SST data files, and enables an error report with the number of zeros once the flag
rocksdb_check_sst_file_tail_for_zeros
is set to a positive value. #19691 - Boosts the bootstrap process by reading entries from the offset of the last flushed operation id instead of the secustomerent's beginning, significantly reducing unnecessary reading. For colocated tables, it enforces the replaying of at least two segments when the
lazy_flush_superblock
is enabled. #18312 - Prevents tservers from communicating with master leaders in different universe clusters averting possible data loss, by introducing a new
universe_uuid
field and an autoflagmaster_enable_universe_uuid_heartbeat_check
to manage the tserver heartbeat checks. #17904 - Rejects ConfigChange requests for system catalog while another server is transitioning, preventing potential data loss from mistaken quorum formation by new peers. #18335
- Enables tracing of UpdateConsensus API by activating the
collect_update_consensus_traces
flag, offering visibility into remote follower traces and adding trace messages to local logs. The feature ensures upgrade/rollback safety and impacts the leader and follower only if both incorporate the change. #19417 - Introduces the
rocksdb_max_sst_write_retries
flag to set the number of retry attempts if corruption is detected when writing SST file, affecting both flushes and compactions. #19730 - Safeguards the
master_join_existing_universe
flag to prevent unnecessary initial sys catalog snapshot restoration. #19357 - Adds a retry mechanism on block checksum mismatches and enhances error logging for better identification of transient read errors. #20102
- Refines error messages on block checksum failure by including a retry scheme and logging on success or failure, offering better error tracking. #20102
- Adds a URL parameter,
show_help,
to the scrape endpoint, enabling control over display of help and metadata information, overriding theexport_help_and_type_in_prometheus_metrics
GFlag. #19176 - Renames
AsyncClientInitialiser
toAsyncClientInitializer
for consistency in naming conventions. #19920 - Introduces flags
tablet_replicas_per_gib_limit
,tablet_replicas_per_core_limit
, andtablet_overhead_size_percentage
to customize tablet replication based on cluster resources, enhancing user control over system load balance. #16177 - Introduces a new script, analyze_test_results.py, to reconcile discrepancies between Spark-based test runner and JUnit-compatible XML test reports, offering more accurate and reliable test results. #18594
- Allows for YSQL parallel scans by breaking table tablets keyspaces into ranges of similar data size for efficient scanning time. #19341
- Reduces unwanted logging in LogAfterLoad when a single 0 version is loaded, thus minimizing unnecessary log generation especially when managing many YSQL databases. #18489
- Introduces
AreNodesSafeToTakeDown
API that ensures safe node removals during cluster upgrades or maintenance operations by checking tablet health and follower lag, facilitating seamless and risk-free updates. #17562 - Adds a
show-changes
command to thesys-catalog-tool
to search and provide details of all updated entries marked asADD
,CHANGE
, orREMOVE
. This needs to be run beforeupdate
to validate the expected changes in the SysCatalog JSON file. Notably, this command exclusively interacts with the file, without reading or writing to the SysCatalog. #18800 - Enhances the TCMalloc heap snapshot functionality with additional columns for estimated bytes and samples count from a call stack, allowing direct comparison with the total system memory and accurate proxy for memory usage. #19071
- Tracks and batches updates for rocksdb and tablet-level event stats metrics, distinguishing between counter and gauge metrics, and exposing them in
EXPLAIN (ANALYZE, DIST, DEBUG)
and tracing. #16785 - Adopts the trace outside the block for ensuring correct execution of per-session tracing with standalone traces, and fixes callbacks to adopt the appropriate trace. #19099
- Modifies the use of scan choices to increase effectiveness in scenarios where only the lower bound is specified, enhancing both speed and performance. #19117
- Allows tracking of per-RPC wait-states using WaitStateInfo for incoming RPC updates, ensuring safe upgrades and functioning ASH without interfering with existing functionalities. #19138
- Optimizes PgWire response serialization for large query results, enhancing overall read performance. #19213
- Reduces high load issues by renaming blocking synchronous YBSession flush functions to
TEST_*
and replacing them with non-blocking asynchronous versions (FlushAsync). #12165 - Reduces the safe time lag in the xCluster by sending the apply safe time more frequently when there are no active transactions. #19274
- Elevates the timeout in TSAN mode for the PgSharedMemTest.TimeOut test, averting potential table creation timeouts. #19313
- Adds a new retrying master-to-master task, allowing for the API
AreNodesSafeToTakeDown
to check if it's safe to remove or upgrade certain nodes without disrupting overall cluster health. #17562 - Replaces
EnableVerboseLoggingForModule
withgoogle::SetVLOGLevel
for a less complex procedure in setting the module log level, eliminating the updating of the vmodule gflag. #19344 - Renames
cdc
toxcluster
, movesValidateTableSchema
toxrepl_catalog_manager
and renames it toValidateTableSchemaForXCluster
. Revisesallow_ycql_transactional_xcluster
to be a TEST flag, enhances XClusterManager's ability to handle XCluster related control logic, and launches dedicated XClusterConfig class. #19353 - Reduces macOS 13.6 linker warnings by updating the compiler to avoid duplicate RPATHs, enables failure on duplicate RPATHs through
YB_FAIL_ON_DUPLICATE_RPATH
, and cleans build system. #19378 - Enables thread safety for members passed by reference by setting the
Wthread-safety-reference
, fixing all resulting build errors for increased stability. #19365 - Enables
TEST_SYNC_POINT
macro in release builds reducing its impact in production by adding the check forFLAGS_TEST_enable_sync_points
before making expensive SyncPoint calls. #19379 - Introduces XClusterManager to handle all XCluster related control logic in the yb-master, creates a dedicated class XClusterConfig for changes to XClusterConfigInfo, and makes
allow_ycql_transactional_xcluster
a TEST flag. #19353 - Adds a
skip_indexes
command line option tocreate_snapshot
andcreate_keyspace_snapshot
, allowing users to exclude indexes when creating backups in YCQL. #14142 - Enables a fallback to RPC when request or response exceeds the scope of allocated shared memory, ensuring continued functionality in larger data scenarios. #19430
- Enhances thread safety analysis by enabling the -Wthread-safety-precise compiler flag, which increases scrutiny on mutex field assignments, and adds the ability to override the compiler type for third-party archive selection using YB_COMPILER_TYPE_FOR_THIRDPARTY environment variable. #19462
- Simplifies xCluster code by allocating related tests to a separate file, introducing XClusterManager for better control logic, and establishing a dedicated XClusterConfig class for changes to XClusterConfigInfo. #19353
- Removes a disabled test, enhancing master start in shell mode with either an empty
master_addresses
or a setmaster_join_existing_universe
flag. #19528 - Saves memory and disk space by introducing a JoinStringsLimitCount utility, which limits reporting and logging to the first 20 elements of large number arrays like tablet Ids. #19527
- Filters out tservers in the read cluster when determining whether to add new tablet replicas to the cluster, providing the dual ability to manage CPU usage when maintaining idle tablets and ensure robust front-end work operations. This process includes configuration adjustments to
tablet_replicas_per_gib_limit
,tablet_replicas_per_core_limit
, andtablet_overhead_size_percentage
flags. #16177 - Renames test file "xcluster_ysq_colocated" to "xcluster_ysql_colocated" for enhanced clarity and correction of a previous error. #19531
- Allows longer GLog traces exceeding 30k limit by splitting output into less than 30k per line and introduces a new Gflag
trace_max_dump_size
to limit size of printed traces. #19532 - Adds a metric for running tablet peers per tserver for easy calculation of tablet peers to cores, and tablet peers to memory ratios on YBM clusters. #9647
- Renames
CDCTabletMetrics
toXClusterTabletMetrics
and several related files, refines metrics retrieval and setting, and enhances handling of race conditions for smoother data management. #20079 - Switches
tablet_replicas_per_core_limit
andtablet_replicas_per_gib_limit
to runtime flags, for setting and adjusting resource-based tablet limits on-the-go. #16177 - Enables aggregation of retryable requests mem-tracker metric at table-level for Prometheus by assigning the entity to the mem-tracker after the Tablet opens with the tablet metric entity. #19301
- Implements a wait period after the addition of new transaction status tablets, enhancing the stability of XClusterYSqlTestConsistentTransactionsTest.UnevenTxnStatusTablets. #19302
- Upgrades OpenSSL to version 3.0.8, disabling Linuxbrew builds and enabling glog to use the stack unwinding function based on backtrace. #19736
- Facilitates the use of remote_build.py tool by interpreting arguments for yb_build.sh even when they couldn't be correctly parsed as remote_build.py arguments. #19696
- Introduces the
trace_max_dump_size
Gflag (default 25000) for limiting trace print sizes, works around GLog's character limit for printing long traces. #19532, #19769 - Relocates
XClusterConfigInfo
andXClusterSafeTimeInfo
fromcatalog_entity_info.h
toxcluster_catalog_entity.h
, and fromcatalog_loaders.h
toxcluster_catalog_entity.h
, respectively. Also, establishes aSingletonMetadataCowWrapper
for singleton catalog entities, creates an XClusterManager interface, and transfersxcluster_safe_time_info_
and its functions from Catalocustomeranager toXClusterManager
. #19713 - Facilitates a more rapid server initialization by deleting the superblock within the DeleteTablet process when the delete_type is TABLET_DATA_DELETED, reducing the number of DELETED tablet superblocks at server startup. #19840
- Introduces a continuation marker for better traceability when a trace segment is split into multiple LOG(INFO) outputs; also adds a new GFlag
trace_max_dump_size
to limit the size of traces printed. #19532, #19808 - Generates an enhanced error message displaying the version info when the yb process incorrectly starts on an older version after AutoFlags have been enabled, aiding in easier problem identification. #16181
- Renames
producer_id
toreplication_group_id
in older proto messages, standardizing the replication group identity for enhanced consistency and rollback safety. #19825 - Centralizes common helper functions for YCQL xcluster tests into XClusterYcqlTestBase for streamlined testing procedures. #19830
- Balances tablet load more evenly across all drives, preventing bottlenecks during remote-bootstrapping by evenly distributing tablets and utilizing available disk bandwidth. #19846
- Introduces additional debug logs for troubleshooting
SELECT
statement errors that could arise from processing non-provisional records or writing provisional records without a hybrid timestamp. #19876 - Cleans up allocated shared memory objects on TServer startup if the TServer process didn't shut down gracefully. #19988
- Enhances the
demote_single_auto_flag
yb-admin command by returning specific error messages for invalid process_name, AutoFlag name, or non-promoted AutoFlag, making identifications easier. #20004 - Enables monitoring of master leader heartbeat delays through a new RPC in the MasterAdmin, ensuring undesired lags can be readily detected and mitigated. #18788
- Avoids indefinite mutex lock and TServer thread blockage by correctly handling crashes during request transmission via shared memory. #20050
- Eliminates usage of UNKNOWN flags in tools, marking them as NON_RUNTIME since dynamic update of these flags is not supported. #20123
- Renames the misleading
cdc
xCluster metric entity toxcluster
, ensuring an accurate representation without affecting dependencies as services like YugabyteDB Anywhere rely on the unchanged metric name. #20131 - Establishes a flag to manage indexing backfills, offering control over whether non-deferred indexes should be batched during the backfill operation. #20213
- Delivers automatic recovery for index tables affected by a bug previously found and addressed, preventing any future performance issues triggered by incorrectly set property values. #20247
- Changes
Successfully read [n]ops from disk.
logs to verbose logging, lowering the frequency of identical log outputs and boosting performance. #20287 - Allows configuration of the yb_build.sh script via .git/yb_buildrc and ~/.yb_buildrc bash scripts, to specify implicit arguments or alternative defaults before parsing command line arguments. #20291
- Converts
UNKNOWN
flags to eitherRUNTIME
orNON_RUNTIME
in DocDB for optimal flag management. #16979 - Marks the Tserver flag
num_concurrent_backfills_allowed
as RUNTIME instead of UNKNOWN for better manageability. #20348 - Upgrades unit test key/certificate pairs from 1024-bit RSA keys to 2048-bit, meeting FIPS 140-2 requirements, and integrates their generation into the build process. #20370
- Marks the
force_global_transactions
,ycql_use_local_transaction_tables
, andauto_promote_nonlocal_transactions_to_global
gflags as runtime, enabling them to be changed directly as required for each new transaction. #20479 - Organizes AutoFlags management across dedicated MasterAutoFlagsManager, TserverAutoFlagsManager and subset AutoFlagsManagerBase, offering neat code architecture and resolving a bug in Master::InitAutoFlags. #19958
- Renames
cdc::ProducerTabletInfo
tocdc::TabletStreamInfo
and removes ReplicationGroupId from it, relocates ReplicationGroupId from cdc to xcluster namespace, and introducesxcluster::ProducerTabletInfo
to optimize naming consistency. #20452 - Enables the use of the OpenSSL FIPS module by setting the new
openssl_require_fips = true
gflag, ensuring FIPS standard compliance for database cluster creation. #20524 - Adds Prometheus metrics for server hard and soft memory limits, enabling better tracking of memory use in TServer or master and creation of dashboard charts for universes using non-default values. #20578
- Introduces a helper function that checks if a CowObject has a write lock, offering special functionality in retail mode and debug mode for enhanced thread safety. #20599
- Eliminates the issue of accessing erased objects in the ClusterLoadBalancer::RunLoadBalancerWithOptions, enhancing the runtime performance. #20673
- Streamlines bloom filter key calculation by avoiding duplicate calculations. This results in approximately 4.5% tserver time improvement, and overall 1.5% performance boost. #20720
- Limits the number of tablets per node, and hastens reaching the desired number of tablets by lowering the values of FLAGS_tablet_split_low_phase_shard_count_per_node to 1 and FLAGS_tablet_split_low_phase_size_threshold_bytes to 128_MB. #20579
- Introduces new auto flags to stave off backward compatibility issues related to version 2.20, ensuring the stable existence of previously promoted AutoFlags during process startup time. #13474
- Adds verbose logs for frequent global and per-table state changes within a load balancer run for easier debugging. #20289
- Splits XClusterManager into two separate managers, XClusterSourceManager and XClusterTargetManager, each handling different objects, to enhance code readability and component isolation. #20737
CDC
- Integrates CDCSDK stream creation for a namespace into YugabyteDB master, introducing support for garnering a CDC stream via
cdcsdk_ysql_replication_slot_name
. Invalidates deprecated logic in cdc_service, focusing on YSQL strategies instead. Promotes explicit parameter requirements for request validation whennamespace_id
is populated. Addresses a race condition and initial checkpoint discrepancy inCreateCDCStream
. This alteration modifies sys-catalog entry and necessitates client checking of the autoflagyb_enable_replication_commands
. #19211 - Enables CRUD syntax for Publications in YSQL as part of a YSQL API for CDC via the PG logical replication mechanism, allowing users to specify tables for streaming through CDC. However, CDC does not support certain features, which may limit table selection and result in errors. The change is irreversible due to the introduction of the
yb_enable_replication_commands
autoflag. #18930, #18933, #18931 - Allows maxAttempts for RPCs in AsyncClient to be adjustable, decreasing the risk of
Too many attempts
exceptions occurring in a short period. #12751 - Enables deletion of CDCSDK streams through replication slot names, advancing the support for SQL syntax for CDC via the PG logical replication model. However, this feature isn't rollback safe and is disabled during upgrades, requiring a subsequent check of the autoflag
yb_enable_replication_commands
. #19212 - Introduces support for creating, viewing, and dropping replication slots in YSQL. Adds two interfaces for support, functions
pg_create_logical_replication_slot
andpg_drop_replication_slot
, and Walsender commandsCREATE_REPLICATION_SLOT
andDROP_REPLICATION_SLOT
. Inserts viewpg_replication_slots
for viewing replication slots. Fixes two issues concerning cleanup of held locks and skipping cache refresh. #19211, #19212, #19509 - Prevents
Object already exists
error during consecutive CreateCDCStream and DeleteCDCStream calls by effectively handling the stream delete state, and supports creating a CDCSDK stream for a namespace via SQL syntax. #19211, #19212 - Automatically forwards CreateCDCStream requests to yb-master for atomic creation of CDCSDK streams, enhancing consistent snapshot capability. This is covered by the
ysql_yb_enable_replication_commands
flag and temporarily bypasses the requirement for a replication slot name. #18890 - Unveils enhanced replica command recognition to overcome issues, paving the way for new replication slot support. Also incorporates the ability to create a CDCSDK stream for a namespace via SQL syntax and remedy specific race conditions. #19211, #19212
- Defines replication slots as active or inactive in YugabyteDB, considering a slot active if it's consumed within the set duration defined by the
ysql_cdc_active_replication_slot_window_ms
Tserver GFlag. This change allows better visibility into slot activities and prevents dropping of active slots. It also addresses a bug in theWaitForGetChangesToFetchRecords
function used in testing. #19211, #19212 - Supports the creation of CDCSDK stream for a namespace, with the ability to fetch it using
cdcsdk_ysql_replication_slot_name
. Simultaneously, addresses a race condition problem during theCreateCDCStream
operation and ensures proper initial checkpoint setting in cdc_state_table. Introduces limits on replication slots (CDC stream) utilizing a GFlag and reports the status when the slots limit is reached. This change also accommodates the detection of replication commands inyb_is_dml_command
. #19211, #19212 - Enables reading of Decimal and VarInt datatypes in CDC for CQL. #19726
- Reinstates support for identifying replication commands after a previous rollback. Allows users to create a CDCSDK stream for a namespace and to retrieve a CDC stream using
cdcsdk_ysql_replication_slot_name
. Addresses a race condition issue betweenCreateCDCStream
and the Catalocustomeranager's background cleanup task and fixes a problem related to the initial checkpoint of tables in the cdc_state_table for CDCSDK. Also reintroduces the ability to determine whether a replication slot (CDC stream) is active or inactive. #19211, #19212 - Limits the number of replication slots in YSQL with
max_replication_slots
GFlag, introducing an error code for when the limit is reached, and enhances CDC stream creation. #19211 - Displays the replication commands conducted by walsenders in the
pg_stat_activity
section. The new implementation supports the creation of a CDCSDK stream for a namespace viacdcsdk_ysql_replication_slot_name
, enables the detection of replication commands without errors, and introduces the limitation of the number of CDCSDK streams by themax_replication_slots
GUC. #19211, #19212 - Expands the range of SQL commands that can be issued to a walsender, increases support for creating CDCSDK stream for a namespace, and guards against a potential race condition between
CreateCDCStream
and Catalocustomeranager background cleanup task. #19211, #19212 - Avoids erroneous deletions from the cdc_state table caused by a race condition during tablet splits by reversing the call order in the CleanUpCDCStreamsMetadata method. #19746
- Detects replication commands in
yb_is_dml_command
, supports creating logical replication slots through SQL usingCREATE_REPLICATION_SLOT
andpg_create_logical_replication_slot
. The change includes support for CDCSDK stream creation, imposes limit on replication slots/streams, and resolves a race condition related toCreateCDCStream
. #19211, #19212 - Changes the
yb_enable_replication_commands
from an autoflag to a TEST flag, making it safer and more flexible for enabling replication slots feature by default. Supports YSQL commands for replication slots when the flag is true, while disallows them when the flag is set to false. It also rectifies a race condition betweenCreateCDCStream
and the CataloCustomeranager background cleanup task. The revision further supports the creation of CDCSDK stream for a namespace, aiding in the long-term goal of supporting SQL syntax for CDC. #19211, #18890 - Ensures cleanup of entries from
cdcsdk_replication_slots_to_stream_map_
when corresponding entries are deleted fromcdc_stream_map_
, avoiding potential inconsistencies. #19211 - Introduces a new yb-admin CLI command and master RPC to enable backfilling of a replication slot name to existing CDCSDK streams, providing manageable streams via YSQL Publication/Replication slot interface. #19261
- Logs a NOTICE for each unsupported table when creating a publication using the
FOR ALL TABLES
case in CDC, improving user visibility on skipped tables. #19291 - Enriches the CDCStreamInfo java class with a new cdcsdk_replication_slot_name field and an accessor method for better support of Publication/Replication slot. #19811
- Optimizes the
CreateCDCStream
by eliminating unnecessary sleep statements, preventing a race condition, and ensuring correct initial checkpoint settings for the CDCSDK. Also, this code change introduces support for SQL syntax for CDC using the Postgres logical replication model, allows detecting replication commands without errors and defines whether a replication slot (CDC stream) is active or not. #19211 - Transforms
yb_enable_replication_commands
into a runtime PG preview flag, correcting a bug that caused publication commands to always be enabled regardless of flag value. #18930 - Introduces a GFlag to toggle automatic tablet splitting for tables within a CDCSDK stream, enhancing user control over replication processes. #19482
- Expands support for two new record types:
PG_DEFAULT
andPG_NOTHING
based on Postgres replica identity types while maintaining backwards compatibility by renamingALL
andMODIFIED_COLUMNS_OLD_AND_NEW_IMAGES
modes toPG_FULL
andPG_CHANGE_OLD_NEW
respectively. A failsafecdc_enable_postgres_replica_identity
autoflag is added. #19260 - Addresses a test failure in
TestCreateCDCStreamForNamespaceLimitReached
by specifically adding the record typeCHANGE
to the stream request. Enables support for two new record typesPG_DEFAULT
andPG_NOTHING
, while retaining theALL
andMODIFIED_COLUMNS_OLD_AND_NEW_IMAGES
modes. Adjusts settings using the newly added autoflagcdc_enable_postgres_replica_identity
. #19260 - Introduces support for two new record types,
DEFAULT
andNOTHING
, based on Postgres replica identity types, and renamesALL
andMODIFIED_COLUMNS_OLD_AND_NEW_IMAGES
modes toPG_FULL
andPG_CHANGE_OLD_NEW
respectively for backward compatibility. It introduces an autoflagcdc_enable_postgres_replica_identity
during CDC stream creation and adjusts the failing test TestCreateCDCStreamForNamespaceLimitReached by specifying the record typeCHANGE
. #19260 - Enhances CDCSDK to report tablet splits promptly upon detection, controls data duplication by cross-referencing hash_key bounds, and optimizes the retrieval of child tablets via
tablet_peer
. #18479 - Refines the
GetCheckpointResponse
to indicatesnapshot_key
presence only when present, enhancing accuracy of bootstrapping and streaming processes. #19292 - Introduces the
UpdateMapUpsertKeyValue
API that lets you update specific keys without needing to re-add all keys, allowing for more efficient updates. #19577 - Enhances the CDC State Table's key update efficiency by selectively updating or removing keys as needed, without having to replace the entire map column. #19577
- Reactivates the cdcsdk_stream-test for TSAN mode, previously disabled, enhancing overall testing capabilities. #19752
- Helps ensure failed CDCSDK stream creation processes are rolled back effectively, reducing problems caused by incomplete creations through a
ScopeExit
mechanism. Manual clean-up may be required in certain failure scenarios until DDL atomicity for alter table statements is implemented. #18934 - Enables the tests in cdcsdk_snapshot-test to run in TSAN mode, augmenting their utility and coverage. #19752
- Rectifies the intermittent failure issue in TestReleaseResourcesOnUnpolledSplitTablets by ensuring that UpdatePeersAndMetrics thread refreshes the cached CDC stream metadata if in the initialized state. #18934
- Alters the default checkpoint type to
EXPLICIT
during stream creation, ensuring no upgrade or rollback issues due to alterations in the default proto field value. #18748 - Allows yb-client to apply retries for retryable error codes, preventing the unnecessary resetting of attempts and deadlines when a CDCErrorException is encountered. #19648
- Releases retention barriers on tables that are not of interest in the CDC Consistent Snapshot feature stream, defined by the new GFlag "cdcsdk_tablet_not_of_interest_timeout_secs." This enhances user control over snapshot consumption. #20146
- Refactors tests to use
ASSERT_EQ
assertions, notASSERT_GE
, for checking consumed record count, utilizingGetChangeRecordCount
method for more accurate record handling and tablet splitting. #20261 - Switches the default consistent snapshot option to USE_SNAPSHOT when creating a new stream, and converts the Consistent Snapshot feature to a preview feature guarded by the RUNTIME_PREVIEW flag
yb_enable_cdc_consistent_snapshot_streams
. #20367 - Modifies the default value of gflag "cdcsdk_tablet_not_of_interest_timeout_secs" to 4 hours enhancing CDC Consistent Snapshot feature and remains guarded by the PREVIEW flag "yb_enable_cdc_consistent_snapshot_streams". #20378
yugabyted
- Integrates client-to-server encryption support for Ysql Connection Manager, securing the connection between the client application, Ysql connection manager, and pg_backend through enabling SSL connectivity. Uses the existing
use_client_to_server_encryption
andcerts_for_client_dir
flags to enable and configure this feature, while not supporting certification files set viaysql_pg_conf_csv
and cert-based authentication. Ensures upgrade and rollback safely without the need for an auto flag or node communication. #19108 - Publishes Ysql Connection Manager metrics on
<tserver_ip_address>:13000/prometheus-metrics
, enhancing data monitoring and diagnostics. #19109 - Alters the format of YSQL connection manager's prometheus metrics on the prometheus-metrics endpoint to include the database as a metric label. #19484
- Enables faster and more secure unix socket connections between Ysql Connection Manager and pg backend on the same machine, replacing the previous TCP/IP connections. Introduces a new flag
ysql_conn_mgr_use_unix_conn
to configure this feature. #19483 - Enables the use of the YSQL Connection Manager feature as an alternative in the
yb-pgsql
java test framework by setting theYB_ENABLE_YSQL_CONN_MGR_IN_TESTS
environment variable totrue
. #19703 - Allows for the creation of separate pools for each user/database combination in the Ysql Connection Manager, eradicating the need to set the user context at the beginning of each transaction. Updates to stats/metrics format also enhance database pool tracking. #19722
- Enables restriction of encryption to the logical connection only in YSQL Connection Manager by setting
use_client_to_server_encryption
. Physical connections, between the YSQL connection manager and Postgres process on the same machine, are not encrypted, enhancing internal performance without sacrificing secure external communications. #19108 - Introduces the GUC variable
ysql_conn_mgr_sticky_object_count
for easier and faster tracking of connection stickiness in YSQL Connection Manager tests, eliminating the need to modify pool sizes. #20067 - Introduces the GUC variable
yb_use_tserver_key_auth
for authenticating clients usingyb-tserver-key
. Removes the "postgres only" requirement foryb-tserver-key
authentication and setsysql_conn_mgr_use_unix_conn
as true by default. Requires no HBA changes. #19996 - Integrates a database migration visualization tool in the yugabyted UI, including a new dashboard for monitoring migration progress and complexity, facilitating smoother transition from other databases. #18782
- Corrects the CPU usage Sankey diagram to accurately report used and available values, enhancing reliability of performance metrics on the
performance page.
#19991 - Enables a new user interface feature in yugabyted for connection management metrics, displaying metrics on active and total logical/physical connections, and providing a clickable banner to navigate to dedicated connections visuals. #18805
- Rectifies confusion with the yugabyted-UI; password authentication no longer incorrectly shows as enabled for an insecure cluster unless the encryption-at-rest is activated. #19295
- Rectifies the misalignment in the display of status messages for specific scenarios in yugabyted. #19334
- Corrects the display of the total number of CPUs on the overview page and ensures live queries show all statuses, not just idle. #19414
- Offers the ability to set preferred regions using yugabyted CLI for lower latencies, by expanding the functionality of the
constraint_value
flag, offering a way to assign preference orders to Availability Zones (AZ). #19415 - Corrects
join
flag bugs, ensuring a smooth start command even if a node's join IP is not an active master and enables error handling when the placement_uuid from the join IP can't be obtained. Now supports Hostnames and handles edge cases for addresses provided through CLI. #19316, #19314 - Adjusts the
yugabyted start
command to interpret0.0.0.0
as127.0.0.1
in the advertise_address, aligning with the IP use in master, tserver, and yugabyted-UI. #18580 - Adds prerequisite checks to confirm if default ports are open before yugabyted starts, resulting in either failure to launch or impaired functionality with warnings depending on the blocked ports. #19504
- Integrates ysql connection manager stats into the tserver metrics snapshotter, which can be enabled via the
metrics_snapshotter_tserver_metrics_whitelist
gflag, offering visibility into total logical and physical connections. #18805 - Allows metrics whitelist to include
ysql_conn_mgr
flag only if the connection manager is enabled, enhancing the accord between connection manager metrics and yugabyted UI. #18805 - Enables Yugabyted UI to display
Alert
messages from all nodes by directing API calls through the yugabyted API server. #19972 - Resolves an issue where the UI failed to launch when
advertise_address=0.0.0.0
by ensuring127.0.0.1
is used instead, and adds a connection check for address uniqueness and timeout for tserver API calls. #18580 - Enables the starting of two different local RF-1 instances on Mac by adding a check for empty
join
flag during the second node's initiation. #20018 - Removes the deprecated gflag
use_initial_sys_catalog_snapshot
, replaced byenable_ysql
that is now true by default, eliminating repetitive warning messages on starting yugabyted nodes. #20056 - Adapts yugabyted-ui to efficiently support Kubernetes (k8s) deployments, ensuring correct function for nodes with only masters. A new
bind_address
flag added for customizing the API server's bind address. #20301 - Rectifies the malfunction in yugabyted-ui when yugabyted utilizes custom
ysql_port
andycql_port
values by introducing a new flag for YCQL port number. #20406 - Updates the yugabyted-ui backend to align with changes in the connection manager stats consumed from the
:13000/connections
endpoint, catering for removal ofpool_name
and addition ofdatabase_name
anduser_name
. #20494 - Adds yugabyted-ui support to the K8s OSS Yugabyte helm chart, including new values to control UI and metrics snapshotter activation for enhanced metrics visualisation in the K8s environment. #20344
- Retains the integrity of user's custom configuration file by associating
config
flag with start command, and directs updates to a yugabyted generated file within base_dir/conf directory. #20881 - Allows a smooth restart of the second node in a cluster using the
join
flag without throwing any errors. #20684 - Enables a predefined set of gflags related to the pg-parity project using the
enable_pg_parity
flag in the yugabyted start command. #21221 - Changes the flag
enable_pg_parity
toenable_pg_parity_tech_preview
for activating a predefined set of gflags related to the pg-parity project with the yugabyted start command. #21221
Other improvements
- Introduces a strict deletion check for orphaned tablets to prevent erroneous data loss when the master issues DeleteTablets to tservers, with the feature guard
master_enable_deletion_check_for_orphaned_tablets=true
, ensuring upgrade and downgrade safety. #18332 - Simplifies reading of remotely fetched traces by introducing proper nesting levels and splitting multi-line trace entries into different lines. #19758
- Enables monitoring of inbound calls for read and write RPCs without any performance impact, by maintaining and updating WaitStateInfo during execution and annotating waits during I/O and lock/condition waiting. #19143
- Switches release packaging to use native libraries on lowest common version (centos7 for linux-x86) instead of linuxbrew libraries, introducing changes to the default calculation for linuxbrew builds in the 2.21 release. #19219
- Redefines release packaging to use native library build instead of linuxbrew, boosting compatibility with later OS versions. Changes the default setting for linuxbrew builds to false. Fixes shellcheck errors in compiler wrapper. #19219
- Redesigns build options parsing in Jenkins for better compatibility, switching from YB_BUILD_OPTS evaluation to YB_*environment variables, and mends shellcheck mistakes in compiler wrapper. #19219
- Corrects the Jenkins build error that occurred when YB_BUILD_OPT was not set, ensuring smooth build operations even in the absence of YB_BUILD_OPTS. The change switches the packaging method to use native library build instead of Linuxbrew, offering better compatibility with later OS versions. #19219
- Ensures consistency at the time of stream creation in the CDC Consistent Snapshot feature by selecting a single common read point across all tablets within the input database. Additionally, guards changes with the TEST flag
yb_enable_cdc_consistent_snapshot_streams
, set to false by default. Also includes alteration to create stream workflow on the Master side and introduces retention barriers on Regular db, WAL, and IntentsDB. #19678 - Allows you to preserve information sources during stream creation until snapshot records and related changes are consumed by maintaining retention barriers on WAL/Intents/RegularDB. Also, ensures data consistency during failover scenarios by performing preparations as part of the Apply of Raft operation. Includes support for colocated tables during snapshot stream creation, with a filter to exclude WAL records with commit_time lower than or equal to the snapshot_time. Currently, changes are hidden behind the TEST flag, which will later be an autoflag. #19679
- Extends MiniCluster with YB Controller servers and introduces graceful shutdown feature, ensuring a smoother testing experience. #19849
- Extends
MiniYBCluster
to include YB Controller servers and allows for their graceful shutdown. #19849 - Introduces snapshot and streaming consumption changes as well as support for colocated tables in the context of consistent snapshot stream, allowing exhaustive and mutually exclusive snapshot and change records. #19680
- Enhances the yb-admin CLI to support the creation of consistent snapshot streams, increasing control over snapshot options like NOEXPORT_SNAPSHOT and USE_SNAPSHOT. #19682
- Introduces
retention_barrier_no_revision_interval_secs
gflag to avoid race conditions in setting retention barriers during stream creation, increasing the consistency of snapshot streams. #20145 - Introduces a generic task that runs tasks after all tablets are created on new tables and fixes issues that could leave the table in the
RUNNING
state or schedule tasks before updating the data on disk. #20577
Bug fixes
YSQL
- Allows for ALTER TYPE to run on temporary tables without blocking PG table rewrite, preventing data corruption and enabling smoother transaction handling. #18909
- Introduces a per-database PG new OID allocator, ensuring OID uniqueness within the database and enhancing horizontal scalability in multi-node and multi-tenancy environments. This new mechanism mitigates OID collisions and allows OID consistency in backup-restore scenarios across clusters. A new GFlag
ysql_enable_pg_per_database_oid_allocator
is provided to return to old OID allocator behavior if necessary. #16130 - Restarts the postmaster when a process is killed during its own initialization or cleanup to prevent potential mishandling of shared memory items. #19945
- Resolves a bug that incorrectly type-checks bound tuple IN conditions involving binary columns like UUID for releases 2.17.1 and higher, improving database consistency. #19753
- Adjusts the default values of
yb_local_throughput_cost
,yb_local_latency_cost
, andyb_docdb_remote_filter_overhead_cycles
, enhancing performance across most TAQO workloads. #20032 - Ensures consistent wait start times in
pg_locks
by tracking the RPC request start time for the waiter instead of the time-out in the wait-queue, providing a more accurate reflection of real progress. #18603, #20120 - Converts the "Unknown session" error into a FATAL error, allowing drivers to instantly finish a non-responsive connection, enhancing client connection management. #16445
- Corrects a backup failure issue by ensuring the function
yb_catalog_version
is introduced, especially in 2.4.x or 2.6.x clusters where it was previously missed due to a YSQL upgrade code bug. #18507 - Ensures the Linux
PDEATH_SIG
mechanism signals child processes of their parent process's exit, by correctly configuring all PG backends immediately after their fork from the postmaster process. #20396 - Enhances distinct iteration to avoid missing live rows after detecting a deleted row, by making AdvanceToNextRow aware of whether a fetched row is deleted, thereby ensuring no rows are missed during distinct queries-to-tables with deleted tuples. #19911
- Enables cleanup after killed backends, fixing an issue where killing a background worker uses up a Proc struct, therefore preventing the webserver from failing after 8 attempts. #20154
- Releases memory to the operating system after processing each endpoint call, effectively managing large amounts of data produced by long and unique queries and preventing unnecessary accumulation of memory. #20040
- Eliminates segmentation fault in webserver SIGHUP handler at cleanup by ensuring
MyLatch
usage in all instances in order to manage process life cycle. #20309 - Adds a regression test for nested correlated subqueries to guard against reintroducing a previously fixed issue and ensures correct query results, with plans to backport it to relevant branches. #20316
- Corrects the lookup function in BNL (Block Nested Loop) to ensure matching outer tuples are found accurately when the join condition contains more than just hashable equality filters. #20531
- Marks BNL plannodes that sort results as unable to project, addressing a regression in sorted BNL's performance and ensuring the accuracy of sorting when a target list changes due to merged overhead projection operators. #20660
- Extends early termination of index scans for conditions with the form
index_column OP NULL
to additional btree operators>/>=/</<=
, ensuring such conditions no longer send unnecessary data to DocDB. #20642 - Corrects an error in the aggregate scans' pushdown eligibility criteria to prevent wrong results from being returned when PG recheck is not expected, but YB preliminary check is required to filter additional rows. #20709
- Corrects the inaccurate detection of constants in distinct prefix computation during distinct index scans, ensuring reliable query results for batch nested loop joins. #20827
- Renders a fix for memory corruption issue that caused failure in creating a valid execution plan for
SELECT DISTINCT
queries. Enables successful execution of queries without errors and prevents server connection closures by disablingdistinct pushdown
. This fix improves the stability and effectiveness of SELECT DISTINCT queries. #20893 - Eliminates unnecessary computation of range bounds in Index-Only Scan precheck condition, preventing crashes for certain queries and improving performance. #21004
- Trims down the probability of inaccurate behaviour involving conflicts between single shard INSERT operations by ensuring read times are chosen after conflict resolution, enhancing data consistency. #19407
- Reduces the time spent on preparing read requests in queries with a large number of operands in the
IN
operator by avoiding O(n^2) complexity in list traversal when generating ybctids. #19329 - Refines parameter computation for Nested Loop joins in YSQL, removing the need to manually track relations that can't be batched parameters, thus mitigating bugs and simplifying logic. #19642, #19946
- Includes additional tests that capture and demonstrably rectify previously recurring errors from Batched Nested Loop Left Join due to incorrectly parameterized batched expressions in multiple loop scenarios. #19642, #19946, #20495
- Corrects the incrementation timing of pg_stat_user_indexes idx_scan column for LSM index for accurate stat generation, ensuring it no longer increments too early. #17495
- Reduces spinlock deadlock detection time by 75% for prompt handling of potential freezes and restarts Postmaster when a process holding a spinlock is killed, ensuring successful initiation of new connections. #18272, #18265
- Prevents potential postmaster crashes during cleanup of killed connections by using the killed process's ProcStruct to wait on an unavailable LWLock. #18000
- Overhauls the handling of DDL statements, preventing them from restarting in READ COMMITTED mode, better managing DDL transactions, and ensuring more immediate clean-up of DDL transactions. #18761
- Rectifies the issue of filters not binding to the request by amending the erroneous duplication-check of the bindings on the first column of the row element, enhancing query performance. #19308
- Resolves an issue by safely dropping all foreign key constraints in one pass, preventing errors when altering a column referenced by a foreign key in partitioned tables. #19063
- Cures null constraint violations in ALTER TYPE operations and failures on tables with a range key, ensuring accurate operation and error reduction. #18911, #19382
- Restores previous conditions after test PgRegressIndex yb_index_scan fails due to a commit reversion. #19477
- Eliminates unnecessary file creation for views on temporary tables by checking if storage is actually needed. #19522
- Moves estimated seeks and
nexts
in the EXPLAIN plan from VERBOSE to DEBUG flag, enhancing Sequential Scan nodes to include these estimates. #19938 - Corrects DDL Atomicity by cleaning up failed
CREATE TABLE
operations, allowing for multiple sub-commands inALTER TABLE ALTER COLUMN TYPE
, adequately looking up Materialized views in PG schema, and addressingorder
field-dependency in DocDB columns. #19605 - Rectifies the serialization mismatch in YBBatchedNestLoop, reducing errors when Parallel Query is enabled. #19612
- Corrects an error that prevents the
ALTER TABLE SET TABLESPACE
command from executing successfully when the cluster has aplacement_uuid
set, by properly filling in theplacement_uuid
during validation. #14984 - Allows transfer of parameter values to and from background workers in Parallel Query by correcting the finalize_plan function, improving Nested Correlated Subquery results. #19694
- Enables running the postprocess script on alternate expected files in
pg_regress
, effectively fixing mismatches previously noticed due to its absence. #19737 - Reduces maintenance time by switching to a less complex implementation of SideBySideDiff.java, thereby eliminating errors from
SideBySideDiff.sanityCheckLinesMatch
. #19690 - Prevents PostgreSQL backend crashes induced by assert errors in the YbPgInheritsCache as it now correctly cleans up unreleased references, improving transaction reliability. #19807
- Safeguards against potential bugs by ensuring that
yb_transaction_priority_lower_bound
andyb_transaction_priority_upper_bound
are disregarded in read committed isolation, irrespective of theenable_wait_queue
status. #19921 - Adjusts the shared relcache init file invalidation to ensure correct refresh of the rel cache after executing DDL statements, ensuring consistency with Postgres results. #19955
- Streamlines the creation of a publication for all tables in per-database catalog version mode by making updates to
pg_yb_catalog_version
that bypassCheckCmdReplicaIdentity
function, eliminating DDL errors. #19965 - Eliminates unnecessary catalog version incrementation on no-op GRANT DDL statements to enhance optimization by rectifying a previously missed case. #19981
- Allows successful dropping of table groups when DDL Atomicity is enabled by verifying if the tables within the group are marked for deletion, instead of ensuring the group is empty. #20002
- Revises YbSeqScan to send
ysql_catalog_version
in user-initiated system table requests, ensuring system table scans use an up-to-date catalog and reducing chances of TestPgRegressIndex failure. #20017 - Rectifies the assertion failure issue in the per-database catalog version mode. The fix updates the conditions for treating DDL statements, eliminating previous failures caused by treating some DDL statements as non-DDL statements. #19975
- Increases the delay when restarting the test cluster in tsan build to prevent occasional failures in unit test PgOidCollisionTest.TablespaceOidCollision/0. #20008
- Corrects the method for deriving
element_typeid
to prevent crashes when running aggregations with join by ensuring it's derived from the RHS of the index condition, not the LHS. #20003 - Resolves a bug ensuring
ddl_transaction_state
gets properly reset even ifYbIncrementMasterCatalogVersionTableEntry
throws an exception, preventing non-global DDL statements from being incorrectly handled as global ones. #20038 - Prevents a possible system crash in YSQL backends manager by ensuring essential checks are in place before using the job database object. #20060
- Enforces stricter locking mechanisms during concurrent updates on different columns of the same row, to maintain data consistency and prevent 'write-skew anomaly within a row’. Adds a new gflag
ysql_skip_row_lock_for_update
to toggle the new row-level locking behavior. #15196 - Ensures removal of both shared and per-database relation cache initialization files during postmaster startup to prevent the reusing of outdated files. #20125
- Disables CheckCmdReplicaIdentity for tables when yb_non_ddl_txn_for_sys_tables_allowed is set to true, preventing YSQL upgrades from failing during update/delete operations on system tables. #20085, #20143
- Eliminates the possibility of a segfault during the LWLock process when the postmaster cleans up a killed process, by using
KilledProcToCleanup
instead ofMyProc
. #20166 - Restores PostgreSQL 11 code to its original format, facilitating an easier merge with PG15. #20176
- Enhances visibility and debugging capabilities by introducing two boolean flags, which log every endpoint access and print basic
tcmalloc
stats after path handler and garbage collection. Nowyb_pg_metrics
handles theSIGHUP
signal to update flags values. Also adds:13000/memz
and:13000/webserver-heap-prof
to expose memory usage with a new runtime variable to control tcmalloc sampling. #20157 - Introduces the
pg_stat_statements.yb_qtext_size_limit
flag, controlling the maximum file size read into memory, limiting potentially large or corrupt qtext files impacting system memory usage. #20211 - Unveils fresh insight into webserver memory usage through the creation of
:13000/memz
and:13000/webserver-heap-prof
for printing tcmalloc stats and displaying current or peak allocations, respectively. #20157 - Rectifies an issue with corrupted state manipulation, caused by processes being killed during writing, by restarting the postmaster anytime a backend is extraordinarily killed in a critical section. This helps avoid infinite loops and CPU overuse, thereby enhancing database stability. #20255
- Caps retrieval of
beentry
fromlocalhost:13000/rpcz
to 1000 iterations, preventing indefinite waits and ensuring safety even in cases of inconsistent states. #20274 - Blocks new-version DDL statements in an invalid per-database catalog version configuration to avoid possible stale read/write RPCs and provide accurate results during cluster upgrades. #20300
- Moves the Active Session History (ASH) code from extension to core Postgres, eliminating the chance of partial feature activation and ensuring control solely through the
TEST_yb_enable_ash
gflag, enhancing the user's control over the ASH functionality. #20180 - Enables rollback from PostgreSQL 15 upgrade to preserve PostgreSQL 11 data directory, therefore preventing a loss of stored data such as statistics. #20319
- Renames the
debug
field inExplainState
toyb_debug
and repositions it to the bottom of the struct for clarity purposes. #20366 - Reduces memory consumption during secondary index scans by introducing a separate arena for batch operations, lowering the risk of a node run out due to high memory usage. #20275
- Prevents background worker crashes caused by assertion failures in Active Session History (ASH) when
MyProcPort
is not established. #20338 - Adds an extra null check to avoid runtime errors when ASH is enabled by default and prevents the execution of ASH code while running initdb, fixing the
PcustomeriniAsh
test failure. #20362 - Reduces likelihood of
Restart read required
error during Cross-DB Concurrent DDLs with per-database catalog version enabled by initiating the functionYbInitPinnedCacheIfNeeded
before starting the DDL transaction. Also, improper usage ofyb_non_ddl_txn_for_sys_tables_allowed
with a DDL statement has been rectified. #20303 - Increases the schema version of the default partition whenever you create a new partition, preventing erroneous data insertion into the default partition due to cache refresh issues. #17942
- Enhances test environment on Mac by fixing clean-up issues, and introduces a rollback ability for stashed PG11 data during PG15 upgrade. #20319
- Adds PgClient session id to ASH metadata to support aggregations for tserver wait events based on client session id, controlled by
TEST_yb_enable_ash
. Safe to upgrade/downgrade. #20242 - Revamps the initialization of YbPgInheritsCache's hash table to use binary comparison with HASH_BLOBS flag, ensuring correct hash lookups, while also stopping marathon Java partitioning tests on TSAN to prevent timeouts and test failures. #20436
- Rectifies the mismatched sizes of various ASH fields, ensuring upgrade and downgrade safety, while providing new functionality without disturbing the existing one. Note that if you downgrade, ASH will become unavailable and it is guarded by TEST_yb_enable_ash. #20454
- Mitigates MISMATCHED_SCHEMA error in cross DB concurrent DDLs with per-database catalog version turned on, by ensuring backends only apply messages sent by themselves. #20340
- Eliminates tsan warnings in the MetricWatcher helper class by using MetricEntity class, preventing potential test failures. #20580
- Rectifies potential flakiness in
TestYbAsh testEmptyCircularBuffer
by ensuring buffer remains empty during idle cluster and excluding certain query samples. #20629 - Refines the Batch Nested Loop (BNL) first batch building logic to accurately handle scenarios when the provisional first batch size equalizes the outer table's size for correct query results. #20707
- Corrects the division by zero error occurring with certain queries when the
yb_enable_base_scans_cost_model
is activated andyb_fetch_size_limit
is enforced by setting a fixed size for result width when it equals zero. #20892 - Reduces PostgreSQL connection startup timeouts in geo-distributed clusters with a new
wait_for_ysql_backends_catalog_version_master_tserver_rpc_timeout_ms
GFlag, increasing the default timeout value to 60s from 30s. This alteration only impacts one specific RPC - WaitForYsqlBackendsCatalogVersion, not all RPCs, which should diminish time-out incidents. #18228 - Updates two column names in the yb_active_session_history view:
yql_endpoint_tserver_uuid
changes totop_level_node_id
for intuition, andsession_id
changes toysql_session_id
for clarity. #20920 - Fixes YSQL upgrade failure from 2.16 to 2.21 by adding a 2-second delay before moving to the next connection if the previous script included a breaking DDL statement. #20842
YCQL
- Solves a concurrency issue in the TestCQLServiceWithCassAuth.TestReadSystemTableAuthenticated unit test by adjusting the CQLServer's shared_pointer reset method. #17779
DocDB
- Resolves potential
WriteQuery
leak issue in CQL workloads, ensuring proper execution and destruction of queries, while preventing possible tablet shutdown blockages during conflict resolution failure. #19919 - Enhances error reporting of cross-cluster pollers, addressing persistence of stale or missed errors and simplifies the corresponding code. Now, instead of storing verbose detailed status, only error codes are stored for efficient memory usage. #19455
- Refines meta cache updates to avoid overwriting child tablets and consequently causing stale data, ensuring more accurate partition map refreshes. #18732
- Streamlines transaction processing by updating TabletState only for tablets engaged in writes and ignoring old statuses during transaction promotion, reducing failure errors and boosting consistency. #18081, #19535
- Resolves an inconsistency problem where indexes grow in size even after delete operations, causing slower query performance. The fix involves intelligent handling of backfill done events on the tablet server side. Note, it only works for newly created indexes and will not auto-recover from current buggy states. #19544
- Enables
wait-on-conflict
by default in release builds across all isolation levels. #19837 - Addresses potential deadlock during tablet shutdown when wait-queues are enabled by refactoring the Wait-Queue shutdown path to execute thread_pool_token_->Shutdown as part of WaitQueue::Impl::CompleteShutdown instead of StartShutdown. #19867
- Includes a script to ensure no index tables retain delete markers post-backfill, addressing a bug causing indexes to expand in size following row deletion, which slowed queries. The bug affected both YCQL and YSQL APIs for new indexes created with versions 2.14.x/2.16.x/2.18.x and led to increasing storage needs due to accumulated delete markers. This script negates these issues and boosts index performance. #19544
- Sets
kMinAutoFlagsConfigVersion
to 1, providing accurate configuration version comparison and reducing potential confusion. #19985 - Reduces the occurrence of
Transaction Metadata Missing
errors by accurately reporting deadlocked transactions that may result from multiple aborts in a deadlocked cycle. #20016 - Enables single shard waiters to progress after a blocking subtransaction rolls back, by applying the same logic used for distributed transactions. #20113
- Handles backfill responses getting interleaved across different operations more gracefully to prevent crashes caused by slow masters or network delays. #20510
- Reintroduces bloom filters use during multi-row insert, improving conflict resolution and rectifying missing conflict issues, while also addressing GH 20648 problem. #20398, #20648
- Reschedules the resumption of contentious waiters on the same underlying
Scheduler::Impl::strand_
, which is used for executing incoming rpc calls, instead of reactor threads, thus preventing a fatal issue. #20651 - Reduces log warnings in normal situations by downgrading repeated waiter resumption alerts to VLOG(1), benefiting from the direct signaling of transaction resolution. #19573
- Disables the wait-on-conflict feature in 2.21.0 by default to fix a launch-blocking bug linked to multiple requests per session to a single tablet. #20978
- Reflects the actual columns locked in conflict resolution instead of the shared in-memory locks in
pg_locks
, providing more accurate output for waiting transactions. #18399 - Deactivates the packed row feature for colocated tables, averting potential write failure issues identified in 20638 during specific kinds of compactions. #21047
- Enables segfault prevention originating from
pg_locks
queries when wait-queues are disabled by explicitly checking the existence ofserver_->tablet_manager ->waiting_txn_registry
before its usage. #20772 - Fixes a race condition on kv_store_.colocation_to_table to prevent undefined behavior and re-enables packed row feature for colocated tables, enhancing data writing and compaction processes. #20638
- Modifies the
DocDB
system by shifting the acquirement ofsubmit_token_
of theWriteQuery
to the post-conflict resolution phase to prevent DDL requests from being blocked, thus optimizing both reads and writes for continued performance and enhanced data consistency. #20730 - Corrects transaction queue behavior allowing multiple waiters for a single transaction per tablet, thereby resolving conflicts and enhancing transaction handling capability. #18394
- Restores the
wait-on-conflict
feature in the 2.21.0 branch that was previously disabled due to a bug, now resolved. #20978 - Filters out external intents beyond producer tablet range to address disparity in tablet partitions, ensuring each consumer tablet only receives relevant intents. This resolves the issue of potential hidden batch records due to erroneous starting of write_ids from zero. #19728
- Resolves the issue where transactions continue and commit despite supposed immediate abort after promotion, due to a timing gap between sending UpdateTransactionStatusLocation RPCs and reception of the first PROMOTED heartbeat. This update delays the sending of UpdateTransactionStatusLocation RPCs until the first PROMOTED heartbeat is acknowledged. #17319
- Refines the leaderless tablet detection logic to prevent incorrect reporting of tablets having recently undergone leader changes as leaderless, improving data consistency. #20124
- Prevents the deletion of active snapshots during a database backup, even if their corresponding tables are dropped, enhancing the reliability of backup operations. #17616
- Adjusts calculation of replication lag metrics for split tablet children by incorporating parent tablet's last sent/committed record time, promoting greater accuracy in metric results. #17025
- Addresses the bug where large transactions partially apply to regular RocksDB during tablet server restarts, thus ensuring consistent transaction data after restarts. #19359
- Allows setting all columns of a row to NULL, resulting in deletion instead of creating a row consisting of NULLs, rectifying an issue during compaction. #18157
- Corrects an issue where an invalid filter key negatively affected the performance of backwards scans, by improperly passing all SST files through the bloom filter. This update will be applied to versions 2.20 and 2.18. #19440
- Resolves issues of data validation failure and unreachable nodes by properly setting child checkpoints in
cdc_state
during tablet splits, curbing log amplification. #18540 - Allows tracing of outgoing calls only if the current RPC is being traced, reducing excessive memory consumption and logging. #19497
- Introduces retry logic to synchronize metadata and checkpoint creation during remote bootstrap initialization, reducing inconsistency risks associated with schema packing. #19546
- Stops Garbage Collection (GC) of schema packings that XCluster config references to avoid data loss during replication, taking into account network partitions and schema changes. #17229
- Removes a regression that could crash the TServer when replaying alter schema during local bootstrap by adding ANNOTATE_UNPROTECTED_WRITE to CqlPackedRowTest.RemoteBootstrap. #19546
- Corrects Master's tablet_overhead mem_tracker issue, ensuring it displays accurate memory consumption, addressing discrepancy in MemTracker metric names between TServer and Master. #19904
- Resolves a race condition in MasterChangeConfigTest.TestBlockRemoveServerWhenConfigHasTransitioningServer by ensuring the launched async thread operates on a copy of
ExternalMaster*
instead of the mutatingcurrent_masters
vector. #19927 - Corrects intermittent index creation failure for empty YCQL tables by evaluating the result of
is_running
rather than checking index state directly, ensuring accurateretain_delete_markers
and reducing potential performance issues. #19933 - Addresses a PITR restore issue by terminating all active transactions, ensuring inserted or updated data doesn't get omitted, and giving a clear signal about the non-application of such transactions. #14290
- Adds retries around the leader step down in the PgNamespaceTest.CreateNamespaceFromTemplateLeaderFailover test to allow the target leader time to properly catch up, preventing previous failures. #14316
- Disables the packed row feature for colocated tables, effectively preventing a possible encounter with the underlying issue in 21218 during debugging. #21218
- Prevents system crashes caused by the CallHome class calling a pure virtual function due to a timing issue during system shutdown. #18254
- Corrects an Xcluster Consumer shutdown issue encountered during testing by implementing a temporary mitigation that waits for the Flush with a timeout. #19402
- Amends
RaftGroupMetadata::CreateSubtabletMetadata
to update the log prefix, preventing the use of parent tablet ID in child tablet's metadata logging. #19375 - Resolves crashes in sys-catalog-tool linked with TabletBootstrap failing due to uninitialized transaction_participant_context, enhancing stability. #19412
- Corrects a previously non-retryable PGSQL operation, preventing errors from being returned back to PG layer during a parent tablet shutdown scenario. #19033
- Enables transaction promotion in TestPgWaitQueuesRegress for an enhanced testing process. #19575
- Restores the original behavior of not counting tablets on dead tservers towards the replica count, ensuring accurate representation of under-replicated tablets. #17867
- Ensures the correct in-memory state for the master coming out of shell mode by fetching the universe key from other masters, enabling proper decryption of the universe key registry. #19513
- Corrects a lock order inversion in the transaction loader to prevent potential deadlock scenarios. #19508
- Adds tests for handling indexes in colocated databases in transactional and non-transactional xCluster environments, enhancing database reliability and consistency. Also simplifies
WaitForReplicationDrain
test helper for easier usage. #18427, #16758 - Rectifies the issue causing the XClusterYsqlIndexTest.FailedCreateIndex test to fail by altering the over-aggressive DCHECK to an efficient SCHECK to allow for transient ALTER operations. #18967
- Rectifies the use-after-free issue in RefinedStream::Connected failure path by ensuring a status return rather than causing memory writes to a freed space. #19727
- Introduces macros that simplify the creation of comma-separated expression lists to a stream, reducing repetition. #19761
- Redefines the structure of thirdparty_archives.yml by eliminating redundant fields, implementing sensible default values, and introducing blank lines for improved readability between distinct third-party archive build sections. #19883
- Increases the visibility of Remote Bootstrap (RBS) sessions by adding a dedicated tserver page that lists all ongoing RBS sessions, including the remote log anchor sessions. Additionally, amplifies the
Last status
field on the tserver's tablets page to display the source a peer is or has been bootstrapping from. #19568 - Resolves a
maybe-uninitialized
compilation error in almalinux8 release gcc11, enhancing the reliability of the code by addressing both identified issues. #19987 - Rectifies the
TestYSQLDumpAsOfTime
compilation issue by replacing<int64_t>
with<PGUint64>
. #19992 - Eliminates the extra verbosity in MiniCluster logs by removing entries with
hk!!
. #20007 - Resolves an issue where the webserver may start prematurely and fail, by ensuring
cds::Initialize
is called before executing any function oncds::threading::Manager
, minimizing race conditions. #20119 - Introduces an asynchronous interface for PgClient shared memory exchange, allowing for multiple requests and parallel query processing. #20151
- Displays the errno when unable to open
version_metadata.json
orauto_flags.json
files, providing clarity on the nature of the IO error. #20250 - Deprecates the
enable_process_lifetime_heap_sampling
flag, simplifying tcmalloc sampling control to only settingprofiler_sample_freq_bytes
, which if <=0 disables sampling. #20236 - Prevents application crashes caused by an interrupted interprocess semaphore which previously threw an exception. #20325
- Allows early termination of old single statement read-committed transactions facing
kConflict
errors to enhance system throughput. #20329 - Eliminates premature shutdowns during transaction status resolution by ensuring the
rpcs_.Shutdown
only occurs after all status resolvers of the participant have ended, avoiding any in-progress status resolver rpc(s). #19823 - Reduces potential
request is too old
errors during YSQL DDLs by setting the SysCatalog tablet's retryable request retain duration to the maximum of YSQL and YQL client timeout. #20330 - Fixes
./yb_build.sh help
to correctly display the help command instead of an error message due to a mismatched function name. #20390 - Removes non-trivially destructible static initializations from the code, eliminating complexities that could lead to difficult to identify bugs. #20407
- Replaces the deprecated
exec_program
command withexecute_process
in CMake, resolving issue 20481 and eliminating potential warning CMP0153 for developers. #20481 - Allows bulk load time reduction by packing all values when inserting a row with multiple values into the PostgreSQL layer. Apply the preview flag -ysql_pack_inserted_value to enable this feature and note it currently uses v1 encoding. #20713
- Stores the first error from a failed setup replication to ensure more accurate feedback to the user, instead of a final generic error message like
Universe is being deleted
. #20689 - Changes the path in
yb_build.sh
to locategenerate_test_truststore.sh
in$YB_BUILD_SUPPORT_DIR
, solving build failures on GitHub Actions. #20747 - Reduces TPCC NewOrder latency by replacing the ThreadPoolToken with a Strand within a dedicated rpc::ThreadPool in PeerMessageQueue's NotifyObservers functions, enhancing speed and efficiency. #20912
- Early aborts transactions that fail during the promotion process, enhancing throughput in geo-partitioned workloads and offering stability in geo-partitioned tests. #21328
- Eliminates a race condition that can occur when simultaneous calls to
SendAbortToOldStatusTabletIfNeeded
try to send the abort RPC, thus preventing avoidable FATALs for failed geo promotions. #17113 - Changes the initial remote log anchor request to be at the follower's last logged operation id index, reducing the probability of falling back to bootstrapping from the leader and improving the success rate of remote bootstraps. #19536
- Prevents concurrent heap profiles from running and problematic resetting of sampling frequency, allowing only one heap profile to run at a time. #19841
- Resolves use-after-move errors detected by clang-tidy's bugprone-use-after-move-check for increased code stability. #20435
- Resolves issues in the under-replicated endpoint algorithm, ensuring correct counting of replicas only when the block's minimum number of replicas has not been fulfilled yet, hence offering accurate replica tally for placement blocks. #20657
CDC
- Introduces an additional test case ensuring that only tablets belonging to a dropped table get deleted from the cdc_state table. #19196
- Eliminates deadlock during the deletion of namespace-level CDC streams, enabling the successful execution of the
ysqlsh drop database
command even when the database has multiple tables. #19879 - Resolves an issue preventing newly created tables from being added to the stream metadata and CDC state table after an existing table is dropped, by considering streams in
DELETING_METADATA
state as well asACTIVE
state during dynamic table addition. #20428 - Removes only non-active tablets from cdc_state in CleanUpCDCStreamsMetadata, including retaining parent split tablets, to preserve essential data during stream cleaning. #19348
- Fixes the issue of WAL garbage collection for tables added after stream creation by enabling WAL retention for each such tablet, reducing connector failure. #19385
- Reinstates the creation of CDC streams with old record types to ensure backwards compatibility and prevent
CDC error 9
when theALL
mode is utilized. #19929 - Fixed the decoding of NUMERIC value in CDC records to prevent precision loss by ensuring that the decoded string is not converted to scientific notation if its length is more than 20 characters. Additionally, the fix involves using the string representation with no limit on length and employing the Postgres numeric_out method for decoding, which is identical to the decoding of numerics in a PG query. #20414
- Rectifies an error within the CDCService side, where
Merger tried to set tablet safetime to a lower value
. Now, for non-consistent snapshot streams, thecommit_time_threshold
adjusts correctly to thesafe_hybrid_time
value as per the request, instead of always setting to zero. #20356 - Rectifies consistent snapshot stream creation by ensuring tablets complete their tasks and snapshot safe opids populate in the cdc_state table for proper initialization. #20477
- Allows continuation of tablet fetching, even if certain tables face errors, by logging a warning instead of sending unnecessary errors to the client. #19434
- Rectifies
pg_replication_slots
view failure prior to any cdc/xCluster stream creation by refining the logic to read the cdc_state_table only when a cdc stream exists. #20073 - Updates the CDCSDK stream metadata with consistent snapshot-related details and ensures its persistence in the sys_catalog, enhancing the stability and accuracy of data. #20202
- Corrects the AsyncYBClient method to pass the
explicit_cdc_sdk_opid
instead of anull
value, ensuring proper snapshot checkpointing and enhancing snapshot resume functionality in EXPLICIT mode. #19394 - Alleviates a regression in the connector snapshot resume capability by adjusting the key population in
GetChangesRequest
, ensuring the key is populated only when it is notnull
. #19394 - Removes potential crash in DEBUG mode by ensuring each entry returned from the cdc_state_table iteration in
pg_replication_slots
view is checked withRETURN_NOT_OK
before usage. #19894 - Increases the value of
FLAGS_update_min_cdc_indices_interval_secs
from 2 to 5, ensuring the CDC state table tablet has enough time to wait for a new leader and correctly update the log. #18156 - Corrects the calculation of the
cdcsdk_sent_lag
metric to prevent disproportionate growth, by updating thelast_sent_record_time
with eachSafePoint
record, reducing inconsistency between transactions. #15415 - Eliminates errors in streaming changes from child tablets in CDCSDK by accurately determining the slowest consumer and preventing unnecessary Garbage Collection of intents. #20284
- Allows propagation of RPC deadline from clients to YB-Master for CreateCDCStream, reducing unnecessary retries and correctly timing out requests. #20583
- Resolves memory leak errors in the asan environment caused by not freeing YBCStatus from YBCPgExecCreateReplicationSlot in case of AlreadyPresent or LimitReached errors. #20279
- Resolves CDCLog and CDCService test failures by setting FLAGS_cdcsdk_retention_barrier_no_revision_interval_secs to 0, ensuring upgrade and rollback safety. #20353
- Rectifies timing issues in the CDCSDKConsistentSnapshotTest.TestRetentionBarrierSettingRace, enhancing stability for TSAN builds via application of WaitFor with an adequate timeout. #20455
- Prevents write pausing on a tablet for an AlterSchema procedure that is solely setting retention barriers during consistent snapshot stream creation. #20620
- Stream creation failures now trigger a thorough cleanup to avoid resource misuse, resolving issues caused by late ALTER TABLE responses. #20725
yugabyted
- Revises auth failure handling in Ysql Connection Manager to give accurate error messages, prevent broken control connections, and improves error packet handling. #17289, #19781, #19800
- Adjusts Ysql Conn Mgr Stats setting to align with Ysql Conn Mgr's status, maintaining FALSE setting even when Postgres process is created without a tablet server. #19998
- Resolves the hanging issue in Odyssey when incoming packet size exceeds a limit, by ensuring COPY_DATA and QUERY message types are fully received before processing. #19245, #19284
- Maintains sticky object count bi-directionally when creating new sub transactions or returning to parent transactions, aligning count with actual usage. #20071
- Allows usage of
SET LOCAL
query to set temporary session parameters for specific transactions, with values reverting after transaction completion. #19556 - Introduces a JSON endpoint at
/api/v1/mem-trackers
, enhancing data reliability by avoiding parsing of the HTML page at the/mem-trackers
server endpoint for memory usage data. #18057 - Modifies yugabyted UI apiserver to acquire memory usage data from the new JSON endpoint
/api/v1/mem-trackers
instead of parsing HTML from/mem-trackers
, ensuring more reliability. #18057
Other fixes
- Ensures the
tserver start
andtserver stop
scripts successfully terminate all running PG processes, regardless of PID length, enhancing process management. #19817 - Updates the condition for HT lease reporting to ensure accurate leaderless tablet detection in RF-1 setup, preventing false alarms. #20919
- Increases the
max_stack_depth
from 900kB to 950kB for proper execution and lessens the excessive logging triggered by inherits cache inyb_pg_errors.sql
. #19443 - Reduces disruptions by throttling the master process log messages related to "tablet server has a pending delete" into 20-second intervals. #19331
- Prevents segmentation faults in the stats collector after a Postmaster reset, ensuring the stats collector's operations are uninterrupted even when a query is terminated. #19672
Other
- Streamlines code base by eliminating over 900 unnecessary includes, splitting oversized .proto files, enhancing the
protoc-gen-yrpc
to produce forward headers for protobuf, and upgrading precompiled headers. Also restructuresMasterService
, divides it into smaller services improving build times, and moves encryption-related classes. Updates now allow less system entropy drain via revised UUID generation. #10584 - Validates the use of two arguments for
disable_tablet_splitting
, addressing a previous condition where only one was required, thereby enhancing backup process reliability. #8744 - Enables passing of username and password to the connect command akin to ysqlsh, permitting direct connection to the desired database/keyspace. #14869
- Introduces documentation for GFlags pertinent to the
bootstrap from closest peer
feature in the tserver flags page. #18061 - Corrects a nonfunctional link in the RBS GFlags description and adds documentation for the
bootstrap from closest peer
feature. #18061 - Reduces network requests when running ./yb_build.sh offline for a smoother rebuild process and adds helpful error messages for easier debugging. #19476
- Rectifies the issue where yugabyted crashes if yugabyted-ui binary doesn't exist, allowing the cluster to start with the UI disabled, similar to setting
ui=false
and alerts the user with a warning. #16098 - Resolves the odyssey build failure on Ubuntu 23.04 when compiling using ./yb_build.sh release gcc13 by addressing -Werror=address issue. #19959
- Adjusts previously hardcoded ports such as
master_rpc_port
,tserver_webserver_port
, andmaster_webserver_port
to dynamically accommodate custom configurations, solving connectivity issues in multi-region/zone cluster setups. #15334 - Ensures better visibility into local calls by tracking them and allowing
DumpRunningRpcs
API to fetch them; if rolled back, this functionality will turn unavailable. #19697 - Transitions primary build and packaging from Centos7 to AlmaLinux8, discontinuing support for Linux OS's with glibc less than 2.28 for future integrations, while preserving it for versions 2.20 and earlier. #20173