TServer fatals with schema packing not found error
Product | Affected Versions | Related Issues | Fixed In |
---|---|---|---|
YSQL | v2.20, v2024.1, v2024.2 | #25106 | v2024.1.6, v2024.2.3, v2.20.11.0 (upcoming) |
Description
Background
DocDB packed row format has a dependency on the schema version in order to read rows. If a table undergoes schema change, then the older schema versions are kept around as long as there are Sorted String Tables (SSTs) that refer to the older schema versions. These schema versions cannot be kept around forever as compactions garbage collect the unused schema versions metadata. The schema versions for each SST file is stored in frontiers.
Problem
When a large transaction inserts rows into a table, the data may be split into multiple RocksDB batches, each potentially flushing independently into separate SST files (for example, File A, File B, and File C). The current issue is that only the last batch updates the schema version in the frontier.
This behavior can lead to aggressive garbage collection of schema version and packing metadata under specific conditions:
- A schema version change occurs after earlier batches (for example, File A and File B) have been flushed.
- Files containing the schema packing metadata (for example, File C) are compacted independently, allowing the garbage collection mechanism to remove this metadata.
While this scenario is rare, it has been observed in lab stress tests.
Necessary conditions
It requires the confluence of YSQL (with Packed Row which is GA only for YSQL), schema modification operations, large transactions, and subsequent compaction that selectively processes later SST files (if all the files A, B and C were compacted together, then there would not be any issue).
Error
TServer fatals with schema packing not found error. The TServer logs would contain logs like "Cannot find packing with version 0 for table …" as follows:
F20241115 05:29:51 ../../src/yb/tablet/tablet.cc:1731] T 2fe6443354144bacabb250de4511d771 P cc3fe0b91eb748c294e6e1183087c7d9: Failed to write a batch with 0 operations into RocksDB: Corruption (yb/tablet/tablet_metadata.cc:387): Cannot find packing with version 0 for table tb_0 (table_id=00004000000030008000000000004115 schema version=5 cotable_id=00000000-0000-0000-0000-000000000000): Not found (yb/dockv/schema_packing.cc:745): Schema packing not found: 0, available_versions: [5, 3, 2, 4]
@ 0xaaaae32c7b5c google::LogMessage::SendToLog()
@ 0xaaaae32c8a00 google::LogMessage::Flush()
@ 0xaaaae32c909c google::LogMessageFatal::~LogMessageFatal()
@ 0xaaaae474ea30 yb::tablet::Tablet::WriteToRocksDB()
@ 0xaaaae474a53c yb::tablet::Tablet::ApplyIntents()
@ 0xaaaae48097f4 yb::tablet::TransactionParticipant::Impl::ProcessReplicated()
@ 0xaaaae472b164 yb::tablet::UpdateTxnOperation::DoReplicated()
@ 0xaaaae471e384 yb::tablet::Operation::Replicated()
@ 0xaaaae4720990 yb::tablet::OperationDriver::ReplicationFinished()
@ 0xaaaae37e287c yb::consensus::ConsensusRound::NotifyReplicationFinished()
@ 0xaaaae382db34 yb::consensus::ReplicaState::ApplyPendingOperationsUnlocked()
@ 0xaaaae382ceb0 yb::consensus::ReplicaState::AdvanceCommittedOpIdUnlocked()
@ 0xaaaae3817118 yb::consensus::RaftConsensus::UpdateReplica()
@ 0xaaaae37f9634 yb::consensus::RaftConsensus::Update()
@ 0xaaaae4abebd8 yb::tserver::ConsensusServiceImpl::UpdateConsensus()
@ 0xaaaae388bd7c std::__1::__function::__func<>::operator()()
@ 0xaaaae388ca84 yb::consensus::ConsensusServiceIf::Handle()
@ 0xaaaae4671444 yb::rpc::ServicePoolImpl::Handle()
@ 0xaaaae45bece0 yb::rpc::InboundCall::InboundCallTask::Run()
@ 0xaaaae46806a8 yb::rpc::(anonymous namespace)::Worker::Execute()
@ 0xaaaae4f12dd8 yb::Thread::SuperviseThread()
@ 0xffff854478b8 start_thread
@ 0xffff854a3afc thread_start
Mitigation
-
If a single replica of a tablet is affected by this issue within a cluster (Replication Factor 3 or greater), the problem can be resolved by removing the affected tablet. Raft will then replicate a healthy copy from the remaining replicas.
-
A general workaround is to set the TServer flag enable_schema_packing_gc to false. While this action causes schema metadata to be retained indefinitely, the associated storage cost is minimal (calculated as: number_of_schema_changes_for_the_table * number_of_tablets_for_table * size_of_schema). This retained metadata will be reclaimed once the
enable_schema_packing_gc
flag is reset to true. -
In scenarios where all replicas of a tablet encounter this failure, self-remediation is not possible. Please contact Yugabyte Support for assistance.
Details
The issue was fixed by adding the missing call to FlushSchemaVersion
when writing the intermediate transaction batch as part of
#25106.