[storage] Support hot state KV DB truncation on startup by wqfish · Pull Request #19005 · aptos-labs/aptos-core

wqfish · 2026-03-09T22:37:56Z

Add truncation support for the hot state KV DB, reusing the existing
cold state truncation logic via a generic delete_state_value_and_index
that is parameterized on the value schema.

Add stale state value index CF to hot state KV DB column families
Make delete_state_value_and_index generic over the value schema
(StateValueByKeyHashSchema for cold, HotStateValueByKeyHashSchema
for hot)
Simplify the startup truncation guard to only check !readonly:
!delete_on_restart was redundant (empty DB returns None progress),
!is_hot is no longer needed now that truncation handles both

Stack created with Sapling. Best reviewed with ReviewStack.

Note

Medium Risk
Touches core storage commit/truncation paths and introduces a new on-disk schema for hot state, so mistakes could cause startup truncation or persistence issues despite added tests.

Overview
This PR wires hot state KV updates through the commit pipeline by extending ChunkToCommit with hot_state_updates and having AptosDB persist them into a dedicated hot state KV DB during save_transactions.

It adds proper RocksDB column families and a revised hot-state value schema (HotStateKvEntry storing the full StateKey plus occupied/vacant value data), plus startup truncation support for the hot state KV DB by reusing the existing state KV truncation logic via a schema-generic delete_state_value_and_index.

It also introduces utilities and tests to read back hot-state entries by version and to bulk-load the latest per-key hot-state set (including LRU pointer reconstruction), and updates hot-state metadata plumbing to be explicit per shard.

^{Written by Cursor Bugbot for commit e0707df. This will update automatically on new commits. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-09T22:40:56Z

storage/aptosdb/src/utils/truncation_helper.rs

+            target_version + 1,
+            &mut batch,
+        )?;
+    }


Hot state truncation is a no-op due to missing stale indices

High Severity

The hot state KV DB truncation relies on iterating StaleStateValueIndexByKeyHashSchema entries to find values to delete, but put_hot_state_updates never writes any StaleStateValueIndexByKeyHashSchema entries — it only writes HotStateValueByKeyHashSchema entries. This means delete_state_value_and_index::<HotStateValueByKeyHashSchema> always iterates an empty CF and deletes nothing. After a crash with partially committed hot state data, the truncation is a no-op, leaving entries beyond the committed version. This will cause load_all_hot_state_entries to panic on the assertion hot_since_version <= committed_version when it encounters those un-truncated entries.

Additional Locations (1)

storage/aptosdb/src/state_store/mod.rs#L842-L894

Wire the hot state KV write path end-to-end so that hot state changes are persisted to `hot_state_kv_db` during `pre_commit_ledger`. The schema value now wraps insertions in `HotStateKvEntry` (carrying the `StateKey` alongside the value, since the schema key only stores the hash). Evictions are written as `None`. `HotStateShardUpdates` is enriched with `value_version` on insertions and checkpoint version on evictions so the DB entries carry all the information needed for reconstruction on load. Hot state KV batches are built alongside cold state KV batches and committed in parallel within the same rayon scope, mirroring the existing cold state pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implements the loading side of hot state KV persistence (commit 2 of 2). On startup, `StateKvDb::load_all_hot_state_entries()` iterates all hot state KV shards in parallel, deduplicates to the latest version per key hash, filters out evictions, and reconstructs the per-shard LRU linked lists. `HotState::new_with_base()` then populates the base DashMaps from the loaded entries so the hot state view is immediately queryable. The `State::new_at_version()` and `StateWithSummary::new_at_version()` APIs now accept `HotStateMetadata` (LRU head/tail/count per shard) so the loaded state can carry correct metadata from the start. Not wired into the startup path yet — `delete_on_restart` effectively remains always-true. The integration is deferred until state KV / state merkle DB consistency on restart is fully worked out. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add truncation support for the hot state KV DB, reusing the existing cold state truncation logic via a generic `delete_state_value_and_index` that is parameterized on the value schema. - Add stale state value index CF to hot state KV DB column families - Make `delete_state_value_and_index` generic over the value schema (`StateValueByKeyHashSchema` for cold, `HotStateValueByKeyHashSchema` for hot) - Simplify the startup truncation guard to only check `!readonly`: `!delete_on_restart` was redundant (empty DB returns `None` progress), `!is_hot` is no longer needed now that truncation handles both

wqfish requested review from grao1991 and lightmark as code owners March 9, 2026 22:37

This was referenced Mar 9, 2026

[storage] Persist hot state KV insertions/evictions to DB #18982

Open

[storage] Load hot state KV from DB on startup #18983

Open

cursor bot reviewed Mar 9, 2026

View reviewed changes

wqfish force-pushed the pr19005 branch from a2f5e48 to 5c9454e Compare March 9, 2026 22:41

wqfish and others added 3 commits March 11, 2026 00:03

wqfish force-pushed the pr19005 branch from 5c9454e to e0707df Compare March 11, 2026 00:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[storage] Support hot state KV DB truncation on startup#19005

[storage] Support hot state KV DB truncation on startup#19005
wqfish wants to merge 3 commits intomainfrom
pr19005

wqfish commented Mar 9, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wqfish commented Mar 9, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 9, 2026

Choose a reason for hiding this comment

Hot state truncation is a no-op due to missing stale indices

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wqfish commented Mar 9, 2026 •

edited by cursor bot

Loading