Skip to content

[Smoke Tests] Tweaks to reduce total test time.#19019

Open
JoshLind wants to merge 1 commit intomainfrom
num_fullnode_failures
Open

[Smoke Tests] Tweaks to reduce total test time.#19019
JoshLind wants to merge 1 commit intomainfrom
num_fullnode_failures

Conversation

@JoshLind
Copy link
Contributor

@JoshLind JoshLind commented Mar 10, 2026

Description

This PR offers a few small tweaks to the smoke tests to try and reduce runtime.

Testing Plan

Existing test infrastructure.


Note

Low Risk
Only adjusts test parameters and disables a handful of redundant tests; no production code paths are changed, with the main risk being reduced CI coverage.

Overview
Reduces smoke test runtime by shortening several test loops and epoch waits, including fewer on-chain config toggle iterations, fewer validator restart rounds, and smaller validator swarms in the DKG join/leave test.

Marks multiple redundant or no-longer-relevant smoke tests as #[ignore] (e.g., “no epoch changes” subsets and “no compression” variants now that compression is default), and temporarily ignores the execution pool window-size config test until deployment.

Written by Cursor Bugbot for commit 4db081b. This will update automatically on new commits. Configure here.

@JoshLind JoshLind added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Mar 10, 2026
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@JoshLind JoshLind force-pushed the num_fullnode_failures branch from 5f98d8b to 8a7baa3 Compare March 10, 2026 19:52
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@JoshLind JoshLind force-pushed the num_fullnode_failures branch from 8a7baa3 to 94abb99 Compare March 10, 2026 20:55
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@JoshLind JoshLind force-pushed the num_fullnode_failures branch from 94abb99 to 4db081b Compare March 10, 2026 21:30
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

✅ Forge suite compat success on 3c8902ab1d1562d395135ed890810b8f054fae0e ==> 4db081b3b3e02f925b5d434de8be0c954271ff74

Compatibility test results for 3c8902ab1d1562d395135ed890810b8f054fae0e ==> 4db081b3b3e02f925b5d434de8be0c954271ff74 (PR)
1. Check liveness of validators at old version: 3c8902ab1d1562d395135ed890810b8f054fae0e
compatibility::simple-validator-upgrade::liveness-check : committed: 11354.38 txn/s, latency: 2896.65 ms, (p50: 2500 ms, p70: 2700, p90: 3600 ms, p99: 12400 ms), latency samples: 435160
2. Upgrading first Validator to new version: 4db081b3b3e02f925b5d434de8be0c954271ff74
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6261.79 txn/s, latency: 5385.76 ms, (p50: 5900 ms, p70: 6000, p90: 6100 ms, p99: 6200 ms), latency samples: 220120
3. Upgrading rest of first batch to new version: 4db081b3b3e02f925b5d434de8be0c954271ff74
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6139.55 txn/s, latency: 5525.81 ms, (p50: 6100 ms, p70: 6200, p90: 6400 ms, p99: 6500 ms), latency samples: 213740
4. upgrading second batch to new version: 4db081b3b3e02f925b5d434de8be0c954271ff74
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 10421.81 txn/s, latency: 3150.16 ms, (p50: 3000 ms, p70: 3800, p90: 4300 ms, p99: 4600 ms), latency samples: 341880
5. check swarm health
Compatibility test for 3c8902ab1d1562d395135ed890810b8f054fae0e ==> 4db081b3b3e02f925b5d434de8be0c954271ff74 passed
Test Ok

@github-actions
Copy link
Contributor

❌ Forge suite realistic_env_max_load hard failure on 4db081b3b3e02f925b5d434de8be0c954271ff74

two traffics test: inner traffic : committed: 14618.53 txn/s, latency: 2568.32 ms, (p50: 2500 ms, p70: 2600, p90: 2700 ms, p99: 3300 ms), latency samples: 5444700
two traffics test : committed: 100.03 txn/s, latency: 1126.99 ms, (p50: 1000 ms, p70: 1200, p90: 1400 ms, p99: 1600 ms), latency samples: 1740
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 1.597, avg: 1.413", "ConsensusProposalToOrdered: max: 0.180, avg: 0.173", "ConsensusOrderedToCommit: max: 0.069, avg: 0.061", "ConsensusProposalToCommit: max: 0.247, avg: 0.234"]
Max non-epoch-change gap was: 1 rounds at version 33100 (avg 0.00) [limit 4], 1.14s no progress at version 33100 (avg 0.08s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.32s no progress at version 2101542 (avg 0.32s) [limit 16].
Test Failed: {"errors":[{"name":"Check no fullnode failures","error":"Error! The number of fullnode failures was > 0 (1), but must be 0!","causes":["Error! The number of fullnode failures was > 0 (1), but must be 0!"]}]}
Trailing Log Lines:
test CompositeNetworkTest ... FAILED
Error: {"errors":[{"name":"Check no fullnode failures","error":"Error! The number of fullnode failures was > 0 (1), but must be 0!","causes":["Error! The number of fullnode failures was > 0 (1), but must be 0!"]}]}
Test Statistics: 
two traffics test: inner traffic : committed: 14618.53 txn/s, latency: 2568.32 ms, (p50: 2500 ms, p70: 2600, p90: 2700 ms, p99: 3300 ms), latency samples: 5444700
two traffics test : committed: 100.03 txn/s, latency: 1126.99 ms, (p50: 1000 ms, p70: 1200, p90: 1400 ms, p99: 1600 ms), latency samples: 1740
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 1.597, avg: 1.413", "ConsensusProposalToOrdered: max: 0.180, avg: 0.173", "ConsensusOrderedToCommit: max: 0.069, avg: 0.061", "ConsensusProposalToCommit: max: 0.247, avg: 0.234"]
Max non-epoch-change gap was: 1 rounds at version 33100 (avg 0.00) [limit 4], 1.14s no progress at version 33100 (avg 0.08s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.32s no progress at version 2101542 (avg 0.32s) [limit 16].
Test Failed: {"errors":[{"name":"Check no fullnode failures","error":"Error! The number of fullnode failures was > 0 (1), but must be 0!","causes":["Error! The number of fullnode failures was > 0 (1), but must be 0!"]}]}

=== BEGIN JUNIT ===
<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="forge" tests="1" failures="1" errors="0" uuid="c2a457af-974c-4835-9555-250638e0d31c">
    <testsuite name="local" tests="1" disabled="0" errors="0" failures="1">
        <testcase name="CompositeNetworkTest(network:multi-region-network-emulation(two traffics test)) with ">
            <failure message="{&quot;errors&quot;:[{&quot;name&quot;:&quot;Check no fullnode failures&quot;,&quot;error&quot;:&quot;Error! The number of fullnode failures was &gt; 0 (1), but must be 0!&quot;,&quot;causes&quot;:[&quot;Error! The number of fullnode failures was &gt; 0 (1), but must be 0!&quot;]}]}"/>
        </testcase>
    </testsuite>
</testsuites>
=== END JUNIT ===

Swarm logs can be found here: See fgi output for more information.
[2026-03-10T22:04:27Z INFO  aptos_forge::backend::k8s::cluster_helper] Deleting namespace forge-e2e-pr-19019: Some(NamespaceStatus { conditions: None, phase: Some("Terminating") })
[2026-03-10T22:04:27Z INFO  aptos_forge::backend::k8s::cluster_helper] aptos-node resources for Forge removed in namespace: forge-e2e-pr-19019
[2026-03-10T22:04:27Z INFO  ureq::unit] sending request POST http://vmagent-victoria-metrics-agent.victoria-metrics.svc:8429/api/v1/import/prometheus

failures:
    CompositeNetworkTest

test result: FAILED. 0 passed; 0 soft failed; 1 hard failed; 0 filtered out

Debugging output:
NAME                                         READY   STATUS      RESTARTS   AGE
aptos-fullnode-0                             1/1     Running     0          12m
aptos-node-0-fullnode-eforgece0a8702-0       1/1     Running     0          14m
aptos-node-0-validator-0                     1/1     Running     0          14m
aptos-node-1-fullnode-eforgece0a8702-0       1/1     Running     0          14m
aptos-node-1-validator-0                     1/1     Running     0          14m
aptos-node-2-validator-0                     1/1     Running     0          14m
aptos-node-3-validator-0                     1/1     Running     0          14m
aptos-node-4-validator-0                     1/1     Running     0          14m
aptos-node-5-validator-0                     1/1     Running     0          14m
aptos-node-6-validator-0                     1/1     Running     0          14m
forge-pfn-deployer-xwtfh                     0/1     Completed   0          14m
forge-testnet-deployer-l9h42                 0/1     Completed   0          14m
genesis-aptos-genesis-eforgece0a8702-8krdf   0/1     Completed   0          14m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant