[Robustness] Compaction from failpoint injection missing in the report

Bug report criteria

This bug report is not security related, security issues should be disclosed privately via security@etcd.io.
This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
You have read the etcd bug reporting guidelines.
Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.

What happened?

Run https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/ci-etcd-robustness-main-s390x/1962827218790912000 reported linearization illegal.

After investigation we found that there is a compaction(601) that failed due to "mvcc: required revision has been compacted", but there is no compaction with higher revision before that.

Looking at the etcd WAL we found that there was compaction(822)

go run ./tools/etcd-dump-logs/ --start-index 0 ./tests/robustness/testdata/1756809623277583161/server-TestRobustnessExploratoryKubernetesHighTrafficClusterOfSize1-test-0/ | grep compaction
   2           941      norm    header:<ID:15517320784389033912 > compaction:<revision:281 > 
   2          1237      norm    header:<ID:15517320784389034212 > compaction:<revision:822 > 
   2          1238      norm    header:<ID:15517320784389034211 > compaction:<revision:601 > 
   3          1244      norm    header:<ID:15517320784390884875 > compaction:<revision:821 > 
   3          1728      norm    header:<ID:15517320784390885363 > compaction:<revision:824 > 
   3          2152      norm    header:<ID:15517320784390885787 > compaction:<revision:1040 > 
   3          2544      norm    header:<ID:15517320784390886179 > compaction:<revision:1327 > 
   3          2995      norm    header:<ID:15517320784390886630 > compaction:<revision:1602 >

However there is no such compaction in the client json reports. The missing piece was in the etcd logs:

/tmp/etcd-main-failpoints/bin/etcd (TestRobustnessExploratoryKubernetesHighTrafficClusterOfSize1-test-0) (18235): {"level":"info","ts":"2025-09-02T10:40:18.502065Z","caller":"mvcc/index.go:194","msg":"compact tree index","revision":822}
/tmp/etcd-main-failpoints/bin/etcd (TestRobustnessExploratoryKubernetesHighTrafficClusterOfSize1-test-0) (18235): {"level":"warn","ts":"2025-09-02T10:40:18.502836Z","caller":"txn/util.go:93","msg":"apply request took too long","took":"2.349812622s","expected-duration":"100ms","prefix":"","request":"header:<ID:15517320784389034212 > compaction:<revision:822 > ","response":"size:5"}
/tmp/etcd-main-failpoints/bin/etcd (TestRobustnessExploratoryKubernetesHighTrafficClusterOfSize1-test-0) (18235): {"level":"warn","ts":"2025-09-02T10:40:18.503016Z","caller":"v3rpc/interceptor.go:202","msg":"request stats","start time":"2025-09-02T10:40:15.571559Z","time spent":"2.931448901s","remote":"127.0.0.1:33114","response type":"/etcdserverpb.KV/Compact","request count":-1,"request size":-1,"response count":-1,"response size":-1,"request content":""}
/tmp/etcd-main-failpoints/bin/etcd (TestRobustnessExploratoryKubernetesHighTrafficClusterOfSize1-test-0) (18235): {"level":"panic","ts":"2025-09-02T10:40:18.503000Z","caller":"schedule/schedule.go:202","msg":"execute job failed","job":"kvstore_compact","panic":"failpoint panic: {}","stacktrace":"go.etcd.io/etcd/pkg/v3/schedule.(*fifo).executeJob.func1\n\tgo.etcd.io/etcd/pkg/v3@v3.6.0-alpha.0/schedule/schedule.go:202\nruntime.gopanic\n\truntime/panic.go:792\ngo.etcd.io/gofail/runtime.actPanic\n\tgo.etcd.io/gofail@v0.2.0/runtime/terms.go:324\ngo.etcd.io/gofail/runtime.(*term).do\n\tgo.etcd.io/gofail@v0.2.0/runtime/terms.go:293\ngo.etcd.io/gofail/runtime.(*terms).eval\n\tgo.etcd.io/gofail@v0.2.0/runtime/terms.go:108\ngo.etcd.io/gofail/runtime.(*Failpoint).Acquire\n\tgo.etcd.io/gofail@v0.2.0/runtime/failpoint.go:45\ngo.etcd.io/etcd/server/v3/storage/mvcc.(*store).scheduleCompaction\n\tgo.etcd.io/etcd/server/v3/storage/mvcc/kvstore_compaction.go:64\ngo.etcd.io/etcd/server/v3/storage/mvcc.(*store).compact.func1\n\tgo.etcd.io/etcd/server/v3/storage/mvcc/kvstore.go:240\ngo.etcd.io/etcd/pkg/v3/schedule.job.Do\n\tgo.etcd.io/etcd/pkg/v3@v3.6.0-alpha.0/schedule/schedule.go:41\ngo.etcd.io/etcd/pkg/v3/schedule.(*fifo).executeJob\n\tgo.etcd.io/etcd/pkg/v3@v3.6.0-alpha.0/schedule/schedule.go:206\ngo.etcd.io/etcd/pkg/v3/schedule.(*fifo).run\n\tgo.etcd.io/etcd/pkg/v3@v3.6.0-alpha.0/schedule/schedule.go:187"}
/tmp/etcd-main-failpoints/bin/etcd (TestRobustnessExploratoryKubernetesHighTrafficClusterOfSize1-test-0) (18235): panic: failpoint panic: {} [recovered]

There was a compaction request for revision 822, but it was done as part of failpoint injection. Such compaction requests are executed through a different path so possibly the code for collecting those client reports has a bug.

What did you expect to happen?

Compaction executed as part of failpoint injection should be preserved in client report allowing it to be included in linearization.

How can we reproduce it (as minimally and precisely as possible)?

n/a

Anything else we need to know?

No response

Etcd version (please run commands below)

$ etcd --version
# paste output here

$ etcdctl version
# paste output here

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Robustness] Compaction from failpoint injection missing in the report #20646

Bug report criteria

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Etcd version (please run commands below)

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Robustness] Compaction from failpoint injection missing in the report #20646

Description

Bug report criteria

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Etcd version (please run commands below)

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions