OPRUN-4392,OPRUN-4393: Add OLMv1 progress deadline QE tests + fixes by dtfranz · Pull Request #755 · openshift/operator-framework-operator-controller

dtfranz · 2026-06-23T13:06:49Z

Automate the ClusterExtension rollout failure coverage for OCP-88331 and OCP-88332 by building in-cluster bundle and catalog images for successful and failing bundle versions.

The new QE specs verify ProgressDeadlineExceeded on an initial failed rollout and ProbeFailure while upgrading to a bad revision under the BoxCutter runtime.

Supersedes #745

Additional changes:

Test respects the CRD minimum of 10 minutes for the timeout - upstream we would modify the CRD so we can do 1 minute, but we can't do that downstream.
Fixed the httpd script

Test pass shown here: PR, CI run

Summary by CodeRabbit

Tests
- Added integration test coverage for ClusterExtension progress-deadline behavior, including validation of deadline-exceeded conditions and handling of rollout/upgrade scenarios with persistent failures.
New Features
- Added a helper option to set ClusterExtension ProgressDeadlineMinutes in test setups.

openshift-ci-robot · 2026-06-23T13:06:54Z

@dtfranz: This pull request references OPRUN-4392 which is a valid jira issue.

This pull request references OPRUN-4393 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Automate the ClusterExtension rollout failure coverage for OCP-88331 and OCP-88332 by building in-cluster bundle and catalog images for successful and failing bundle versions.

The new QE specs verify ProgressDeadlineExceeded on an initial failed rollout and ProbeFailure while upgrading to a bad revision under the BoxCutter runtime.

Supersedes #745

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-06-23T13:07:09Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c0ea13ce-b5fe-4573-881d-1c3ae9ed6c36

📥 Commits

Reviewing files that changed from the base of the PR and between 1d56a97 and 41eb704.

📒 Files selected for processing (3)

openshift/tests-extension/.openshift-tests-extension/openshift_payload_olmv1.json
openshift/tests-extension/pkg/helpers/cluster_extension.go
openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go

🚧 Files skipped from review as they are similar to previous changes (1)

openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go

Walkthrough

Adds a QE Ginkgo test for ClusterExtension progress-deadline behavior and a helper option to set ProgressDeadlineMinutes. The test builds bundle and catalog images, installs a ClusterExtension, and asserts progress-state and probe conditions during failure and upgrade cases.

Changes

Progress deadline test flow

Layer / File(s)	Summary
Progress-deadline option `openshift/tests-extension/pkg/helpers/cluster_extension.go`	Adds `WithProgressDeadlineMinutes`, which sets `ClusterExtension.Spec.ProgressDeadlineMinutes` when the extension pointer is non-nil.
Test scenarios and fixture bootstrap `openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go`	Defines two Ginkgo cases under `NewOLMBoxCutterRuntime` and the fixture setup that creates namespaces, RBAC, a `ClusterCatalog`, and the `ClusterExtension` under test.
Image build and template generation `openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go`	Builds bundle and catalog images from temporary tar archives and generated file maps, including placeholder substitution and catalog entry generation.
Polling and cleanup helpers `openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go`	Adds polling helpers for `ClusterExtension` and `ClusterObjectSet` conditions, active revision checks, and delete cleanup that ignores `NotFound`.

Estimated code review effort: 3 (Moderate) | ~25 minutes

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (14 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: new OLMv1 progress deadline QE tests with related fixes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	PASS: The new Ginkgo Describe/It titles are fixed literals; no pod, namespace, UUID, timestamp, node, IP, or interpolated values appear in test names.
Test Structure And Quality	✅ Passed	Each It covers one rollout-failure scenario, resources are cleaned up via DeferCleanup/cleanup helpers, and all Eventually waits have explicit timeouts with clear messages.
Microshift Test Compatibility	✅ Passed	PASS: The new spec calls exutil.SkipMicroshift() in BeforeEach, so its BuildConfig/OLM/FeatureGate usage is gated off on MicroShift.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	The new QE tests only build images, create a namespace/ClusterCatalog, and assert conditions; they don't count nodes or require distinct hosts, so no SNO guard is needed.
Topology-Aware Scheduling Compatibility	✅ Passed	The patch adds QE test bundles and a progress-deadline helper only; no node selectors, affinity, spread constraints, tolerations, or topology-based replica logic were introduced.
Ote Binary Stdout Contract	✅ Passed	No process-level stdout writes were added: the new spec only uses build/output helpers inside It blocks, and the helper option is side-effect free.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	No IPv4-only assumptions or external connectivity found; it uses cluster-internal registry/images and binds httpd to :: for IPv6.
No-Weak-Crypto	✅ Passed	Scanned the added helper and QE spec; they use no crypto packages, weak algorithms, custom crypto, or secret/token comparisons.
Container-Privileges	✅ Passed	No added manifest uses privileged settings; the new CSV sets runAsNonRoot:true and allowPrivilegeEscalation:false, and the helper only sets progress deadline.
No-Sensitive-Data-In-Logs	✅ Passed	The new test/helper code only logs Ginkgo step text and resource names; no passwords, tokens, API keys, PII, or customer data are logged.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

openshift-ci · 2026-06-23T13:07:42Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dtfranz
Once this PR has been reviewed and has the lgtm label, please assign perdasilva for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

DOWNSTREAM_OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go (1)
405-416: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Consider documenting the 12-minute timeout rationale.

The 12-minute timeout here is notably longer than other assertions (3-5 minutes). While this is correct for waiting beyond the 10-minute progress deadline set in test case 88331, a comment explaining the relationship would improve maintainability.
📝 Optional improvement
 func expectClusterObjectSetCondition(ctx context.Context, name, conditionType string, status metav1.ConditionStatus, reason string) {
+	// Timeout is 12 minutes to accommodate the 10-minute progress deadline in test case 88331,
+	// plus buffer time for controller processing and status updates.
 	eventually(func(g o.Gomega) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go` around
lines 405 - 416, Add a comment above the eventually function call in
expectClusterObjectSetCondition to document why the 12-minute timeout is used.
The comment should explain that this timeout is intentionally longer than other
assertions (3-5 minutes) to wait beyond the 10-minute progress deadline
configured in test case 88331, establishing the relationship between the timeout
value and the deadline requirement for maintainability.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go`:
- Around line 263-264: The error message in the o.Expect assertion for the oc
start-build command includes the full output variable, which can contain
extensive build logs violating QE guidelines. Remove the output variable from
the o.Expect error message assertion at the line where start-build is run and
Run method is called, and instead log the output separately using g.By() before
the assertion or truncate the output to a reasonable size if the error needs to
include diagnostic information. This ensures large build logs are handled
through proper logging mechanisms rather than being embedded in the error
expectation message.

---

Nitpick comments:
In `@openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go`:
- Around line 405-416: Add a comment above the eventually function call in
expectClusterObjectSetCondition to document why the 12-minute timeout is used.
The comment should explain that this timeout is intentionally longer than other
assertions (3-5 minutes) to wait beyond the 10-minute progress deadline
configured in test case 88331, establishing the relationship between the timeout
value and the deadline requirement for maintainability.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2af580ca-f800-4db0-8859-d59b8a61f185

📥 Commits

Reviewing files that changed from the base of the PR and between ecd140b and 1d56a97.

📒 Files selected for processing (2)

openshift/tests-extension/.openshift-tests-extension/openshift_payload_olmv1.json
openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go

coderabbitai · 2026-06-23T13:20:00Z

+	output, err := oc.AsAdmin().WithoutNamespace().Run("start-build").Args(name, "-n", namespace, "--from-archive="+archive, "--wait").Output()
+	o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s: %s", name, output)


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Don't include large build output in error messages.

The output variable from oc start-build could contain extensive build logs (hundreds of lines). Including this directly in the o.Expect error message violates the QE guideline: "Don't put large log outputs in error messages (use proper log messages instead of o.Expect with large output)".

Consider logging the output separately with g.By() or truncating it:

♻️ Suggested fix

output, err := oc.AsAdmin().WithoutNamespace().Run("start-build").Args(name, "-n", namespace, "--from-archive="+archive, "--wait").Output() -o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s: %s", name, output) +if err != nil { + g.By(fmt.Sprintf("Build output for %s:\n%s", name, output)) + o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s", name) +}

Or truncate the output:

output, err := oc.AsAdmin().WithoutNamespace().Run("start-build").Args(name, "-n", namespace, "--from-archive="+archive, "--wait").Output() -o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s: %s", name, output) +truncated := output +if len(truncated) > 500 { + truncated = truncated[:500] + "... (truncated)" +} +o.Expect(err).NotTo(o.HaveOccurred(), "failed to build image %s: %s", name, truncated)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openshift/tests-extension/test/qe/specs/olmv1_ce_progress_deadline.go` around lines 263 - 264, The error message in the o.Expect assertion for the oc start-build command includes the full output variable, which can contain extensive build logs violating QE guidelines. Remove the output variable from the o.Expect error message assertion at the line where start-build is run and Run method is called, and instead log the output separately using g.By() before the assertion or truncate the output to a reasonable size if the error needs to include diagnostic information. This ensures large build logs are handled through proper logging mechanisms rather than being embedded in the error expectation message.

Source: Coding guidelines

dtfranz · 2026-06-23T22:47:32Z

/retest

dtfranz · 2026-06-24T05:28:12Z