Skip to content

docs(design-proposals): add Terraform/OpenTofu backend for AGL#11

Draft
Maxim Kitsunoff (kitsunoff) wants to merge 1 commit into
cozystack:mainfrom
kitsunoff:feat/agl-tofu-backends-design
Draft

docs(design-proposals): add Terraform/OpenTofu backend for AGL#11
Maxim Kitsunoff (kitsunoff) wants to merge 1 commit into
cozystack:mainfrom
kitsunoff:feat/agl-tofu-backends-design

Conversation

@kitsunoff
Copy link
Copy Markdown

Summary

Adds a design proposal for a Terraform/OpenTofu backend for Cozystack's Application Generation Layer (AGL).

Today AGL maps user-facing kinds (Postgres, Kafka, Bucket, ...) to Flux HelmReleases. Helm is the only supported backend, which makes cloud-side primitives (VPCs, DNS zones, managed databases, IAM bindings, external buckets) awkward to express through the same abstraction. This proposal adds Terraform CRs of flux-iac/tofu-controller as a second backend, so platform engineers can describe cloud resources through AGL the same way they describe in-cluster workloads.

Contents

design-proposals/agl-tofu-backends/:

  • README.md — proposal overview, context, goals, non-goals, comparison of the two alternatives, recommendation, user-facing changes, upgrade/rollback, security, failure cases, testing, rollout, open questions.
  • draft-1-parallel-tofu-stack.md — alternative 1: a new TofuApplicationDefinition CRD with its own aggregation apiserver and reconciler, mirroring the existing Helm AGL one-to-one. Minimal regression risk, ~80% code duplication.
  • draft-2-pluggable-backend.md — alternative 2: refactor AGL so Helm and Terraform are two implementations of one Backend Go interface behind a single ApplicationDefinition CRD. Larger refactor of the hot path, much lower long-term cost.
  • presentation.md — Marp-flavoured slide deck summarising both drafts side-by-side.

Status

Draft. The two drafts are alternatives, not sibling proposals — the intent is to decide between them (or commit to the hybrid recommendation: Draft 1 as PoC, then Draft 2 refactor) before implementation starts.

Open questions for reviewers

  • Which alternative should land — Draft 1, Draft 2, or the hybrid path?
  • Group naming for Terraform-backed kinds: stay under apps.cozystack.io or use a distinct tofu.apps.cozystack.io?
  • Per-instance backend override (runner pod template / cloud identity per tenant) — needed in v1 or deferred?
  • Plan-approval UX when approvePlan is manual — status field + dashboard, or a separate approve subresource on Application?

Reading these documents

Rendered, browsable version (with navigation and the Marp slide deck as HTML): https://kitsunoff.github.io/cozystack-agl-backends/.

Source repository: https://github.com/kitsunoff/cozystack-agl-backends.

Add a design proposal for extending Cozystack's Application Generation
Layer to support Terraform/OpenTofu as a release backend alongside Helm,
enabling AGL to manage cloud-side primitives (VPCs, DNS zones, managed
services) under the same abstraction.

Two alternative designs are documented for review:
- Draft 1: a parallel TofuApplicationDefinition stack mirroring the
  existing Helm AGL one-to-one (minimal regression risk, high
  duplication).
- Draft 2: a pluggable Backend interface behind a single
  ApplicationDefinition CRD (larger refactor, lower long-term cost).

A short Marp slide deck summarising both drafts side-by-side is included.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: ZverGuy <maximbel2003@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3794b1ec-8ca1-46f3-a5e4-b3acbc965c25

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two design proposals for integrating OpenTofu/Terraform into Cozystack's Application Generation Layer (AGL). Draft 1 outlines a parallel stack approach, while Draft 2 proposes a refactored pluggable backend architecture to unify different deployment technologies. Feedback highlights several design considerations, including the need for per-tenant identity isolation in runner pods, maintaining a unified API group to hide implementation details, and establishing clear conventions for variable mapping and status projection to ensure a consistent developer experience across backends.

Comment on lines +73 to +75
runnerPodTemplate: # cloud creds, IRSA, custom image
spec:
serviceAccountName: tofu-runner
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The runnerPodTemplate is defined at the cluster-scoped TofuApplicationDefinition level. In multi-tenant environments, different Application instances (e.g., belonging to different tenants) often require distinct cloud identities, such as different AWS IAM roles via IRSA. Hardcoding the serviceAccountName in the definition prevents per-tenant identity isolation. Consider allowing the Application spec to reference a local ServiceAccount or providing a mechanism to inject tenant-specific pod templates.

### API server changes

- New binary `cmd/cozystack-tofu-api/main.go` (or a feature-flagged subcommand) — boots an aggregation API server identical in shape to `cozystack-api`.
- Group: `tofu.apps.cozystack.io/v1alpha1` (separate to avoid kind collisions with Helm-side AGL).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Introducing a separate API group (tofu.apps.cozystack.io) for Terraform-backed resources exposes the implementation details to the end user. This fragmentation contradicts the goal of the Application Generation Layer, which is to provide a unified abstraction for infrastructure. Users should ideally interact with the same API group regardless of whether the backend is Helm, Terraform, or any other technology.

Application.Name → Terraform.Name = prefix + Application.Name
Application.Namespace → Terraform.Namespace
Application.Labels → Terraform.Labels (with LabelPrefix)
Application.Spec (RawExtension) → Terraform.Spec.Vars (flattened, top-level keys = vars)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The proposal to flatten Application.Spec into Terraform.Spec.Vars needs a clearly defined strategy for handling nested objects. If the openAPISchema defines a hierarchical structure, a simple flattening might lead to key collisions or variable names that don't match the Terraform module's expectations. Explicitly documenting the flattening convention (e.g., using underscores for delimiters) or restricting Application.Spec to a flat map would avoid ambiguity.

TofuAppDef.Terraform.RunnerPod… → Terraform.Spec.RunnerPodTemplate
```

`Application.Spec` keys become Terraform input variables. A small validator on the way in: keys must match `^[a-z_][a-z0-9_]*$` (HCL identifier).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The validator ^[a-z_][a-z0-9_]*$ enforces snake_case for input variables. Since Kubernetes properties typically use camelCase, this creates a naming convention conflict for package authors. Forcing users to use snake_case in their Application manifests to satisfy Terraform requirements breaks the Kubernetes-native feel of the AGL. An automatic translation layer (camelCase to snake_case) or an explicit mapping field would improve the developer experience.

Comment on lines +186 to +192
type Backend struct {
// +kubebuilder:validation:Enum=Helm;Terraform
Type BackendType `json:"type"`

Helm *HelmBackend `json:"helm,omitempty"`
Terraform *TerraformBackend `json:"terraform,omitempty"`
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Backend struct uses a discriminated union pattern but lacks validation to ensure that the field corresponding to the Type is actually populated. For example, if Type is set to Terraform, the Terraform field should be required. Adding +kubebuilder:validation rules or implementing a Validate method for the CRD would prevent invalid configurations from being persisted.

Message string `json:"message,omitempty"`

// Backend-specific, raw JSON. Schema documented per backend.
Backend *runtime.RawExtension `json:"backend,omitempty"`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using runtime.RawExtension for backend-specific status data makes it difficult for generic consumers (like the Cozystack dashboard or CLI) to provide a consistent user experience without being aware of every backend's internal structure. To maintain the "Generation Layer" abstraction, consider defining a structured Outputs map or a standard set of status fields that all backends must populate, using RawExtension only for truly opaque, backend-specific metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant