Skip to content

feat: Triton 26.06 + TensorRT 11 upgrade and CVE remediation#4

Open
davidamacey wants to merge 7 commits into
feat/port-operational-hardeningfrom
feat/triton-2606-cve-refresh
Open

feat: Triton 26.06 + TensorRT 11 upgrade and CVE remediation#4
davidamacey wants to merge 7 commits into
feat/port-operational-hardeningfrom
feat/triton-2606-cve-refresh

Conversation

@davidamacey

Copy link
Copy Markdown
Owner

Stacked on #3 (auto-retargets to main when it merges).

  • Triton 26.06 (CUDA 13.3 / TensorRT 11 / Ubuntu 24.04) per NVIDIA security bulletins advising ≥ r26.03; torch 2.12.1+cu130; runs as the image's non-root triton-server user; apt security upgrades at build.
  • BREAKING: TRT 10→11 invalidates every existing .plan engine — see docs/MIGRATION_TRITON_26.md for the re-export procedure.
  • tensorrt-cu13==11.0.0.114 pinned to exactly match the server's TRT.
  • Monitoring pins: prometheus v3.12.0, grafana 13.1.0, loki 3.6.12 (non-root), node-exporter v1.10.2 — all previously :latest. Promtail (EOL 2026-03-02) replaced by Grafana Alloy v1.17.1.
  • OpenSearch 3.6.0 (3.0–3.2 carry HIGH CVEs: CVE-2025-9624, Netty CVE-2026-33870/71).
  • make scan targets + GitHub Actions Trivy scanning (fs + API image, SARIF upload, weekly cron, fails on unfixed HIGH/CRITICAL).
  • Export scripts made TRT-11-ready (guarded EXPLICIT_BATCH via shared export/trt_utils.py, dead pre-TRT10 fallback removed).

… cu130 torch, non-root

- Base nvcr.io/nvidia/tritonserver:25.10-py3 -> 26.06-py3 per NVIDIA
  security bulletins advising >= r26.03.
- torch 2.5.1/cu124 -> 2.12.1/cu130, torchvision 0.27.1, opencv-headless
  4.13.0.92.
- apt-get upgrade layer pulls Ubuntu point-release fixes the monthly NGC
  tag lags behind.
- Runs as the image's unprivileged triton-server user (uid 1000).

BREAKING: TensorRT 10 -> 11 invalidates every existing .plan engine;
re-export required (docs/MIGRATION_TRITON_26.md).
- Both image stages run apt-get upgrade for Debian point-release fixes.
- tensorrt-cu12==10.13.3.9 -> tensorrt-cu13==11.0.0.114 (exact match to
  the TRT bundled in tritonserver:26.06; engines are version-locked).
- pyproject deps synced with requirements.txt (tensorrt, ultralytics<8.4).
…fana Alloy

- prometheus v3.12.0, grafana 13.1.0, loki 3.6.12, node-exporter v1.10.2
  (previously all :latest — unpinned images defeat CVE auditing and
  reproducible deploys).
- Promtail (EOL 2026-03-02) replaced by grafana/alloy v1.17.1 with an
  equivalent docker-discovery pipeline; docker.sock now mounted
  read-only.
- Loki no longer forced to root; runs as the image's uid 10001
  (existing volumes need a one-time chown — see migration doc).
OpenSearch 3.0.0-3.2.0 carry known HIGH CVEs (CVE-2025-9624 query_string
DoS; Netty CVE-2026-33870/33871). 3.6.0 is the current stable; 3.3->3.6
is same-major so existing indices roll forward.
- make scan / scan-api / scan-triton wrap the existing
  scripts/security-scan.sh (trivy, grype, dockle, syft, hadolint).
- New workflow runs a Trivy filesystem scan (SARIF upload) and an API
  image scan on PRs, pushes to main, and a weekly cron; fails on
  unfixed HIGH/CRITICAL. The 20 GB Triton image is scanned locally via
  make scan-triton before any publish.
- New export/trt_utils.py create_explicit_network(): the EXPLICIT_BATCH
  creation flag was deprecated in TRT 10 and removed in newer majors;
  all six export scripts now go through the guarded helper.
- Drop the pre-TRT10 build_engine fallback (dead code under the
  tensorrt-cu13 11.x pin).
- docs/MIGRATION_TRITON_26.md: mandatory engine re-export, loki volume
  chown, promtail->Alloy conversion, /health semantics, pin matrix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant