build(web_api): slim CPU/GPU extras + image size reduction#1263
build(web_api): slim CPU/GPU extras + image size reduction#1263magic-vladyslav wants to merge 6 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1263 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 211 211
Lines 11292 11314 +22
=========================================
+ Hits 11292 11314 +22 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
This would break tensorrt which requires CUDA 13. What needs CUDA 12.1? |
|
I had to modify the dependency installation partially due to cutie (but not only). Also please make sure to rerun |
I think it's because Cloud Run driver's for L4 GPU can't be updated: https://github.com/ZettaAI/zetta_utils/blob/main/web_api/gpu.Dockerfile#L3-L6 |
Replace the monolithic `modules` extra in both web_api images with new
`web_api` (CPU) and `web_api-gpu` extras that pull only what web_api/app and
internal.alignment actually import, dropping training/mazepa_addons/meshing/
skeletonization/chunkedgraph/segmentation-native and tensorrt from the
deploy images.
- pyproject: add `web-api-base` + `web-api`/`web-api-gpu` leaf extras and a
`cutie` sub-extra (still pulled by `segmentation`). Restore torch>=2.11 on
the non-web extras (training/alignment/montaging) while the web_api path
stays torch>=2.5 so the cu121 GPU base image keeps torch 2.5.1.
- web_api resolves CPU-only torch via the pytorch-cpu index ([tool.uv.sources]
+ conflicts), so installing without a GPU pulls no nvidia-* CUDA wheels.
- web_api-gpu omits the gpu/tensorrt extra: web_api only calls
convnet.load_model with tensorrt_enabled=False, so the CUDA-13 stack is dead
weight on the CUDA-12.1 base.
- Dockerfiles install the slim extras (CPU: pinned --no-deps + cpu torch index,
cchardet shim retained for cutie; GPU: resolution on the cu121 base). Drop the
abiss/waterz/lsd build machinery and the `zetta --help` check (the CLI pulls
kubernetes); smoke-import app.main instead.
- update_pinned_requirements.sh exports requirements.web_api{,_gpu}.txt;
install_zutils gains web_api/web_api_gpu modes; web_api/requirements.txt is
removed (single source of truth is the extra).
- CI: add web-api-extras-build (clean CPU install + smoke import + heavy-package
absence assert) and web-api-gpu-build (full GPU docker build) jobs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Run the two image variants concurrently by default (--no-parallel to opt out, and automatic when only one variant is selected), streaming per-variant prefixed, line-buffered output so the interleaved logs stay readable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- web-api-extras-build: set UV_INDEX_STRATEGY=unsafe-best-match so uv finds exact pins on PyPI even though the pytorch CPU index mirrors some packages (e.g. certifi) at older versions; default first-index strategy stopped there. - gpu.Dockerfile: restore the cchardet stub + faust-cchardet shim before the resolution install, since cutie's cchardet>=2.1.7 does not build on the base image's Python. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The bash assert step reported all packages absent but still exited 1
under bash -l {0} (login shell + conda). Replace it with an
importlib.metadata check + sys.exit so the result is deterministic and
self-documenting.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
See comments, but a major one regarding the web_api/gpu build:
The R535 limitation on Cloud Run does not prevent you from using newer Pytorch / CUDA driver.
- Updating to Pytorch 2.12 + CUDA 12.6 in the requirements should be straightforward, thanks to minor-version compatibility. That already resolves the issue with Pytorch 2.5 <-> 2.12
- CUDA 13.0 is also possible with
apt install cuda-compat-13-0, and prepending LD_LIBRARY_PATH with/usr/local/cuda-13.0/compat, which should contain thelibcuda.so. - I would drop the pytorch/pytorch image in either case. and rely on the pinned requirements files to install the version zutils actually prefers. Otherwise we need to keep paying close attention to the requirements and update the base image version to stay in sync with the resolved dependencies
| # backend; pip still pulls sub-package deps from the index at install time. | ||
| [[tool.uv.dependency-metadata]] | ||
| name = "tensorrt-cu13" | ||
| requires-dist = [] |
There was a problem hiding this comment.
The install scripts explicitly use --no-deps, so this change drops tensorrt libs and bindings from the pinned requirements and break the main image. Try preserving the dependencies for the metapackage, that might still bypass the macOS issue.
| # backend; pip still pulls sub-package deps from the index at install time. | |
| [[tool.uv.dependency-metadata]] | |
| name = "tensorrt-cu13" | |
| requires-dist = [] | |
| # backend | |
| [[tool.uv.dependency-metadata]] | |
| name = "tensorrt-cu13" | |
| requires-dist = ["tensorrt-cu13-libs", "tensorrt-cu13-bindings"] |
| # cutie declares cchardet>=2.1.7, which does not build on the base image's | ||
| # Python. Install an empty stub to satisfy the requirement (so the resolution | ||
| # install below does not try to build the real one) plus faust-cchardet to | ||
| # provide the actual top-level `cchardet` module. | ||
| RUN mkdir -p /tmp/cc_stub \ | ||
| && printf 'from setuptools import setup\nsetup(name="cchardet", version="2.1.7", py_modules=[])\n' > /tmp/cc_stub/setup.py \ | ||
| && pip install --no-deps /tmp/cc_stub \ | ||
| && rm -rf /tmp/cc_stub | ||
| RUN --mount=type=cache,target=/root/.cache/pip pip install faust-cchardet |
There was a problem hiding this comment.
cchardet is declared by cutie, but never used. You neither need cchardet nor faust-cchardet.
| @@ -33,7 +32,8 @@ RUN --mount=type=cache,target=/root/.cache/pip \ | |||
| pip install faust-cchardet | |||
There was a problem hiding this comment.
cchardet is declared by cutie, but never used. You neither need cchardet nor faust-cchardet.
There was a problem hiding this comment.
Nothing consumes this file. In gpu.Dockerfile you are relying on pip+pyproject.toml to resolve dependencies. Also note that this requirements file here pins torch==2.12 and CUDA 13 libraries, which you wanted to avoid in the web_api/gpu.Dockerfile.
|
|
||
| RUN --mount=type=cache,target=/root/.cache/pip \ | ||
| pip install "cutie @ git+https://github.com/hkchengrex/Cutie.git" | ||
| pip install --no-deps -r requirements.web_api.txt \ |
There was a problem hiding this comment.
Switching to uv pip here (like the base image does via install_zutils.sh) may help with consistency/reproducibility. It's also faster than pip
|
Could you please address Nicos input? |
Summary
Carves the web_api deploy images away from the monolithic
modulesextra and onto purpose-built, slimweb_api(CPU) /web_api-gpuextras that contain only whatweb_api/appandinternal.alignmentactually import. Also makes the CPU image ship CPU-only torch (no CUDA wheels) and the GPU image drop the unused TensorRT/CUDA-13 stack, and restores the newest torch for everyone else.Net effect: dramatically smaller, faster web_api images, and a web_api dev on Apple Silicon no longer has to resolve the other team's tensorrt/training deps.
What web_api actually imports → covering extra
Verified by tracing every import in
web_api/app/*.pyand recursively throughinternal.alignment(sift → misalignment_detector → manual_correspondence → field → online_finetuner). Nothing reachable from web_api importsmazepa,mazepa_addons,training,lightning,wandb,torchmetrics, meshing/skeletonization/chunkedgraph/montaging/calcada, or TensorRT.task_management.*(tasks.py)task_management(→ databackends, sql, tenacity, pcg_skel→caveclient)db_annotations.*(annotations/collections/layers/...)databackends+tenacity(viatask_management)layer.volumetric+.cloudvol+.annotation(painting/precomputed)cloudvol+tensorstore(volumetric__init__loads.tensorstore) +tensor_opsinternal.alignment.sifttensor_ops; scipy declared explicitlyinternal.alignment.misalignment_detectorconvnetinternal.alignment.manual_correspondence/field/online_finetunertensor_opssegmentation.pycutiesub-extra; hydra/omegaconf explicit; gcs via basecloud-filesmain.pygoogle-cloud-iap(web stack)Hidden / transitive gaps closed
internal/alignment/sift.py,alignment.py) — only transitive via scikit-image → declared explicitly.segmentation.py) — only transitive via the gitcutie→ declared explicitly.pcg_skel, tenacity viatask_management, gcs viacloud-files).faust-cchardet/stub shim.New / changed extras (
pyproject.toml)cutiesub-extra;segmentationnow referenceszetta_utils[cutie](modules/segmentation semantics unchanged for the other team).web-api-base(shared deps) +web-api/web-api-gpuleaf extras.web-api: apytorch-cpuindex +[tool.uv.sources]bind torch to it for theweb-apiextra, with[tool.uv]conflictsbetweenweb-api/web-api-gpu.requirements.web_api.txtresolvestorch==…+cpuwith 0nvidia-*CUDA wheels. uv-only — plainpip install '.[web-api-gpu]'ignores it, so the GPU image keeps its base cu121 torch.web-api-gpuomits thegpu/tensorrt extra — web_api only callsconvnet.load_model(tensorrt_enabled=False), so TensorRT is never imported; layering a CUDA-13 runtime on the CUDA-12.1 base was pure bloat + a version mismatch.>= 2.11ontraining/alignment/montaging; kept>= 2.5on the web_api-path extras (convnet,tensor-typing,web-api) so the cu121 base's torch 2.5.1 is honored.--resolution higheststill pins 2.12 everywhere else, so non-GPU consumers get the newest torch.Removed from the web_api images
training(lightning/wandb/torchmetrics),mazepa_addons(kubernetes/awscli/mitmproxy/gcloud SDKs),meshing,skeletonization,montaging,chunkedgraph,calcada, and (CPU) the nativeabiss/waterz/lsdsbuilds + their numpy==1.26.4/cython/nanobind machinery — plus the fullnvidia-*CUDA stack (CPU) andtensorrt-cu13+ CUDA-13 libs (GPU).Image size wins
Dockerfiles
web_api/Dockerfile(CPU): installrequirements.web_api.txt--no-depswithPIP_EXTRA_INDEX_URL=…/whl/cpu; keep the cchardet shim; drop libboost/unixodbc + standalone cutie + abiss/waterz/lsd machinery; replaceRUN zetta --help(pulls kubernetes) with apython -c "import app.main"smoke test.web_api/gpu.Dockerfile:pip install '.[web-api-gpu]'on thepytorch/pytorch:2.5.1-cuda12.1-cudnn9-runtimebase, keepingPIP_EXTRA_INDEX_URL=…/cu121; drop numpy/cython/lsd/nanobind.web_api/requirements.txt— single source of truth is the extra.Scripts & CI
update_pinned_requirements.sh: exportsrequirements.web_api.txt/requirements.web_api_gpu.txt(lock/prune/fork-strategy untouched).install_zutils.py:--modegainsweb_api/web_api_gpu.build_web_api.py: builds the CPU & GPU variants in parallel by default (--no-parallelto opt out) with per-variant prefixed output..github/workflows/testing.yaml: newweb-api-extras-buildjob (py 3.11/3.12/3.14: clean CPU install with the cpu torch index → smoke-importapp.main→ assert lightning/wandb/torchmetrics/mitmproxy/awscli/kubernetes/tensorrt absent) andweb-api-gpu-buildjob (full GPUdocker build). Both added toall-checks-test.Verification
web_api.app.mainimports on CPU (darwin) includinginternal.alignment+ cutie + hydra/omegaconf/scipy; no hardcoded.cuda().web_api→torch==…+cpu, 0 CUDA libs, no heavy pkgs;web_api_gpu→ CUDA torch, no tensorrt;modules/all→ CUDA torch + tensorrt unchanged (other team unaffected).web_api-gpuresolves cleanly withtorch==2.5.1;modulesnow correctly rejectstorch==2.5.1(requires>=2.11).update_pinned_requirements.shruns on Apple Silicon without choking on tensorrt (static-metadata workaround).Note for the other team
pyproject.tomlis shared, butmodules/training/segmentation/allsemantics are preserved:modulesstill resolves with tensorrt + CUDA torch,segmentationstill includes cutie (via thecutiesub-extra). No undeclaredinternaldependency was added — scipy/hydra-core/omegaconf were already resolved transitively and are now declared explicitly in the web_api extra only.🤖 Generated with Claude Code