Skip to content

[RNE Rewrite] feat: add semantic segmentation task#1275

Open
barhanc wants to merge 4 commits into
rne-rewritefrom
@bh/semantic-segmentation
Open

[RNE Rewrite] feat: add semantic segmentation task#1275
barhanc wants to merge 4 commits into
rne-rewritefrom
@bh/semantic-segmentation

Conversation

@barhanc

@barhanc barhanc commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Description

  • Adds semantic segmentation task and required native operations for it.
  • Adds computer-vision example app screen for semantic segmentation.
  • Fixes the performance regression in the argmax introduced in [RNE Rewrite] feat!: add classification task and common utilities for manipulating tensors in CV tasks #1264. The initial implementation from PoC was more efficient for the default axis=-1 case as the internal loop was over contiguous elements. The linked PR changed it so that it was more efficient for axis=0 case, however since the default is axis=-1 this caused a performance regression in the semantic segmentation task.

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

  • Build the computer-vision example app.
  • Test the semantic segmentation screen.

Screenshots

Related issues

Closes #1242

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

@barhanc barhanc self-assigned this Jun 22, 2026
@barhanc barhanc added refactoring feature PRs that implement a new feature labels Jun 22, 2026
@barhanc barhanc marked this pull request as ready for review June 23, 2026 12:58
@barhanc barhanc requested a review from msluszniak June 23, 2026 12:59
@barhanc

barhanc commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Two models are missing from models.ts: DeepLabV3 and FCN, since the forward returns auxiliary tensor. I will reexport them but don't know under what tag should I put them, isv0.10.0 ok?

@msluszniak

Copy link
Copy Markdown
Member

Two models are missing from models.ts: DeepLabV3 and FCN, since the forward returns auxiliary tensor. I will reexport them but don't know under what tag should I put them, isv0.10.0 ok?

Yeah, v0.10.0 sounds ok.

if (dst->dtype_ != rnexecutorch::core::types::DType::uint8) {
throw jsi::JSError(rt, "applyColormap: dst must be uint8");
}
if (dst->numel_ != src->numel_ * 4) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess 4 might be replaced here and in other places in this file as constexpr size_t numChannels = 4; or so.

Comment on lines +693 to +696
dstData[i * 4 + 0] = lut[idx][0];
dstData[i * 4 + 1] = lut[idx][1];
dstData[i * 4 + 2] = lut[idx][2];
dstData[i * 4 + 3] = lut[idx][3];

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this one when you introduce numChannels? Loop will be automatically unrolled on constexpr numChannels. Just add correct formatting.

Suggested change
dstData[i * 4 + 0] = lut[idx][0];
dstData[i * 4 + 1] = lut[idx][1];
dstData[i * 4 + 2] = lut[idx][2];
dstData[i * 4 + 3] = lut[idx][3];
for (size_t c = 0; c < numChannels; ++c) {
dstData[i * numChannels + c] = lut[idx][c];
}

Comment on lines 279 to +292
int32_t *dstData = reinterpret_cast<int32_t *>(dst->data_.get());
std::vector<float> maxVals(inner);

for (size_t o = 0; o < outer; ++o) {
const float *srcSlab = srcData + o * axisDim * inner;
int32_t *dstRow = dstData + o * inner;

for (size_t i = 0; i < inner; ++i) {
maxVals[i] = -std::numeric_limits<float>::infinity();
dstRow[i] = 0;
}

for (size_t d = 0; d < axisDim; ++d) {
const float *srcRow = srcSlab + d * inner;
for (size_t i = 0; i < inner; ++i) {
const float val = srcRow[i];
if (val > maxVals[i]) {
maxVals[i] = val;
dstRow[i] = static_cast<int32_t>(d);
float maxVal = -std::numeric_limits<float>::infinity();
int32_t maxIdx = 0;
for (size_t d = 0; d < axisDim; ++d) {
const float val = srcData[o * axisDim * inner + d * inner + i];
if (val > maxVal) {
maxVal = val;
maxIdx = static_cast<int32_t>(d);
}
}
dstData[o * inner + i] = maxIdx;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we eventually change this one. Didn't it give performance gain?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from #1264 caused a performance regression for default axis=-1 case as in the PR description.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add some comment in here to not "optimize" it wrongly in the future then?

import { validateModelSchema, SymbolicTensor } from '../../../core/modelSchema';
import { wrapAsync } from '../../../core/runtime';

import { type ImageBuffer } from '../image';

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import { type ImageBuffer } from '../image';
import type { ImageBuffer } from '../image';

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PRs that implement a new feature refactoring

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RNE Rewrite] CV - add semantic segmentation pipeline implementation

2 participants