Skip to content

Embedder#1

Open
ArnabChatterjee20k wants to merge 19 commits into
masterfrom
embedder
Open

Embedder#1
ArnabChatterjee20k wants to merge 19 commits into
masterfrom
embedder

Conversation

@ArnabChatterjee20k
Copy link
Copy Markdown
Member

What does this PR do?

(Provide a description of what this PR does.)

Test Plan

(Write your test plan here. If you changed any code, please provide us with clear instructions on how you verified your changes work.)

Related PRs and Issues

(If this PR is related to any other PR or resolves any issue or related to any issue link all related PR and issues here.)

Have you read the Contributing Guidelines on issues?

(Write your answer here.)

- Introduced `EmbeddingConfig` struct for managing embedding model configurations.
- Implemented `EmbeddingClient` for handling model initialization and embedding generation.
- Added environment variable support for model selection, cache directory, and pool size.
- Created a main function to demonstrate model loading and embedding generation with sample texts.
- Integrated `fastembed` library for embedding functionalities.
@ArnabChatterjee20k
Copy link
Copy Markdown
Member Author

@copilot create a pr fixing the dockerfile

Copy link
Copy Markdown

Copilot AI commented May 22, 2026

@ArnabChatterjee20k I've opened a new pull request, #2, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 6 commits May 22, 2026 12:33
Agent-Logs-Url: https://github.com/appwrite/embedding/sessions/4f5decfe-9143-42bd-b0e6-811ed2616fcb

Co-authored-by: ArnabChatterjee20k <83803257+ArnabChatterjee20k@users.noreply.github.com>
Agent-Logs-Url: https://github.com/appwrite/embedding/sessions/4f5decfe-9143-42bd-b0e6-811ed2616fcb

Co-authored-by: ArnabChatterjee20k <83803257+ArnabChatterjee20k@users.noreply.github.com>
Agent-Logs-Url: https://github.com/appwrite/embedding/sessions/5bc8a2a9-7dfa-400d-9834-850b8f96eb8d

Co-authored-by: ArnabChatterjee20k <83803257+ArnabChatterjee20k@users.noreply.github.com>
Set default runtime model cache path and fix Docker CI compatibility
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 26, 2026

Greptile Summary

This PR introduces a new Rust microservice that serves text embeddings over HTTP using the fastembed library with ONNX-backed models. It includes memory-aware pool sizing, round-robin instance dispatch, dynamic sub-batch computation, a Docker image, and a CI workflow.

  • src/embedding.rs: Implements EmbeddingClient with a per-model Vec<Arc<Mutex<TextEmbedding>>> pool, warmup-based memory measurement for pool sizing, and async batched inference split across tokio::task::spawn_blocking workers.
  • src/main.rs: Thin Axum server exposing a single POST /embed route; validates that texts is non-empty but has no upper-bound guard.
  • tests/embed_e2e.rs: Integration tests — the non-ignored embed_unknown_alias_returns_error test silently passes in CI without exercising its assertion because it uses .ok() to swallow a model-loading failure that occurs before the alias check.

Confidence Score: 4/5

Safe to merge with one test correctness fix — the alias-rejection test never executes its assertion in CI and should be restructured or marked #[ignore].

The alias-rejection test in tests/embed_e2e.rs swallows the model-loading error with .ok() and wraps its assertions in if let Some(client), so on a CI runner without a cached model the entire assertion block is skipped and the test passes vacuously. This means the alias-validation code path is not exercised in CI at all, despite appearing in the test suite.

tests/embed_e2e.rs — the embed_unknown_alias_returns_error test needs restructuring to actually run its assertion. src/model.rs — the dimension() function should add an explicit arm for EmbeddingGemma300M.

Important Files Changed

Filename Overview
src/embedding.rs Core embedding client: pool management, memory-aware sizing, and async batched inference. Previously flagged issues remain; no new logic bugs in this file.
src/main.rs Axum HTTP server with single /embed route; no upper bound on texts array size and no graceful-shutdown signal handler (previously flagged).
src/model.rs Model alias resolution and dimension lookup; EmbeddingGemma300M falls through to the 768 catch-all arm in dimension().
tests/embed_e2e.rs E2e integration tests; embed_unknown_alias_returns_error is not #[ignore]d but requires a real model download, so it vacuously passes in CI without exercising the assertion.
src/lib.rs Library crate root; re-exports the public surface cleanly, no issues.
Dockerfile Multi-stage build with non-root user and BuildKit cache mounts; looks correct.
.github/workflows/ci.yml CI workflow targets main branch on push, not master (previously flagged); otherwise fmt/clippy/test steps are well-structured.
Cargo.toml Dependencies pinned with explicit versions; tokio signal feature is included in preparation for graceful shutdown.

Reviews (2): Last reviewed commit: "feat: implement multi-model support in E..." | Re-trigger Greptile

Comment thread .github/workflows/ci.yml
Comment on lines +3 to +6
on:
push:
branches: [main]
pull_request:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 CI push trigger targets main, but the repository's default branch (and the base for this PR) is master. As-is, the test and docker jobs will never fire on a direct push to master, so merges to the primary branch go unvalidated by CI.

Suggested change
on:
push:
branches: [main]
pull_request:
on:
push:
branches: [master]
pull_request:

Comment thread .env Outdated
Comment thread src/embedding.rs Outdated
Comment thread src/embedding.rs
Comment thread src/main.rs
Comment on lines +64 to +85
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
tracing_subscriber::fmt()
.with_env_filter(
tracing_subscriber::EnvFilter::try_from_default_env()
.unwrap_or_else(|_| tracing_subscriber::EnvFilter::new("info")),
)
.init();

let config = EmbeddingConfig::from_env();
let client = Arc::new(EmbeddingClient::new(config)?);
let state = AppState { client };

let app = Router::new().route("/embed", post(embed)).with_state(state);

let addr = std::env::var("BIND_ADDR").unwrap_or_else(|_| "0.0.0.0:3000".to_string());
let listener = tokio::net::TcpListener::bind(&addr).await?;
tracing::info!("listening on {}", addr);
axum::serve(listener, app).await?;

Ok(())
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No graceful shutdown handling

axum::serve is awaited directly with no signal hook. When the container receives SIGTERM, the process is killed immediately and any in-flight embedding requests are dropped. Axum supports a .with_graceful_shutdown(signal) overload that drains connections before exit.

Comment thread tests/embed_e2e.rs
Comment on lines +58 to +66
#[tokio::test(flavor = "multi_thread")]
async fn embed_unknown_alias_returns_error() {
// No model loading on this path — fails the alias check first.
let client = EmbeddingClient::new(small_model_config(1)).ok();
if let Some(client) = client {
let err = client.embed("not-a-real-model", &["x"]).await.unwrap_err();
assert!(err.contains("unknown model alias"), "got: {}", err);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Test silently passes in CI without exercising the assertion

EmbeddingClient::new(small_model_config(1)) calls Self::load_model which downloads the AllMiniLML6V2 ONNX model — this is the same model loading path as the #[ignore]d tests. In a fresh CI environment (no cached model), construction fails and .ok() produces None, so the if let Some(client) block is never entered and the actual alias-rejection assertion never runs. The test passes vacuously every CI run, giving false assurance. The comment "No model loading on this path" is incorrect — the alias check lives in embed(), which is never reached if new() fails.

The cleanest fix is to test alias rejection through model::from_name directly (no client required), or add #[ignore] like the other e2e tests that need a real model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants