diff --git a/.gitignore b/.gitignore index fd70ee1ce5c..c128e8bb836 100644 --- a/.gitignore +++ b/.gitignore @@ -28,7 +28,6 @@ requestdata.json # Playwright MCP session artifacts (console logs, page snapshots, ad-hoc # screenshots) written during local UI debugging. Not source. .playwright-mcp/ -model-picker-open.png # Live cluster dumps from `kubectl get -o yaml > …`. NEVER commit: # Darwin's ConfigMap currently contains real secrets in plaintext (Slack @@ -39,3 +38,9 @@ model-picker-open.png darwin-kubernetes/temp/ k8s/overlays/*/secrets.env k8s/overlays/*/*.secrets.env +# Velero Azure SP credentials (source for the cloud-credentials secret) +k8s/overlays/prod-velero/credentials-velero +# Velero notifier Slack bot token (source for the slack-notify secret) +k8s/overlays/prod-velero/slack-notify.env +# Ad-hoc export of web connector URLs (local only) +web-connectors.csv diff --git a/AGENTS.md b/AGENTS.md index 93fb7815714..fe3c069bd08 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -432,6 +432,78 @@ liveness probes by design** (an aggressive one kills slow-but-healthy nodes); readiness probes on the Service-backed nodes (configserver/query/feed) gate traffic during the slow bootstrap. +### 11. Slow indexing is the per-doc Vespa existence VISIT, not the crawl — and the content-hash dedup already exists + +When a connector (especially a big web one like docs.uipath) indexes slowly, +the bottleneck is almost never the source fetch. For every document, +`VespaIndex.index()` → `_clear_and_index_vespa_chunks()` → +`_get_vespa_chunks_by_document_id` (`document_index/vespa/index.py`) hits Vespa's +**Visit API** (`GET /document/v1/.../docid?selection=document_id==''&wantedDocumentCount=1000`) +to find existing chunks before re-writing. That `selection` is a **corpus scan**, +~**10–11 s per document** on the large prod index — and Vespa content nodes sit +near-idle while it happens (it's scan/IO-bound, so scaling content nodes does +NOT help). It's in the shared indexing path, so it slows every connector; large +multi-doc connectors just make it obvious. The real fix is a keyed lookup +(`document_id` as a `fast-search` attribute, or a point GET / search query) +instead of the visit. + +**Do NOT "add" a Postgres content-hash dedup to skip this — it already exists.** +`Document.indexed_content_hash` (`db/models.py`) + +`get_doc_ids_to_update` (`indexing/indexing_pipeline.py`) skip a doc (no re-embed, +no Vespa write) when the stored hash equals `doc.get_content_hash()`. The hash is +written only AFTER a confirmed Vespa write. Why it can still re-index everything: + +- It's bypassed when `ignore_time_skip=True`, set on `from_beginning` full runs + (`background/indexing/run_indexing.py`). +- Docs indexed before the hash feature have `indexed_content_hash = NULL`, so the + hash check can't fire and it falls back to a `doc_updated_at` timestamp compare. +- The **web connector never sets `doc_updated_at`**, so that fallback can't skip + hash-less web docs either → they re-index every run (each paying the ~11 s + visit) UNTIL the run completes and backfills their hash. It is self-healing — + once hashes exist, later polls skip unchanged docs and run fast — but a full + run that times out before backfilling will keep re-doing the slow work. + +(Diagnosed 2026-06 on the docs.uipath automation-suite latest-N connector: +~2889 of ~3161 docs had NULL hashes.) + +--- + +### 12. NEVER build the web image locally on Apple Silicon + +The web image's `next build` step **SIGSEGVs** when built for `linux/amd64` +under emulation on an arm64 Mac (Next.js build worker dies with `signal: +SIGSEGV`). Building amd64 under emulation is the only way to produce a +deployable image locally on Apple Silicon, so there is no working local web +build there — don't try, and don't burn time "fixing" it. It is not a config / +dependency / disk-space problem. + +Instead, build web on **darwinacr** (native-amd64 ACR build agents) and import +the result into the prod registry. `k8s/scripts/build-deploy.sh` does this +**automatically** on Apple Silicon — `build-deploy.sh deploy web` detects the +host and routes web to `az acr build` + `az acr import`, no flags needed. The +backend image has no native build step and still builds locally under emulation. + +If you ever need the raw commands (script unavailable / debugging): + +```bash +# 1. build on darwinacr (native amd64) +az acr build --registry darwinacr \ + --image danswer/danswer-web-server:vha-N \ + --build-arg NODE_BASE=darwinacr.azurecr.io/library/node:20-alpine \ + --file web/Dockerfile ./web +# 2. transfer darwinacr -> prod registry (different subscriptions, so blob-copy +# via pull/retag/push, NOT `az acr import`). Pure copy on the Mac, no SIGSEGV. +az acr login --name darwinacr +docker pull --platform linux/amd64 darwinacr.azurecr.io/danswer/danswer-web-server:vha-N +docker tag darwinacr.azurecr.io/danswer/danswer-web-server:vha-N \ + sfbrdevhelmweacr.azurecr.io/danswer/danswer-web-server:vha-N +docker push sfbrdevhelmweacr.azurecr.io/danswer/danswer-web-server:vha-N +``` + +(`--file` is relative to the CWD, not the `./web` context — `web/Dockerfile`, +not `Dockerfile`. `az acr build` on darwinacr needs **PIM Contributor**; the +prod push uses the `~/.zshrc` ACR_USERNAME/ACR_PASSWORD admin creds.) + --- ## Common workflows diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 1e70f61b346..e9224d90849 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -112,21 +112,21 @@ playwright install #### Dependent Docker Containers First navigate to `danswer/deployment/docker_compose`, then start Postgres. -The simplest path is the compose-managed pair (uses a named docker volume for -Vespa's data; data lives until you `docker volume rm`): +Start Postgres and Redis via compose under the `-p danswer-stack` project, so +they share the `danswer-stack_default` network (Vespa is run separately, below, +on the same network): ```bash -docker compose -f docker-compose.dev.yml -p danswer-stack up -d relational_db +docker compose -f docker-compose.dev.yml -p danswer-stack up -d relational_db redis ``` -If you'd rather pin Vespa's data + logs to host-mounted directories so you -can inspect them outside Docker (and survive `docker compose down -v`), -start Postgres via compose and Vespa via a manual `docker run` on the same -network. Pick any host paths you like: +Run Vespa via a manual `docker run` on that same `danswer-stack_default` +network, with host-mounted data + logs dirs so you can inspect them outside +Docker (and they survive `docker compose down -v`). Use this rather than the +compose `index` service (it's unreliable locally); the `--network` flag is what +keeps the manually-run Vespa on the shared network. Pick any host paths: ```bash -docker compose -f docker-compose.dev.yml -p danswer-stack up -d relational_db - export VESPA_VAR_STORAGE="${HOME}/danswer-vespa-data/var" export VESPA_LOG_STORAGE="${HOME}/danswer-vespa-data/logs" mkdir -p "$VESPA_VAR_STORAGE" "$VESPA_LOG_STORAGE" @@ -142,7 +142,7 @@ docker run \ --publish 19071:19071 \ vespaengine/vespa:8.277.17 -# Sanity check: both containers should be on the danswer-stack_default network +# Sanity check: all containers (Postgres, Redis, Vespa) on danswer-stack_default docker ps --format '{{ .ID }} {{ .Names }} {{ json .Networks }}' ``` @@ -150,6 +150,19 @@ docker ps --format '{{ .ID }} {{ .Names }} {{ json .Networks }}' `index` matters — Danswer reaches Vespa by that DNS name on the shared network.) +Redis (caching + per-user rate limiting) comes up with the commands above as +part of the `danswer-stack` project, so it's already on the shared +`danswer-stack_default` network. To check it or manage it on its own: + +```bash +docker compose -f docker-compose.dev.yml -p danswer-stack exec redis redis-cli ping # -> PONG +docker compose -f docker-compose.dev.yml -p danswer-stack stop redis +``` + +The container runs with no auth and publishes `6379` to the host, so a +host-run backend connects with `REDIS_HOST=localhost`, `REDIS_PORT=6379`, +`REDIS_PASSWORD=` (empty). (In-compose, the service name is `redis`.) + #### Running Danswer To start the frontend, navigate to `danswer/web` and run: ```bash @@ -325,7 +338,18 @@ export MODEL_SERVER_HOST=localhost export MODEL_SERVER_PORT=9000 export INDEXING_MODEL_SERVER_HOST=localhost export INDEXING_MODEL_SERVER_PORT=9000 -export REDIS_HOST=cache # matches the compose service name +export REDIS_HOST=localhost # backend runs on the host; reach Redis via the published 6379 port + +# Cross-encoder reranking, available locally. The model server (`dmo`) loads the +# reranker IN-PROCESS (sentence-transformers, CPU) — no extra container. Uses the +# small default model (mxbai-rerank-xsmall-v1); set RERANK_MODEL_NAME to try a +# bigger one. Reranking still only runs for assistants / chats that opt in. +# (Prod serves the reranker via a TEI container instead — see k8s/optional/tei-rerank.) +export RERANK_ENABLED=true +export LLM_RELEVANCE_FILTER_ENABLED=true # LLM relevance filter; independent of rerank +# Advanced: to mirror prod and offload the reranker to a local TEI container +# instead of in-process, run TEI yourself and set: +# export RERANK_SERVER_URL=http://localhost:8086 # --------------------------------------------------------------------------- # LLM (Generative AI) — UiPath LLM Gateway via OAuth client credentials diff --git a/backend/alembic/versions/a8b9c0d1e2f3_persona_display_name.py b/backend/alembic/versions/a8b9c0d1e2f3_persona_display_name.py new file mode 100644 index 00000000000..df35247256e --- /dev/null +++ b/backend/alembic/versions/a8b9c0d1e2f3_persona_display_name.py @@ -0,0 +1,35 @@ +"""persona: add display_name (user-friendly chat label) + +Adds persona.display_name — an optional, admin-editable label shown in the chat +UI. The immutable `name` stays the identifier; `display_name` is presentational +only and the chat falls back to `name` when it's blank. Backfills existing rows +with their `name` so nothing changes visually until an admin edits it. See +db/models.py::Persona. + +Revision ID: a8b9c0d1e2f3 +Revises: f7a8b9c0d1e2 +Create Date: 2026-06-19 + +""" +from alembic import op +import sqlalchemy as sa + + +# revision identifiers, used by Alembic. +revision = "a8b9c0d1e2f3" +down_revision = "f7a8b9c0d1e2" +branch_labels: None = None +depends_on: None = None + + +def upgrade() -> None: + op.add_column( + "persona", + sa.Column("display_name", sa.String(), nullable=True), + ) + # Backfill: existing assistants keep showing their current name. + op.execute("UPDATE persona SET display_name = name WHERE display_name IS NULL") + + +def downgrade() -> None: + op.drop_column("persona", "display_name") diff --git a/backend/alembic/versions/b9c0d1e2f3a4_user_hidden_assistants.py b/backend/alembic/versions/b9c0d1e2f3a4_user_hidden_assistants.py new file mode 100644 index 00000000000..af3206b20c5 --- /dev/null +++ b/backend/alembic/versions/b9c0d1e2f3a4_user_hidden_assistants.py @@ -0,0 +1,46 @@ +"""user: add hidden_assistants (opt-out assistant visibility) + +Adds user.hidden_assistants — the list of assistant (persona) ids a user has +explicitly hidden from their chat picker. This flips assistant visibility from +opt-IN (only assistants in `chosen_assistants` were shown) to opt-OUT: every +accessible assistant is visible by default, so a newly created admin assistant +appears for all users automatically; a user hides the ones they don't want. + +`chosen_assistants` now controls ORDER/default only, not visibility. + +No backfill: the chat experience hasn't been rolled out to end users yet, so +there is no curated state to preserve — every existing user simply starts with +an empty hidden list (= sees everything), which is the desired behavior. See +db/models.py::User. + +Revision ID: b9c0d1e2f3a4 +Revises: a8b9c0d1e2f3 +Create Date: 2026-06-21 + +""" +from alembic import op +import sqlalchemy as sa +from sqlalchemy.dialects import postgresql + + +# revision identifiers, used by Alembic. +revision = "b9c0d1e2f3a4" +down_revision = "a8b9c0d1e2f3" +branch_labels: None = None +depends_on: None = None + + +def upgrade() -> None: + op.add_column( + "user", + sa.Column( + "hidden_assistants", + postgresql.ARRAY(sa.Integer()), + nullable=False, + server_default="{}", + ), + ) + + +def downgrade() -> None: + op.drop_column("user", "hidden_assistants") diff --git a/backend/alembic/versions/f6a7b8c9d0e1_persona_rerank_enabled.py b/backend/alembic/versions/f6a7b8c9d0e1_persona_rerank_enabled.py new file mode 100644 index 00000000000..f26ec265572 --- /dev/null +++ b/backend/alembic/versions/f6a7b8c9d0e1_persona_rerank_enabled.py @@ -0,0 +1,39 @@ +"""persona: add rerank_enabled (per-assistant cross-encoder reranking opt-in) + +Per-assistant toggle for cross-encoder reranking. Only takes effect when +reranking is globally available (RERANK_ENABLED + a GPU-backed model server); +default false so existing assistants and the GPU-free local/default setup are +unchanged. Lets reranking be rolled out incrementally / A-B compared per +assistant before becoming the default. See db/models.py::Persona and +search/preprocessing/preprocessing.py. + +Revision ID: f6a7b8c9d0e1 +Revises: e5f6a7b8c9d0 +Create Date: 2026-06-03 + +""" +from alembic import op +import sqlalchemy as sa + + +# revision identifiers, used by Alembic. +revision = "f6a7b8c9d0e1" +down_revision = "e5f6a7b8c9d0" +branch_labels: None = None +depends_on: None = None + + +def upgrade() -> None: + op.add_column( + "persona", + sa.Column( + "rerank_enabled", + sa.Boolean(), + nullable=False, + server_default=sa.false(), + ), + ) + + +def downgrade() -> None: + op.drop_column("persona", "rerank_enabled") diff --git a/backend/alembic/versions/f7a8b9c0d1e2_slack_bot_response_blocklist.py b/backend/alembic/versions/f7a8b9c0d1e2_slack_bot_response_blocklist.py new file mode 100644 index 00000000000..b399449b521 --- /dev/null +++ b/backend/alembic/versions/f7a8b9c0d1e2_slack_bot_response_blocklist.py @@ -0,0 +1,62 @@ +"""slack bot: response blocklist (suppress responses for certain senders) + +Creates slack_bot_response_blocklist — senders (by email) whose Slack messages +should NOT trigger a Darwin response. DB-driven so the list can change without a +redeploy. Seeds the first entry (jr.bancel@uipath.com). See +db/models.py::SlackBotResponseBlocklist and +danswerbot/slack/handlers/handle_message.py. + +Revision ID: f7a8b9c0d1e2 +Revises: f6a7b8c9d0e1 +Create Date: 2026-06-17 + +""" +from alembic import op +import sqlalchemy as sa + + +# revision identifiers, used by Alembic. +revision = "f7a8b9c0d1e2" +down_revision = "f6a7b8c9d0e1" +branch_labels: None = None +depends_on: None = None + + +def upgrade() -> None: + op.create_table( + "slack_bot_response_blocklist", + sa.Column("id", sa.Integer(), nullable=False), + sa.Column("email", sa.String(), nullable=False), + sa.Column( + "created_at", + sa.DateTime(timezone=True), + server_default=sa.func.now(), + nullable=False, + ), + sa.PrimaryKeyConstraint("id"), + ) + # Single unique index — mirrors `mapped_column(String, unique=True, index=True)`. + op.create_index( + op.f("ix_slack_bot_response_blocklist_email"), + "slack_bot_response_blocklist", + ["email"], + unique=True, + ) + + # Seed the initial blocked senders (stored lowercase; matched + # case-insensitively). Further additions are plain DB inserts — no migration. + op.execute( + sa.text( + "INSERT INTO slack_bot_response_blocklist (email) VALUES " + "('jr.bancel@uipath.com'), ('andrei.barbu@uipath.com') " + "ON CONFLICT (email) DO NOTHING" + ) + ) + + +def downgrade() -> None: + op.drop_index( + op.f("ix_slack_bot_response_blocklist_email"), + table_name="slack_bot_response_blocklist", + ) + op.drop_table("slack_bot_response_blocklist") diff --git a/backend/danswer/auth/api_key.py b/backend/danswer/auth/api_key.py index 15b372cc73e..6b4dd778d33 100644 --- a/backend/danswer/auth/api_key.py +++ b/backend/danswer/auth/api_key.py @@ -40,3 +40,35 @@ def validate_api_key(request: Request, db_session: Session = Depends(get_session # Cache it for future requests cache[api_key_value] = True return None + + +def request_has_valid_api_key(request: Request, db_session: Session) -> bool: + """Return True if the request carries a valid X-API-Key. + + These keys are service credentials for automation (they intentionally do NOT + map to a browser `User`). `current_user` uses this to authorize an api-key + request as an anonymous service caller instead of 403'ing it into the SSO + flow once AUTH_TYPE enforces auth (e.g. OIDC). Mirrors `validate_api_key`'s + lookup + cache exactly, so the two stay consistent. + + NOTE: `db_session` is passed in (not a Depends) because the caller already + holds a session. + """ + if _API_KEY_HEADER not in request.headers: + return False + + api_key_value = request.headers.get(_API_KEY_HEADER) + if not api_key_value: + return False + + if api_key_value in cache: + return True + + api_key = db_session.scalar( + select(ApiKey).where(ApiKey.hashed_api_key == api_key_value) + ) + if api_key is None: + return False + + cache[api_key_value] = True + return True diff --git a/backend/danswer/auth/noauth_user.py b/backend/danswer/auth/noauth_user.py index 4744c4a6488..25dd11bf090 100644 --- a/backend/danswer/auth/noauth_user.py +++ b/backend/danswer/auth/noauth_user.py @@ -25,7 +25,7 @@ def load_no_auth_user_preferences(store: DynamicConfigStore) -> UserPreferences: ) return UserPreferences(**preferences_data) except ConfigNotFoundError: - return UserPreferences(chosen_assistants=None) + return UserPreferences(chosen_assistants=None, hidden_assistants=[]) def fetch_no_auth_user(store: DynamicConfigStore) -> UserInfo: diff --git a/backend/danswer/auth/users.py b/backend/danswer/auth/users.py index 5605fdbde35..ebc4ec92339 100644 --- a/backend/danswer/auth/users.py +++ b/backend/danswer/auth/users.py @@ -26,6 +26,7 @@ from fastapi_users_db_sqlalchemy import SQLAlchemyUserDatabase from sqlalchemy.orm import Session +from danswer.auth.api_key import request_has_valid_api_key from danswer.auth.invited_users import get_invited_users from danswer.auth.schemas import UserCreate from danswer.auth.schemas import UserRole @@ -354,8 +355,19 @@ async def double_check_user( async def current_user( + request: Request, user: User | None = Depends(optional_user), + db_session: Session = Depends(get_session), ) -> User | None: + # API keys are service credentials for automation (not browser users) and + # intentionally don't map to a User. A request carrying a valid key is + # authorized as an anonymous service caller (user stays None — endpoints + # already handle that), rather than being 403'd into the SSO flow once + # AUTH_TYPE enforces auth (e.g. OIDC). Browser requests without a session + # still 403. Admin-only routes (current_admin_user) still require an admin + # user, so a key alone does not grant admin access. + if user is None and request_has_valid_api_key(request, db_session): + return None return await double_check_user(user) diff --git a/backend/danswer/background/celery/celery_app.py b/backend/danswer/background/celery/celery_app.py index ed21fc40576..077c3d2f07f 100644 --- a/backend/danswer/background/celery/celery_app.py +++ b/backend/danswer/background/celery/celery_app.py @@ -31,14 +31,18 @@ from danswer.db.document import get_document_ids_for_connector_credential_pair from danswer.db.document import prepare_to_modify_documents from danswer.db.document_set import delete_document_set +from danswer.db.document_set import document_set_sync_cursor_key from danswer.db.document_set import fetch_document_sets from danswer.db.document_set import fetch_document_sets_for_documents from danswer.db.document_set import fetch_documents_for_document_set_paginated from danswer.db.document_set import get_document_set_by_id from danswer.db.document_set import mark_document_set_as_synced +from danswer.dynamic_configs.factory import get_dynamic_config_store +from danswer.dynamic_configs.interface import ConfigNotFoundError from danswer.db.engine import build_connection_string from danswer.db.engine import get_sqlalchemy_engine from danswer.db.engine import SYNC_DB_API +from danswer.db.tasks import get_stuck_deletion_cc_ids from danswer.db.models import DocumentSet from danswer.document_index.document_index_utils import get_both_index_names from danswer.document_index.factory import get_default_document_index @@ -75,6 +79,14 @@ _SYNC_BATCH_SIZE = 100 +# Cap on how many document-set syncs run at once. Each sync fans out +# _NUM_THREADS (32) concurrent Vespa requests, so without a cap up to +# worker-concurrency (10) syncs × 32 = ~320 simultaneous Vespa calls would +# hammer the cluster. 3 keeps Vespa load predictable (≈96 concurrent calls +# across the 3 content nodes — measured to have headroom: ~2-2.5 cores/node, +# no 429s/timeouts at 2) while draining the backlog a bit faster; the rest +# wait and are picked up on later ticks. +_MAX_CONCURRENT_DOCUMENT_SET_SYNCS = 3 ##### @@ -270,9 +282,24 @@ def _sync_document_batch(document_ids: list[str], db_session: Session) -> None: ] document_index.update(update_requests=update_requests) + kv_store = get_dynamic_config_store() + cursor_key = document_set_sync_cursor_key(document_set_id) + with Session(get_sqlalchemy_engine()) as db_session: try: - cursor = None + # Resume from the last persisted cursor so a worker restart or the + # 6h soft_time_limit doesn't force a from-scratch re-sync. Without + # this, a set too large to finish in one window kept re-doing its + # first batches forever and never reached the rest. + try: + cursor = cast(str, kv_store.load(cursor_key)) + logger.info( + f"Resuming document set {document_set_id} sync after cursor " + f"'{cursor}'" + ) + except ConfigNotFoundError: + cursor = None + while True: document_id_batch, cursor = fetch_documents_for_document_set_paginated( document_set_id=document_set_id, @@ -287,6 +314,16 @@ def _sync_document_batch(document_ids: list[str], db_session: Session) -> None: ) if cursor is None: break + # Checkpoint progress after each fully-synced batch so an + # interruption resumes here (re-doing at most one batch, which + # is idempotent since updates are "assign"). + kv_store.store(cursor_key, cursor) + + # Completed a full pass — drop the resume cursor. + try: + kv_store.delete(cursor_key) + except ConfigNotFoundError: + pass # if there are no connectors, then delete the document set. Otherwise, just # mark it as successfully synced. @@ -329,12 +366,29 @@ def check_for_document_sets_sync_task() -> None: document_set_info = fetch_document_sets( user_id=None, db_session=db_session, include_outdated=True ) + + # Bound how many syncs run concurrently (each fans out 32 Vespa + # threads). Count the ones already in flight, then only kick off enough + # new ones to reach _MAX_CONCURRENT_DOCUMENT_SET_SYNCS. The rest are + # left for a later tick. should_sync_doc_set() returns False for sets + # that are up-to-date OR already syncing, so an out-of-date set for + # which it returns False is one that's currently in flight. + live_syncs = 0 + candidates = [] for document_set, _ in document_set_info: + if document_set.is_up_to_date: + continue if should_sync_doc_set(document_set, db_session): - logger.info(f"Syncing the {document_set.name} document set") - sync_document_set_task.apply_async( - kwargs=dict(document_set_id=document_set.id), - ) + candidates.append(document_set) + else: + live_syncs += 1 + + open_slots = max(0, _MAX_CONCURRENT_DOCUMENT_SET_SYNCS - live_syncs) + for document_set in candidates[:open_slots]: + logger.info(f"Syncing the {document_set.name} document set") + sync_document_set_task.apply_async( + kwargs=dict(document_set_id=document_set.id), + ) @celery_app.task( @@ -414,6 +468,41 @@ def check_for_prune_task() -> None: ) +@celery_app.task( + name="check_for_stuck_deletion_tasks", + soft_time_limit=JOB_TIMEOUT, +) +def check_for_stuck_deletion_tasks() -> None: + """Re-drive connector deletions orphaned by a lost broker message. + + Connector deletion is the event-driven `cleanup_connector_credential_pair_task` + on the non-durable Redis broker. A Redis/worker restart while it's queued + loses the broker message but leaves the `task_queue_jobs` row PENDING, so + the connector is stuck "Deleting" forever — deletion, unlike sync/prune, is + never periodically rescheduled, and the delete API's dedup guard then blocks + re-submission. This re-enqueues any cleanup task whose latest row has been + non-terminal past JOB_TIMEOUT. + + Safe to run repeatedly: the cleanup task's per-cc-pair advisory lock makes a + re-enqueue a no-op if a deletion is genuinely still running, and the fresh + row a re-enqueue creates stays "live" for JOB_TIMEOUT — so this self-throttles + to at most one re-drive per cc-pair per timeout window. A re-enqueue for an + already-deleted cc-pair simply fails fast (cc-pair not found -> FAILURE), + clearing the stale "Deleting" state.""" + with Session(get_sqlalchemy_engine()) as db_session: + for connector_id, credential_id in get_stuck_deletion_cc_ids(db_session): + logger.info( + f"Re-driving orphaned connector deletion: " + f"connector_id={connector_id}, credential_id={credential_id}" + ) + cleanup_connector_credential_pair_task.apply_async( + kwargs=dict( + connector_id=connector_id, + credential_id=credential_id, + ) + ) + + ##### # Celery Beat (Periodic Tasks) Settings ##### @@ -425,9 +514,28 @@ def check_for_prune_task() -> None: } celery_app.conf.beat_schedule.update( { + # Was every 5s, but check_for_prune_task scans ALL cc-pairs (444+ here, + # with lazy-loaded connector/credential → N+1, ~8s/run). At a 5s cadence + # the runs overlapped and piled up until they saturated all worker + # threads, starving sync_document_set_task (doc sets stuck syncing). + # Pruning is governed by each connector's prune_freq (~daily), so a + # frequent check buys nothing — 15 min is plenty. "check-for-prune": { "task": "check_for_prune_task", - "schedule": timedelta(seconds=5), + "schedule": timedelta(minutes=15), + }, + } +) +celery_app.conf.beat_schedule.update( + { + # Safety net for connector deletions orphaned by a lost broker message + # (Redis is non-durable; a restart strands the task_queue_jobs row + # PENDING and the connector sticks on "Deleting"). Re-drives any cleanup + # task non-terminal past JOB_TIMEOUT. 30-min cadence is fine — the + # orphan threshold is JOB_TIMEOUT (6h) and the re-drive self-throttles. + "check-for-stuck-deletions": { + "task": "check_for_stuck_deletion_tasks", + "schedule": timedelta(minutes=30), }, } ) diff --git a/backend/danswer/background/update.py b/backend/danswer/background/update.py index d00aadcf108..1c7790ac045 100755 --- a/backend/danswer/background/update.py +++ b/backend/danswer/background/update.py @@ -18,7 +18,9 @@ from danswer.configs.app_configs import DASK_JOB_CLIENT_ENABLED from danswer.configs.app_configs import DISABLE_INDEX_UPDATE_ON_SWAP from danswer.configs.app_configs import NUM_INDEXING_WORKERS +from danswer.configs.indexing_concurrency import cap_for_source from danswer.configs.indexing_concurrency import PER_SOURCE_CAP +from danswer.configs.indexing_concurrency import PER_SOURCE_CAP_OVERRIDES from danswer.db.connector import fetch_connectors from danswer.db.embedding_model import get_current_db_embedding_model from danswer.db.embedding_model import get_secondary_db_embedding_model @@ -272,6 +274,7 @@ def _build_running_view( in_progress_rows: list[IndexAttempt], dispatched_pre_completion_rows: list[IndexAttempt], per_source_cap: int, + cap_overrides: dict[str, int] | None = None, ) -> tuple[dict[str, int], set[tuple[int | None, int | None, int]]]: """Build the scheduler's running-attempt view. @@ -288,6 +291,7 @@ def _build_running_view( once despite cap=1" + "lower-priority attempt running while a higher-priority attempt sits NOT_STARTED". """ + overrides = cap_overrides or {} running_per_source: dict[str, int] = {} in_progress_cc_pair_keys: set[tuple[int | None, int | None, int]] = set() accounted: set[int] = set() @@ -300,8 +304,8 @@ def _build_running_view( in_progress_cc_pair_keys.add( (ip.connector_id, ip.credential_id, ip.embedding_model_id) ) - if per_source_cap > 0: - key = ip.connector.source.value + key = ip.connector.source.value + if cap_for_source(key, per_source_cap, overrides) > 0: running_per_source[key] = running_per_source.get(key, 0) + 1 for d in dispatched_pre_completion_rows: if d.id in accounted: @@ -312,8 +316,8 @@ def _build_running_view( in_progress_cc_pair_keys.add( (d.connector_id, d.credential_id, d.embedding_model_id) ) - if per_source_cap > 0: - key = d.connector.source.value + key = d.connector.source.value + if cap_for_source(key, per_source_cap, overrides) > 0: running_per_source[key] = running_per_source.get(key, 0) + 1 return running_per_source, in_progress_cc_pair_keys @@ -323,6 +327,7 @@ def _evaluate_dispatch_for_attempt( running_per_source: dict[str, int], in_progress_cc_pair_keys: set[tuple[int | None, int | None, int]], per_source_cap: int, + cap_overrides: dict[str, int] | None = None, ) -> str: """Pure decision: should this attempt dispatch now, or defer? @@ -339,9 +344,10 @@ def _evaluate_dispatch_for_attempt( ) if cc_pair_key in in_progress_cc_pair_keys: return _DEFER_CC_PAIR - if per_source_cap > 0: - source_key = attempt.connector.source.value - if running_per_source.get(source_key, 0) >= per_source_cap: + source_key = attempt.connector.source.value + source_cap = cap_for_source(source_key, per_source_cap, cap_overrides or {}) + if source_cap > 0: + if running_per_source.get(source_key, 0) >= source_cap: return _DEFER_SOURCE_CAP running_per_source[source_key] = running_per_source.get(source_key, 0) + 1 in_progress_cc_pair_keys.add(cc_pair_key) @@ -400,6 +406,7 @@ def kickoff_indexing_jobs( in_progress_rows, dispatched_pre_completion_rows, PER_SOURCE_CAP, + PER_SOURCE_CAP_OVERRIDES, ) logger.info(f"Found {len(new_indexing_attempts)} new indexing tasks.") @@ -437,6 +444,7 @@ def kickoff_indexing_jobs( running_per_source, in_progress_cc_pair_keys, PER_SOURCE_CAP, + PER_SOURCE_CAP_OVERRIDES, ) if decision == _DEFER_CC_PAIR: logger.info( @@ -450,7 +458,9 @@ def kickoff_indexing_jobs( f"Deferring indexing attempt {attempt.id} for connector " f"'{attempt.connector.name}' " f"(source={attempt.connector.source.value}): " - f"cap of {PER_SOURCE_CAP} reached. " + f"cap of " + f"{cap_for_source(attempt.connector.source.value, PER_SOURCE_CAP, PER_SOURCE_CAP_OVERRIDES)}" + f" reached. " "Will retry on next scheduler tick." ) continue diff --git a/backend/danswer/chat/personas.yaml b/backend/danswer/chat/personas.yaml index ecb9d7cfe38..a9c1db3cd0c 100644 --- a/backend/danswer/chat/personas.yaml +++ b/backend/danswer/chat/personas.yaml @@ -18,7 +18,7 @@ personas: # Enable/Disable usage of the LLM chunk filter feature whereby each chunk is passed to the LLM to determine # if the chunk is useful or not towards the latest user query # This feature can be overriden for all personas via DISABLE_LLM_CHUNK_FILTER env variable - llm_relevance_filter: true + llm_relevance_filter: false # Enable/Disable usage of the LLM to extract query time filters including source type and time range filters llm_filter_extraction: true # Decay documents priority as they age, options are: @@ -61,7 +61,7 @@ personas: prompts: - "OnlyLLM" num_chunks: 0 - llm_relevance_filter: true + llm_relevance_filter: false llm_filter_extraction: true recency_bias: "auto" document_sets: [] @@ -74,7 +74,7 @@ personas: prompts: - "Paraphrase" num_chunks: 10 - llm_relevance_filter: true + llm_relevance_filter: false llm_filter_extraction: true recency_bias: "auto" document_sets: [] diff --git a/backend/danswer/chat/process_message.py b/backend/danswer/chat/process_message.py index b94264abd06..876636411e9 100644 --- a/backend/danswer/chat/process_message.py +++ b/backend/danswer/chat/process_message.py @@ -16,6 +16,7 @@ from danswer.chat.models import StreamingError from danswer.configs.chat_configs import CHAT_TARGET_CHUNK_PERCENTAGE from danswer.configs.chat_configs import DISABLE_LLM_CHOOSE_SEARCH +from danswer.configs.chat_configs import LLM_RELEVANCE_FILTER_ENABLED from danswer.configs.chat_configs import MAX_CHUNKS_FED_TO_CHAT from danswer.configs.constants import MessageType from danswer.configs.model_configs import GEN_AI_TEMPERATURE @@ -80,6 +81,7 @@ from danswer.tools.utils import explicit_tool_calling_supported from danswer.utils.logger import setup_logger from danswer.utils.timing import log_generator_function_time +from shared_configs.configs import RERANK_ENABLED logger = setup_logger() @@ -93,15 +95,25 @@ def translate_citations( for db_doc in db_docs: if db_doc.document_id not in doc_id_to_saved_doc_id_map: doc_id_to_saved_doc_id_map[db_doc.document_id] = db_doc.id - #print(f'found doc id: {db_doc.id}') + # print(f'found doc id: {db_doc.id}') citation_to_saved_doc_id_map: dict[int, int] = {} for citation in citations_list: - #print(f'citation id {citation.document_id} for doc num {citation.citation_num}') + # print(f'citation id {citation.document_id} for doc num {citation.citation_num}') if citation.citation_num not in citation_to_saved_doc_id_map: - citation_to_saved_doc_id_map[ - citation.citation_num - ] = doc_id_to_saved_doc_id_map[citation.document_id] + saved_doc_id = doc_id_to_saved_doc_id_map.get(citation.document_id) + if saved_doc_id is None: + # The LLM can cite a document that isn't in this turn's + # reference docs — e.g. it references a doc from earlier in the + # conversation, or when chatting with a subset of selected + # documents. Skip the stray citation instead of failing the + # entire response. + logger.warning( + f"Citation {citation.citation_num} references unknown " + f"document_id '{citation.document_id}'; skipping" + ) + continue + citation_to_saved_doc_id_map[citation.citation_num] = saved_doc_id return citation_to_saved_doc_id_map @@ -425,6 +437,10 @@ def stream_chat_message_objects( if db_tool_model.in_code_tool_id: tool_cls = get_built_in_tool_by_id(db_tool_model.id, db_session) if tool_cls.__name__ == SearchTool.__name__ and not latest_query_files: + # Chat-page per-conversation toggles (default off, assistant + # settings intentionally ignored in chat). Each is gated by + # its global master switch; we pass explicit skip_* so + # retrieval_preprocessing uses these instead of the persona. search_tool = SearchTool( db_session=db_session, user=user, @@ -438,6 +454,11 @@ def stream_chat_message_objects( chunks_above=new_msg_req.chunks_above, chunks_below=new_msg_req.chunks_below, full_doc=new_msg_req.full_doc, + skip_rerank=not (RERANK_ENABLED and new_msg_req.use_reranking), + skip_llm_chunk_filter=not ( + LLM_RELEVANCE_FILTER_ENABLED + and new_msg_req.use_relevance_filter + ), ) tool_dict[db_tool_model.id] = [search_tool] elif tool_cls.__name__ == ImageGenerationTool.__name__: diff --git a/backend/danswer/configs/chat_configs.py b/backend/danswer/configs/chat_configs.py index e4b85f53cd8..6c8d3f59c41 100644 --- a/backend/danswer/configs/chat_configs.py +++ b/backend/danswer/configs/chat_configs.py @@ -33,6 +33,72 @@ DISABLE_LLM_CHUNK_FILTER = ( os.environ.get("DISABLE_LLM_CHUNK_FILTER", "").lower() == "true" ) +# Global master switch for the (one-shot, main-LLM) relevance filter, mirroring +# RERANK_ENABLED. When true the app may run the filter for assistants/chats that +# opt in; when false (default) it never runs regardless of per-assistant flags. +# Unlike reranking this is LLM-only — it needs NO GPU — so it can be enabled on +# its own as a cheaper quality tier. DISABLE_LLM_CHUNK_FILTER still hard-kills it. +LLM_RELEVANCE_FILTER_ENABLED = ( + os.environ.get("LLM_RELEVANCE_FILTER_ENABLED", "").lower() == "true" +) +# Source diversity at final doc selection: guarantee that up to +# SOURCE_DIVERSITY_RESERVED_SLOTS of the highest-ranked docs from PROTECTED_SOURCES +# survive into the LLM prompt, so curated KB/web content isn't crowded out by a +# chatty high-relevance source (e.g. Slack). Replaces the old two-query +# source-prioritization hack — always-on, global, operates on the single +# comparably-scored candidate set. Set RESERVED_SLOTS=0 to disable. +PROTECTED_SOURCES = [ + s.strip().lower() + for s in (os.environ.get("PROTECTED_SOURCES") or "web,sfkbarticles").split(",") + if s.strip() +] +SOURCE_DIVERSITY_RESERVED_SLOTS = int( + os.environ.get("SOURCE_DIVERSITY_RESERVED_SLOTS") or 2 +) +# Source-reserved RETRIEVAL (recall guarantee). SOURCE_DIVERSITY_RESERVED_SLOTS +# above only reserves FINAL-prompt slots among docs retrieval already surfaced — +# it cannot help when a chatty source (e.g. Slack) saturates the entire top-N and +# a curated PROTECTED_SOURCES doc (web/KB/OutSystems) never makes the candidate set +# at all. This runs ONE extra source-scoped retrieval pass (reusing the same ACL + +# persona doc-set fence) to guarantee up to N PROTECTED_SOURCES docs land in the +# candidate set, then the diversity reservation above carries them into the prompt. +# 0 = disabled (single-query behavior). Set per environment. +SOURCE_RESERVED_RETRIEVAL_SLOTS = int( + os.environ.get("SOURCE_RESERVED_RETRIEVAL_SLOTS") or 0 +) +# Per-source cap on the FINAL LLM prompt. A chatty source (e.g. a busy Slack +# channel) can contribute dozens of docs and monopolize what the LLM grounds in +# and CITES, drowning out curated sources even when those are present and +# front-ranked. This keeps the top-N (highest-ranked, after source-diversity +# promotion) docs per source and drops the rest before the token-budget cut, so +# the model sees a balanced set and cites across sources. 0 = disabled (no cap). +# Generic: applies to every assistant + both flows, and only binds when one +# source dominates (single-source assistants are unaffected). +MAX_PROMPT_DOCS_PER_SOURCE = int(os.environ.get("MAX_PROMPT_DOCS_PER_SOURCE") or 0) +# Verify-then-retain authoritative citations. Citations are the LLM's output and it +# inconsistently cites curated sources even when they're at the front of the prompt. +# After generation, if a promoted PROTECTED_SOURCES doc is in context but was NOT +# cited by the LLM, one extra (conditional, batched) LLM call checks whether it +# actually supports a statement in the answer; supporting docs are appended as an +# "Authoritative sources" footer. Additive (LLM's own citations are untouched), +# deduped (skips already-cited + same-document_id), honest (only retains on verified +# support). 0/false = disabled. Costs at most ONE extra call, and only on answers +# where an uncited authoritative doc is present. +AUTHORITATIVE_CITATION_RETENTION_ENABLED = ( + os.environ.get("AUTHORITATIVE_CITATION_RETENTION_ENABLED", "").lower() == "true" +) +# Versioned-docs dedup at final doc selection. Documentation sites publish the +# SAME page under one URL per product version (e.g. docs.uipath.com/.../2024.10/… +# and /.../2023.10/… and /.../2.2510/…). Retrieval then floods the LLM context +# with many near-identical copies of one page, crowding out distinct sources and +# (observed) making the LLM intermittently fail to cite -> DanswerBot skips the +# answer. When enabled, for each docs page we keep only the newest version's +# chunk(s) and drop the older-version duplicates, freeing context for diverse +# pages. Scoped to URLs containing DOCS_VERSION_DEDUP_URL_SUBSTR — all other +# sources are untouched. Set the substr empty to disable. +DOCS_VERSION_DEDUP_URL_SUBSTR = ( + os.environ.get("DOCS_VERSION_DEDUP_URL_SUBSTR") or "docs.uipath.com" +) # Whether the LLM should be used to decide if a search would help given the chat history DISABLE_LLM_CHOOSE_SEARCH = ( os.environ.get("DISABLE_LLM_CHOOSE_SEARCH", "").lower() == "true" diff --git a/backend/danswer/configs/constants.py b/backend/danswer/configs/constants.py index e04497a1e1c..b1f650444c2 100644 --- a/backend/danswer/configs/constants.py +++ b/backend/danswer/configs/constants.py @@ -107,6 +107,7 @@ class DocumentSource(str, Enum): GOOGLE_CLOUD_STORAGE = "google_cloud_storage" OCI_STORAGE = "oci_storage" HIGHSPOT = "highspot" + OUTSYSTEMS = "outsystems" class BlobType(str, Enum): diff --git a/backend/danswer/configs/indexing_concurrency.py b/backend/danswer/configs/indexing_concurrency.py index 5ddf5e22730..bb1e563398d 100644 --- a/backend/danswer/configs/indexing_concurrency.py +++ b/backend/danswer/configs/indexing_concurrency.py @@ -24,6 +24,27 @@ of the same source type because each cc-pair has its own credential): export INDEXING_PER_SOURCE_CAP=0 + +Per-source overrides +-------------------- +The global cap above is uniform: raising it lifts the cap for *every* +source, including ones with real external rate limits or a shared +credential (Slack, Jira, Confluence). When you only want to parallelize a +*specific* source — most commonly `web`, where every cc-pair has its own +dummy credential and mostly distinct domains — set a per-source override +instead. Comma-separated `source=cap` pairs; `source` is the +`DocumentSource` value (lowercase, e.g. `web`, `slack`); `cap` follows the +same convention as the global (0 = uncapped): + + # web runs uncapped (bounded only by NUM_INDEXING_WORKERS + the + # per-cc-pair lock); everything else stays at the global default of 1. + export INDEXING_PER_SOURCE_CAP_OVERRIDES="web=0" + + # web up to 3 at a time, slack still 1 (explicit): + export INDEXING_PER_SOURCE_CAP_OVERRIDES="web=3,slack=1" + +A source not named in the override map falls back to `PER_SOURCE_CAP`. +Malformed entries are skipped (the source keeps the global default). """ from __future__ import annotations @@ -40,8 +61,48 @@ def _resolve_cap() -> int: return 1 +def _resolve_overrides() -> dict[str, int]: + """Parse `INDEXING_PER_SOURCE_CAP_OVERRIDES` into {source: cap}. + + Format: comma-separated `source=cap` pairs, e.g. `web=0,slack=1`. + Source keys are lowercased to match `DocumentSource(...).value`. + Malformed pairs are ignored so one typo can't wipe the whole map. + """ + raw = os.environ.get("INDEXING_PER_SOURCE_CAP_OVERRIDES", "").strip() + overrides: dict[str, int] = {} + if not raw: + return overrides + for part in raw.split(","): + part = part.strip() + if not part or "=" not in part: + continue + source, _, cap_str = part.partition("=") + source = source.strip().lower() + if not source: + continue + try: + overrides[source] = max(0, int(cap_str.strip())) + except ValueError: + continue + return overrides + + # Concurrent attempts per source type. 1 = at most one indexing attempt # per `DocumentSource` at a time (the generic rule). 0 = uncapped (skip # the slot logic entirely; rely solely on the per-cc-pair lock + # NUM_INDEXING_WORKERS). PER_SOURCE_CAP: int = _resolve_cap() + +# Optional per-source overrides on top of PER_SOURCE_CAP. Keyed by +# `DocumentSource` value (lowercase). A source absent here uses +# PER_SOURCE_CAP. See module docstring for the env format. +PER_SOURCE_CAP_OVERRIDES: dict[str, int] = _resolve_overrides() + + +def cap_for_source(source: str, default: int, overrides: dict[str, int]) -> int: + """Resolve the concurrency cap for a single `DocumentSource` value. + + Pure helper (default + overrides passed in) so the scheduler can stay + testable without reaching into module globals. + """ + return overrides.get(source, default) diff --git a/backend/danswer/connectors/factory.py b/backend/danswer/connectors/factory.py index 4f33eb38d67..cd2cf3f51c5 100644 --- a/backend/danswer/connectors/factory.py +++ b/backend/danswer/connectors/factory.py @@ -24,6 +24,7 @@ from danswer.connectors.guru.connector import GuruConnector from danswer.connectors.highspot.connector import HighspotConnector from danswer.connectors.hubspot.connector import HubSpotConnector +from danswer.connectors.outsystems.connector import OutSystemsConnector from danswer.connectors.interfaces import BaseConnector from danswer.connectors.interfaces import EventConnector from danswer.connectors.interfaces import LoadConnector @@ -110,6 +111,7 @@ def identify_connector_class( DocumentSource.GOOGLE_CLOUD_STORAGE: BlobStorageConnector, DocumentSource.OCI_STORAGE: BlobStorageConnector, DocumentSource.HIGHSPOT: HighspotConnector, + DocumentSource.OUTSYSTEMS: OutSystemsConnector, } connector_by_source = connector_map.get(source, {}) diff --git a/backend/danswer/connectors/highspot/connector.py b/backend/danswer/connectors/highspot/connector.py index 7f29c12e96e..e320558f64e 100644 --- a/backend/danswer/connectors/highspot/connector.py +++ b/backend/danswer/connectors/highspot/connector.py @@ -243,18 +243,16 @@ def _ensure_browser() -> BrowserContext: logger.warning("Item without ID found, skipping") continue - item_details = self.client.get_item(item_id) - if not item_details: - logger.warning( - "Item %s details not found, skipping", - item_id, - ) - continue - - # Time-window filter (poll mode). + # Time-window filter (poll mode) — applied on the + # LIST item's `date_updated` BEFORE the per-item + # get_item() call, so incremental polls skip the + # detail fetch + content download/scrape for items + # unchanged within the window. The list response + # already carries `date_updated`, so this is a true + # delta fetch rather than enumerating every item. if start or end: parsed = _parse_doc_updated_at( - item_details.get("date_updated") + item.get("date_updated") ) if parsed is None: # No usable timestamp — skip in poll @@ -264,6 +262,14 @@ def _ensure_browser() -> BrowserContext: if (start and ts < start) or (end and ts > end): continue + item_details = self.client.get_item(item_id) + if not item_details: + logger.warning( + "Item %s details not found, skipping", + item_id, + ) + continue + content = self._get_item_content( item_details, scrape_context_factory=_ensure_browser, diff --git a/backend/danswer/connectors/highspot/sync.py b/backend/danswer/connectors/highspot/sync.py new file mode 100644 index 00000000000..a271f052a74 --- /dev/null +++ b/backend/danswer/connectors/highspot/sync.py @@ -0,0 +1,133 @@ +"""Sync Highspot Spots -> per-Spot connectors. + +Idempotent: ensures exactly one connector + cc-pair (named after the Spot) +exists for every Spot the given Highspot credential can see. Re-running picks up +newly added Spots and is a no-op for already-covered ones, so it's safe to call +repeatedly (e.g. from the admin endpoint in server/documents/connector.py). +""" +import re + +from sqlalchemy.orm import Session + +from danswer.configs.constants import DocumentSource +from danswer.connectors.highspot.client import HighspotClient +from danswer.connectors.models import InputType +from danswer.db.connector import connector_by_name_source_exists +from danswer.db.connector import create_connector +from danswer.db.connector_credential_pair import add_credential_to_connector +from danswer.db.models import Connector +from danswer.db.models import Credential +from danswer.server.documents.models import ConnectorBase +from danswer.utils.logger import setup_logger + +logger = setup_logger() + +# Monthly refresh; daily prune — mirrors the existing per-Spot Highspot connector. +HIGHSPOT_MONTHLY_REFRESH_FREQ = 2_592_000 +HIGHSPOT_DEFAULT_PRUNE_FREQ = 86_400 + + +def clean_spot_name(title: str) -> str: + """Tidy a raw Highspot Spot title for use as a connector / cc-pair display + name: drop control + zero-width characters and collapse internal whitespace. + + NOTE: only the *display* name is cleaned. The original, unmodified title is + what goes into `spot_names`, because the connector matches Spots by their + real title (see connector.py::_fetch_spots_to_process) — cleaning that would + break the match. + """ + cleaned = re.sub(r"[\x00-\x1f\x7f​-‏]", "", title) + cleaned = re.sub(r"\s+", " ", cleaned).strip() + return cleaned + + +def sync_highspot_spots_to_connectors( + credential_id: int, + db_session: Session, + refresh_freq: int = HIGHSPOT_MONTHLY_REFRESH_FREQ, + prune_freq: int = HIGHSPOT_DEFAULT_PRUNE_FREQ, +) -> list[int]: + """Ensure a per-Spot connector exists for every Spot the credential can see. + + Returns the ids of newly created connectors. Spots already covered by an + existing Highspot connector's `spot_names` are skipped (idempotent). + """ + credential = db_session.get(Credential, credential_id) + if credential is None or not isinstance(credential.credential_json, dict): + raise ValueError(f"Highspot credential {credential_id} not found") + cj = credential.credential_json + if not cj.get("highspot_key") or not cj.get("highspot_secret"): + raise ValueError(f"Credential {credential_id} is not a Highspot credential") + + client = HighspotClient( + cj["highspot_key"], + cj["highspot_secret"], + base_url=(cj.get("highspot_url") or HighspotClient.BASE_URL), + ) + spots = [s for s in client.get_spots() if s.get("title")] + + # Spots already covered by any existing Highspot connector (case-insensitive + # on the ORIGINAL title, which is what's stored in spot_names). + covered: set[str] = set() + for connector in db_session.query(Connector).all(): + source = getattr(connector.source, "value", str(connector.source)) + if source.lower() != DocumentSource.HIGHSPOT.value: + continue + for name in (connector.connector_specific_config or {}).get("spot_names", []) or []: + covered.add(name.lower()) + + created: list[int] = [] + used_display_names: set[str] = set() + for spot in spots: + title = spot["title"] + if title.lower() in covered: + continue + + # Skip empty Spots — a connector for a Spot with no items would just run + # on schedule and index nothing. (Cheap check: counts_total via limit=1.) + if client.get_spot_items(spot["id"], offset=0, page_size=1).get("counts_total", 0) == 0: + logger.info(f"Skipping empty Highspot spot '{title}' (no items)") + continue + + display_name = clean_spot_name(title) or title + # Connector name must be unique per source; skip on any collision rather + # than letting create_connector raise mid-loop. + if display_name.lower() in used_display_names or connector_by_name_source_exists( + display_name, DocumentSource.HIGHSPOT, db_session + ): + logger.warning( + f"Skipping Highspot spot '{title}': connector name " + f"'{display_name}' already exists" + ) + continue + + connector = create_connector( + ConnectorBase( + name=display_name, + source=DocumentSource.HIGHSPOT, + input_type=InputType.POLL, + connector_specific_config={"spot_names": [title]}, + refresh_freq=refresh_freq, + prune_freq=prune_freq, + disabled=False, + ), + db_session, + ) + add_credential_to_connector( + connector_id=connector.id, + credential_id=credential_id, + cc_pair_name=display_name, + is_public=True, + user=None, + db_session=db_session, + ) + created.append(connector.id) + covered.add(title.lower()) + used_display_names.add(display_name.lower()) + logger.info( + f"Created Highspot connector {connector.id} for spot '{title}' " + f"(name='{display_name}')" + ) + + logger.info(f"Highspot spot sync complete: {len(created)} new connector(s)") + return created diff --git a/backend/danswer/connectors/outsystems/__init__.py b/backend/danswer/connectors/outsystems/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/danswer/connectors/outsystems/connector.py b/backend/danswer/connectors/outsystems/connector.py new file mode 100644 index 00000000000..977f3190330 --- /dev/null +++ b/backend/danswer/connectors/outsystems/connector.py @@ -0,0 +1,519 @@ +"""OutSystems connector. + +Named generically for the OutSystems low-code platform; currently tuned to the +inside.uipath.com Intranet app (its module/screen names are baked into the +screenservice paths below — `Intranet`/`PageTemplates.Page`). To support another +OutSystems app later, parameterize those paths via connector config; the auth + +extraction logic stays the same. + +INTERIM AUTH — to be enhanced once a service account exists. The target has no +API key / bearer token; the SPA authenticates with the SSO **session cookie** +plus an `x-csrftoken` header. So the credential here is currently a short-lived +browser session (cookie + csrf + apiVersion) captured from DevTools. + +Because that session expires in hours, this is intended for ONE-TIME indexing: +create the connector with no refresh schedule (refresh_freq=None) and trigger a +single index run while the cookie is fresh. The content is mostly static, so one +backfill is sufficient until the service account lands — at which point the +credential (and `_auth_headers`) swap to the service account with no change to +the enumeration/extraction logic below. + +Content model: pages are `/Page?PageId=` (sequential ints). Each page's +content is fetched from a JSON "screenservice" and is a tree of +sections -> widgets. Widget items (`PageWidgetItem`) come in two flavors, +discriminated by `PageFileId`: + - text widget (PageFileId == 0): `Text1` is HTML body content. + - document widget (PageFileId != 0): `Text1` is a FilePath (e.g. + "IC_Content/Docs/foo.pdf"), `Text2` is the filename. The actual file lives + in SharePoint; we resolve it via the `ActionFileMetadata_Get` screenservice + -> a pre-authed `DownloadURL` (tempauth in the URL) -> download bytes -> + extract text (PDF/DOCX/...). Many policy pages are near-empty text with the + real content in the attached PDF, so file extraction is essential here. + +File download needs the W_Document action's own apiVersion (separate from the +page action's, and server-enforced), supplied as `outsystems_file_api_version`. +If that credential is absent, file download is skipped (page text only). +""" +import html +import io +import re +from concurrent.futures import ThreadPoolExecutor +from concurrent.futures import TimeoutError as FutureTimeout +from html.parser import HTMLParser +from typing import Any + +import requests + +from danswer.configs.app_configs import INDEX_BATCH_SIZE +from danswer.configs.constants import DocumentSource +from danswer.connectors.interfaces import GenerateDocumentsOutput +from danswer.connectors.interfaces import LoadConnector +from danswer.connectors.models import ConnectorMissingCredentialError +from danswer.connectors.models import Document +from danswer.connectors.models import Section +from danswer.file_processing.extract_file_text import extract_file_text +from danswer.utils.logger import setup_logger + +logger = setup_logger() + +_PAGE_DATA_ACTION = "/screenservices/Intranet/New/N_Page_Preview/DataActionGetPage" +_FILE_METADATA_ACTION = ( + "/screenservices/Intranet/PageWidgets/W_Document/ActionFileMetadata_Get" +) +_MODULE_VERSION_URL = "/moduleservices/moduleversioninfo" +_PAGE_VIEW_NAME = "PageTemplates.Page" +_DEFAULT_BASE_URL = "https://inside.uipath.com" +# (connect, read) timeouts on every outbound request. Without these a single +# hung connection freezes the whole index attempt indefinitely; with them a slow +# page/file times out, is logged, and skipped so the scan continues. +_HTTP_TIMEOUT = (10, 60) +# Wall-clock bound on a single file's text extraction, enforced in a SEPARATE +# PROCESS (a thread can't be force-killed when the parser holds the GIL). On +# timeout the child is terminated and the file skipped. +_FILE_EXTRACT_TIMEOUT = 120 +# Skip files larger than this. Learned the hard way: a 249 MB .mp4 and several +# 4–15 MB PDFs froze the indexing pipeline (huge text -> thousands of embedding +# chunks). Big media has no useful text anyway; big PDFs are capped below too. +_MAX_FILE_BYTES = 20 * 1024 * 1024 +# Per-file/-page text ceiling (generous — splitting below keeps it indexable). +_MAX_DOC_CHARS = 1_000_000 +# Split a document's text into sections of at most this many chars. The chunker +# tokenizes section-by-section, and the HF tokenizer is ~O(n^2) on a single huge +# string — so one giant section freezes the worker. Bounded sections keep every +# tokenize() call small (linear overall) AND preserve all content (no skip, no +# truncation beyond the ceiling above). +_MAX_SECTION_CHARS = 30_000 +# Any unbroken run of non-whitespace longer than this gets spaces inserted. PDFs +# of dense tables extract as enormous space-free strings; the HF tokenizer is +# ~O(n^2) per "word", so a 500K-char run tokenizes for MINUTES, holding the GIL +# and freezing the dask worker. Breaking long runs makes tokenization linear. +_MAX_TOKEN_RUN = 80 +# Pages whose attached file produces enormous text (151K–1.8M chars: drug +# formularies, clinic-network spreadsheets, multi-plan benefit PDFs). Such a doc +# tokenizes for minutes (~O(n^2)), holding the GIL and freezing the dask worker, +# which can't force-kill it (daemonic process -> no subprocess). Skipped entirely +# so a full sync can't stall. This is the COMPLETE list from a full 1..2200 scan +# (extract every file, measure chars; threshold 150K). Re-run scratchpad +# full_scan.py and refresh if the source content changes materially. +# Pages whose attached file is enormous (151K–1.8M chars: drug formularies, +# clinic-network spreadsheets, multi-plan benefit PDFs). Even split into bounded +# sections these stall the chunk/embed stage — each explodes into ~1.5K chunks +# that the CPU-only embedder can't process in reasonable time (verified: a split +# run upserted the rows then hung in chunk/embed with the model server idle). +# Section-splitting (below) still protects against *moderately* large docs; this +# list is the small set of genuinely-unindexable giants. Complete list from a +# full 1..2200 scan (scratchpad/full_scan.py, char threshold 150K). To actually +# index these later, use a GPU embedder or ingest them split out-of-band. +_SKIP_PAGE_IDS: set[int] = { + 585, 833, 835, 850, 901, 906, 907, 908, 940, 953, + 1139, 1193, 1197, 1209, 1591, 1684, 1818, 1820, 2090, 2095, +} +# Minimum extracted-text length for a page to be treated as real content (skips +# empty / unpublished / widget-only pages). Computed over page text + file text. +_MIN_TEXT_LEN = 40 +# Only attempt download/extract for file types extract_file_text handles. +_DOC_EXTENSIONS = ( + ".pdf", ".docx", ".doc", ".pptx", ".xlsx", ".eml", ".epub", ".html", ".txt", +) +# Explicitly skip videos/media/archives — no useful text, and the originals are +# huge (e.g. a 249 MB onboarding .mp4, many ~10 MB .webm walkthroughs). Skipped +# with a log so it's auditable, before any download. (The allowlist above would +# drop them too; this makes the intent explicit and visible.) +_MEDIA_EXTENSIONS = ( + ".mp4", ".webm", ".mov", ".avi", ".mkv", ".wmv", ".flv", ".m4v", + ".mp3", ".wav", ".m4a", ".aac", ".ogg", + ".png", ".jpg", ".jpeg", ".gif", ".svg", ".bmp", ".webp", + ".zip", ".rar", ".7z", ".tar", ".gz", +) + + +class _TextExtractor(HTMLParser): + """Strip HTML to readable text, inserting newlines around block elements.""" + + _BLOCK = { + "p", "div", "br", "li", "ul", "ol", "tr", "td", "th", + "h1", "h2", "h3", "h4", "h5", "h6", "section", "table", + } + + def __init__(self) -> None: + super().__init__() + self._parts: list[str] = [] + + def handle_starttag(self, tag: str, attrs: Any) -> None: + if tag in self._BLOCK: + self._parts.append("\n") + + def handle_endtag(self, tag: str) -> None: + if tag in self._BLOCK: + self._parts.append("\n") + + def handle_data(self, data: str) -> None: + self._parts.append(data) + + def text(self) -> str: + joined = html.unescape("".join(self._parts)) + lines = [re.sub(r"[ \t ]+", " ", ln).strip() for ln in joined.splitlines()] + out: list[str] = [] + for ln in lines: + if ln or (out and out[-1]): + out.append(ln) + return "\n".join(out).strip() + + +def _strip_html(raw: str) -> str: + parser = _TextExtractor() + parser.feed(raw or "") + return parser.text() + + +def _break_long_tokens(text: str, n: int = _MAX_TOKEN_RUN) -> str: + """Insert a space into any whitespace-free run longer than `n` chars. Real + words are short; only pathological blobs (table dumps, base64) get broken, + which keeps the tokenizer linear without losing readable content.""" + if not text: + return text + return re.sub(r"\S{%d}" % n, lambda m: m.group(0) + " ", text) + + +def _split_sections(text: str, link: str, header: str | None = None) -> list[Section]: + """Turn one (possibly huge) text into several bounded-size Sections so the + chunker never tokenizes a giant string in one call. Splits on a newline near + each boundary when possible. `header` (e.g. a filename) prefixes the text.""" + text = (text or "").strip() + if not text: + return [] + text = text[:_MAX_DOC_CHARS] + if header: + text = f"{header}\n\n{text}" + out: list[Section] = [] + i, n = 0, len(text) + while i < n: + end = min(i + _MAX_SECTION_CHARS, n) + if end < n: # prefer a newline boundary in the back half of the window + nl = text.rfind("\n", i + _MAX_SECTION_CHARS // 2, end) + if nl != -1: + end = nl + chunk = text[i:end].strip() + if chunk: + out.append(Section(link=link, text=chunk)) + i = end + return out + + +def _extract_text_with_timeout(filename: str, content: bytes, timeout: int) -> str: + """Run extract_file_text under a soft wall-clock bound on a worker thread. + + NOTE: must be threads, not a subprocess — the dask indexing worker runs tasks + in a *daemonic* process, which cannot spawn children ("daemonic processes are + not allowed to have children"), so multiprocessing is unavailable here. On + timeout we abandon the thread (can't force-kill it) and return "" so the run + keeps moving. This is a backstop only: the real protection against the giant + documents that froze prod is the file-size + text caps below, which bound the + download and the embedding work in-process.""" + ex = ThreadPoolExecutor(max_workers=1) + try: + fut = ex.submit( + extract_file_text, + file_name=filename, + file=io.BytesIO(content), + break_on_unprocessable=False, + ) + return fut.result(timeout=timeout) + except FutureTimeout: + logger.warning(f"OutSystems extract timed out after {timeout}s: {filename}") + return "" + except Exception as e: + logger.warning(f"OutSystems extract error for {filename}: {e}") + return "" + finally: + ex.shutdown(wait=False) # never block on a hung extraction + + +def collect_widgets(obj: Any, html_blobs: list[str], files: list[tuple[str, str]]) -> None: + """Walk the page response and classify each PageWidgetItem: + - text widget (PageFileId falsy/"0"): Text1 HTML -> html_blobs + - document widget (PageFileId truthy) : (Text1 FilePath, Text2 filename) -> files + Keyed on the widget-item shape (a dict carrying both Text1 and PageFileId), + so a document widget's Text1 (a file path) is NOT mistaken for body text. + """ + if isinstance(obj, dict): + if "Text1" in obj: + file_id = str(obj.get("PageFileId") or "0") + text1 = obj.get("Text1") + if file_id and file_id != "0": + filepath = (text1 or "").strip() + filename = (obj.get("Text2") or "").strip() or filepath.split("/")[-1] + if filepath: + files.append((filepath, filename)) + elif isinstance(text1, str) and text1.strip(): + html_blobs.append(text1) + # still recurse (widget items don't nest other widget items, but + # be safe / cheap) + for v in obj.values(): + collect_widgets(v, html_blobs, files) + elif isinstance(obj, list): + for v in obj: + collect_widgets(v, html_blobs, files) + + +def extract_page_text(data: dict) -> str: + """Plain text from the page's text widgets only (no file content).""" + html_blobs: list[str] = [] + files: list[tuple[str, str]] = [] + collect_widgets(data, html_blobs, files) + return "\n\n".join(_strip_html(b) for b in html_blobs if b).strip() + + +def extract_page_title(data: dict, body_text: str, page_id: int) -> str: + name = (data.get("Page2") or {}).get("Name") or "" + if name.strip(): + return name.strip() + for line in body_text.splitlines(): + if len(line.strip()) >= 3: + return line.strip()[:120] + return f"OutSystems Page {page_id}" + + +class OutSystemsConnector(LoadConnector): + def __init__( + self, + page_id_start: int = 1, + page_id_end: int = 600, + # Deliberately small (vs the default INDEX_BATCH_SIZE of 16). These pages + # carry large PDFs -> large text -> many embedding chunks per doc. Big + # batches made a single index_doc_batch tokenize/embed for minutes, + # holding the worker GIL until the run looked frozen. A small batch keeps + # each chunk/embed unit bounded so the worker stays responsive and the + # full range completes in one pass. + batch_size: int = 4, + ) -> None: + self.page_id_start = int(page_id_start) + self.page_id_end = int(page_id_end) + self.batch_size = batch_size + + self.base_url = _DEFAULT_BASE_URL + self._cookie: str | None = None + self._csrf: str | None = None + self._api_version: str | None = None + self._file_api_version: str | None = None + self._module_version: str | None = None + + def load_credentials(self, credentials: dict[str, Any]) -> dict[str, Any] | None: + self.base_url = ( + credentials.get("outsystems_base_url") or _DEFAULT_BASE_URL + ).rstrip("/") + self._cookie = credentials.get("outsystems_cookie") + self._csrf = credentials.get("outsystems_csrf") + self._api_version = credentials.get("outsystems_api_version") + # Optional: enables downloading attached files (PDFs etc). Absent -> page + # text only. Server-enforced and distinct from the page apiVersion. + self._file_api_version = credentials.get("outsystems_file_api_version") + return None + + def _auth_headers(self) -> dict[str, str]: + # Swap point for the service account later. Never logged. + return { + "cookie": self._cookie or "", + "x-csrftoken": self._csrf or "", + "content-type": "application/json; charset=UTF-8", + "accept": "application/json", + "referer": self.base_url + "/", + "origin": self.base_url, + } + + def _session(self) -> requests.Session: + if not self._cookie or not self._csrf: + raise ConnectorMissingCredentialError("OutSystems") + s = requests.Session() + s.headers.update(self._auth_headers()) + return s + + def _fetch_module_version(self, s: requests.Session) -> str: + if self._module_version: + return self._module_version + r = s.get(self.base_url + _MODULE_VERSION_URL, timeout=_HTTP_TIMEOUT) + r.raise_for_status() + token = r.json().get("versionToken", "") + if not token: + raise RuntimeError("Could not read moduleversioninfo.versionToken") + self._module_version = token + return token + + def _fetch_page(self, s: requests.Session, page_id: int) -> dict | None: + payload = { + "versionInfo": { + "moduleVersion": self._module_version, + "apiVersion": self._api_version, + }, + "viewName": _PAGE_VIEW_NAME, + "screenData": { + "variables": { + "PageId": str(page_id), + "_pageIdInDataFetchStatus": 1, + "PageTypeId": 1, + "_pageTypeIdInDataFetchStatus": 1, + "SubSiteId": "0", + "_subSiteIdInDataFetchStatus": 1, + "ShowShareButton": False, + "_showShareButtonInDataFetchStatus": 1, + } + }, + } + try: + r = s.post( + self.base_url + _PAGE_DATA_ACTION, json=payload, timeout=_HTTP_TIMEOUT + ) + except requests.RequestException as e: + # Hung / slow / dropped connection: skip this page so one bad page + # never freezes the whole scan. It can be re-picked on a later run. + logger.warning(f"OutSystems page {page_id}: request failed ({e}); skipping") + return None + if r.status_code in (401, 403): + # Session expired mid-run. There's no cursor, so the page range IS + # the resume point: re-run with page_id_start= (and a + # fresh credential) to continue without re-doing earlier pages. + raise ConnectorMissingCredentialError( + f"OutSystems session expired at PageId {page_id} — refresh the " + f"credential and re-run with page_id_start={page_id} to resume" + ) + if r.status_code != 200: + logger.warning(f"OutSystems page {page_id}: HTTP {r.status_code}") + return None + return r.json().get("data") + + def _file_download_url(self, s: requests.Session, filepath: str) -> str | None: + """Resolve a widget FilePath to a pre-authed SharePoint DownloadURL via + the W_Document screenservice. Returns None if disabled or unresolved.""" + if not self._file_api_version: + return None + payload = { + "versionInfo": { + "moduleVersion": self._module_version, + "apiVersion": self._file_api_version, + }, + "viewName": _PAGE_VIEW_NAME, + "inputParameters": { + "FilePath": filepath, + "IgnoreMediaRefreshToken": False, + "ForceRefresh": False, + "AzureId": "", + }, + } + r = s.post( + self.base_url + _FILE_METADATA_ACTION, json=payload, timeout=_HTTP_TIMEOUT + ) + if r.status_code != 200: + logger.warning(f"OutSystems file metadata {filepath}: HTTP {r.status_code}") + return None + return ((r.json().get("data") or {}).get("FileMetadata") or {}).get( + "DownloadURL" + ) or None + + def _download_file_text( + self, s: requests.Session, filepath: str, filename: str + ) -> str: + """Download an attached file and extract its text. Best-effort: any + failure logs and returns "" so one bad file never sinks the page.""" + low = filename.lower() + if low.endswith(_MEDIA_EXTENSIONS): + logger.info(f"OutSystems skip video/media (no text): {filename}") + return "" + if not low.endswith(_DOC_EXTENSIONS): + logger.info(f"OutSystems skip unsupported file type: {filename}") + return "" + try: + url = self._file_download_url(s, filepath) + if not url: + return "" + # NOTE: the DownloadURL carries its own tempauth; do NOT send the + # inside.uipath.com session cookie to sharepoint.com — use a bare + # request so no cross-domain credential leak occurs. Stream so we can + # bail on oversized files without buffering them fully. + fr = requests.get(url, timeout=_HTTP_TIMEOUT, stream=True) + if fr.status_code != 200: + logger.warning(f"OutSystems download {filename}: HTTP {fr.status_code}") + return "" + declared = int(fr.headers.get("content-length") or 0) + if declared > _MAX_FILE_BYTES: + logger.info( + f"OutSystems skip oversized file {filename} " + f"({declared // 1048576} MB > {_MAX_FILE_BYTES // 1048576} MB)" + ) + fr.close() + return "" + # Read with a hard byte cap (covers missing/false Content-Length). + chunks: list[bytes] = [] + total = 0 + for piece in fr.iter_content(chunk_size=262144): + chunks.append(piece) + total += len(piece) + if total > _MAX_FILE_BYTES: + logger.info(f"OutSystems skip oversized file {filename} (stream cap)") + fr.close() + return "" + content = b"".join(chunks) + text = _extract_text_with_timeout( + filename, content, _FILE_EXTRACT_TIMEOUT + ) + # Break pathological long runs (linear tokenization) then cap so a + # huge doc can't explode into thousands of chunks and stall embedding. + return _break_long_tokens(text.strip())[:_MAX_DOC_CHARS] + except Exception as e: + logger.warning(f"OutSystems file extract failed/skipped for {filename}: {e}") + return "" + + def load_from_state(self) -> GenerateDocumentsOutput: + s = self._session() + self._fetch_module_version(s) + + batch: list[Document] = [] + for page_id in range(self.page_id_start, self.page_id_end + 1): + # Progress breadcrumb: if the session expires, the logs show how far + # we got so the run can be resumed via page_id_start. + if page_id % 50 == 0: + logger.info(f"OutSystems: scanning PageId {page_id}/{self.page_id_end}") + if page_id in _SKIP_PAGE_IDS: + logger.info(f"OutSystems: skipping known problem PageId {page_id}") + continue + data = self._fetch_page(s, page_id) + if not data: + continue + + html_blobs: list[str] = [] + file_refs: list[tuple[str, str]] = [] + collect_widgets(data, html_blobs, file_refs) + page_text = _break_long_tokens( + "\n\n".join(_strip_html(b) for b in html_blobs if b).strip() + ) + + url = f"{self.base_url}/Page?PageId={page_id}" + # Split each text into bounded sections so the chunker can't choke on + # a giant single string (keeps all content; no truncation/skip). + sections: list[Section] = _split_sections(page_text, url) + for filepath, filename in file_refs: + file_text = self._download_file_text(s, filepath, filename) + if file_text: + sections.extend(_split_sections(file_text, url, header=filename)) + + total_len = sum(len(sec.text) for sec in sections) + + # Skip pages with no usable content (neither page text nor files). + if total_len < _MIN_TEXT_LEN or not sections: + continue + + title = extract_page_title(data, page_text, page_id) + batch.append( + Document( + id=f"OUTSYSTEMS__page_{page_id}", + sections=sections, + source=DocumentSource.OUTSYSTEMS, + semantic_identifier=title, + title=title, + metadata={"page_id": str(page_id)}, + ) + ) + if len(batch) >= self.batch_size: + yield batch + batch = [] + if batch: + yield batch diff --git a/backend/danswer/connectors/web/connector.py b/backend/danswer/connectors/web/connector.py index b1e3ff36715..f66346a9d15 100644 --- a/backend/danswer/connectors/web/connector.py +++ b/backend/danswer/connectors/web/connector.py @@ -1,6 +1,7 @@ import io import ipaddress import random +import re import socket import time from datetime import datetime @@ -127,8 +128,17 @@ def is_valid_url(url: str) -> bool: def get_internal_links( - base_url: str, url: str, soup: BeautifulSoup, should_ignore_pound: bool = True + base_url: str, + url: str, + soup: BeautifulSoup, + should_ignore_pound: bool = True, + allowed_prefixes: list[str] | None = None, ) -> set[str]: + # Recursion scope: a link is followed only if it lives under one of these + # prefixes. Defaults to [base_url]; for UiPath docs version-expansion this is + # the set of the latest-N version base URLs so all N subtrees are crawled + # (and older versions are excluded). + prefixes = allowed_prefixes if allowed_prefixes else [base_url] internal_links = set() for link in cast(list[dict[str, Any]], soup.find_all("a")): href = cast(str | None, link.get("href")) @@ -142,7 +152,9 @@ def get_internal_links( # Relative path handling href = urljoin(url, href) - if urlparse(href).netloc == urlparse(url).netloc and base_url in href: + if urlparse(href).netloc == urlparse(url).netloc and any( + prefix in href for prefix in prefixes + ): internal_links.add(href) return internal_links @@ -279,6 +291,127 @@ def get_sitemap_url_from_base_url(base_url: str) -> str: return sitemap_url +# Concrete UiPath docs version segment, e.g. "2022.10", "2023.4", "2.2510". +# Deliberately matches only numeric dotted versions so the "latest" alias and +# non-version path segments are excluded. +_UIPATH_VERSION_RE = re.compile(r"^\d+(?:\.\d+)+$") + + +def _uipath_product_prefix(path: str) -> str: + """The path up to (but excluding) the first version-or-'latest' segment. + + /robot/standalone/latest -> /robot/standalone + /robot/standalone/2022.10 -> /robot/standalone + /apps/automation-suite/ -> /apps/automation-suite + /automation-cloud/automation-cloud/latest/x/y -> /automation-cloud/automation-cloud + """ + segs = [s for s in path.strip("/").split("/") if s] + cut = len(segs) + for i, seg in enumerate(segs): + if seg == "latest" or _UIPATH_VERSION_RE.match(seg): + cut = i + break + return "/" + "/".join(segs[:cut]) + + +def get_uipath_docs_version_base_urls(base_url: str, max_versions: int = 2) -> list[str]: + """Expand a docs.uipath.com product URL to the base URLs of its latest N + concrete versions. + + docs.uipath product pages render a version selector (newest-first) linking + each available version, e.g. /robot/standalone/2025.10. On-prem / standalone + products list many concrete versions; cloud / evergreen products expose only + the 'latest' alias. We take the first `max_versions` CONCRETE versions in the + selector's own order — the site already sorts newest-first, which correctly + handles the calendar -> MAJOR.YYMM scheme change (e.g. 2.2510 is newer than + 2023.10, which no naive numeric sort would get right) — skipping 'latest'. + + Returns: + - the latest N concrete version base URLs for versioned products, else + - [base_url] unchanged for non-docs.uipath.com URLs, or evergreen / single + pages already scoped to a version or '/latest'. + + RAISES (fail-safe, does NOT fall back to a product-base crawl) when the + selector can't be fetched after retries, or no concrete versions are found + AND base_url is a version-less product base — both cases would otherwise + recursively index EVERY version. + """ + if "docs.uipath.com" not in base_url: + return [base_url] + + prefix = _uipath_product_prefix(urlparse(base_url).path) + prefix_segs = [s for s in prefix.strip("/").split("/") if s] + path_segs = [s for s in urlparse(base_url).path.strip("/").split("/") if s] + # base_url already sits inside a specific version / 'latest' segment (so a + # recursive crawl stays within ONE version) iff a version/'latest' segment + # was found, i.e. the product prefix is shorter than the full path. + base_within_version = len(prefix_segs) < len(path_segs) + + # Fetch the version selector, with retries. A transient failure must NEVER + # silently fall back to crawling the bare product base — for a version-less + # base URL that recursively indexes EVERY version, the exact thing this + # feature exists to prevent. Fail the run instead; it retries cleanly. + soup = None + last_err: Exception | None = None + for attempt in range(3): + try: + response = requests.get(base_url, timeout=30) + response.raise_for_status() + soup = BeautifulSoup(response.content, "html.parser") + break + except Exception as e: + last_err = e + time.sleep(2 * (attempt + 1)) + if soup is None: + raise RuntimeError( + f"Could not fetch the UiPath docs version selector for {base_url} " + f"after 3 attempts ({last_err}). Failing rather than falling back to " + "a full product-base crawl (which would index every version)." + ) + + # Collect version-root links in document order (the selector is + # newest-first), deduped: exactly the product prefix + one concrete version + # segment (no deeper path -> isolates selector links from content links). + versions: list[str] = [] + for link in cast(list[dict[str, Any]], soup.find_all("a")): + href = cast(str | None, link.get("href")) + if not href: + continue + parsed = urlparse(urljoin(base_url, href.split("#")[0])) + if parsed.netloc != "docs.uipath.com": + continue + segs = [s for s in parsed.path.strip("/").split("/") if s] + if ( + len(segs) == len(prefix_segs) + 1 + and segs[: len(prefix_segs)] == prefix_segs + and _UIPATH_VERSION_RE.match(segs[-1]) + and segs[-1] not in versions + ): + versions.append(segs[-1]) + + if versions: + latest = versions[:max_versions] + result = [f"https://docs.uipath.com{prefix}/{v}" for v in latest] + logger.info( + f"docs.uipath versioned product '{prefix}': crawling latest " + f"{len(result)} of {len(versions)} versions {latest}" + ) + return result + + # No concrete versions in the selector. Crawl base_url as-is ONLY if it's + # already scoped to a single version / 'latest' (e.g. an evergreen product + # like activities/other/latest, or a deep cloud page .../latest/admin/...). + # A version-less product base would span EVERY version under a recursive + # crawl, so refuse instead of leaking. + if base_within_version: + return [base_url] + raise RuntimeError( + f"No concrete versions found for UiPath docs product '{prefix}' and " + f"'{base_url}' has no version/'latest' segment. Refusing a full " + "product-base crawl — point the connector at a specific version or /latest." + ) + + class WebConnector(LoadConnector, PollConnector): def __init__( self, @@ -286,16 +419,39 @@ def __init__( web_connector_type: str = WEB_CONNECTOR_VALID_SETTINGS.RECURSIVE.value, mintlify_cleanup: bool = True, # Mostly ok to apply to other websites as well batch_size: int = INDEX_BATCH_SIZE, + # OPT-IN, docs.uipath.com + RECURSIVE only. When True, the configured + # base_url is expanded to the latest `max_versions` concrete versions of + # that product (via the page's version selector), re-evaluated every + # indexing run so new releases are auto-picked-up and the oldest drops + # off. Leave False (default) for every other connector — auto-applying + # would make per-version connectors all crawl the same latest set. + uipath_latest_versions: bool = False, + max_versions: int = 2, ) -> None: self.base_url = base_url self.mintlify_cleanup = mintlify_cleanup self.batch_size = batch_size self.recursive = False self.web_connector_type = web_connector_type + self.uipath_latest_versions = uipath_latest_versions + self.max_versions = max_versions + # Recursion scope (RECURSIVE mode). Defaults to the configured base_url; + # overridden below to the latest-N version base URLs when version + # tracking is enabled for a versioned docs.uipath product. + self.recursive_prefixes: list[str] = [base_url] if web_connector_type == WEB_CONNECTOR_VALID_SETTINGS.RECURSIVE.value: self.recursive = True - self.to_visit_list = [_ensure_valid_url(base_url)] + if uipath_latest_versions: + # docs.uipath versioned product -> latest N version base URLs; + # returns [base_url] unchanged for evergreen / non-docs URLs. + expanded = get_uipath_docs_version_base_urls( + base_url, max_versions=max_versions + ) + self.to_visit_list = [_ensure_valid_url(u) for u in expanded] + self.recursive_prefixes = self.to_visit_list + else: + self.to_visit_list = [_ensure_valid_url(base_url)] return elif web_connector_type == WEB_CONNECTOR_VALID_SETTINGS.SINGLE.value: @@ -431,7 +587,12 @@ def load_from_state(self, is_polling: bool = False) -> GenerateDocumentsOutput: soup = BeautifulSoup(content, "html.parser") if self.recursive and not is_polling: - for link in get_internal_links(base_url, current_url, soup): + for link in get_internal_links( + base_url, + current_url, + soup, + allowed_prefixes=self.recursive_prefixes, + ): if link not in visited_links: to_visit.append(link) @@ -544,6 +705,15 @@ def poll_source( break else: # RECURSIVE case for url, lastmod in urls_with_dates: + # Scope to the crawl prefixes. For a uipath_latest_versions + # connector these are the latest-N version base URLs, so the + # all-versions product sitemap (it lists EVERY version under + # the product) is filtered down to just those. Without this, + # poll re-indexes every version, bypassing the expansion. + # For a normal connector recursive_prefixes == [base_url], so + # this is the same scoping as before. + if not any(prefix in url for prefix in self.recursive_prefixes): + continue if lastmod and start_datetime <= lastmod <= end_datetime: urls_to_index.append(url) # If we don't have a lastmod date, we should check the page @@ -563,8 +733,14 @@ def poll_source( except Exception as e: logger.warning(f"Failed to use sitemap for polling: {e}") - # Fall back to regular indexing if sitemap fails - return self.load_from_state(is_polling=True) + # Fall back to a full RECURSIVE crawl (is_polling=False) when the + # sitemap is unavailable. The product sitemap URL is currently stale + # (404s), so this fallback is the common path; with is_polling=True + # recursion is disabled (see load_from_state) and only the seed + # URL(s) would be indexed — i.e. just landing pages. A full crawl + # also honors uipath_latest_versions (to_visit_list/recursive_prefixes + # are version-expanded in __init__). + return self.load_from_state(is_polling=False) if __name__ == "__main__": diff --git a/backend/danswer/danswerbot/slack/handlers/handle_message.py b/backend/danswer/danswerbot/slack/handlers/handle_message.py index d4b79ef0d2c..d7e263a94c7 100644 --- a/backend/danswer/danswerbot/slack/handlers/handle_message.py +++ b/backend/danswer/danswerbot/slack/handlers/handle_message.py @@ -50,6 +50,9 @@ from danswer.db.models import SlackBotConfig from danswer.db.models import SlackBotResponseType from danswer.db.persona import fetch_persona_by_id +from danswer.db.slack_response_blocklist import ( + get_slack_response_blocklisted_emails, +) from danswer.db.persona import get_persona_with_docset_and_prompts from danswer.db.persona import get_personas from danswer.db.users import add_slack_persona_for_user @@ -69,7 +72,6 @@ from danswer.search.models import OptionalSearchSetting from danswer.search.models import RetrievalDetails from danswer.utils.logger import setup_logger -from shared_configs.configs import ENABLE_RERANKING_ASYNC_FLOW logger_base = setup_logger() @@ -230,6 +232,38 @@ def handle_message( is_bot_dm = message_info.is_bot_dm persona_name = None + # Suppress responses for senders on the DB-driven blocklist (e.g. people whose + # messages should never trigger Darwin). Only hit the Slack API for the + # sender's email when the blocklist is non-empty, so the common case adds no + # per-message API call. + if sender_id: + try: + with Session(get_sqlalchemy_engine()) as db_session: + blocklisted_emails = get_slack_response_blocklisted_emails( + db_session + ) + except Exception: + # Fail open: if the blocklist can't be read (e.g. the table doesn't + # exist yet mid-migration, or a transient DB error), respond as + # normal rather than dropping the message. + blocklisted_emails = set() + logger.warning("Could not load Slack response blocklist; proceeding") + if blocklisted_emails: + sender_email: str | None = None + try: + sender_email = ( + client.users_info(user=sender_id) + .data["user"]["profile"] # type: ignore + .get("email") + ) + except Exception: + logger.warning("Unable to fetch sender email for blocklist check") + if sender_email and sender_email.lower() in blocklisted_emails: + logger.info( + f"Ignoring Slack message from blocklisted sender {sender_email}" + ) + return False + # Check if channel config has JIRA integration enabled and title filter if channel_config and channel_config.channel_config: channel_conf = channel_config.channel_config @@ -659,7 +693,10 @@ def _get_answer(new_message_request: DirectQARequest) -> OneShotQAResponse | Non persona_id=persona.id if persona is not None else 0, retrieval_options=retrieval_details, chain_of_thought=not disable_cot, - skip_rerank=not ENABLE_RERANKING_ASYNC_FLOW, + # Leave None so retrieval_preprocessing resolves reranking from + # the global RERANK_ENABLED switch + the assistant's + # rerank_enabled flag — same logic as the chat flow. + skip_rerank=None, ) ) except Exception as e: diff --git a/backend/danswer/db/chat.py b/backend/danswer/db/chat.py index 4c791bcfc2e..c2732fdbda5 100644 --- a/backend/danswer/db/chat.py +++ b/backend/danswer/db/chat.py @@ -72,6 +72,10 @@ def get_chat_sessions_by_user( deleted: bool | None, db_session: Session, include_one_shot: bool = False, + limit: int | None = None, + offset: int = 0, + start_time: datetime | None = None, + end_time: datetime | None = None, ) -> list[ChatSession]: stmt = select(ChatSession).where(ChatSession.user_id == user_id) @@ -81,6 +85,25 @@ def get_chat_sessions_by_user( if deleted is not None: stmt = stmt.where(ChatSession.deleted == deleted) + # Date-range bounds let the sidebar load each time bucket independently + # (recent / previous-30-days / older) without paging through the buckets + # above it. `start_time` is inclusive, `end_time` exclusive, so adjacent + # buckets partition cleanly without overlap. + if start_time is not None: + stmt = stmt.where(ChatSession.time_created >= start_time) + if end_time is not None: + stmt = stmt.where(ChatSession.time_created < end_time) + + # Newest first so the sidebar's "recent" page + lazy-loaded older pages + # walk backwards through history deterministically (offset pagination). + # id breaks ties when several sessions share the same time_created. + stmt = stmt.order_by(ChatSession.time_created.desc(), ChatSession.id.desc()) + + if offset: + stmt = stmt.offset(offset) + if limit is not None: + stmt = stmt.limit(limit) + result = db_session.execute(stmt) chat_sessions = result.scalars().all() diff --git a/backend/danswer/db/document_set.py b/backend/danswer/db/document_set.py index cd68527a7b2..59b1590d3d9 100644 --- a/backend/danswer/db/document_set.py +++ b/backend/danswer/db/document_set.py @@ -7,6 +7,7 @@ from sqlalchemy import func from sqlalchemy import or_ from sqlalchemy import select +from sqlalchemy import text from sqlalchemy.orm import Session from danswer.db.document_set_cache import invalidate_document_sets_all @@ -20,6 +21,13 @@ from danswer.utils.variable_functionality import fetch_versioned_implementation +def document_set_sync_cursor_key(document_set_id: int) -> str: + """key_value_store key holding the resumable doc-set sync cursor (the last + document id synced). Lets sync_document_set_task resume after a restart or + the 6h soft_time_limit instead of re-syncing from scratch.""" + return f"document_set_sync_cursor__{document_set_id}" + + def _delete_document_set_cc_pairs__no_commit( db_session: Session, document_set_id: int, is_current: bool | None = None ) -> None: @@ -170,6 +178,13 @@ def update_document_set( document_set_row.is_up_to_date = False document_set_row.is_public = document_set_update_request.is_public + # A membership change needs a full re-tag of every doc, so discard any + # resume cursor left by a previous (interrupted) sync of this set. + db_session.execute( + text("DELETE FROM key_value_store WHERE key = :k"), + {"k": document_set_sync_cursor_key(document_set_update_request.id)}, + ) + versioned_private_doc_set_fn = fetch_versioned_implementation( "danswer.db.document_set", "make_doc_set_private" ) diff --git a/backend/danswer/db/engine.py b/backend/danswer/db/engine.py index 1c3c0a69cae..da62d13eec0 100644 --- a/backend/danswer/db/engine.py +++ b/backend/danswer/db/engine.py @@ -87,6 +87,15 @@ def get_sqlalchemy_async_engine() -> AsyncEngine: connection_string, pool_size=POSTGRES_POOL_SIZE, max_overflow=POSTGRES_POOL_OVERFLOW, + # Mirror the sync engine: validate a pooled connection before use so + # a stale/closed one (server- or proxy-side idle timeout) is + # transparently reconnected instead of raising + # `asyncpg.InterfaceError: connection is closed`, which was surfacing + # as intermittent 500s on async endpoints (e.g. the /chat SSR + # fetches). pool_recycle proactively retires connections before the + # server's idle timeout can close them. + pool_pre_ping=True, + pool_recycle=3600, ) return _ASYNC_ENGINE diff --git a/backend/danswer/db/models.py b/backend/danswer/db/models.py index f99b0da8f24..cb69b89ec40 100644 --- a/backend/danswer/db/models.py +++ b/backend/danswer/db/models.py @@ -116,12 +116,21 @@ class User(SQLAlchemyBaseUserTableUUID, Base): putting here for simpicity """ - # if specified, controls the assistants that are shown to the user + their order - # if not specified, all assistants are shown + # if specified, controls the ORDER (and pinned default = position 0) of the + # assistants shown to the user. Visibility is governed by `hidden_assistants` + # below, NOT by membership here. If not specified, the natural order is used. chosen_assistants: Mapped[list[int]] = mapped_column( postgresql.ARRAY(Integer), nullable=True ) + # Assistants the user has explicitly hidden from their picker. Opt-OUT model: + # every accessible assistant is visible by default — so a newly created + # (e.g. admin) assistant appears for all users automatically — and a user + # removes the ones they don't want here. Empty list = nothing hidden. + hidden_assistants: Mapped[list[int]] = mapped_column( + postgresql.ARRAY(Integer), nullable=False, server_default="{}" + ) + # relationships credentials: Mapped[list["Credential"]] = relationship( "Credential", back_populates="user", lazy="joined" @@ -998,6 +1007,10 @@ class Persona(Base): id: Mapped[int] = mapped_column(primary_key=True) user_id: Mapped[UUID | None] = mapped_column(ForeignKey("user.id"), nullable=True) name: Mapped[str] = mapped_column(String) + # User-friendly label shown in the chat UI. The immutable `name` stays the + # identifier; `display_name` is presentational only and admin-editable. + # Backfilled to `name`; chat falls back to `name` if blank. + display_name: Mapped[str | None] = mapped_column(String, nullable=True) description: Mapped[str] = mapped_column(String) # Currently stored but unused, all flows use hybrid search_type: Mapped[SearchType] = mapped_column( @@ -1014,6 +1027,13 @@ class Persona(Base): recency_bias: Mapped[RecencyBiasSetting] = mapped_column( Enum(RecencyBiasSetting, native_enum=False) ) + # Per-assistant opt-in for cross-encoder reranking (beta). Only takes effect + # when reranking is globally available (RERANK_ENABLED + a GPU-backed model + # server); see search/preprocessing/preprocessing.py. Default off so existing + # assistants and the GPU-free local setup are unchanged until toggled. + rerank_enabled: Mapped[bool] = mapped_column( + Boolean, default=False, server_default="false" + ) # Allows the Persona to specify a different LLM version than is controlled # globablly via env variables. For flexibility, validity is not currently enforced # NOTE: only is applied on the actual response generation - is not used for things like @@ -1140,6 +1160,22 @@ class SlackBotConfig(Base): persona: Mapped[Persona | None] = relationship("Persona") +class SlackBotResponseBlocklist(Base): + """Senders (by email) whose Slack messages should NOT trigger a Darwin + response. DB-driven so the list can be managed without a redeploy. Checked + early in danswerbot/slack/handlers/handle_message.py::handle_message.""" + + __tablename__ = "slack_bot_response_blocklist" + + id: Mapped[int] = mapped_column(primary_key=True) + # Stored lowercase; matched case-insensitively against the Slack sender's + # profile email. + email: Mapped[str] = mapped_column(String, unique=True, index=True) + created_at: Mapped[datetime.datetime] = mapped_column( + DateTime(timezone=True), server_default=func.now() + ) + + class UserSlackPersona(Base): __tablename__ = "user_slack_persona" diff --git a/backend/danswer/db/persona.py b/backend/danswer/db/persona.py index e39eb47306b..600884f4499 100644 --- a/backend/danswer/db/persona.py +++ b/backend/danswer/db/persona.py @@ -10,11 +10,13 @@ from sqlalchemy import select from sqlalchemy import update from sqlalchemy.orm import joinedload +from sqlalchemy.orm import selectinload from sqlalchemy.orm import Session from danswer.auth.schemas import UserRole from danswer.db.constants import SLACK_BOT_PERSONA_PREFIX from danswer.db.engine import get_sqlalchemy_engine +from danswer.db.models import ConnectorCredentialPair from danswer.db.models import DocumentSet from danswer.db.models import Persona from danswer.db.models import Persona__User @@ -69,11 +71,13 @@ def create_update_persona( persona_id=persona_id, user=user, name=create_persona_request.name, + display_name=create_persona_request.display_name, description=create_persona_request.description, num_chunks=create_persona_request.num_chunks, llm_relevance_filter=create_persona_request.llm_relevance_filter, llm_filter_extraction=create_persona_request.llm_filter_extraction, recency_bias=create_persona_request.recency_bias, + rerank_enabled=create_persona_request.rerank_enabled, prompt_ids=create_persona_request.prompt_ids, tool_ids=create_persona_request.tool_ids, document_set_ids=create_persona_request.document_set_ids, @@ -157,6 +161,30 @@ def get_prompts( return db_session.scalars(stmt).all() +def _persona_snapshot_load_options() -> list: + """Eager-load every relationship PersonaSnapshot.from_model touches so + serializing personas (admin list, edit page) doesn't N+1 down + document_sets -> connector_credential_pairs -> connector/credential. Opt-in + via the `eager_load` flag — non-serializing callers (visibility toggle, + delete, slack matching) skip it and stay cheap. + """ + return [ + joinedload(Persona.user), + selectinload(Persona.prompts), + selectinload(Persona.tools), + selectinload(Persona.users), + selectinload(Persona.groups), + selectinload(Persona.document_sets) + .selectinload(DocumentSet.connector_credential_pairs) + .options( + joinedload(ConnectorCredentialPair.connector), + joinedload(ConnectorCredentialPair.credential), + ), + selectinload(Persona.document_sets).selectinload(DocumentSet.users), + selectinload(Persona.document_sets).selectinload(DocumentSet.groups), + ] + + def get_personas( # if user_id is `None` assume the user is an admin or auth is disabled user_id: UUID | None, @@ -164,8 +192,11 @@ def get_personas( include_default: bool = True, include_slack_bot_personas: bool = False, include_deleted: bool = False, + eager_load: bool = False, ) -> Sequence[Persona]: stmt = select(Persona).distinct() + if eager_load: + stmt = stmt.options(*_persona_snapshot_load_options()) if user_id is not None: # Subquery to find all groups the user belongs to user_groups_subquery = ( @@ -195,7 +226,9 @@ def get_personas( if not include_deleted: stmt = stmt.where(Persona.deleted.is_(False)) - return db_session.scalars(stmt).all() + # .unique() is required when eager_load joins collection relationships; + # harmless otherwise (rows are already distinct). + return db_session.scalars(stmt).unique().all() def mark_persona_as_deleted( @@ -342,6 +375,8 @@ def upsert_persona( starter_messages: list[StarterMessage] | None, is_public: bool, db_session: Session, + rerank_enabled: bool = False, + display_name: str | None = None, prompt_ids: list[int] | None = None, document_set_ids: list[int] | None = None, tool_ids: list[int] | None = None, @@ -388,11 +423,13 @@ def upsert_persona( check_user_can_edit_persona(user=user, persona=persona) persona.name = name + persona.display_name = display_name or name persona.description = description persona.num_chunks = num_chunks persona.llm_relevance_filter = llm_relevance_filter persona.llm_filter_extraction = llm_filter_extraction persona.recency_bias = recency_bias + persona.rerank_enabled = rerank_enabled persona.default_persona = default_persona persona.llm_model_provider_override = llm_model_provider_override persona.llm_model_version_override = llm_model_version_override @@ -419,11 +456,13 @@ def upsert_persona( user_id=user.id if user else None, is_public=is_public, name=name, + display_name=display_name or name, description=description, num_chunks=num_chunks, llm_relevance_filter=llm_relevance_filter, llm_filter_extraction=llm_filter_extraction, recency_bias=recency_bias, + rerank_enabled=rerank_enabled, default_persona=default_persona, prompts=prompts or [], document_sets=document_sets or [], @@ -570,8 +609,11 @@ def get_persona_by_id( db_session: Session, include_deleted: bool = False, is_for_edit: bool = True, # NOTE: assume true for safety + eager_load: bool = False, ) -> Persona: stmt = select(Persona).where(Persona.id == persona_id) + if eager_load: + stmt = stmt.options(*_persona_snapshot_load_options()) or_conditions = [] @@ -589,7 +631,9 @@ def get_persona_by_id( if not include_deleted: stmt = stmt.where(Persona.deleted.is_(False)) - result = db_session.execute(stmt) + # .unique() is required when eager_load joins collection relationships; + # harmless otherwise. + result = db_session.execute(stmt).unique() persona = result.scalar_one_or_none() if persona is None: diff --git a/backend/danswer/db/slack_response_blocklist.py b/backend/danswer/db/slack_response_blocklist.py new file mode 100644 index 00000000000..cf8f1a8d102 --- /dev/null +++ b/backend/danswer/db/slack_response_blocklist.py @@ -0,0 +1,37 @@ +"""Helpers for the Slack response blocklist — senders (by email) whose messages +should NOT trigger a Darwin response. DB-driven (see +db/models.py::SlackBotResponseBlocklist); consumed by +danswerbot/slack/handlers/handle_message.py. +""" +from sqlalchemy import select +from sqlalchemy.orm import Session + +from danswer.db.models import SlackBotResponseBlocklist + + +def get_slack_response_blocklisted_emails(db_session: Session) -> set[str]: + """Lowercased set of emails whose Slack messages must not trigger a response. + + Returns an empty set when nothing is blocklisted, which lets the caller skip + the per-message Slack `users_info` lookup entirely. + """ + emails = db_session.scalars(select(SlackBotResponseBlocklist.email)).all() + return {email.lower() for email in emails} + + +def add_email_to_slack_response_blocklist( + email: str, db_session: Session +) -> SlackBotResponseBlocklist: + entry = SlackBotResponseBlocklist(email=email.strip().lower()) + db_session.add(entry) + db_session.commit() + return entry + + +def remove_email_from_slack_response_blocklist( + email: str, db_session: Session +) -> None: + db_session.query(SlackBotResponseBlocklist).filter( + SlackBotResponseBlocklist.email == email.strip().lower() + ).delete() + db_session.commit() diff --git a/backend/danswer/db/tasks.py b/backend/danswer/db/tasks.py index 58bd605aa36..fec667d6220 100644 --- a/backend/danswer/db/tasks.py +++ b/backend/danswer/db/tasks.py @@ -133,3 +133,60 @@ def check_task_is_live_and_not_timed_out( time_elapsed = current_db_time - last_update_time return time_elapsed.total_seconds() < timeout + + +# Prefix of the connector-deletion (cleanup) task name. Mirrors +# task_utils.name_cc_cleanup_task -> "cleanup_connector_credential_pair_{cid}_{credid}". +# Kept as a literal here to avoid a circular import (task_utils imports this module). +_CC_CLEANUP_TASK_PREFIX = "cleanup_connector_credential_pair_" + + +def get_stuck_deletion_cc_ids( + db_session: Session, + timeout: int = JOB_TIMEOUT, +) -> list[tuple[int, int]]: + """(connector_id, credential_id) for connector deletions that are orphaned. + + A deletion is the event-driven `cleanup_connector_credential_pair_task`, + enqueued on the (non-durable) Redis broker. If Redis or the worker restarts + while it's queued, the broker message is lost but the `task_queue_jobs` row + stays non-terminal (PENDING/STARTED) forever — the connector shows + "Deleting" indefinitely and nothing re-runs it (deletion, unlike sync/prune, + isn't periodically rescheduled). This returns the cc-pairs whose LATEST + cleanup row is non-terminal AND older than `timeout`, so the periodic + re-drive (`check_for_stuck_deletion_tasks`) can re-enqueue them. + """ + # latest task_queue_jobs row id per cleanup task_name + latest_ids_subq = ( + select(func.max(TaskQueueState.id).label("max_id")) + .where(TaskQueueState.task_name.like(f"{_CC_CLEANUP_TASK_PREFIX}%")) + .group_by(TaskQueueState.task_name) + .subquery() + ) + latest_rows = ( + db_session.execute( + select(TaskQueueState).join( + latest_ids_subq, TaskQueueState.id == latest_ids_subq.c.max_id + ) + ) + .scalars() + .all() + ) + + stuck: list[tuple[int, int]] = [] + for task in latest_rows: + # Live (recently (re)enqueued or genuinely running within timeout) — leave it. + if check_task_is_live_and_not_timed_out(task, db_session, timeout=timeout): + continue + # Terminal — deletion finished or genuinely failed; not orphaned. + if task.status in (TaskStatus.SUCCESS, TaskStatus.FAILURE): + continue + # Non-terminal AND timed out => orphaned. Recover the ids from the name. + suffix = task.task_name[len(_CC_CLEANUP_TASK_PREFIX) :] + try: + connector_id_str, credential_id_str = suffix.rsplit("_", 1) + stuck.append((int(connector_id_str), int(credential_id_str))) + except (ValueError, TypeError): + # malformed name — skip rather than crash the whole sweep + continue + return stuck diff --git a/backend/danswer/document_index/vespa/index.py b/backend/danswer/document_index/vespa/index.py index 8605a9bdd86..a07b292ca94 100644 --- a/backend/danswer/document_index/vespa/index.py +++ b/backend/danswer/document_index/vespa/index.py @@ -97,6 +97,11 @@ _NUM_THREADS = ( 32 # since Vespa doesn't allow batching of inserts / updates, we use threads ) +# How many document_ids to fold into a single visit/selection scan when looking +# up chunk IDs in bulk (see _get_vespa_chunk_ids_by_document_ids). Kept modest +# so the `... or ...` selection string stays well within Vespa's request-URI +# limit even for long (URL-style) document ids. +_VESPA_VISIT_DOC_ID_BATCH = 25 # up from 500ms for now, since we've seen quite a few timeouts # in the long term, we are looking to improve the performance of Vespa # so that we can bring this back to default @@ -218,7 +223,6 @@ def _get_vespa_chunks_by_document_id( ): continue document_chunks.append(document) - document_chunks.extend(response_data["documents"]) # Check for continuation token to handle pagination if "continuation" in response_data and response_data["continuation"]: @@ -241,6 +245,63 @@ def _get_vespa_chunk_ids_by_document_id( return [chunk["id"].split("::", 1)[-1] for chunk in document_chunks] +def _get_vespa_chunk_ids_by_document_ids( + document_ids: list[str], index_name: str +) -> dict[str, list[str]]: + """Fetch chunk IDs for MANY documents in a single Vespa visit. + + A visit with `document_id == 'X'` is a selection scan; doing it once per + document (the old behavior) meant N scans for N documents. Selecting all + the ids in one visit (`... or ... or ...`) returns every matching chunk in a + single scan, so bulk updates (doc-set sync especially) issue far fewer + scans. Callers batch `document_ids` (see _VESPA_VISIT_DOC_ID_BATCH) to keep + the selection string / request URL within Vespa's limits. Returns chunk IDs + grouped by their document_id field. + """ + if not document_ids: + return {} + + url = DOCUMENT_ID_ENDPOINT.format(index_name=index_name) + id_filter = " or ".join( + f"{index_name}.document_id=='{document_id}'" for document_id in document_ids + ) + params: dict[str, str | int | None] = { + "selection": f"({id_filter})", + "continuation": None, + "wantedDocumentCount": 1_000, + "fieldSet": f"{index_name}:{DOCUMENT_ID}", + } + + chunk_ids_by_document: dict[str, list[str]] = {} + while True: + response = requests.get(url, params=params) + try: + response.raise_for_status() + except requests.HTTPError as e: + error_base = ( + f"Error getting chunk IDs for {len(document_ids)} documents " + f"in index {index_name}" + ) + logger.error(f"{error_base}: {response.status_code} {response.text}") + raise requests.HTTPError(error_base) from e + + response_data = response.json() + for document in response_data.get("documents", []): + doc_id = document.get("fields", {}).get(DOCUMENT_ID) + if doc_id is None: + continue + chunk_id = document["id"].split("::", 1)[-1] + chunk_ids_by_document.setdefault(doc_id, []).append(chunk_id) + + continuation = response_data.get("continuation") + if continuation: + params["continuation"] = continuation + else: + break + + return chunk_ids_by_document + + @retry(tries=3, delay=1, backoff=2) def _delete_vespa_doc_chunks( document_id: str, index_name: str, http_client: httpx.Client @@ -702,7 +763,9 @@ def query_vespa_helper(params): @retry(tries=3, delay=1, backoff=2) -def _query_vespa(query_params: Mapping[str, str | int | float]) -> list[InferenceChunk]: +def _query_vespa( + query_params: Mapping[str, str | int | float], +) -> list[InferenceChunk]: if "query" in query_params and not cast(str, query_params["query"]).strip(): raise ValueError("No/empty query received") @@ -715,48 +778,15 @@ def _query_vespa(query_params: Mapping[str, str | int | float]) -> list[Inferenc else {}, ) - # Get prioritized sources from filters, default to web and sfkbarticles if none - prioritized_sources = query_params.get("prioritized_sources") or [ - "web", - "sfkbarticles", - ] - # All records - params["hits"] = 50 - filtered_hits_all = query_vespa_helper(params) - - # Records from prioritized sources - params["hits"] = 10 - source_conditions = " or ".join( - f'source_type contains "{source}"' for source in prioritized_sources - ) - params["yql"] = params["yql"] + f" and ({source_conditions})" - filtered_hits_prioritized = query_vespa_helper(params) - - filtered_hits_final = filtered_hits_prioritized + filtered_hits_all - - inference_chunks = [ - _vespa_hit_to_inference_chunk(hit) for hit in filtered_hits_final - ] - # inplace sorting based on score - # inference_chunks.sort(key=lambda x: x.score, reverse=True) - - unique_chunks: dict[tuple[str, int], InferenceChunk] = {} - for chunk in inference_chunks: - key = (chunk.document_id, chunk.chunk_id) - if key not in unique_chunks: - unique_chunks[key] = chunk - continue - - stored_chunk_score = unique_chunks[key].score or 0 - this_chunk_score = chunk.score or 0 - if stored_chunk_score < this_chunk_score: - unique_chunks[key] = chunk - - inference_chunks = sorted( - unique_chunks.values(), key=lambda x: x.score or 0, reverse=True - ) - # Good Debugging Spot - return inference_chunks + # Single, all-sources query: every chunk is scored on ONE comparable + # normalize_linear scale (honors the caller's `hits`). Source diversity — + # making sure curated KB/web docs aren't crowded out by a chatty source — + # is handled later, at final doc selection (see llm/answering/doc_pruning.py + # ::ensure_source_diversity), rather than by a second, independently- + # normalized query (which inflated those scores). + hits = query_vespa_helper(params) + inference_chunks = [_vespa_hit_to_inference_chunk(hit) for hit in hits] + return sorted(inference_chunks, key=lambda chunk: chunk.score or 0, reverse=True) @retry(tries=3, delay=1, backoff=2) @@ -979,29 +1009,40 @@ def update(self, update_requests: list[UpdateRequest]) -> None: index_names.append(self.secondary_index_name) chunk_id_start_time = time.monotonic() + all_document_ids = [ + document_id + for update_request in update_requests + for document_id in update_request.document_ids + ] + # Look up chunk IDs in batched visits (one selection scan per + # _VESPA_VISIT_DOC_ID_BATCH documents) instead of one scan per + # document — far fewer scans against Vespa for bulk updates. Batches + # (× indexes) still run concurrently across the thread pool. with concurrent.futures.ThreadPoolExecutor( max_workers=_NUM_THREADS ) as executor: - future_to_doc_chunk_ids = { + future_to_index = { executor.submit( - _get_vespa_chunk_ids_by_document_id, - document_id=document_id, + _get_vespa_chunk_ids_by_document_ids, + document_ids=list(id_batch), index_name=index_name, - ): (document_id, index_name) + ): index_name for index_name in index_names - for update_request in update_requests - for document_id in update_request.document_ids + for id_batch in batch_generator( + all_document_ids, _VESPA_VISIT_DOC_ID_BATCH + ) } - for future in concurrent.futures.as_completed(future_to_doc_chunk_ids): - document_id, index_name = future_to_doc_chunk_ids[future] + for future in concurrent.futures.as_completed(future_to_index): + index_name = future_to_index[future] try: - doc_chunk_ids = future.result() - if document_id not in all_doc_chunk_ids: - all_doc_chunk_ids[document_id] = [] - all_doc_chunk_ids[document_id].extend(doc_chunk_ids) + for document_id, doc_chunk_ids in future.result().items(): + all_doc_chunk_ids.setdefault(document_id, []).extend( + doc_chunk_ids + ) except Exception as e: logger.error( - f"Error retrieving chunk IDs for document {document_id} in index {index_name}: {e}" + f"Error retrieving chunk IDs (batched visit) in index " + f"{index_name}: {e}" ) logger.debug( f"Took {time.monotonic() - chunk_id_start_time:.2f} seconds to fetch all Vespa chunk IDs" @@ -1032,7 +1073,11 @@ def update(self, update_requests: list[UpdateRequest]) -> None: continue for document_id in update_request.document_ids: - for doc_chunk_id in all_doc_chunk_ids[document_id]: + # .get(): a document with no chunks in Vespa (e.g. an orphaned + # doc row that was never indexed / already removed) simply has + # nothing to update — skip it rather than KeyError. The old + # per-document path implicitly initialized an empty list here. + for doc_chunk_id in all_doc_chunk_ids.get(document_id, []): processed_updates_requests.append( _VespaUpdateRequest( document_id=document_id, diff --git a/backend/danswer/llm/answering/answer.py b/backend/danswer/llm/answering/answer.py index 1aceaa1f8f7..abda178a0ed 100644 --- a/backend/danswer/llm/answering/answer.py +++ b/backend/danswer/llm/answering/answer.py @@ -12,6 +12,10 @@ from danswer.chat.models import CitationInfo from danswer.chat.models import DanswerAnswerPiece from danswer.chat.models import LlmDoc +from danswer.configs.chat_configs import AUTHORITATIVE_CITATION_RETENTION_ENABLED +from danswer.llm.answering.authoritative_retention import ( + retained_authoritative_footer, +) from danswer.configs.chat_configs import QA_PROMPT_OVERRIDE from danswer.file_store.utils import InMemoryChatFile from danswer.llm.answering.models import AnswerStyleConfig @@ -497,7 +501,32 @@ def _stream() -> Iterator[str]: yield cast(str, message) yield from cast(Iterator[str], stream) - yield from process_answer_stream_fn(_stream()) + # Accumulate the answer text + which docs the LLM cited, so we can run + # the (additive, deduped) authoritative-citation retention afterwards. + answer_parts: list[str] = [] + cited_doc_ids: set[str] = set() + for packet in process_answer_stream_fn(_stream()): + if isinstance(packet, DanswerAnswerPiece) and packet.answer_piece: + answer_parts.append(packet.answer_piece) + elif isinstance(packet, CitationInfo): + cited_doc_ids.add(packet.document_id) + yield packet + + # Verify-then-retain: for any relevant authoritative doc the LLM left + # UNCITED, a single batched call checks whether it's a relevant + # authoritative reference; relevant ones are appended as an + # "Authoritative sources" footer. No-op (and no LLM call) when there are + # no uncited authoritative docs in context. + if AUTHORITATIVE_CITATION_RETENTION_ENABLED and final_context_docs: + footer = retained_authoritative_footer( + answer="".join(answer_parts), + final_context_docs=final_context_docs, + already_cited_doc_ids=cited_doc_ids, + llm=self.llm, + question=self.question, + ) + if footer: + yield DanswerAnswerPiece(answer_piece=footer) processed_stream = [] for processed_packet in _process_stream(output_generator): diff --git a/backend/danswer/llm/answering/authoritative_retention.py b/backend/danswer/llm/answering/authoritative_retention.py new file mode 100644 index 00000000000..be57361f092 --- /dev/null +++ b/backend/danswer/llm/answering/authoritative_retention.py @@ -0,0 +1,180 @@ +"""Verify-then-retain authoritative citations. + +Citations are the LLM's own output, and it inconsistently cites curated / +authoritative sources even when they're promoted to the front of the prompt +(confirmed across chat + Slack, multiple runs). This module adds a post-generation +step that is: + +- **additive** — the LLM's own inline citations are untouched; +- **deduped** — only considers authoritative docs the LLM did NOT cite, deduped + by document_id (the same page often appears as several chunks); +- **honest** — appends a doc only if one batched LLM call confirms it actually + supports a statement in the answer (fail-closed on any error); +- **bounded** — at most ONE extra LLM call, and only when an uncited authoritative + doc is present in the context. + +Supporting docs are appended as a small "Authoritative sources" markdown footer, +which renders in both the chat UI and Slack. +""" +import json +import re + +from danswer.chat.models import LlmDoc +from danswer.configs.chat_configs import PROTECTED_SOURCES +from danswer.llm.interfaces import LLM +from danswer.llm.utils import message_to_string +from danswer.utils.logger import setup_logger + +logger = setup_logger() + + +_VERIFY_PROMPT = """\ +You are selecting authoritative reference documents to surface as sources for an \ +answer to a user's question. For each candidate you are given the passage that the \ +search matched. + +QUESTION: +{question} + +ANSWER: +{answer} + +CANDIDATE DOCUMENTS (matched passage shown): +{docs} + +For EACH candidate, decide whether its matched passage is genuinely relevant to BOTH \ +the question and the answer. If it is, include it; otherwise exclude it. Respond with \ +ONLY a JSON array of the numbers of the relevant documents (e.g. [1, 3]); if none \ +qualify, respond with []. +""" + + +def _source_value(doc: LlmDoc) -> str: + src = doc.source_type + return (src.value if hasattr(src, "value") else str(src)).lower() + + +def select_authoritative_candidates( + final_context_docs: list[LlmDoc], + already_cited_doc_ids: set[str], +) -> list[LlmDoc]: + """Authoritative (PROTECTED_SOURCES) docs in the prompt that the LLM did NOT + cite — candidates for the relevance-verify step. + + We surface a relevant authoritative doc the LLM left out **even if it cited some + other authoritative source** — citing one KB article shouldn't suppress a + different, relevant docs/web page the answer also draws on. (We tried a tighter + "skip if any authoritative was cited" gate to save the verify call, but it hid + exactly these docs, so it was loosened.) Deduped by document_id; link required; + pure (no I/O). Empty → no verify call.""" + protected = set(PROTECTED_SOURCES) + out: list[LlmDoc] = [] + seen: set[str] = set() + for doc in final_context_docs: + if _source_value(doc) not in protected: + continue + if doc.document_id in already_cited_doc_ids or doc.document_id in seen: + continue + if not doc.link: + continue + seen.add(doc.document_id) + out.append(doc) + return out + + +def parse_supporting_indices(raw: str, n: int) -> list[int]: + """Parse the verify call's JSON array into 0-based indices in [0, n). Tolerant: + returns [] if nothing parseable (fail-closed → append nothing).""" + match = re.search(r"\[[^\[\]]*\]", raw) + if not match: + return [] + try: + values = json.loads(match.group(0)) + except (ValueError, TypeError): + return [] + out: list[int] = [] + for v in values: + if isinstance(v, bool): + continue + if isinstance(v, int) and 1 <= v <= n: + out.append(v - 1) + return out + + +def verify_supporting_docs( + answer: str, + candidates: list[LlmDoc], + llm: LLM, + question: str = "", + snippet_chars: int = 4000, + max_attempts: int = 2, +) -> list[LlmDoc]: + """One batched LLM call: which candidates are relevant authoritative references + for the answer? The judgment is made against each doc's MATCHED PASSAGE (the + retrieved chunk that scored against the query, i.e. LlmDoc.content) rather than a + short title/prefix — this is the reliable signal for relevance, so we pass the + full passage (capped generously). The candidate list is small (1-3 after the + gate/dedupe), so the extra tokens are bounded. Retries once on a transient + failure (the gateway non-streaming completion occasionally times out), then + fails closed (returns [] → no footer) so we never append on a real error.""" + if not candidates or not answer.strip(): + return [] + docs_str = "\n\n".join( + f"[{i + 1}] {d.semantic_identifier}\nMatched passage: {d.content[:snippet_chars]}" + for i, d in enumerate(candidates) + ) + prompt = _VERIFY_PROMPT.format( + question=(question or "(not provided)").strip(), answer=answer.strip(), docs=docs_str + ) + for attempt in range(1, max_attempts + 1): + try: + raw = message_to_string(llm.invoke(prompt)) + return [candidates[i] for i in parse_supporting_indices(raw, len(candidates))] + except Exception as e: + logger.warning( + "authoritative retention: verify call failed (attempt %d/%d): %s", + attempt, + max_attempts, + e, + ) + return [] + + +def build_authoritative_footer(docs: list[LlmDoc]) -> str: + """Markdown footer of verified authoritative sources (renders in chat + Slack). + + Rendered as its own labelled block rather than merged into the numbered Sources + cards: citation numbers ARE context positions and the LLM already owns the low + ones, so injecting into that list either collides (de-duped away) or can't be + placed at the top without renumbering the LLM's inline citations. A footer + sidesteps that and surfaces the link unambiguously.""" + if not docs: + return "" + lines = "\n".join(f"- [{d.semantic_identifier}]({d.link})" for d in docs) + return f"\n\n**Authoritative sources:**\n{lines}" + + +def retained_authoritative_footer( + answer: str, + final_context_docs: list[LlmDoc], + already_cited_doc_ids: set[str], + llm: LLM, + question: str = "", +) -> str: + """candidates → verify → footer. Returns "" (and makes NO LLM call) when there + are no uncited authoritative candidates. `question` anchors the relevance check + to what was actually asked (so docs that merely share a keyword/service with the + answer are excluded).""" + candidates = select_authoritative_candidates( + final_context_docs, already_cited_doc_ids + ) + if not candidates: + return "" + supporting = verify_supporting_docs(answer, candidates, llm, question=question) + if supporting: + logger.info( + "authoritative retention: appended %d source(s): %s", + len(supporting), + [d.semantic_identifier for d in supporting], + ) + return build_authoritative_footer(supporting) diff --git a/backend/danswer/llm/answering/doc_pruning.py b/backend/danswer/llm/answering/doc_pruning.py index 5a43ab3c6fd..62be8988470 100644 --- a/backend/danswer/llm/answering/doc_pruning.py +++ b/backend/danswer/llm/answering/doc_pruning.py @@ -1,12 +1,22 @@ import json +import re +from collections import defaultdict from copy import deepcopy from typing import TypeVar +from sqlalchemy import select +from sqlalchemy.orm import Session + from danswer.chat.models import ( LlmDoc, ) +from danswer.configs.chat_configs import DOCS_VERSION_DEDUP_URL_SUBSTR +from danswer.configs.chat_configs import MAX_PROMPT_DOCS_PER_SOURCE +from danswer.configs.chat_configs import PROTECTED_SOURCES +from danswer.configs.chat_configs import SOURCE_DIVERSITY_RESERVED_SLOTS from danswer.configs.constants import IGNORE_FOR_QA from danswer.configs.model_configs import DOC_EMBEDDING_CONTEXT_SIZE +from danswer.db.models import Document from danswer.llm.answering.models import DocumentPruningConfig from danswer.llm.answering.models import PromptConfig from danswer.llm.answering.prompts.citations_prompt import compute_max_document_tokens @@ -84,6 +94,236 @@ def reorder_docs( return reordered_docs +def ensure_source_diversity(docs: list[T]) -> list[T]: + """Guarantee that up to SOURCE_DIVERSITY_RESERVED_SLOTS of the highest-ranked + docs from PROTECTED_SOURCES survive final selection, so curated KB/web + content isn't crowded out of the prompt by a chatty high-relevance source + (e.g. Slack). Promotes those protected docs to the front (keeping their + relative order); everything else keeps its order. No-op when disabled + (reserved <= 0), when there are no protected sources, or when none are + present in `docs`. + """ + if SOURCE_DIVERSITY_RESERVED_SLOTS <= 0 or not PROTECTED_SOURCES: + return docs + + protected = set(PROTECTED_SOURCES) + promote_indices: list[int] = [] + for ind, doc in enumerate(docs): + source = doc.source_type + source_str = (source.value if hasattr(source, "value") else str(source)).lower() + if source_str in protected: + promote_indices.append(ind) + if len(promote_indices) >= SOURCE_DIVERSITY_RESERVED_SLOTS: + break + + if not promote_indices: + return docs + + promote_set = set(promote_indices) + promoted = [docs[i] for i in promote_indices] + rest = [doc for i, doc in enumerate(docs) if i not in promote_set] + return promoted + rest + + +def cap_docs_per_source(docs: list[T]) -> list[T]: + """Cap how many docs any single source contributes to the prompt, preserving + order (so the highest-ranked / source-diversity-promoted docs per source + survive and the rest are dropped). Prevents a chatty source from monopolizing + the context — and therefore the citations — when diverse sources are present. + No-op when disabled (cap <= 0). Run AFTER ensure_source_diversity so promoted + curated docs are kept. + """ + cap = MAX_PROMPT_DOCS_PER_SOURCE + if cap <= 0: + return docs + + counts: dict[str, int] = defaultdict(int) + capped: list[T] = [] + for doc in docs: + source = doc.source_type + key = (source.value if hasattr(source, "value") else str(source)).lower() + if counts[key] >= cap: + continue + counts[key] += 1 + capped.append(doc) + return capped + + +_DOCS_VERSION_SEG_RE = re.compile(r"^(latest|\d+\.\d+)$") + + +def _docs_page_and_version(url: str | None) -> tuple[str, str] | None: + """For a versioned documentation URL, return (page_key, version_token) where + page_key is the URL with the version path-segment stripped, so the SAME page + across product versions collapses to one key. Returns None for non-docs URLs + (other sources are left untouched) and for docs URLs with no recognizable + version segment. Scoped by DOCS_VERSION_DEDUP_URL_SUBSTR (empty = disabled). + """ + if not url or not DOCS_VERSION_DEDUP_URL_SUBSTR: + return None + if DOCS_VERSION_DEDUP_URL_SUBSTR not in url: + return None + parts = url.split("/") + for i, seg in enumerate(parts): + if _DOCS_VERSION_SEG_RE.match(seg): + page_key = "/".join(parts[:i] + parts[i + 1 :]) + return page_key, seg + return None + + +def _versioned_url_parts(url: str | None) -> tuple[str, str, str] | None: + """(prefix, version, suffix) for a versioned docs URL, else None. The page is + identified by prefix + suffix (version segment stripped); any version's URL is + rebuilt as f"{prefix}/{version}/{suffix}". Scoped by DOCS_VERSION_DEDUP_URL_SUBSTR.""" + if not url or not DOCS_VERSION_DEDUP_URL_SUBSTR: + return None + if DOCS_VERSION_DEDUP_URL_SUBSTR not in url: + return None + parts = url.split("/") + for i, seg in enumerate(parts): + if _DOCS_VERSION_SEG_RE.match(seg): + return "/".join(parts[:i]), seg, "/".join(parts[i + 1 :]) + return None + + +# Version mentioned in a user question: "23.10", "2023.10", "24.10", etc. We treat +# a 2-digit year as 20YY ("23.10" -> "2023.10"). +_QUESTION_VERSION_RE = re.compile(r"\b(20\d{2}|\d{2})\.(\d{1,2})\b") + + +def parse_question_doc_version(question: str | None) -> str | None: + """Return a normalized docs version token (e.g. '2023.10') IFF the question names + exactly ONE version; None when zero or multiple are mentioned (ambiguous — e.g. + 'upgrade from 23.10 to 25.10' — so we don't guess and fall back to latest).""" + found: set[str] = set() + for m in _QUESTION_VERSION_RE.finditer(question or ""): + year = m.group(1) + if len(year) == 2: + year = "20" + year + found.add(f"{year}.{int(m.group(2))}") # int() drops any leading zero in minor + return next(iter(found)) if len(found) == 1 else None + + +def rewrite_docs_links( + docs: list[LlmDoc], db_session: Session, target_version: str | None = None +) -> None: + """Rewrite each versioned docs link to the right version of that SAME page (same + URL with the version path-segment stripped; slug kept): + + - `target_version` set (the question named one version): rewrite to THAT version + if it's indexed for the page — even if it's older than what was retrieved + (answering 'is X supported in 23.10' must point at the 23.10 doc, not latest). + If that version isn't indexed for the page, leave the link as-is. + - `target_version` None (version-agnostic question): rewrite to the NEWEST indexed + version, and only when it's newer than what was retrieved. + + Mutates `docs` in place; no-op for non-docs links. One PK-indexed prefix-scan per + distinct page-prefix. + """ + pages: set[tuple[str, str]] = set() + for doc in docs: + pv = _versioned_url_parts(doc.link) + if pv: + pages.add((pv[0], pv[2])) + if not pages: + return + + # (prefix, suffix) page -> {version_token: full_url} of every indexed version. + page_versions: dict[tuple[str, str], dict[str, str]] = defaultdict(dict) + for prefix in {p for p, _ in pages}: + rows = db_session.execute( + select(Document.id).where(Document.id.like(f"{prefix}/%")) + ).all() + for (doc_id,) in rows: + pv = _versioned_url_parts(doc_id) + if pv is None: + continue + key = (pv[0], pv[2]) + if key in pages: + page_versions[key][pv[1]] = doc_id + + for doc in docs: + pv = _versioned_url_parts(doc.link) + if pv is None: + continue + versions = page_versions.get((pv[0], pv[2])) + if not versions: + continue + if target_version is not None: + if target_version in versions: + doc.link = versions[target_version] # exact match (may be older) + # else: requested version not indexed for this page -> leave as-is + else: + newest = max(versions, key=_docs_version_sort_key) + if _docs_version_sort_key(newest) > _docs_version_sort_key(pv[1]): + doc.link = versions[newest] + + +def _docs_version_sort_key(token: str) -> tuple[int, int, int]: + """Order doc versions newest-first. Tiered so we never have to decode the + exact meaning of the post-migration 'N.YYMM' scheme — we only need it to + outrank the frozen old 'YYYY.M' scheme, which always holds once a product has + migrated: + tier 3: 'latest' alias (always newest) + tier 2: new scheme N.YYMM e.g. 2.2510 (current) + tier 1: old scheme YYYY.M e.g. 2024.10 (frozen) + tier 0: unrecognized (sorts last) + Within a tier, compare the numeric components. + """ + if token == "latest": + return (3, 0, 0) + m = re.match(r"^(\d+)\.(\d+)$", token) + if not m: + return (0, 0, 0) + major, minor = int(m.group(1)), int(m.group(2)) + if major >= 1000: # YYYY.M (old scheme) + return (1, major, minor) + return (2, major, minor) # N.YYMM (new scheme) -> ranks above all old + + +def dedupe_doc_versions( + docs: list[LlmDoc], doc_relevance_list: list[bool] | None +) -> tuple[list[LlmDoc], list[bool] | None]: + """Collapse the same documentation page repeated across product versions, + keeping only the newest version's chunk(s) so distinct pages aren't crowded + out of the LLM context. Only affects versioned docs URLs (see + DOCS_VERSION_DEDUP_URL_SUBSTR); every other source/doc passes through. + `doc_relevance_list` is filtered in lockstep (prune_documents requires the + two stay equal length). + """ + parsed = [_docs_page_and_version(doc.link or doc.document_id) for doc in docs] + + # Newest version token seen per docs page. + newest: dict[str, str] = {} + for pv in parsed: + if pv is None: + continue + page, ver = pv + if page not in newest or _docs_version_sort_key( + ver + ) > _docs_version_sort_key(newest[page]): + newest[page] = ver + + kept_docs: list[LlmDoc] = [] + kept_rel: list[bool] = [] + dropped = 0 + for i, doc in enumerate(docs): + pv = parsed[i] + # Keep non-docs / unversioned docs, and only the newest version per page. + if pv is None or pv[1] == newest[pv[0]]: + kept_docs.append(doc) + if doc_relevance_list is not None: + kept_rel.append(doc_relevance_list[i]) + else: + dropped += 1 + + if dropped: + logger.info( + f"Deduped {dropped} older-version duplicate doc page(s) from the LLM context" + ) + return kept_docs, (kept_rel if doc_relevance_list is not None else None) + + def _remove_docs_to_ignore(docs: list[LlmDoc]) -> list[LlmDoc]: return [doc for doc in docs if not doc.metadata.get(IGNORE_FOR_QA)] @@ -103,6 +343,10 @@ def _apply_pruning( docs = reorder_docs(docs=docs, doc_relevance_list=doc_relevance_list) # remove docs that are explicitly marked as not for QA docs = _remove_docs_to_ignore(docs=docs) + # guarantee curated KB/web docs aren't crowded out before the token-budget cut + docs = ensure_source_diversity(docs) + # cap any single source so a chatty one can't monopolize the prompt + citations + docs = cap_docs_per_source(docs) tokens_per_doc: list[int] = [] final_doc_ind = None @@ -211,6 +455,10 @@ def prune_documents( if doc_relevance_list is not None: assert len(docs) == len(doc_relevance_list) + # Drop older-version duplicates of the same docs page before anything else, + # so the freed context slots get filled by distinct sources during pruning. + docs, doc_relevance_list = dedupe_doc_versions(docs, doc_relevance_list) + doc_token_limit = _compute_limit( prompt_config=prompt_config, llm_config=llm_config, diff --git a/backend/danswer/prompts/llm_chunk_filter.py b/backend/danswer/prompts/llm_chunk_filter.py index 623ae587703..8cec50b1985 100644 --- a/backend/danswer/prompts/llm_chunk_filter.py +++ b/backend/danswer/prompts/llm_chunk_filter.py @@ -25,6 +25,32 @@ """.strip() +# Listwise variant: judge ALL candidate sections in ONE call (cheaper + +# lower-latency than one call per chunk, and lets the model compare them). +# Run on the MAIN llm. {sections} is a numbered list; the model returns a JSON +# array of the USEFUL section numbers. +LISTWISE_CHUNK_FILTER_PROMPT = """ +You are given {count} numbered reference sections and a user query. For EACH +section, decide whether it is USEFUL for answering the query. It is NOT enough +to be related — the section must contain information USEFUL for answering. If a +section contains ANY useful information that counts; it need not fully answer +the query. + +Reference Sections: +{sections} + +User Query: +``` +{user_query} +``` + +Respond with EXACTLY AND ONLY a JSON array of the numbers of the useful +sections, in any order, e.g. [1, 3, 4]. If none are useful, respond with []. +""".strip() + + # Use the following for easy viewing of prompts if __name__ == "__main__": print(CHUNK_FILTER_PROMPT) + print("\n\n") + print(LISTWISE_CHUNK_FILTER_PROMPT) diff --git a/backend/danswer/prompts/prompt_utils.py b/backend/danswer/prompts/prompt_utils.py index 117de69cf9a..e7ca46b9d73 100644 --- a/backend/danswer/prompts/prompt_utils.py +++ b/backend/danswer/prompts/prompt_utils.py @@ -7,6 +7,7 @@ from danswer.chat.models import LlmDoc from danswer.configs.chat_configs import LANGUAGE_HINT from danswer.configs.chat_configs import MULTILINGUAL_QUERY_EXPANSION +from danswer.configs.chat_configs import PROTECTED_SOURCES from danswer.configs.constants import DocumentSource from danswer.db.models import Prompt from danswer.llm.answering.models import PromptConfig @@ -61,11 +62,42 @@ def build_task_prompt_reminders( language_hint_str: str = LANGUAGE_HINT, ) -> str: base_task = prompt.task_prompt - citation_or_nothing = citation_str if prompt.include_citations else "" + citation_or_nothing = ( + citation_str + build_authoritative_sources_reminder() + if prompt.include_citations + else "" + ) language_hint_or_nothing = language_hint_str.lstrip() if use_language_hint else "" return base_task + citation_or_nothing + language_hint_or_nothing +def build_authoritative_sources_reminder() -> str: + """A generic, global nudge: tell the LLM which sources are the authoritative + systems of record (derived from PROTECTED_SOURCES) and to prefer grounding + + citing them over other sources (e.g. chat discussions) when they support the + answer. Source names are rendered via clean_up_source so they match the + "Source: X" labels on the docs in the prompt. Empty when no protected sources + are configured. Applies to every assistant + both flows (it rides on the + shared citation reminder), so it needs no per-persona setup. + """ + if not PROTECTED_SOURCES: + return "" + names: list[str] = [] + for source in PROTECTED_SOURCES: + name = clean_up_source(source) + if name not in names: + names.append(name) + if not names: + return "" + listed = ", ".join(names) + return ( + f"\n\n{listed} are authoritative systems of record. When one of these documents " + f"supports a point in your answer, prefer citing it over a non-authoritative " + f"source (e.g. chat discussions) that makes the same point. Only cite sources " + f"that actually support your answer." + ) + + # Maps connector enum string to a more natural language representation for the LLM # If not on the list, uses the original but slightly cleaned up, see below CONNECTOR_NAME_MAP = { diff --git a/backend/danswer/search/pipeline.py b/backend/danswer/search/pipeline.py index 98b1a87161d..71c41c43d26 100644 --- a/backend/danswer/search/pipeline.py +++ b/backend/danswer/search/pipeline.py @@ -7,6 +7,9 @@ from sqlalchemy.orm import Session from danswer.configs.chat_configs import MULTILINGUAL_QUERY_EXPANSION +from danswer.configs.chat_configs import PROTECTED_SOURCES +from danswer.configs.chat_configs import SOURCE_RESERVED_RETRIEVAL_SLOTS +from danswer.configs.constants import DocumentSource from danswer.db.embedding_model import get_current_db_embedding_model from danswer.db.models import User from danswer.document_index.factory import get_default_document_index @@ -23,9 +26,46 @@ from danswer.search.postprocessing.postprocessing import search_postprocessing from danswer.search.preprocessing.preprocessing import retrieval_preprocessing from danswer.search.retrieval.search_runner import retrieve_chunks +from danswer.utils.logger import setup_logger from danswer.utils.threadpool_concurrency import run_functions_tuples_in_parallel +logger = setup_logger() + + +def protected_source_topup( + existing: list[InferenceChunk], + candidates: list[InferenceChunk], + reserved: int, + protected_sources: set[DocumentSource], +) -> list[InferenceChunk]: + """Pick up to `reserved` protected-source chunks from `candidates` that are not + already present in `existing`, to top the candidate set up to the reservation. + + Pure (no I/O) so it's unit-testable. Returns only the chunks to ADD (callers + append them). De-dupes by `unique_id` against `existing` and within the result. + """ + if reserved <= 0 or not protected_sources: + return [] + present = sum(1 for c in existing if c.source_type in protected_sources) + need = reserved - present + if need <= 0: + return [] + + seen = {c.unique_id for c in existing} + added: list[InferenceChunk] = [] + for c in candidates: + if len(added) >= need: + break + if c.source_type not in protected_sources: + continue + if c.unique_id in seen: + continue + seen.add(c.unique_id) + added.append(c) + return added + + class ChunkRange(BaseModel): chunk: InferenceChunk start: int @@ -273,7 +313,7 @@ def retrieved_chunks(self) -> list[InferenceChunk]: if self._retrieved_chunks is not None: return self._retrieved_chunks - self._retrieved_chunks = retrieve_chunks( + chunks = retrieve_chunks( query=self.search_query, document_index=self.document_index, db_session=self.db_session, @@ -282,8 +322,79 @@ def retrieved_chunks(self) -> list[InferenceChunk]: retrieval_metrics_callback=self.retrieval_metrics_callback, ) + # Recall guarantee for curated sources (see SOURCE_RESERVED_RETRIEVAL_SLOTS). + chunks = self._supplement_protected_sources(chunks) + + self._retrieved_chunks = chunks return cast(list[InferenceChunk], self._retrieved_chunks) + def _supplement_protected_sources( + self, chunks: list[InferenceChunk] + ) -> list[InferenceChunk]: + """Ensure up to SOURCE_RESERVED_RETRIEVAL_SLOTS chunks from PROTECTED_SOURCES + are in the candidate set. A chatty source (e.g. Slack) can otherwise fill + every top hit, starving curated sources (web/KB/OutSystems) that retrieval + would surface only on a source-scoped query — so we run ONE extra retrieval + restricted to the protected sources and merge the top results in. + + Scope-safe: the supplemental query reuses the SAME filters as the main query + (ACL + persona document-set fence) and only ADDS a source_type restriction + (intersected with any existing source filter), so nothing outside the + query's existing scope is ever introduced. Universal — runs for every + persona in both the chat and Slack flows, since both funnel through here. + """ + reserved = SOURCE_RESERVED_RETRIEVAL_SLOTS + if reserved <= 0 or not PROTECTED_SOURCES: + return chunks + + # Map configured source strings -> DocumentSource (skip any unknowns). + valid = DocumentSource._value2member_map_ + protected_ds = [valid[s] for s in PROTECTED_SOURCES if s in valid] + if not protected_ds: + return chunks + protected_set = set(protected_ds) + + # Cheap exit if the reservation is already satisfied. + if sum(1 for c in chunks if c.source_type in protected_set) >= reserved: + return chunks + + # Restrict to protected sources, intersected with any existing source filter + # so we never widen scope. ACL + document_set fence are preserved as-is. + existing_src = self.search_query.filters.source_type + allowed = ( + [s for s in protected_ds if s in existing_src] + if existing_src + else protected_ds + ) + if not allowed: + return chunks + # SearchQuery / IndexFilters are immutable (frozen pydantic) — rebuild via + # copy(update=...) rather than mutating in place. + supp_filters = self.search_query.filters.copy(update={"source_type": allowed}) + supp_query = self.search_query.copy( + update={"filters": supp_filters, "num_hits": max(reserved * 4, 10)} + ) + + supp_chunks = retrieve_chunks( + query=supp_query, + document_index=self.document_index, + db_session=self.db_session, + hybrid_alpha=self.search_request.hybrid_alpha, + multilingual_expansion_str=MULTILINGUAL_QUERY_EXPANSION, + retrieval_metrics_callback=self.retrieval_metrics_callback, + ) + + added = protected_source_topup(chunks, supp_chunks, reserved, protected_set) + if added: + logger.info( + "Source-reserved retrieval: injected %d protected-source chunk(s) " + "from %s (reservation=%d)", + len(added), + [c.source_type.value for c in added], + reserved, + ) + return chunks + added + @property def retrieved_sections(self) -> list[InferenceSection]: # Calls retrieved_chunks inside @@ -300,7 +411,10 @@ def reranked_chunks(self) -> list[InferenceChunk]: self._postprocessing_generator = search_postprocessing( search_query=self.search_query, retrieved_chunks=self.retrieved_chunks, - llm=self.fast_llm, # use fast_llm for relevance, since it is a relatively easier task + # Use the MAIN llm (not fast_llm) for the relevance filter: it now + # judges all chunks in one listwise call, and the main model is more + # reliable at that structured multi-item judgment. + llm=self.llm, rerank_metrics_callback=self.rerank_metrics_callback, ) self._reranked_chunks = cast( diff --git a/backend/danswer/search/postprocessing/postprocessing.py b/backend/danswer/search/postprocessing/postprocessing.py index 3b36bcff3a9..0aeb3dec7d2 100644 --- a/backend/danswer/search/postprocessing/postprocessing.py +++ b/backend/danswer/search/postprocessing/postprocessing.py @@ -17,7 +17,7 @@ from danswer.search.models import SearchQuery from danswer.search.models import SearchType from danswer.search.search_nlp_models import CrossEncoderEnsembleModel -from danswer.secondary_llm_flows.chunk_usefulness import llm_batch_eval_chunks +from danswer.secondary_llm_flows.chunk_usefulness import llm_eval_chunks_listwise from danswer.utils.logger import setup_logger from danswer.utils.threadpool_concurrency import FunctionCall from danswer.utils.threadpool_concurrency import run_functions_in_parallel @@ -141,7 +141,9 @@ def filter_chunks( Returns a list of the unique chunk IDs that were marked as relevant""" chunks_to_filter = chunks_to_filter[: query.max_llm_filter_chunks] - llm_chunk_selection = llm_batch_eval_chunks( + # One listwise call over all candidates (on the main LLM) rather than one + # call per chunk — cheaper, lower latency, and lets the model compare them. + llm_chunk_selection = llm_eval_chunks_listwise( query=query.query, chunk_contents=[chunk.content for chunk in chunks_to_filter], llm=llm, diff --git a/backend/danswer/search/preprocessing/preprocessing.py b/backend/danswer/search/preprocessing/preprocessing.py index e59ef37c95e..7dfef923015 100644 --- a/backend/danswer/search/preprocessing/preprocessing.py +++ b/backend/danswer/search/preprocessing/preprocessing.py @@ -4,7 +4,9 @@ from danswer.configs.chat_configs import DISABLE_LLM_CHUNK_FILTER from danswer.configs.chat_configs import DISABLE_LLM_FILTER_EXTRACTION from danswer.configs.chat_configs import FAVOR_RECENT_DECAY_MULTIPLIER +from danswer.configs.chat_configs import LLM_RELEVANCE_FILTER_ENABLED from danswer.configs.chat_configs import NUM_RETURNED_HITS +from danswer.db.models import Persona from danswer.db.models import User from danswer.llm.interfaces import LLM from danswer.search.enums import QueryFlow @@ -23,11 +25,58 @@ from danswer.utils.threadpool_concurrency import run_functions_in_parallel from danswer.utils.timing import log_function_time from shared_configs.configs import ENABLE_RERANKING_REAL_TIME_FLOW +from shared_configs.configs import RERANK_ENABLED logger = setup_logger() +def _resolve_skip_rerank( + explicit_skip_rerank: bool | None, + persona: Persona | None, +) -> bool: + """Single source of truth for whether to skip cross-encoder reranking, + used by both the chat and Slack flows. + + Reranking runs only when the global master switch (RERANK_ENABLED — i.e. a + GPU-backed model server is deployed) AND the per-assistant opt-in + (Persona.rerank_enabled) are both on. If a caller set skip_rerank explicitly + we honor it. With RERANK_ENABLED off (the local / GPU-free default) + reranking never runs regardless of the per-assistant flag, so no GPU is + required. The legacy ENABLE_RERANKING_REAL_TIME_FLOW env flag is retained + only as a fallback for callers that haven't adopted the per-assistant model. + """ + if explicit_skip_rerank is not None: + return explicit_skip_rerank + persona_opts_in = bool(persona and persona.rerank_enabled) + rerank = (RERANK_ENABLED and persona_opts_in) or ENABLE_RERANKING_REAL_TIME_FLOW + return not rerank + + +def _resolve_skip_llm_chunk_filter( + explicit_skip: bool | None, + persona: Persona | None, + disable_llm_chunk_filter: bool, +) -> bool: + """Single source of truth for whether to skip the LLM relevance filter. + + Independent of reranking (it's LLM-only, needs no GPU). The filter runs only + when the global master switch LLM_RELEVANCE_FILTER_ENABLED AND the + per-assistant opt-in (Persona.llm_relevance_filter) are both on. The global + DISABLE_LLM_CHUNK_FILTER kill-switch always wins. If a caller set skip + explicitly (e.g. the chat flow, which has already applied the global gate + + its per-conversation toggle), honor it — but the kill-switch still applies. + """ + if disable_llm_chunk_filter: + return True + if explicit_skip is not None: + return explicit_skip + use = LLM_RELEVANCE_FILTER_ENABLED and bool( + persona and persona.llm_relevance_filter + ) + return not use + + @log_function_time(print_only=True) def retrieval_preprocessing( search_request: SearchRequest, @@ -155,22 +204,10 @@ def retrieval_preprocessing( prioritized_sources=preset_filters.prioritized_sources, # Use prioritized_sources from filters ) - llm_chunk_filter = False - if search_request.skip_llm_chunk_filter is not None: - llm_chunk_filter = not search_request.skip_llm_chunk_filter - elif persona: - llm_chunk_filter = persona.llm_relevance_filter - - if disable_llm_chunk_filter: - if llm_chunk_filter: - logger.info( - "LLM chunk filtering would have run but has been globally disabled" - ) - llm_chunk_filter = False - - skip_rerank = search_request.skip_rerank - if skip_rerank is None: - skip_rerank = not ENABLE_RERANKING_REAL_TIME_FLOW + skip_llm_chunk_filter = _resolve_skip_llm_chunk_filter( + search_request.skip_llm_chunk_filter, persona, disable_llm_chunk_filter + ) + skip_rerank = _resolve_skip_rerank(search_request.skip_rerank, persona) # Decays at 1 / (1 + (multiplier * num years)) if persona and persona.recency_bias == RecencyBiasSetting.NO_DECAY: @@ -194,7 +231,7 @@ def retrieval_preprocessing( num_hits=limit if limit is not None else NUM_RETURNED_HITS, offset=offset or 0, skip_rerank=skip_rerank, - skip_llm_chunk_filter=not llm_chunk_filter, + skip_llm_chunk_filter=skip_llm_chunk_filter, chunks_above=search_request.chunks_above, chunks_below=search_request.chunks_below, full_doc=search_request.full_doc, diff --git a/backend/danswer/search/search_nlp_models.py b/backend/danswer/search/search_nlp_models.py index 761d9aa791f..13597c146c5 100644 --- a/backend/danswer/search/search_nlp_models.py +++ b/backend/danswer/search/search_nlp_models.py @@ -13,6 +13,7 @@ from danswer.utils.logger import setup_logger from shared_configs.configs import MODEL_SERVER_HOST from shared_configs.configs import MODEL_SERVER_PORT +from shared_configs.configs import RERANK_SERVER_URL from shared_configs.model_server_models import EmbedRequest from shared_configs.model_server_models import EmbedResponse from shared_configs.model_server_models import IntentRequest @@ -128,20 +129,44 @@ def __init__( self, model_server_host: str = MODEL_SERVER_HOST, model_server_port: int = MODEL_SERVER_PORT, + rerank_server_url: str = RERANK_SERVER_URL, ) -> None: + # When a TEI rerank server is configured, talk to it directly (its + # /rerank API). Otherwise fall back to our model server's + # /cross-encoder-scores (sentence-transformers) path. + self.tei_rerank_endpoint = ( + f"{rerank_server_url}/rerank" if rerank_server_url else None + ) model_server_url = build_model_server_url(model_server_host, model_server_port) self.rerank_server_endpoint = model_server_url + "/encoder/cross-encoder-scores" def predict(self, query: str, passages: list[str]) -> list[list[float]]: - rerank_request = RerankRequest(query=query, documents=passages) + if self.tei_rerank_endpoint: + return [self._predict_tei(query, passages)] + rerank_request = RerankRequest(query=query, documents=passages) response = requests.post( self.rerank_server_endpoint, json=rerank_request.dict() ) response.raise_for_status() - return RerankResponse(**response.json()).scores + def _predict_tei(self, query: str, passages: list[str]) -> list[float]: + """Call a Hugging Face TEI /rerank server and return scores in the SAME + order as `passages`. TEI returns [{index, score}, ...] sorted by score, + so we scatter them back to the input order. raw_scores=true keeps the + cross-encoder logits (the downstream normalization expects a logit-like + scale, not a 0-1 probability).""" + response = requests.post( + self.tei_rerank_endpoint, + json={"query": query, "texts": passages, "raw_scores": True}, + ) + response.raise_for_status() + scores = [0.0] * len(passages) + for item in response.json(): + scores[item["index"]] = item["score"] + return scores + class IntentModel: def __init__( diff --git a/backend/danswer/secondary_llm_flows/chunk_usefulness.py b/backend/danswer/secondary_llm_flows/chunk_usefulness.py index 8148a37138c..6dff17beeab 100644 --- a/backend/danswer/secondary_llm_flows/chunk_usefulness.py +++ b/backend/danswer/secondary_llm_flows/chunk_usefulness.py @@ -1,9 +1,12 @@ +import json +import re from collections.abc import Callable from danswer.llm.interfaces import LLM from danswer.llm.utils import dict_based_prompt_to_langchain_prompt from danswer.llm.utils import message_to_string from danswer.prompts.llm_chunk_filter import CHUNK_FILTER_PROMPT +from danswer.prompts.llm_chunk_filter import LISTWISE_CHUNK_FILTER_PROMPT from danswer.prompts.llm_chunk_filter import NONUSEFUL_PAT from danswer.utils.logger import setup_logger from danswer.utils.threadpool_concurrency import run_functions_tuples_in_parallel @@ -11,6 +14,66 @@ logger = setup_logger() +def _parse_useful_indices(model_output: str, count: int) -> set[int] | None: + """Parse the listwise filter's reply into a set of 1-based useful indices. + + Returns None when no JSON array can be found (a parse failure → caller + should fail OPEN and keep all chunks). An explicitly empty array `[]` is a + valid "none useful" answer and returns an empty set (not None). + """ + match = re.search(r"\[[\s\d,]*\]", model_output) + if match is None: + return None + try: + parsed = json.loads(match.group(0)) + except json.JSONDecodeError: + return None + return { + int(n) for n in parsed if isinstance(n, (int, float)) and 1 <= int(n) <= count + } + + +def llm_eval_chunks_listwise( + query: str, chunk_contents: list[str], llm: LLM +) -> list[bool]: + """Judge all chunks in a SINGLE LLM call (vs one call per chunk). + + Returns a parallel list of booleans. Fails OPEN: on any error or an + unparseable reply, every chunk is kept (True) — same philosophy as the + per-chunk path ("better to trust the (re)ranking if the LLM fails"). + """ + if not chunk_contents: + return [] + + sections = "\n\n".join( + f"Section {i + 1}:\n```\n{content}\n```" + for i, content in enumerate(chunk_contents) + ) + messages = [ + { + "role": "user", + "content": LISTWISE_CHUNK_FILTER_PROMPT.format( + count=len(chunk_contents), sections=sections, user_query=query + ), + } + ] + filled_prompt = dict_based_prompt_to_langchain_prompt(messages) + try: + model_output = message_to_string(llm.invoke(filled_prompt)) + except Exception: + logger.exception("Listwise relevance filter call failed — keeping all chunks") + return [True] * len(chunk_contents) + + useful = _parse_useful_indices(model_output, len(chunk_contents)) + if useful is None: + logger.warning( + "Could not parse listwise relevance filter output — keeping all chunks" + ) + return [True] * len(chunk_contents) + + return [(i + 1) in useful for i in range(len(chunk_contents))] + + def llm_eval_chunk(query: str, chunk_content: str, llm: LLM) -> bool: def _get_usefulness_messages() -> list[dict[str, str]]: messages = [ diff --git a/backend/danswer/server/documents/connector.py b/backend/danswer/server/documents/connector.py index 03e35cdb604..4f72c86f70a 100644 --- a/backend/danswer/server/documents/connector.py +++ b/backend/danswer/server/documents/connector.py @@ -21,6 +21,7 @@ from danswer.configs.app_configs import ENABLED_CONNECTOR_TYPES from danswer.configs.constants import DocumentSource from danswer.configs.constants import FileOrigin +from danswer.connectors.highspot.sync import sync_highspot_spots_to_connectors from danswer.connectors.gmail.connector_auth import delete_gmail_service_account_key from danswer.connectors.gmail.connector_auth import delete_google_app_gmail_cred from danswer.connectors.gmail.connector_auth import get_gmail_auth_url @@ -387,12 +388,24 @@ def list_highspot_spots( HighspotSpotResponse(id=s["id"], name=s.get("title", "")) for s in client.get_spots() ] - except HighspotAuthenticationError as e: + except HighspotAuthenticationError: + logger.exception("Highspot authentication failed while listing spots") raise HTTPException( - status_code=401, detail=f"Highspot authentication failed: {e}" + status_code=401, + detail=( + "Could not authenticate to Highspot. Please check the API key " + "and secret on this credential and try again." + ), + ) + except HighspotClientError: + logger.exception("Highspot API error while listing spots") + raise HTTPException( + status_code=502, + detail=( + "Highspot returned an error while listing spots. Please try " + "again, or contact an administrator if the problem persists." + ), ) - except HighspotClientError as e: - raise HTTPException(status_code=502, detail=f"Highspot API error: {e}") @router.post("/admin/connector/file/upload") @@ -579,6 +592,27 @@ def create_connector_from_model( raise HTTPException(status_code=400, detail=str(e)) +@router.post("/admin/connector/highspot/sync-spots") +def sync_highspot_spots_endpoint( + credential_id: int, + _: User = Depends(current_admin_user), + db_session: Session = Depends(get_session), +) -> StatusResponse[list[int]]: + """Create a per-Spot connector (named after the Spot) for every Highspot + Spot the given credential can see, on a monthly schedule. Idempotent — safe + to re-run to pick up newly added Spots. See connectors/highspot/sync.py. + """ + try: + created = sync_highspot_spots_to_connectors(credential_id, db_session) + except ValueError as e: + raise HTTPException(status_code=400, detail=str(e)) + return StatusResponse( + success=True, + message=f"Created {len(created)} Highspot connector(s)", + data=created, + ) + + @router.patch("/admin/connector/{connector_id}") def update_connector_from_model( connector_id: int, diff --git a/backend/danswer/server/features/document_set/api.py b/backend/danswer/server/features/document_set/api.py index de49346f969..6c52ec50611 100644 --- a/backend/danswer/server/features/document_set/api.py +++ b/backend/danswer/server/features/document_set/api.py @@ -1,6 +1,5 @@ from fastapi import APIRouter from fastapi import Depends -from fastapi import HTTPException from sqlalchemy.orm import Session from danswer.auth.api_key import validate_api_key @@ -19,10 +18,20 @@ from danswer.server.features.document_set.models import DocumentSet from danswer.server.features.document_set.models import DocumentSetCreationRequest from danswer.server.features.document_set.models import DocumentSetUpdateRequest - +from danswer.server.utils import user_facing_http_exception router = APIRouter(prefix="/manage", dependencies=[Depends(validate_api_key)]) +# Surfaced when a document-set write hits an IntegrityError. The common cause +# is a document set with both an is_current=True and is_current=False row for +# the same cc-pair (a known data inconsistency); flipping current→outdated then +# collides on the (document_set_id, cc_pair_id, is_current) primary key. +_DOC_SET_INTEGRITY_DETAIL = ( + "This document set has duplicate or conflicting connector entries, so it " + "couldn't be saved. This is a known data inconsistency — please contact an " + "administrator to clean it up." +) + @router.post("/admin/document-set") def create_document_set( @@ -37,7 +46,9 @@ def create_document_set( db_session=db_session, ) except Exception as e: - raise HTTPException(status_code=400, detail=str(e)) + raise user_facing_http_exception( + e, "create the document set", integrity_detail=_DOC_SET_INTEGRITY_DETAIL + ) return document_set_db_model.id @@ -53,7 +64,9 @@ def patch_document_set( db_session=db_session, ) except Exception as e: - raise HTTPException(status_code=400, detail=str(e)) + raise user_facing_http_exception( + e, "update the document set", integrity_detail=_DOC_SET_INTEGRITY_DETAIL + ) @router.delete("/admin/document-set/{document_set_id}") @@ -67,7 +80,7 @@ def delete_document_set( document_set_id=document_set_id, db_session=db_session ) except Exception as e: - raise HTTPException(status_code=400, detail=str(e)) + raise user_facing_http_exception(e, "delete the document set") @router.get("/admin/document-set") diff --git a/backend/danswer/server/features/folder/api.py b/backend/danswer/server/features/folder/api.py index 754e3693dab..b03a0108a7e 100644 --- a/backend/danswer/server/features/folder/api.py +++ b/backend/danswer/server/features/folder/api.py @@ -1,6 +1,5 @@ from fastapi import APIRouter from fastapi import Depends -from fastapi import HTTPException from fastapi import Path from sqlalchemy.orm import Session @@ -24,6 +23,7 @@ from danswer.server.features.folder.models import GetUserFoldersResponse from danswer.server.models import DisplayPriorityRequest from danswer.server.query_and_chat.models import ChatSessionDetails +from danswer.server.utils import user_facing_http_exception router = APIRouter(prefix="/folder", dependencies=[Depends(validate_api_key)]) @@ -103,7 +103,7 @@ def patch_folder_endpoint( db_session=db_session, ) except Exception as e: - raise HTTPException(status_code=400, detail=str(e)) + raise user_facing_http_exception(e, "rename the folder") @router.delete("/{folder_id}") @@ -122,7 +122,7 @@ def delete_folder_endpoint( db_session=db_session, ) except Exception as e: - raise HTTPException(status_code=400, detail=str(e)) + raise user_facing_http_exception(e, "delete the folder") @router.post("/{folder_id}/add-chat-session") @@ -148,7 +148,7 @@ def add_chat_to_folder_endpoint( db_session=db_session, ) except Exception as e: - raise HTTPException(status_code=400, detail=str(e)) + raise user_facing_http_exception(e, "add the chat to the folder") @router.post("/{folder_id}/remove-chat-session/") @@ -174,4 +174,4 @@ def remove_chat_from_folder_endpoint( db_session=db_session, ) except Exception as e: - raise HTTPException(status_code=400, detail=str(e)) + raise user_facing_http_exception(e, "remove the chat from the folder") diff --git a/backend/danswer/server/features/persona/api.py b/backend/danswer/server/features/persona/api.py index 73f0cd6e3e8..87a3435d4e9 100644 --- a/backend/danswer/server/features/persona/api.py +++ b/backend/danswer/server/features/persona/api.py @@ -77,6 +77,7 @@ def list_personas_admin( db_session=db_session, user_id=None, # user_id = None -> give back all personas include_deleted=include_deleted, + eager_load=True, # serialized via PersonaSnapshot -> avoid N+1 ) ] @@ -187,6 +188,7 @@ def get_persona( user=user, db_session=db_session, is_for_edit=False, + eager_load=True, # serialized via PersonaSnapshot -> avoid N+1 ) ) diff --git a/backend/danswer/server/features/persona/models.py b/backend/danswer/server/features/persona/models.py index aee39e72af0..6ae3f59aef7 100644 --- a/backend/danswer/server/features/persona/models.py +++ b/backend/danswer/server/features/persona/models.py @@ -17,12 +17,17 @@ class CreatePersonaRequest(BaseModel): name: str + # Optional user-friendly label shown in chat; defaults to `name` if omitted. + display_name: str | None = None description: str num_chunks: float llm_relevance_filter: bool is_public: bool llm_filter_extraction: bool recency_bias: RecencyBiasSetting + # Per-assistant cross-encoder reranking opt-in (beta). Defaults False so + # older clients that omit it keep current behavior. + rerank_enabled: bool = False prompt_ids: list[int] document_set_ids: list[int] # e.g. ID of SearchTool or ImageGenerationTool or @@ -39,6 +44,7 @@ class PersonaSnapshot(BaseModel): id: int owner: MinimalUserSnapshot | None name: str + display_name: str | None is_visible: bool is_public: bool display_priority: int | None @@ -46,6 +52,7 @@ class PersonaSnapshot(BaseModel): num_chunks: float | None llm_relevance_filter: bool llm_filter_extraction: bool + rerank_enabled: bool llm_model_provider_override: str | None llm_model_version_override: str | None starter_messages: list[StarterMessage] | None @@ -70,6 +77,7 @@ def from_model( return PersonaSnapshot( id=persona.id, name=persona.name, + display_name=persona.display_name, owner=( MinimalUserSnapshot(id=persona.user.id, email=persona.user.email) if persona.user @@ -82,6 +90,7 @@ def from_model( num_chunks=persona.num_chunks, llm_relevance_filter=persona.llm_relevance_filter, llm_filter_extraction=persona.llm_filter_extraction, + rerank_enabled=persona.rerank_enabled, llm_model_provider_override=persona.llm_model_provider_override, llm_model_version_override=persona.llm_model_version_override, starter_messages=persona.starter_messages, diff --git a/backend/danswer/server/features/tool/api.py b/backend/danswer/server/features/tool/api.py index f403633ce11..19709b3e7ce 100644 --- a/backend/danswer/server/features/tool/api.py +++ b/backend/danswer/server/features/tool/api.py @@ -17,6 +17,7 @@ from danswer.db.tools import get_tools from danswer.db.tools import update_tool from danswer.server.features.tool.models import ToolSnapshot +from danswer.server.utils import user_facing_http_exception from danswer.tools.custom.openapi_parsing import MethodSpec from danswer.tools.custom.openapi_parsing import openapi_to_method_specs from danswer.tools.custom.openapi_parsing import validate_openapi_schema @@ -92,8 +93,17 @@ def delete_custom_tool( except ValueError as e: raise HTTPException(status_code=404, detail=str(e)) except Exception as e: - # handles case where tool is still used by an Assistant - raise HTTPException(status_code=400, detail=str(e)) + # Most commonly a FK IntegrityError: the tool is still referenced by an + # Assistant. Give an actionable message instead of leaking raw SQL. + raise user_facing_http_exception( + e, + "delete the tool", + integrity_detail=( + "This tool can't be deleted because one or more assistants " + "still use it. Remove it from those assistants first, then try " + "again." + ), + ) class ValidateToolRequest(BaseModel): diff --git a/backend/danswer/server/manage/models.py b/backend/danswer/server/manage/models.py index a5a1268eaa4..c7f18d4ee92 100644 --- a/backend/danswer/server/manage/models.py +++ b/backend/danswer/server/manage/models.py @@ -34,6 +34,7 @@ class AuthTypeResponse(BaseModel): class UserPreferences(BaseModel): chosen_assistants: list[int] | None + hidden_assistants: list[int] | None = None class UserInfo(BaseModel): @@ -54,7 +55,12 @@ def from_model(cls, user: "UserModel") -> "UserInfo": is_superuser=user.is_superuser, is_verified=user.is_verified, role=user.role, - preferences=(UserPreferences(chosen_assistants=user.chosen_assistants)), + preferences=( + UserPreferences( + chosen_assistants=user.chosen_assistants, + hidden_assistants=user.hidden_assistants or [], + ) + ), ) diff --git a/backend/danswer/server/manage/users.py b/backend/danswer/server/manage/users.py index 8e505755861..3e3bc484eb7 100644 --- a/backend/danswer/server/manage/users.py +++ b/backend/danswer/server/manage/users.py @@ -298,3 +298,35 @@ def update_user_assistant_list( .values(chosen_assistants=request.chosen_assistants) ) db_session.commit() + + +class HiddenAssistantsRequest(BaseModel): + hidden_assistants: list[int] + + +@router.patch("/user/hidden-assistants") +def update_user_hidden_assistants( + request: HiddenAssistantsRequest, + user: User | None = Depends(current_user), + db_session: Session = Depends(get_session), +) -> None: + # Opt-out visibility: the assistants a user has explicitly hidden from their + # picker. Everything not in this list is visible by default (so new admin + # assistants surface for everyone). Mirrors `update_user_assistant_list`. + if user is None: + if AUTH_TYPE == AuthType.DISABLED: + store = get_dynamic_config_store() + + no_auth_user = fetch_no_auth_user(store) + no_auth_user.preferences.hidden_assistants = request.hidden_assistants + set_no_auth_user_preferences(store, no_auth_user.preferences) + return + else: + raise RuntimeError("This should never happen") + + db_session.execute( + update(User) + .where(User.id == user.id) # type: ignore + .values(hidden_assistants=request.hidden_assistants) + ) + db_session.commit() diff --git a/backend/danswer/server/query_and_chat/chat_backend.py b/backend/danswer/server/query_and_chat/chat_backend.py index c8eff35bfba..1aee9c3d3d4 100644 --- a/backend/danswer/server/query_and_chat/chat_backend.py +++ b/backend/danswer/server/query_and_chat/chat_backend.py @@ -1,5 +1,6 @@ import io import uuid +from datetime import datetime from typing import cast from fastapi import APIRouter @@ -131,19 +132,41 @@ def _reject_if_text_too_long(text: str, filename: str | None) -> None: @router.get("/get-user-chat-sessions") def get_user_chat_sessions( + limit: int | None = None, + offset: int = 0, + start_time: datetime | None = None, + end_time: datetime | None = None, user: User | None = Depends(current_user), db_session: Session = Depends(get_session), ) -> ChatSessionsResponse: user_id = user.id if user is not None else None + # The sidebar loads one time bucket at a time, newest-first: it passes a + # [start_time, end_time) window plus `limit`/`offset` to page within that + # bucket. When `limit` is omitted the endpoint keeps its legacy behavior and + # returns every session. To report whether more (older) sessions remain in + # the window, fetch one extra row and trim it off. + fetch_limit = limit + 1 if limit is not None else None + try: chat_sessions = get_chat_sessions_by_user( - user_id=user_id, deleted=False, db_session=db_session + user_id=user_id, + deleted=False, + db_session=db_session, + limit=fetch_limit, + offset=offset, + start_time=start_time, + end_time=end_time, ) except ValueError: raise ValueError("Chat session does not exist or has been deleted") + has_more = False + if limit is not None and len(chat_sessions) > limit: + has_more = True + chat_sessions = chat_sessions[:limit] + return ChatSessionsResponse( sessions=[ ChatSessionDetails( @@ -156,7 +179,8 @@ def get_user_chat_sessions( current_alternate_model=chat.current_alternate_model, ) for chat in chat_sessions - ] + ], + has_more=has_more, ) diff --git a/backend/danswer/server/query_and_chat/models.py b/backend/danswer/server/query_and_chat/models.py index ea1ce1ff680..bf6bf7b924f 100644 --- a/backend/danswer/server/query_and_chat/models.py +++ b/backend/danswer/server/query_and_chat/models.py @@ -113,6 +113,13 @@ class CreateChatMessageRequest(ChunkContext): # used for seeded chats to kick off the generation of an AI answer use_existing_user_message: bool = False + # Per-conversation toggles for the search-quality features, default OFF and + # independent of the assistant's own settings (the chat page exposes these + # so a user can opt in for just this conversation). Each still requires its + # global master switch (RERANK_ENABLED / LLM_RELEVANCE_FILTER_ENABLED). + use_reranking: bool = False + use_relevance_filter: bool = False + @root_validator def check_search_doc_ids_or_retrieval_options(cls: BaseModel, values: dict) -> dict: search_doc_ids, retrieval_options = values.get("search_doc_ids"), values.get( @@ -156,6 +163,9 @@ class ChatSessionDetails(BaseModel): class ChatSessionsResponse(BaseModel): sessions: list[ChatSessionDetails] + # True when more (older) sessions exist beyond this page. Omitted/False + # means the caller has reached the end of the history. + has_more: bool = False class SearchFeedbackRequest(BaseModel): diff --git a/backend/danswer/server/settings/models.py b/backend/danswer/server/settings/models.py index 15e3d86b4ac..86a0a4607c1 100644 --- a/backend/danswer/server/settings/models.py +++ b/backend/danswer/server/settings/models.py @@ -22,6 +22,13 @@ class Settings(BaseModel): # here so the chat UI pre-checks against the SAME value the backend enforces # instead of a hardcoded duplicate. chat_file_max_size_mb: int = 25 + # Env-driven (RERANK_ENABLED / LLM_RELEVANCE_FILTER_ENABLED), injected in + # load_settings — surfaced so the chat + assistant UIs hide the + # per-conversation and per-assistant rerank / relevance toggles when the + # feature is disabled cluster-wide. Effective values (relevance also respects + # the DISABLE_LLM_CHUNK_FILTER kill-switch). + rerank_enabled: bool = False + llm_relevance_filter_enabled: bool = False def check_validity(self) -> None: chat_page_enabled = self.chat_page_enabled diff --git a/backend/danswer/server/settings/store.py b/backend/danswer/server/settings/store.py index 29293afaab8..c626577f085 100644 --- a/backend/danswer/server/settings/store.py +++ b/backend/danswer/server/settings/store.py @@ -1,9 +1,12 @@ from typing import cast from danswer.configs.app_configs import CHAT_FILE_MAX_SIZE_MB +from danswer.configs.chat_configs import DISABLE_LLM_CHUNK_FILTER +from danswer.configs.chat_configs import LLM_RELEVANCE_FILTER_ENABLED from danswer.dynamic_configs.factory import get_dynamic_config_store from danswer.dynamic_configs.interface import ConfigNotFoundError from danswer.server.settings.models import Settings +from shared_configs.configs import RERANK_ENABLED _SETTINGS_KEY = "danswer_settings" @@ -21,6 +24,15 @@ def load_settings() -> Settings: # chat UI pre-check matches the backend's CHAT_FILE_MAX_SIZE_MB. settings.chat_file_max_size_mb = CHAT_FILE_MAX_SIZE_MB + # Cluster-level enablement of the search-quality features, surfaced so the UI + # can hide the rerank/relevance toggles when they're disabled cluster-wide. + # These mirror the backend's own gating exactly (relevance also off when the + # DISABLE_LLM_CHUNK_FILTER kill-switch is set). + settings.rerank_enabled = RERANK_ENABLED + settings.llm_relevance_filter_enabled = ( + LLM_RELEVANCE_FILTER_ENABLED and not DISABLE_LLM_CHUNK_FILTER + ) + return settings diff --git a/backend/danswer/server/utils.py b/backend/danswer/server/utils.py index bf535661878..4c57ba8a3be 100644 --- a/backend/danswer/server/utils.py +++ b/backend/danswer/server/utils.py @@ -1,6 +1,60 @@ import json from typing import Any +from fastapi import HTTPException +from sqlalchemy.exc import IntegrityError + +from danswer.utils.logger import setup_logger + +logger = setup_logger() + + +def user_facing_http_exception( + exc: Exception, + action: str, + *, + status_code: int = 400, + integrity_detail: str | None = None, +) -> HTTPException: + """Convert an exception into an HTTPException safe to show the user. + + Endpoints historically did ``raise HTTPException(detail=str(e))``, which + leaks raw psycopg2 / SQLAlchemy text (full SQL statements, parameters, + constraint names) to the browser whenever a broad ``except Exception`` + catches a database error. This centralizes the translation: + + - ``ValueError`` is treated as an intentional, already-friendly domain + error and its message is surfaced verbatim. + - ``IntegrityError`` (unique/foreign-key violations, etc.) gets a generic + "conflicts with existing data" message, or a caller-supplied + ``integrity_detail`` for a more specific hint. + - Anything else is logged server-side and replaced with a generic message. + + ``action`` is a short verb phrase, e.g. "update the document set". + """ + if isinstance(exc, ValueError): + return HTTPException(status_code=status_code, detail=str(exc)) + + logger.exception(f"Failed to {action}") + if isinstance(exc, IntegrityError): + return HTTPException( + status_code=status_code, + detail=( + integrity_detail + or f"Could not {action} because it conflicts with existing " + "data (a duplicate entry, or something still referencing it). " + "Adjust the conflicting item and try again, or contact an " + "administrator if the problem persists." + ), + ) + return HTTPException( + status_code=status_code, + detail=( + f"Something went wrong while trying to {action}. Please try again, " + "or contact an administrator if the problem persists." + ), + ) + def get_json_line(json_dict: dict) -> str: return json.dumps(json_dict) + "\n" diff --git a/backend/danswer/tools/search/search_tool.py b/backend/danswer/tools/search/search_tool.py index 75770e69f62..6b5e7d533de 100644 --- a/backend/danswer/tools/search/search_tool.py +++ b/backend/danswer/tools/search/search_tool.py @@ -13,7 +13,9 @@ from danswer.db.models import Persona from danswer.db.models import User from danswer.dynamic_configs.interface import JSON_ro +from danswer.llm.answering.doc_pruning import parse_question_doc_version from danswer.llm.answering.doc_pruning import prune_documents +from danswer.llm.answering.doc_pruning import rewrite_docs_links from danswer.llm.answering.models import DocumentPruningConfig from danswer.llm.answering.models import PreviousMessage from danswer.llm.answering.models import PromptConfig @@ -78,6 +80,12 @@ def __init__( chunks_below: int = 0, full_doc: bool = False, bypass_acl: bool = False, + # Per-request reranking / relevance-filter overrides. None => let + # retrieval_preprocessing decide from the global flags + the assistant's + # settings (Slack / default path). The chat flow passes explicit values + # derived from the per-conversation toggles + global flags. + skip_rerank: bool | None = None, + skip_llm_chunk_filter: bool | None = None, ) -> None: self.user = user self.persona = persona @@ -93,6 +101,8 @@ def __init__( self.chunks_below = chunks_below self.full_doc = full_doc self.bypass_acl = bypass_acl + self.skip_rerank = skip_rerank + self.skip_llm_chunk_filter = skip_llm_chunk_filter self.db_session = db_session def name(self) -> str: @@ -211,6 +221,8 @@ def run(self, **kwargs: str) -> Generator[ToolResponse, None, None]: chunks_above=self.chunks_above, chunks_below=self.chunks_below, full_doc=self.full_doc, + skip_rerank=self.skip_rerank, + skip_llm_chunk_filter=self.skip_llm_chunk_filter, ), user=self.user, llm=self.llm, @@ -263,6 +275,14 @@ def run(self, **kwargs: str) -> Generator[ToolResponse, None, None]: question=query, document_pruning_config=self.pruning_config, ) + # Rewrite versioned-docs links to the right version of the same page: the + # version the question asked about if it named one, else the newest indexed + # (retrieval can surface an arbitrary/stale version when several exist). + rewrite_docs_links( + final_context_documents, + self.db_session, + target_version=parse_question_doc_version(query), + ) yield ToolResponse(id=FINAL_CONTEXT_DOCUMENTS, response=final_context_documents) def final_result(self, *args: ToolResponse) -> JSON_ro: diff --git a/backend/model_server/main.py b/backend/model_server/main.py index 1aaf9567874..c1fa2a5c653 100644 --- a/backend/model_server/main.py +++ b/backend/model_server/main.py @@ -20,6 +20,8 @@ from shared_configs.configs import MIN_THREADS_ML_MODELS from shared_configs.configs import MODEL_SERVER_ALLOWED_HOST from shared_configs.configs import MODEL_SERVER_PORT +from shared_configs.configs import RERANK_ENABLED +from shared_configs.configs import RERANK_SERVER_URL os.environ["TOKENIZERS_PARALLELISM"] = "false" os.environ["HF_HUB_DISABLE_TELEMETRY"] = "1" @@ -41,7 +43,15 @@ async def lifespan(app: FastAPI) -> AsyncGenerator: if not INDEXING_ONLY: warm_up_intent_model() - if ENABLE_RERANKING_REAL_TIME_FLOW or ENABLE_RERANKING_ASYNC_FLOW: + # Only load the cross-encoder here when reranking is enabled AND it's + # NOT served by an external TEI server (RERANK_SERVER_URL). With TEI, + # the reranker lives in that container, so this server stays embedding + + # intent only. + if ( + RERANK_ENABLED + or ENABLE_RERANKING_REAL_TIME_FLOW + or ENABLE_RERANKING_ASYNC_FLOW + ) and not RERANK_SERVER_URL: warm_up_cross_encoders() else: logger.info("This model server should only run document indexing.") diff --git a/backend/scripts/test_uipath_version_discovery.py b/backend/scripts/test_uipath_version_discovery.py new file mode 100644 index 00000000000..5167a77ad5a --- /dev/null +++ b/backend/scripts/test_uipath_version_discovery.py @@ -0,0 +1,34 @@ +"""Ad-hoc check of get_uipath_docs_version_base_urls against the agreed test set. +Run: (venv active, cwd=backend) python scripts/test_uipath_version_discovery.py +Hits live docs.uipath.com pages.""" +from danswer.connectors.web.connector import get_uipath_docs_version_base_urls as expand + +TESTS = [ + ("https://docs.uipath.com/robot/standalone/latest", 3, "versioned standalone"), + ("https://docs.uipath.com/apps/automation-suite/", 3, "versioned on-prem"), + ("https://docs.uipath.com/activities/other/latest", 1, "evergreen (only latest)"), + ( + "https://docs.uipath.com/automation-cloud/automation-cloud/latest/admin-guide/using-the-migration-tool", + 1, + "cloud/evergreen deep page", + ), + ( + "https://learn.microsoft.com/en-us/azure/devops/pipelines/yaml-schema/jobs-deployment?view=azure-pipelines", + 1, + "non-docs.uipath (domain-gated out)", + ), +] + +ok = True +for url, expected_count, label in TESTS: + res = expand(url, max_versions=3) + status = "PASS" if len(res) == expected_count else "FAIL" + if status == "FAIL": + ok = False + print(f"\n[{status}] {label}") + print(f" in : {url}") + print(f" exp: {expected_count} url(s) got: {len(res)}") + for u in res: + print(f" -> {u}") + +print("\n==== ALL PASS ====" if ok else "\n==== FAILURES PRESENT ====") diff --git a/backend/shared_configs/configs.py b/backend/shared_configs/configs.py index aeeb9cf2ae9..36b0aad8cb7 100644 --- a/backend/shared_configs/configs.py +++ b/backend/shared_configs/configs.py @@ -21,14 +21,29 @@ DOC_EMBEDDING_CONTEXT_SIZE = 512 # Cross Encoder Settings +# Global master switch for cross-encoder reranking. When true, the reranker is +# available and the app will rerank for assistants that opt in +# (Persona.rerank_enabled). When false (default) reranking is never attempted +# regardless of per-assistant flags. +RERANK_ENABLED = os.environ.get("RERANK_ENABLED", "").lower() == "true" +# If set, reranking is served by a Hugging Face Text-Embeddings-Inference (TEI) +# container at this base URL (its /rerank endpoint) — a CPU-optimized, +# full-precision way to host the cross-encoder WITHOUT a GPU. When set, our own +# model server does NOT load the cross-encoder (TEI owns it). Empty => use the +# legacy in-model-server sentence-transformers path. +RERANK_SERVER_URL = (os.environ.get("RERANK_SERVER_URL") or "").rstrip("/") ENABLE_RERANKING_ASYNC_FLOW = ( os.environ.get("ENABLE_RERANKING_ASYNC_FLOW", "").lower() == "true" ) ENABLE_RERANKING_REAL_TIME_FLOW = ( os.environ.get("ENABLE_RERANKING_REAL_TIME_FLOW", "").lower() == "true" ) -# Only using one cross-encoder for now -CROSS_ENCODER_MODEL_ENSEMBLE = ["mixedbread-ai/mxbai-rerank-xsmall-v1"] +# Only using one cross-encoder for now. Env-overridable so a GPU-backed prod +# deployment can select a stronger reranker (e.g. BAAI/bge-reranker-v2-m3) +# without a code change; local/dev keeps the small default. +CROSS_ENCODER_MODEL_ENSEMBLE = [ + os.environ.get("RERANK_MODEL_NAME") or "mixedbread-ai/mxbai-rerank-xsmall-v1" +] CROSS_EMBED_CONTEXT_SIZE = 512 # This controls the minimum number of pytorch "threads" to allocate to the embedding diff --git a/backend/tests/integration/test_auth_api_key_and_sso.py b/backend/tests/integration/test_auth_api_key_and_sso.py new file mode 100644 index 00000000000..7990f328265 --- /dev/null +++ b/backend/tests/integration/test_auth_api_key_and_sso.py @@ -0,0 +1,153 @@ +"""Integration tests for the auth gate — both flows must keep working: + + 1. SSO / session: a browser request with no session is rejected (403), which + is what drives the OIDC login flow. This must NOT be weakened. + 2. API key: these are service credentials for *automation* and intentionally + do NOT map to a `User`. A request carrying a valid `X-API-Key` is authorized + as an anonymous service caller (`user=None`, which endpoints already handle) + instead of being 403'd into the SSO flow. + +Regression context: enabling OIDC (AUTH_TYPE=oidc) flipped DISABLE_AUTH off, so +`current_user` started 403'ing api-key requests (they have no session and the +keys don't resolve to a user). These tests lock the contract so future changes +to auth or SSO don't silently break either flow. + +Style matches the other tests in this dir: drive the real dependency functions +directly with stubs (no live server / DB). +""" +import asyncio +from types import SimpleNamespace + +import pytest +from fastapi import HTTPException + +from starlette.datastructures import Headers + +from danswer.auth import api_key as api_key_mod +from danswer.auth import users as users_mod +from danswer.auth.api_key import request_has_valid_api_key +from danswer.auth.schemas import UserRole + + +def _run(coro): # avoid a hard pytest-asyncio dependency + return asyncio.run(coro) + + +def _request(headers: dict[str, str]) -> SimpleNamespace: + # Real Starlette Headers => case-insensitive lookup, exactly like a request + # (the Postman collection sends lowercase "x-api-key"). + return SimpleNamespace(headers=Headers(headers)) + + +class _StubDB: + """Stands in for the Session: `.scalar()` returns a row iff `found`.""" + + def __init__(self, found: bool) -> None: + self._found = found + + def scalar(self, *_args, **_kwargs): # type: ignore[no-untyped-def] + return SimpleNamespace(id=1, user_id="00000000-0000-0000-0000-000000000000") if self._found else None + + +def _user(*, verified: bool = True, role: UserRole = UserRole.BASIC) -> SimpleNamespace: + return SimpleNamespace(is_verified=verified, role=role, email="svc@example.com") + + +@pytest.fixture(autouse=True) +def _clear_api_key_cache(): + # validate_api_key / request_has_valid_api_key share a module-level TTLCache; + # clear it so tests don't leak validity into each other. + api_key_mod.cache.clear() + yield + api_key_mod.cache.clear() + + +@pytest.fixture +def auth_enforced(monkeypatch): + """Force the 'auth enabled' world (e.g. OIDC) deterministically. + + `double_check_user`'s `optional` default is captured from DISABLE_AUTH at + import time, so we pin it to False here regardless of the test env's + AUTH_TYPE. `current_admin_user` reads DISABLE_AUTH at call time, so patch + that too. + """ + monkeypatch.setattr(users_mod.double_check_user, "__defaults__", (False,)) + monkeypatch.setattr(users_mod, "DISABLE_AUTH", False) + + +# --------------------------------------------------------------------------- # +# request_has_valid_api_key — the validity check itself +# --------------------------------------------------------------------------- # +def test_valid_api_key_header_is_accepted(): + assert request_has_valid_api_key(_request({"X-API-Key": "k"}), _StubDB(found=True)) is True + + +def test_lowercase_header_is_accepted(): + # Postman sends "x-api-key"; header lookup must be case-insensitive. + assert request_has_valid_api_key(_request({"x-api-key": "k"}), _StubDB(found=True)) is True + + +def test_unknown_api_key_is_rejected(): + assert request_has_valid_api_key(_request({"X-API-Key": "nope"}), _StubDB(found=False)) is False + + +def test_missing_header_is_not_valid(): + assert request_has_valid_api_key(_request({}), _StubDB(found=True)) is False + + +def test_empty_header_value_is_not_valid(): + assert request_has_valid_api_key(_request({"X-API-Key": ""}), _StubDB(found=True)) is False + + +# --------------------------------------------------------------------------- # +# current_user — the gate endpoints actually depend on (auth enforced) +# --------------------------------------------------------------------------- # +def test_valid_api_key_authorizes_as_anonymous_service_caller(auth_enforced): + # The fix: valid key, no session -> authorized with user=None (no 403/SSO). + result = _run(current_user_call(_request({"x-api-key": "k"}), user=None, db=_StubDB(found=True))) + assert result is None + + +def test_invalid_api_key_is_rejected(auth_enforced): + with pytest.raises(HTTPException) as exc: + _run(current_user_call(_request({"x-api-key": "bad"}), user=None, db=_StubDB(found=False))) + assert exc.value.status_code == 403 + + +def test_no_session_and_no_api_key_is_rejected_so_sso_still_triggers(auth_enforced): + # The SSO guard: browser request, no session, no key -> 403 (drives login). + with pytest.raises(HTTPException) as exc: + _run(current_user_call(_request({}), user=None, db=_StubDB(found=False))) + assert exc.value.status_code == 403 + + +def test_session_user_still_authenticates(auth_enforced): + # A real (verified) OIDC/session user must still pass, key or not. + u = _user(verified=True) + assert _run(current_user_call(_request({}), user=u, db=_StubDB(found=False))) is u + + +# --------------------------------------------------------------------------- # +# current_admin_user — a key alone must NOT grant admin +# --------------------------------------------------------------------------- # +def test_api_key_does_not_grant_admin(auth_enforced): + # api-key request resolves to user=None -> admin gate must still 403. + with pytest.raises(HTTPException) as exc: + _run(users_mod.current_admin_user(user=None)) + assert exc.value.status_code == 403 + + +def test_admin_user_passes_admin_gate(auth_enforced): + admin = _user(role=UserRole.ADMIN) + assert _run(users_mod.current_admin_user(user=admin)) is admin + + +def test_basic_user_blocked_from_admin_gate(auth_enforced): + with pytest.raises(HTTPException) as exc: + _run(users_mod.current_admin_user(user=_user(role=UserRole.BASIC))) + assert exc.value.status_code == 403 + + +# helper: call current_user with explicit args (bypassing FastAPI DI defaults) +async def current_user_call(request, user, db): # noqa: ANN001 + return await users_mod.current_user(request=request, user=user, db_session=db) diff --git a/backend/tests/integration/test_filter_chunks_listwise.py b/backend/tests/integration/test_filter_chunks_listwise.py new file mode 100644 index 00000000000..87e1b3d5d19 --- /dev/null +++ b/backend/tests/integration/test_filter_chunks_listwise.py @@ -0,0 +1,48 @@ +"""Integration test: the one-shot LLM relevance filter end-to-end (filter_chunks). + +Drives the real filter_chunks → llm_eval_chunks_listwise → _parse_useful_indices +path with a stub LLM (we can't run a real chat model locally), verifying the +listwise selection maps to the right chunk ids and that it fails OPEN. +""" +from types import SimpleNamespace + +from danswer.search.postprocessing.postprocessing import filter_chunks + + +class _StubLLM: + """Minimal LLM whose .invoke returns a message with the given content + (message_to_string only needs `.content` to be a str).""" + + def __init__(self, reply: str) -> None: + self._reply = reply + + def invoke(self, *_: object, **__: object) -> SimpleNamespace: + return SimpleNamespace(content=self._reply) + + +def _chunks(n: int) -> list[SimpleNamespace]: + return [ + SimpleNamespace(content=f"chunk number {i}", unique_id=f"u{i}") + for i in range(n) + ] + + +def _query() -> SimpleNamespace: + return SimpleNamespace(query="some question", max_llm_filter_chunks=15) + + +def test_listwise_selection_maps_to_chunk_ids() -> None: + # LLM says sections 1 and 3 are useful (1-based) → chunks u0 and u2. + out = filter_chunks(_query(), _chunks(3), _StubLLM("[1, 3]")) + assert out == ["u0", "u2"] + + +def test_none_useful() -> None: + out = filter_chunks(_query(), _chunks(3), _StubLLM("[]")) + assert out == [] + + +def test_fail_open_on_unparseable_reply() -> None: + # No JSON array → keep everything (fail open). + out = filter_chunks(_query(), _chunks(3), _StubLLM("I'm not sure.")) + assert out == ["u0", "u1", "u2"] diff --git a/backend/tests/integration/test_reranking_cpu_model.py b/backend/tests/integration/test_reranking_cpu_model.py new file mode 100644 index 00000000000..4489cbc7cae --- /dev/null +++ b/backend/tests/integration/test_reranking_cpu_model.py @@ -0,0 +1,67 @@ +"""Integration test: cross-encoder reranking with a REAL model on CPU. + +Loads a small cross-encoder (ms-marco-MiniLM-L-6-v2 — tiny, CPU-friendly) and +runs it through the real rerank_chunks / semantic_reranking ordering logic to +prove a relevant chunk gets reordered to the top even when retrieval put an +irrelevant one first. Uses the small MiniLM model (not the prod +bge-reranker-v2-m3) because the *ordering logic* is what's under test, and it +keeps the download light. Skips cleanly if the model can't be fetched (offline). +""" +from types import SimpleNamespace + +import pytest + +pytest.importorskip("sentence_transformers") + + +def _load_cross_encoder(): # type: ignore[no-untyped-def] + from sentence_transformers import CrossEncoder + + try: + return CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2") + except Exception as exc: # network / offline / disk + pytest.skip(f"cross-encoder model unavailable: {exc}") + + +def _chunk(content: str, score: float) -> SimpleNamespace: + # SimpleNamespace stands in for InferenceChunk — semantic_reranking only + # does attribute access (content, boost, recency_bias, score). + return SimpleNamespace( + content=content, + boost=0, + recency_bias=1.0, + score=score, + document_id="d", + chunk_id=0, + source_links=None, + ) + + +def test_reranker_reorders_relevant_chunk_to_top(monkeypatch) -> None: # type: ignore[no-untyped-def] + model = _load_cross_encoder() + + import danswer.search.postprocessing.postprocessing as pp + + class _RealEnsemble: + def __init__(self, *_: object, **__: object) -> None: + pass + + def predict(self, query: str, passages: list[str]) -> list[list[float]]: + scores = model.predict([(query, p) for p in passages]) + return [[float(s) for s in scores]] + + monkeypatch.setattr(pp, "CrossEncoderEnsembleModel", _RealEnsemble) + + # Deliberately WRONG initial order: the off-topic chunk has the top score. + chunks = [ + _chunk("Bananas are a good source of potassium.", score=0.9), + _chunk("The capital of France is Paris.", score=0.5), + _chunk("Python is a programming language.", score=0.4), + ] + query = SimpleNamespace(query="What is the capital of France?", num_rerank=15) + + ranked = pp.rerank_chunks(query=query, chunks_to_rerank=chunks) + + # After reranking, the France/Paris chunk should be on top, not the banana. + assert "Paris" in ranked[0].content + assert ranked[0].content != "Bananas are a good source of potassium." diff --git a/backend/tests/integration/test_tei_rerank_client.py b/backend/tests/integration/test_tei_rerank_client.py new file mode 100644 index 00000000000..178c5f0eea6 --- /dev/null +++ b/backend/tests/integration/test_tei_rerank_client.py @@ -0,0 +1,64 @@ +"""Integration test for the TEI rerank transport in CrossEncoderEnsembleModel. + +When RERANK_SERVER_URL is configured, the client talks to a Hugging Face TEI +server's /rerank endpoint. TEI returns [{index, score}, ...] sorted by score, +so the client must scatter the scores back into the INPUT passage order and +wrap them as list[list[float]] (the shape semantic_reranking expects). +""" + +import danswer.search.search_nlp_models as nlp + + +class _FakeResponse: + def __init__(self, payload: object) -> None: + self._payload = payload + + def raise_for_status(self) -> None: + pass + + def json(self) -> object: + return self._payload + + +def test_tei_path_scatters_scores_back_to_passage_order(monkeypatch) -> None: # type: ignore[no-untyped-def] + captured: dict = {} + + # TEI replies sorted by score (best first), referencing the input by index. + def fake_post(url: str, json: dict | None = None, **_: object) -> _FakeResponse: + captured["url"] = url + captured["json"] = json + return _FakeResponse( + [ + {"index": 2, "score": 9.0}, + {"index": 0, "score": 1.0}, + {"index": 1, "score": -3.0}, + ] + ) + + monkeypatch.setattr(nlp.requests, "post", fake_post) + + model = nlp.CrossEncoderEnsembleModel(rerank_server_url="http://tei-rerank:80") + out = model.predict("the query", ["a", "b", "c"]) + + # one ensemble entry, scores back in passage order [a, b, c] + assert out == [[1.0, -3.0, 9.0]] + assert captured["url"] == "http://tei-rerank:80/rerank" + assert captured["json"]["texts"] == ["a", "b", "c"] + assert captured["json"]["raw_scores"] is True + + +def test_legacy_path_used_when_no_tei_url(monkeypatch) -> None: # type: ignore[no-untyped-def] + captured: dict = {} + + def fake_post(url: str, json: dict | None = None, **_: object) -> _FakeResponse: + captured["url"] = url + return _FakeResponse({"scores": [[0.1, 0.2]]}) + + monkeypatch.setattr(nlp.requests, "post", fake_post) + + model = nlp.CrossEncoderEnsembleModel(rerank_server_url="") + out = model.predict("q", ["a", "b"]) + + assert out == [[0.1, 0.2]] + # legacy model-server endpoint, NOT /rerank + assert captured["url"].endswith("/encoder/cross-encoder-scores") diff --git a/backend/tests/unit/danswer/background/test_indexing_scheduler.py b/backend/tests/unit/danswer/background/test_indexing_scheduler.py index 93ff426ff8b..5b6b21ab6e5 100644 --- a/backend/tests/unit/danswer/background/test_indexing_scheduler.py +++ b/backend/tests/unit/danswer/background/test_indexing_scheduler.py @@ -30,11 +30,13 @@ """ from __future__ import annotations +import os import random import unittest from dataclasses import dataclass from dataclasses import field from enum import Enum +from unittest import mock from danswer.background.update import _build_running_view from danswer.background.update import _DEFER_CC_PAIR @@ -48,6 +50,7 @@ class _Source(str, Enum): GITHUB = "github" CONFLUENCE = "confluence" SALESFORCE = "salesforce" + WEB = "web" @dataclass @@ -703,5 +706,96 @@ def test_crash_recovery_does_not_leak_cap(self) -> None: ) +class TestPerSourceCapOverrides(unittest.TestCase): + """The per-source override (`INDEXING_PER_SOURCE_CAP_OVERRIDES`) lets a + single source run at a different cap than the global default without + affecting other sources. Web is the motivating case: lift its cap while + Slack/Confluence stay at 1. + """ + + def _tick( + self, + candidates: list[_FakeAttempt], + in_progress: list[_FakeAttempt], + per_source_cap: int, + overrides: dict[str, int], + ) -> tuple[list[int], list[tuple[int, str]]]: + running_per_source, keys = _build_running_view( + in_progress, [], per_source_cap, overrides + ) + dispatched: list[int] = [] + deferred: list[tuple[int, str]] = [] + for attempt in candidates: + decision = _evaluate_dispatch_for_attempt( + attempt, running_per_source, keys, per_source_cap, overrides + ) + if decision == _DISPATCH: + dispatched.append(attempt.id) + else: + deferred.append((attempt.id, decision)) + return dispatched, deferred + + def test_override_uncaps_one_source_only(self) -> None: + # Default cap 1, web uncapped (0). Three distinct web cc-pairs + + # two distinct slack cc-pairs queued on an idle scheduler. + candidates = [ + _attempt(id=1, source=_Source.WEB), + _attempt(id=2, source=_Source.WEB), + _attempt(id=3, source=_Source.WEB), + _attempt(id=4, source=_Source.SLACK), + _attempt(id=5, source=_Source.SLACK), + ] + dispatched, deferred = self._tick( + candidates, in_progress=[], per_source_cap=1, overrides={"web": 0} + ) + # All three web attempts go (uncapped); only one slack goes (cap 1). + self.assertEqual(set(dispatched), {1, 2, 3, 4}) + self.assertEqual(deferred, [(5, _DEFER_SOURCE_CAP)]) + + def test_override_with_finite_cap(self) -> None: + # web=2: at most two web at once; the third defers. + candidates = [ + _attempt(id=1, source=_Source.WEB), + _attempt(id=2, source=_Source.WEB), + _attempt(id=3, source=_Source.WEB), + ] + dispatched, deferred = self._tick( + candidates, in_progress=[], per_source_cap=1, overrides={"web": 2} + ) + self.assertEqual(set(dispatched), {1, 2}) + self.assertEqual(deferred, [(3, _DEFER_SOURCE_CAP)]) + + def test_non_overridden_source_keeps_global_default(self) -> None: + # Override only names web; confluence must still honor the global 1. + candidates = [ + _attempt(id=1, source=_Source.CONFLUENCE), + _attempt(id=2, source=_Source.CONFLUENCE), + ] + dispatched, deferred = self._tick( + candidates, in_progress=[], per_source_cap=1, overrides={"web": 0} + ) + self.assertEqual(dispatched, [1]) + self.assertEqual(deferred, [(2, _DEFER_SOURCE_CAP)]) + + def test_per_cc_pair_lock_still_holds_when_uncapped(self) -> None: + # Even uncapped, the same cc-pair never runs twice concurrently. + running = _attempt(id=1, source=_Source.WEB, connector_id=99) + dup = _attempt(id=2, source=_Source.WEB, connector_id=99) + dispatched, deferred = self._tick( + [dup], in_progress=[running], per_source_cap=1, overrides={"web": 0} + ) + self.assertEqual(dispatched, []) + self.assertEqual(deferred, [(2, _DEFER_CC_PAIR)]) + + def test_resolve_overrides_parsing(self) -> None: + from danswer.configs.indexing_concurrency import _resolve_overrides + + with mock.patch.dict( + os.environ, + {"INDEXING_PER_SOURCE_CAP_OVERRIDES": " web = 0 , slack=2 ,bad,=3,x=y"}, + ): + self.assertEqual(_resolve_overrides(), {"web": 0, "slack": 2}) + + if __name__ == "__main__": unittest.main() diff --git a/backend/tests/unit/danswer/chat/test_translate_citations.py b/backend/tests/unit/danswer/chat/test_translate_citations.py new file mode 100644 index 00000000000..ecb08c50e98 --- /dev/null +++ b/backend/tests/unit/danswer/chat/test_translate_citations.py @@ -0,0 +1,49 @@ +"""Regression tests for translate_citations. + +Guards the selected-docs `KeyError` fix: the LLM can emit a citation for a +document that isn't in this turn's reference docs (e.g. when chatting with a +subset of selected documents, it cites a doc from earlier in the conversation). +That must be skipped gracefully, not crash the whole response (which previously +surfaced as the misleading "Failed to parse LLM output"). +""" +from types import SimpleNamespace + +from danswer.chat.models import CitationInfo +from danswer.chat.process_message import translate_citations + + +def _db_doc(db_id: int, document_id: str) -> SimpleNamespace: + # translate_citations only reads .id and .document_id off each db_doc. + return SimpleNamespace(id=db_id, document_id=document_id) + + +def test_maps_citation_num_to_saved_doc_id() -> None: + db_docs = [_db_doc(101, "DOC_A"), _db_doc(102, "DOC_B")] + citations = [ + CitationInfo(citation_num=1, document_id="DOC_A"), + CitationInfo(citation_num=2, document_id="DOC_B"), + ] + assert translate_citations(citations, db_docs) == {1: 101, 2: 102} + + +def test_skips_citation_for_unknown_document_id() -> None: + # The fix: an unknown document_id is skipped, not a KeyError. + db_docs = [_db_doc(101, "DOC_A")] + citations = [ + CitationInfo(citation_num=1, document_id="DOC_A"), + CitationInfo(citation_num=2, document_id="DOC_NOT_IN_THIS_TURN"), + ] + assert translate_citations(citations, db_docs) == {1: 101} + + +def test_all_unknown_citations_yields_empty_map_without_error() -> None: + db_docs = [_db_doc(101, "DOC_A")] + citations = [CitationInfo(citation_num=1, document_id="GHOST")] + assert translate_citations(citations, db_docs) == {} + + +def test_first_db_doc_for_a_document_id_wins() -> None: + # Duplicate document_id across db_docs -> the first (UI order) is cited. + db_docs = [_db_doc(101, "DOC_A"), _db_doc(202, "DOC_A")] + citations = [CitationInfo(citation_num=1, document_id="DOC_A")] + assert translate_citations(citations, db_docs) == {1: 101} diff --git a/backend/tests/unit/danswer/configs/test_indexing_concurrency.py b/backend/tests/unit/danswer/configs/test_indexing_concurrency.py new file mode 100644 index 00000000000..d5d1afc04d0 --- /dev/null +++ b/backend/tests/unit/danswer/configs/test_indexing_concurrency.py @@ -0,0 +1,53 @@ +"""Unit tests for the per-source indexing concurrency cap helpers.""" +import pytest + +from danswer.configs.indexing_concurrency import _resolve_overrides +from danswer.configs.indexing_concurrency import cap_for_source + + +def test_cap_for_source_uses_override_when_present() -> None: + assert cap_for_source("web", default=1, overrides={"web": 3}) == 3 + + +def test_cap_for_source_falls_back_to_default() -> None: + assert cap_for_source("slack", default=1, overrides={"web": 3}) == 1 + + +def test_cap_for_source_zero_override_is_respected() -> None: + # 0 = uncapped; must not be treated as falsy/absent. + assert cap_for_source("web", default=1, overrides={"web": 0}) == 0 + + +def test_resolve_overrides_parses_pairs(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("INDEXING_PER_SOURCE_CAP_OVERRIDES", "web=0,slack=1") + assert _resolve_overrides() == {"web": 0, "slack": 1} + + +def test_resolve_overrides_lowercases_source(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("INDEXING_PER_SOURCE_CAP_OVERRIDES", "WEB=3") + assert _resolve_overrides() == {"web": 3} + + +def test_resolve_overrides_skips_malformed_entries( + monkeypatch: pytest.MonkeyPatch, +) -> None: + # "web" (no =), "=5" (empty source), "slack=abc" (bad int) all skipped; + # one typo must not wipe the valid entries. + monkeypatch.setenv( + "INDEXING_PER_SOURCE_CAP_OVERRIDES", "web, =5, slack=abc, confluence=2" + ) + assert _resolve_overrides() == {"confluence": 2} + + +def test_resolve_overrides_clamps_negative_to_zero( + monkeypatch: pytest.MonkeyPatch, +) -> None: + monkeypatch.setenv("INDEXING_PER_SOURCE_CAP_OVERRIDES", "web=-2") + assert _resolve_overrides() == {"web": 0} + + +def test_resolve_overrides_empty_when_unset( + monkeypatch: pytest.MonkeyPatch, +) -> None: + monkeypatch.delenv("INDEXING_PER_SOURCE_CAP_OVERRIDES", raising=False) + assert _resolve_overrides() == {} diff --git a/backend/tests/unit/danswer/connectors/highspot/test_clean_spot_name.py b/backend/tests/unit/danswer/connectors/highspot/test_clean_spot_name.py new file mode 100644 index 00000000000..5fb146b11f9 --- /dev/null +++ b/backend/tests/unit/danswer/connectors/highspot/test_clean_spot_name.py @@ -0,0 +1,41 @@ +"""Regression tests for clean_spot_name. + +The cleaned value is used as the connector / cc-pair display name when syncing +Highspot Spots; the original Spot title is what's stored in `spot_names` (used to +match the real Spot), so cleaning must only tidy whitespace/control chars and +never alter meaningful characters. +""" +from danswer.connectors.highspot.sync import clean_spot_name + + +def test_collapses_internal_whitespace() -> None: + assert ( + clean_spot_name("Automation for Good - Sustainability at UiPath") + == "Automation for Good - Sustainability at UiPath" + ) + + +def test_trims_leading_and_trailing_whitespace() -> None: + assert ( + clean_spot_name(" Healthcare & Life Sciences (HLS) Spot ") + == "Healthcare & Life Sciences (HLS) Spot" + ) + + +def test_preserves_meaningful_punctuation() -> None: + title = "Track Your Impact: See How You're Progressing" + assert clean_spot_name(title) == title + + +def test_strips_zero_width_chars_keeping_real_spaces() -> None: + # zero-width space (U+200B) removed; the normal space is preserved. + assert clean_spot_name("Gen​AI GTM") == "GenAI GTM" + + +def test_strips_control_chars() -> None: + # control chars (e.g. tab) are removed entirely, not turned into spaces. + assert clean_spot_name("Sales\tAMER") == "SalesAMER" + + +def test_already_clean_title_unchanged() -> None: + assert clean_spot_name("Sales AMER") == "Sales AMER" diff --git a/backend/tests/unit/danswer/connectors/outsystems/test_extract.py b/backend/tests/unit/danswer/connectors/outsystems/test_extract.py new file mode 100644 index 00000000000..12ba9ceddea --- /dev/null +++ b/backend/tests/unit/danswer/connectors/outsystems/test_extract.py @@ -0,0 +1,178 @@ +"""Tests for outsystems page text/title extraction. + +The connector fetches each page as an OutSystems screenservice JSON and pulls +content from `PageWidgetItem.Text1` (HTML) anywhere in the section/widget tree, +with the title from `Page2.Name` (falling back to the first heading). These guard +that extraction against schema drift and HTML edge cases. +""" +import time + +from danswer.connectors.outsystems import connector as os_connector +from danswer.connectors.outsystems.connector import collect_widgets +from danswer.connectors.outsystems.connector import extract_page_text +from danswer.connectors.outsystems.connector import extract_page_title +from danswer.connectors.outsystems.connector import _extract_text_with_timeout + + +def _page(*text1_blobs: str, name: str = "") -> dict: + items = [{"PageWidgetItem": {"Text1": b}} for b in text1_blobs] + return { + "Page2": {"Name": name}, + "StrPageSectionRowList": { + "List": [ + {"StrPageSectionList": {"List": [{"WidgetItems": {"List": items}}]}} + ] + }, + } + + +def test_extracts_and_strips_html_across_widgets() -> None: + data = _page( + "

How to report

Email ethics@uipath.com & call.

", + "
  • Step one
  • Step two
", + name="Reporting an ethics concern", + ) + body = extract_page_text(data) + assert "How to report" in body + assert "ethics@uipath.com & call." in body # entity unescaped, tags gone + assert "Step one" in body and "Step two" in body + assert "<" not in body and ">" not in body + + +def test_title_from_page2_name() -> None: + data = _page("

body text here

", name="Reporting an ethics concern") + assert ( + extract_page_title(data, extract_page_text(data), 314) + == "Reporting an ethics concern" + ) + + +def test_title_falls_back_to_first_heading_when_name_blank() -> None: + data = _page("

Welcome to the Policy Hub

more

", name="") + assert extract_page_title(data, extract_page_text(data), 7) == ( + "Welcome to the Policy Hub" + ) + + +def test_title_final_fallback_uses_page_id() -> None: + assert extract_page_title({"Page2": {"Name": ""}}, "", 42) == "OutSystems Page 42" + + +def test_empty_page_yields_empty_text() -> None: + assert extract_page_text({"Page2": {"Name": ""}}) == "" + assert extract_page_text(_page(" ")) == "" + + +def test_ignores_non_text1_string_fields() -> None: + # Only PageWidgetItem.Text1 is content; other strings must be ignored. + data = { + "Page2": {"Name": "x", "GUID": "should-not-appear", "CustomURL": "nope"}, + "Junk": {"SomeOtherField": "also should not appear"}, + "StrPageSectionRowList": { + "List": [ + { + "StrPageSectionList": { + "List": [ + {"W": {"List": [{"PageWidgetItem": {"Text1": "real body"}}]}} + ] + } + } + ] + }, + } + body = extract_page_text(data) + assert body == "real body" + + +# ---- widget classification: text widgets vs document(file) widgets ---------- +def _widget(text1: str, page_file_id: str = "0", text2: str = "") -> dict: + item = {"PageWidgetItem": {"Text1": text1, "PageFileId": page_file_id, "Text2": text2}} + return { + "StrPageSectionRowList": { + "List": [{"StrPageSectionList": {"List": [{"W": {"List": [item]}}]}}] + } + } + + +def test_text_widget_goes_to_html_blobs_not_files() -> None: + html_blobs: list[str] = [] + files: list[tuple[str, str]] = [] + collect_widgets(_widget("

hello world

", page_file_id="0"), html_blobs, files) + assert html_blobs == ["

hello world

"] + assert files == [] + + +def test_document_widget_routes_to_files_with_path_and_name() -> None: + # Document widget: Text1 is a FilePath, Text2 the filename, PageFileId != 0. + html_blobs: list[str] = [] + files: list[tuple[str, str]] = [] + data = _widget( + "IC_Content/Docs/conflict-of-interest-policy.pdf", + page_file_id="1482", + text2="conflict-of-interest-policy.pdf", + ) + collect_widgets(data, html_blobs, files) + assert files == [ + ("IC_Content/Docs/conflict-of-interest-policy.pdf", + "conflict-of-interest-policy.pdf") + ] + # the file path must NOT be treated as body text (latent-bug guard) + assert html_blobs == [] + assert "IC_Content" not in extract_page_text(data) + + +def test_document_widget_filename_falls_back_to_path_basename() -> None: + html_blobs: list[str] = [] + files: list[tuple[str, str]] = [] + collect_widgets(_widget("Lib/Docs/policy.pdf", page_file_id="9"), html_blobs, files) + assert files == [("Lib/Docs/policy.pdf", "policy.pdf")] + + +# ---- process-isolated extraction timeout (the fix for GIL-bound parse hangs) -- +def test_extract_timeout_returns_empty_and_is_bounded(monkeypatch) -> None: + """A pathological parse that never returns must be hard-killed within the + timeout and yield "" — proving the index attempt can't be frozen by one file.""" + def _hang(**_kwargs): + time.sleep(60) + return "never" + + monkeypatch.setattr(os_connector, "extract_file_text", _hang) + start = time.time() + out = _extract_text_with_timeout("bad.pdf", b"data", timeout=2) + elapsed = time.time() - start + assert out == "" + assert elapsed < 10 # killed near the 2s bound, not hung + + +def test_extract_fast_path_returns_text(monkeypatch) -> None: + monkeypatch.setattr(os_connector, "extract_file_text", lambda **_k: "hello body") + assert _extract_text_with_timeout("a.pdf", b"data", timeout=10) == "hello body" + + +# ---- long-token sanitizer (prevents O(n^2) tokenizer hang on table dumps) ----- +def test_break_long_tokens_bounds_runs() -> None: + from danswer.connectors.outsystems.connector import _break_long_tokens + out = _break_long_tokens("X" * 500_000) + assert max(len(w) for w in out.split()) <= 80 + + +def test_break_long_tokens_leaves_normal_text() -> None: + from danswer.connectors.outsystems.connector import _break_long_tokens + s = "Reporting an ethics concern: email ethics@uipath.com for help." + assert _break_long_tokens(s) == s + + +# ---- section splitting (keeps giant docs indexable without a tokenizer hang) -- +def test_split_sections_bounds_size_and_keeps_content() -> None: + from danswer.connectors.outsystems.connector import _split_sections, _MAX_SECTION_CHARS + secs = _split_sections("word " * 200_000, "http://x", header="big.pdf") + assert len(secs) > 1 + assert all(len(s.text) <= _MAX_SECTION_CHARS for s in secs) + assert secs[0].text.startswith("big.pdf") + assert all(s.link == "http://x" for s in secs) + + +def test_split_sections_small_text_single_section() -> None: + from danswer.connectors.outsystems.connector import _split_sections + secs = _split_sections("just a little text", "http://x") + assert len(secs) == 1 and secs[0].text == "just a little text" diff --git a/backend/tests/unit/danswer/db/test_persona_eager_load.py b/backend/tests/unit/danswer/db/test_persona_eager_load.py new file mode 100644 index 00000000000..b38f88aecbc --- /dev/null +++ b/backend/tests/unit/danswer/db/test_persona_eager_load.py @@ -0,0 +1,64 @@ +"""Guards for the persona-serialization eager-load (N+1 fix). + +PersonaSnapshot.from_model walks document_sets -> connector_credential_pairs -> +connector/credential, all lazy by default. get_personas / get_persona_by_id take +an opt-in `eager_load` flag that selectin-loads that whole chain so serializing +the admin assistants list / edit page doesn't fire hundreds of queries. + +These tests are DB-free. They guard the two ways this fix silently breaks: + 1. A relationship gets renamed -> the eager-load chain references a dead + attribute (would raise, or quietly revert to N+1). Asserting the exact + relationship names exist (and that building the options doesn't raise) + catches that. + 2. `eager_load` stops defaulting to False -> the visibility toggle / delete + paths (which call get_persona_by_id) start paying for the heavy load they + don't need. Asserting the default stays False catches that. +""" +import inspect + +from sqlalchemy import inspect as sa_inspect +from sqlalchemy import select + +from danswer.db.models import ConnectorCredentialPair +from danswer.db.models import DocumentSet +from danswer.db.models import Persona +from danswer.db.persona import _persona_snapshot_load_options +from danswer.db.persona import get_persona_by_id +from danswer.db.persona import get_personas + + +def test_load_options_builds_without_error() -> None: + # Constructing the options dereferences every relationship in the chain + # (Persona.document_sets, DocumentSet.connector_credential_pairs, ...), so a + # rename would raise here. + opts = _persona_snapshot_load_options() + assert len(opts) == 8 + + +def test_options_compose_onto_a_persona_select() -> None: + stmt = select(Persona).options(*_persona_snapshot_load_options()) + assert stmt is not None + + +def test_eager_chain_relationships_exist() -> None: + persona_rels = sa_inspect(Persona).relationships + for rel in ["user", "prompts", "tools", "users", "groups", "document_sets"]: + assert rel in persona_rels, f"Persona.{rel} relationship missing" + + docset_rels = sa_inspect(DocumentSet).relationships + for rel in ["connector_credential_pairs", "users", "groups"]: + assert rel in docset_rels, f"DocumentSet.{rel} relationship missing" + + ccpair_rels = sa_inspect(ConnectorCredentialPair).relationships + for rel in ["connector", "credential"]: + assert rel in ccpair_rels, f"ConnectorCredentialPair.{rel} relationship missing" + + +def test_eager_load_defaults_to_false() -> None: + # Must stay False so the visibility toggle / delete paths don't pay for the + # heavy serialization load they never use. + assert get_personas.__defaults__ is not None + assert inspect.signature(get_personas).parameters["eager_load"].default is False + assert ( + inspect.signature(get_persona_by_id).parameters["eager_load"].default is False + ) diff --git a/backend/tests/unit/danswer/document_index/__init__.py b/backend/tests/unit/danswer/document_index/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/document_index/vespa/__init__.py b/backend/tests/unit/danswer/document_index/vespa/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/llm/__init__.py b/backend/tests/unit/danswer/llm/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/llm/answering/__init__.py b/backend/tests/unit/danswer/llm/answering/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/llm/answering/test_authoritative_retention.py b/backend/tests/unit/danswer/llm/answering/test_authoritative_retention.py new file mode 100644 index 00000000000..c47755a812f --- /dev/null +++ b/backend/tests/unit/danswer/llm/answering/test_authoritative_retention.py @@ -0,0 +1,147 @@ +"""Unit tests for verify-then-retain authoritative citations. + +Covers the pure pieces (candidate selection / dedupe / index parsing / footer) and +the verify step with a stub LLM. The retention is additive + deduped + fail-closed. +""" +from types import SimpleNamespace + +import pytest + +from danswer.llm.answering import authoritative_retention as ar + + +def doc(doc_id, source, name="Doc", content="c", link="http://x"): + return SimpleNamespace( + document_id=doc_id, source_type=source, semantic_identifier=name, + content=content, link=link, + ) + + +@pytest.fixture(autouse=True) +def _cfg(monkeypatch): + monkeypatch.setattr(ar, "PROTECTED_SOURCES", ["web", "outsystems"]) + + +# ---- select_authoritative_candidates ------------------------------------------- + +def test_returns_authoritative_when_none_cited(): + docs = [ + doc("os1", "outsystems"), + doc("s1", "slack"), # not protected + doc("w1", "web"), + ] + out = ar.select_authoritative_candidates(docs, already_cited_doc_ids={"s1"}) + # slack cited is fine (not authoritative); both authoritative docs are candidates + assert [d.document_id for d in out] == ["os1", "w1"] + + +def test_surfaces_uncited_authoritative_even_when_another_is_cited(): + # Citing one authoritative doc must NOT suppress a different uncited one. + docs = [doc("os1", "outsystems"), doc("os2", "outsystems"), doc("w1", "web")] + out = ar.select_authoritative_candidates(docs, already_cited_doc_ids={"os2"}) + assert [d.document_id for d in out] == ["os1", "w1"] # os2 cited; os1, w1 surfaced + + +def test_dedupes_same_document_id(): + docs = [doc("os1", "outsystems"), doc("os1", "outsystems"), doc("os2", "outsystems")] + out = ar.select_authoritative_candidates(docs, already_cited_doc_ids=set()) + assert [d.document_id for d in out] == ["os1", "os2"] + + +def test_skips_docs_without_link(): + docs = [doc("os1", "outsystems", link=None), doc("os2", "outsystems", link="http://y")] + out = ar.select_authoritative_candidates(docs, already_cited_doc_ids=set()) + assert [d.document_id for d in out] == ["os2"] + + +def test_handles_enum_source_type(): + docs = [doc("os1", SimpleNamespace(value="outsystems"))] + out = ar.select_authoritative_candidates(docs, already_cited_doc_ids=set()) + assert [d.document_id for d in out] == ["os1"] + + +# ---- parse_supporting_indices -------------------------------------------------- + +@pytest.mark.parametrize("raw,n,expected", [ + ("[1, 3]", 3, [0, 2]), + ("sure: [2]", 3, [1]), + ("[]", 3, []), + ("none", 3, []), + ("[1, 9]", 3, [0]), # out-of-range dropped + ("[1, 1]", 3, [0, 0]), # parser doesn't dedupe ids (caller controls candidates) +]) +def test_parse_supporting_indices(raw, n, expected): + assert ar.parse_supporting_indices(raw, n) == expected + + +# ---- verify_supporting_docs (stub LLM) ----------------------------------------- + +class _StubLLM: + def __init__(self, reply): self._reply = reply + def invoke(self, prompt): return SimpleNamespace(content=self._reply) + + +class _BoomLLM: + def __init__(self): self.calls = 0 + def invoke(self, prompt): + self.calls += 1 + raise RuntimeError("llm down") + + +class _FlakyLLM: + """Fails the first N invokes, then returns `reply`.""" + def __init__(self, fail_times, reply): + self._fail_times = fail_times + self._reply = reply + self.calls = 0 + def invoke(self, prompt): + self.calls += 1 + if self.calls <= self._fail_times: + raise RuntimeError("transient timeout") + return SimpleNamespace(content=self._reply) + + +def test_verify_returns_supporting_docs(): + cands = [doc("os1", "outsystems", "Forma"), doc("os2", "outsystems", "Tax")] + out = ar.verify_supporting_docs("answer", cands, _StubLLM("[1]")) + assert [d.document_id for d in out] == ["os1"] + + +def test_verify_fail_closed_after_exhausting_retries(): + cands = [doc("os1", "outsystems")] + llm = _BoomLLM() + assert ar.verify_supporting_docs("answer", cands, llm, max_attempts=2) == [] + assert llm.calls == 2 # tried twice, then gave up + + +def test_verify_retries_then_succeeds_on_transient_failure(): + cands = [doc("os1", "outsystems"), doc("os2", "outsystems")] + llm = _FlakyLLM(fail_times=1, reply="[2]") # first call times out, retry works + out = ar.verify_supporting_docs("answer", cands, llm, max_attempts=2) + assert [d.document_id for d in out] == ["os2"] + assert llm.calls == 2 + + +def test_verify_noop_without_candidates_or_answer(): + assert ar.verify_supporting_docs("answer", [], _StubLLM("[1]")) == [] + assert ar.verify_supporting_docs(" ", [doc("os1", "outsystems")], _StubLLM("[1]")) == [] + + +# ---- retained_authoritative_footer (footer block, stub LLM) -------------------- + +def test_footer_appended_for_verified_doc(): + fcd = [doc("os1", "outsystems", "Forma", link="http://f"), doc("s1", "slack")] + out = ar.retained_authoritative_footer("ans", fcd, set(), _StubLLM("[1]")) + assert "Authoritative sources" in out and "[Forma](http://f)" in out + + +def test_footer_empty_and_no_llm_call_when_authoritative_already_cited(): + fcd = [doc("os1", "outsystems"), doc("s1", "slack")] + llm = _BoomLLM() # would raise if the verify call ran + assert ar.retained_authoritative_footer("ans", fcd, {"os1"}, llm) == "" + assert llm.calls == 0 # gated out before any LLM call + + +def test_footer_empty_when_verify_rejects(): + fcd = [doc("os1", "outsystems"), doc("s1", "slack")] + assert ar.retained_authoritative_footer("ans", fcd, set(), _StubLLM("[]")) == "" diff --git a/backend/tests/unit/danswer/llm/answering/test_cap_docs_per_source.py b/backend/tests/unit/danswer/llm/answering/test_cap_docs_per_source.py new file mode 100644 index 00000000000..719db0173ee --- /dev/null +++ b/backend/tests/unit/danswer/llm/answering/test_cap_docs_per_source.py @@ -0,0 +1,79 @@ +"""Unit tests for cap_docs_per_source — caps how many docs any single source +contributes to the final prompt, so a chatty source (e.g. a busy Slack channel) +can't monopolize grounding/citations and drown out curated sources. +""" +from types import SimpleNamespace + +import pytest + +from danswer.llm.answering import doc_pruning as dp + + +def _doc(name: str, source: str) -> SimpleNamespace: + return SimpleNamespace(semantic_identifier=name, source_type=source) + + +def _ids(docs: list[SimpleNamespace]) -> list[str]: + return [d.semantic_identifier for d in docs] + + +@pytest.fixture(autouse=True) +def _cfg(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(dp, "MAX_PROMPT_DOCS_PER_SOURCE", 2) + + +def test_caps_dominant_source_keeping_order() -> None: + # 5 Slack + 2 OutSystems (already promoted to front) -> Slack capped to 2. + docs = [ + _doc("os1", "outsystems"), + _doc("os2", "outsystems"), + _doc("s1", "slack"), + _doc("s2", "slack"), + _doc("s3", "slack"), + _doc("s4", "slack"), + _doc("s5", "slack"), + ] + out = dp.cap_docs_per_source(docs) + assert _ids(out) == ["os1", "os2", "s1", "s2"] + + +def test_caps_each_source_independently() -> None: + docs = [ + _doc("s1", "slack"), + _doc("w1", "web"), + _doc("s2", "slack"), + _doc("w2", "web"), + _doc("s3", "slack"), + _doc("w3", "web"), + ] + out = dp.cap_docs_per_source(docs) + # first 2 of each source, in original order + assert _ids(out) == ["s1", "w1", "s2", "w2"] + + +def test_keeps_top_n_by_position() -> None: + # cap keeps the FIRST N (highest-ranked / promoted), drops the tail + docs = [_doc(f"s{i}", "slack") for i in range(6)] + assert _ids(dp.cap_docs_per_source(docs)) == ["s0", "s1"] + + +def test_under_cap_unchanged() -> None: + docs = [_doc("s1", "slack"), _doc("os1", "outsystems")] + assert _ids(dp.cap_docs_per_source(docs)) == ["s1", "os1"] + + +def test_disabled_when_cap_zero(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(dp, "MAX_PROMPT_DOCS_PER_SOURCE", 0) + docs = [_doc(f"s{i}", "slack") for i in range(5)] + assert dp.cap_docs_per_source(docs) is docs + + +def test_handles_enum_like_source_type(monkeypatch: pytest.MonkeyPatch) -> None: + # source_type may be a DocumentSource enum (has .value) rather than a str. + monkeypatch.setattr(dp, "MAX_PROMPT_DOCS_PER_SOURCE", 1) + docs = [ + _doc("a", SimpleNamespace(value="slack")), + _doc("b", SimpleNamespace(value="slack")), + _doc("c", SimpleNamespace(value="outsystems")), + ] + assert _ids(dp.cap_docs_per_source(docs)) == ["a", "c"] diff --git a/backend/tests/unit/danswer/llm/answering/test_dedupe_doc_versions.py b/backend/tests/unit/danswer/llm/answering/test_dedupe_doc_versions.py new file mode 100644 index 00000000000..3df2087cc11 --- /dev/null +++ b/backend/tests/unit/danswer/llm/answering/test_dedupe_doc_versions.py @@ -0,0 +1,134 @@ +"""Tests for versioned-docs dedup in the final LLM source-doc selection. + +Documentation sites (docs.uipath.com) publish the same page under one URL per +product version. Retrieval floods the LLM context with near-identical copies, +crowding out distinct sources and making the bot intermittently fail to cite. +dedupe_doc_versions keeps only the newest version's chunk(s) per page, scoped to +docs URLs — every other source must pass through untouched. +""" +from datetime import datetime + +from danswer.chat.models import LlmDoc +from danswer.configs.constants import DocumentSource +from danswer.llm.answering import doc_pruning +from danswer.llm.answering.doc_pruning import _docs_page_and_version +from danswer.llm.answering.doc_pruning import _docs_version_sort_key +from danswer.llm.answering.doc_pruning import dedupe_doc_versions + +AS = "https://docs.uipath.com/automation-suite/automation-suite" +PAGE = "installation-guide/how-to-delete-images-from-the-old-installer-after-upgrade" + + +def _doc(url: str, source: DocumentSource = DocumentSource.WEB) -> LlmDoc: + return LlmDoc( + document_id=url, + content="content", + blurb="blurb", + semantic_identifier="How to delete images", + source_type=source, + metadata={}, + updated_at=None, + link=url, + source_links={0: url}, + ) + + +# ---- version ordering ------------------------------------------------------ +def test_version_sort_new_scheme_outranks_old() -> None: + # 2.2510 (new scheme, current) must be newer than 2024.10 (old scheme). + assert _docs_version_sort_key("2.2510") > _docs_version_sort_key("2024.10") + + +def test_version_sort_old_scheme_internal_order() -> None: + keys = [ + _docs_version_sort_key(v) + for v in ["2022.4", "2022.10", "2023.4", "2023.10", "2024.10"] + ] + assert keys == sorted(keys) # already newest-last -> strictly increasing + + +def test_version_sort_latest_is_highest() -> None: + for v in ["2.2510", "2024.10", "2023.10", "9.9999"]: + assert _docs_version_sort_key("latest") > _docs_version_sort_key(v) + + +def test_version_sort_unrecognized_is_lowest() -> None: + assert _docs_version_sort_key("nonsense") < _docs_version_sort_key("2022.4") + + +# ---- page/version parsing -------------------------------------------------- +def test_parse_strips_version_segment() -> None: + page, ver = _docs_page_and_version(f"{AS}/2024.10/{PAGE}") + assert ver == "2024.10" + assert "2024.10" not in page + # same page, different version -> identical page_key + page2, _ = _docs_page_and_version(f"{AS}/2.2510/{PAGE}") + assert page == page2 + + +def test_parse_non_docs_url_returns_none() -> None: + assert _docs_page_and_version("https://example.com/foo/2024.10/bar") is None + + +def test_parse_docs_url_without_version_returns_none() -> None: + assert _docs_page_and_version("https://docs.uipath.com/overview") is None + + +# ---- dedup behavior -------------------------------------------------------- +def test_keeps_only_newest_version_of_a_page() -> None: + docs = [ + _doc(f"{AS}/2023.10/{PAGE}"), + _doc(f"{AS}/2024.10/{PAGE}"), + _doc(f"{AS}/2.2510/{PAGE}"), # newest + _doc(f"{AS}/2022.4/{PAGE}"), + ] + kept, _ = dedupe_doc_versions(docs, None) + assert [d.document_id for d in kept] == [f"{AS}/2.2510/{PAGE}"] + + +def test_distinct_pages_all_survive() -> None: + a = f"{AS}/2024.10/guide/page-a" + b = f"{AS}/2024.10/guide/page-b" + kept, _ = dedupe_doc_versions([_doc(a), _doc(b)], None) + assert {d.document_id for d in kept} == {a, b} + + +def test_keeps_multiple_chunks_of_newest_version() -> None: + # Two distinct chunks of the same newest-version page both survive. + d1 = _doc(f"{AS}/2.2510/{PAGE}") + d2 = _doc(f"{AS}/2.2510/{PAGE}") + d2.content = "different chunk" + old = _doc(f"{AS}/2024.10/{PAGE}") + kept, _ = dedupe_doc_versions([d1, old, d2], None) + assert len(kept) == 2 + assert all(d.document_id == f"{AS}/2.2510/{PAGE}" for d in kept) + + +def test_non_docs_sources_untouched() -> None: + slack = _doc("slack-msg-123", source=DocumentSource.SLACK) + web = _doc("https://other.com/x/2024.10/y", source=DocumentSource.WEB) + docs = [_doc(f"{AS}/2024.10/{PAGE}"), _doc(f"{AS}/2.2510/{PAGE}"), slack, web] + kept, _ = dedupe_doc_versions(docs, None) + ids = [d.document_id for d in kept] + assert "slack-msg-123" in ids + assert "https://other.com/x/2024.10/y" in ids + assert f"{AS}/2024.10/{PAGE}" not in ids # older docs version dropped + + +def test_relevance_list_filtered_in_lockstep() -> None: + docs = [ + _doc(f"{AS}/2024.10/{PAGE}"), # dropped + _doc(f"{AS}/2.2510/{PAGE}"), # kept (relevant) + _doc("slack-1", source=DocumentSource.SLACK), # kept + ] + rel = [True, True, False] + kept, kept_rel = dedupe_doc_versions(docs, rel) + assert len(kept) == len(kept_rel) == 2 + assert kept_rel == [True, False] + + +def test_disabled_when_substr_empty(monkeypatch) -> None: + monkeypatch.setattr(doc_pruning, "DOCS_VERSION_DEDUP_URL_SUBSTR", "") + docs = [_doc(f"{AS}/2023.10/{PAGE}"), _doc(f"{AS}/2.2510/{PAGE}")] + kept, _ = dedupe_doc_versions(docs, None) + assert len(kept) == 2 # no dedup when disabled diff --git a/backend/tests/unit/danswer/llm/answering/test_docs_version_rewrite.py b/backend/tests/unit/danswer/llm/answering/test_docs_version_rewrite.py new file mode 100644 index 00000000000..ef55cfe5cbb --- /dev/null +++ b/backend/tests/unit/danswer/llm/answering/test_docs_version_rewrite.py @@ -0,0 +1,112 @@ +"""Unit tests for query-time docs version rewrite — rewrite a retrieved versioned +docs link to the newest version of the same page that exists in the index.""" +from types import SimpleNamespace + +import pytest + +from danswer.llm.answering import doc_pruning as dp + +BASE = "https://docs.uipath.com/orchestrator/standalone" +PAGE = "installation-guide/maintenance-considerations" + + +def ldoc(link): + return SimpleNamespace(link=link) + + +class _FakeResult: + def __init__(self, rows): + self._rows = rows + + def all(self): + return self._rows + + +class _FakeDB: + """Ignores the WHERE (the Python-side page match is what we're testing) and + returns every indexed id as a 1-tuple row, like SQLAlchemy .execute().all().""" + + def __init__(self, ids): + self._ids = ids + self.calls = 0 + + def execute(self, _stmt): + self.calls += 1 + return _FakeResult([(i,) for i in self._ids]) + + +@pytest.fixture(autouse=True) +def _cfg(monkeypatch): + monkeypatch.setattr(dp, "DOCS_VERSION_DEDUP_URL_SUBSTR", "docs.uipath.com") + + +def test_versioned_url_parts(): + assert dp._versioned_url_parts(f"{BASE}/2023.10/{PAGE}") == (BASE, "2023.10", PAGE) + assert dp._versioned_url_parts("https://docs.uipath.com/no/version/here") is None + assert dp._versioned_url_parts("https://example.com/2024.10/x") is None # not docs + + +def test_rewrites_to_latest_indexed_version(): + index = [f"{BASE}/2023.4/{PAGE}", f"{BASE}/2023.10/{PAGE}", f"{BASE}/2025.10/{PAGE}"] + docs = [ldoc(f"{BASE}/2023.4/{PAGE}")] + dp.rewrite_docs_links(docs, _FakeDB(index)) + assert docs[0].link == f"{BASE}/2025.10/{PAGE}" + + +def test_noop_when_already_latest(): + index = [f"{BASE}/2023.10/{PAGE}", f"{BASE}/2025.10/{PAGE}"] + docs = [ldoc(f"{BASE}/2025.10/{PAGE}")] + dp.rewrite_docs_links(docs, _FakeDB(index)) + assert docs[0].link == f"{BASE}/2025.10/{PAGE}" + + +def test_does_not_cross_slug_variants(): + # "is-maintenance-considerations" is a different page than "maintenance-considerations" + other = "installation-guide/is-maintenance-considerations" + index = [f"{BASE}/2025.10/{other}", f"{BASE}/2023.4/{PAGE}"] + docs = [ldoc(f"{BASE}/2023.4/{PAGE}")] + dp.rewrite_docs_links(docs, _FakeDB(index)) + # no newer version of THIS slug -> unchanged + assert docs[0].link == f"{BASE}/2023.4/{PAGE}" + + +def test_non_docs_link_untouched(): + docs = [ldoc("https://Product.slack.com/archives/C1/p123")] + db = _FakeDB([]) + dp.rewrite_docs_links(docs, db) + assert docs[0].link == "https://Product.slack.com/archives/C1/p123" + assert db.calls == 0 # no docs pages -> no query + + +def test_new_scheme_outranks_old(): + index = [f"{BASE}/2025.10/{PAGE}", f"{BASE}/2.2510/{PAGE}"] # 2.2510 = new scheme + docs = [ldoc(f"{BASE}/2025.10/{PAGE}")] + dp.rewrite_docs_links(docs, _FakeDB(index)) + assert docs[0].link == f"{BASE}/2.2510/{PAGE}" + + +# ---- version-aware targeting --------------------------------------------------- + +def test_parse_question_doc_version(): + assert dp.parse_question_doc_version("is this supported in 23.10?") == "2023.10" + assert dp.parse_question_doc_version("on Orchestrator 2024.10 standalone") == "2024.10" + # multiple versions -> ambiguous -> None (fall back to latest) + assert dp.parse_question_doc_version("upgrade from 23.10 to 25.10") is None + # no version + assert dp.parse_question_doc_version("how many records can AuditLogEntities hold") is None + # not confused by '2.9 million' + assert dp.parse_question_doc_version("the table has 2.9 million records") is None + + +def test_targets_requested_version_even_if_older(): + index = [f"{BASE}/2023.10/{PAGE}", f"{BASE}/2025.10/{PAGE}"] + docs = [ldoc(f"{BASE}/2025.10/{PAGE}")] # retrieved latest + dp.rewrite_docs_links(docs, _FakeDB(index), target_version="2023.10") + assert docs[0].link == f"{BASE}/2023.10/{PAGE}" # downgraded to the asked version + + +def test_target_version_not_indexed_leaves_link_as_is(): + index = [f"{BASE}/2023.10/{PAGE}", f"{BASE}/2025.10/{PAGE}"] + docs = [ldoc(f"{BASE}/2025.10/{PAGE}")] + dp.rewrite_docs_links(docs, _FakeDB(index), target_version="2022.4") # not indexed + assert docs[0].link == f"{BASE}/2025.10/{PAGE}" # unchanged diff --git a/backend/tests/unit/danswer/llm/answering/test_source_diversity.py b/backend/tests/unit/danswer/llm/answering/test_source_diversity.py new file mode 100644 index 00000000000..71ecd5267af --- /dev/null +++ b/backend/tests/unit/danswer/llm/answering/test_source_diversity.py @@ -0,0 +1,68 @@ +"""Unit tests for ensure_source_diversity — guarantees curated KB/web docs +aren't crowded out of the final prompt by a chatty high-relevance source. + +Promotes up to SOURCE_DIVERSITY_RESERVED_SLOTS of the highest-ranked +protected-source docs to the front, preserving the rest of the order. This is +the replacement for the old two-query source-prioritization hack. +""" +from types import SimpleNamespace + +import pytest + +from danswer.llm.answering import doc_pruning as dp + + +def _doc(name: str, source: str) -> SimpleNamespace: + return SimpleNamespace(semantic_identifier=name, source_type=source) + + +def _ids(docs: list[SimpleNamespace]) -> list[str]: + return [d.semantic_identifier for d in docs] + + +@pytest.fixture(autouse=True) +def _cfg(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(dp, "PROTECTED_SOURCES", ["web", "sfkbarticles"]) + monkeypatch.setattr(dp, "SOURCE_DIVERSITY_RESERVED_SLOTS", 2) + + +def test_promotes_top_protected_docs_to_front() -> None: + # Slack dominates the top; KB/web are ranked lower and would be cut. + docs = [ + _doc("s1", "slack"), + _doc("s2", "slack"), + _doc("kb1", "sfkbarticles"), + _doc("s3", "slack"), + _doc("w1", "web"), + ] + out = dp.ensure_source_diversity(docs) + # top 2 protected hoisted (in their original relative order); rest preserved + assert _ids(out) == ["kb1", "w1", "s1", "s2", "s3"] + + +def test_caps_at_reserved_slots() -> None: + # 3 protected present, but only the top 2 are promoted. + docs = [ + _doc("s1", "slack"), + _doc("kb1", "web"), + _doc("kb2", "web"), + _doc("kb3", "sfkbarticles"), + ] + out = dp.ensure_source_diversity(docs) + assert _ids(out) == ["kb1", "kb2", "s1", "kb3"] + + +def test_noop_when_no_protected_docs() -> None: + docs = [_doc("s1", "slack"), _doc("s2", "slack")] + assert dp.ensure_source_diversity(docs) is docs + + +def test_disabled_when_reserved_zero(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(dp, "SOURCE_DIVERSITY_RESERVED_SLOTS", 0) + docs = [_doc("kb1", "web"), _doc("s1", "slack")] + assert dp.ensure_source_diversity(docs) is docs + + +def test_already_at_front_unchanged() -> None: + docs = [_doc("kb1", "web"), _doc("kb2", "web"), _doc("s1", "slack")] + assert _ids(dp.ensure_source_diversity(docs)) == ["kb1", "kb2", "s1"] diff --git a/backend/tests/unit/danswer/prompts/__init__.py b/backend/tests/unit/danswer/prompts/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/prompts/test_authoritative_sources.py b/backend/tests/unit/danswer/prompts/test_authoritative_sources.py new file mode 100644 index 00000000000..7d59437f3d9 --- /dev/null +++ b/backend/tests/unit/danswer/prompts/test_authoritative_sources.py @@ -0,0 +1,64 @@ +"""Unit tests for the authoritative-sources citation nudge. + +Generic, global instruction (derived from PROTECTED_SOURCES) telling the LLM to +prefer grounding + citing authoritative systems of record over chat discussions. +Source names must match the `Source: X` labels build_doc_context_str puts on docs +(i.e. clean_up_source), so the model can map the nudge to specific docs. +""" +import pytest + +from danswer.prompts import prompt_utils as pu + + +def test_lists_protected_sources_with_doc_label_names( + monkeypatch: pytest.MonkeyPatch, +) -> None: + monkeypatch.setattr(pu, "PROTECTED_SOURCES", ["web", "outsystems", "highspot"]) + out = pu.build_authoritative_sources_reminder() + # clean_up_source: web -> "Website" (CONNECTOR_NAME_MAP), others title-cased. + assert "Website" in out + assert "Outsystems" in out + assert "Highspot" in out + # matches the doc labels and instructs preference over discussions. + assert "authoritative" in out.lower() + assert "chat discussions" in out.lower() + + +def test_names_match_clean_up_source(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pu, "PROTECTED_SOURCES", ["sfkbarticles", "web"]) + out = pu.build_authoritative_sources_reminder() + # Whatever build_doc_context_str would label these, the nudge must use the same. + assert pu.clean_up_source("sfkbarticles") in out + assert pu.clean_up_source("web") in out + + +def test_empty_when_no_protected_sources(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pu, "PROTECTED_SOURCES", []) + assert pu.build_authoritative_sources_reminder() == "" + + +def test_dedupes_repeated_display_names(monkeypatch: pytest.MonkeyPatch) -> None: + # If two keys cleaned to the same display name, list it once. + monkeypatch.setattr(pu, "PROTECTED_SOURCES", ["web", "web"]) + out = pu.build_authoritative_sources_reminder() + assert out.count("Website") == 1 + + +def test_only_appended_to_task_prompt_when_citations_enabled( + monkeypatch: pytest.MonkeyPatch, +) -> None: + monkeypatch.setattr(pu, "PROTECTED_SOURCES", ["outsystems"]) + + class _P: + task_prompt = "BASE." + include_citations = True + + class _PNo: + task_prompt = "BASE." + include_citations = False + + with_cite = pu.build_task_prompt_reminders(_P(), use_language_hint=False, citation_str="CITE.") + no_cite = pu.build_task_prompt_reminders(_PNo(), use_language_hint=False, citation_str="CITE.") + assert "authoritative systems of record" in with_cite + assert "Outsystems" in with_cite + assert "authoritative systems of record" not in no_cite diff --git a/backend/tests/unit/danswer/search/__init__.py b/backend/tests/unit/danswer/search/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/search/preprocessing/__init__.py b/backend/tests/unit/danswer/search/preprocessing/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/search/preprocessing/test_resolve_skip_llm_chunk_filter.py b/backend/tests/unit/danswer/search/preprocessing/test_resolve_skip_llm_chunk_filter.py new file mode 100644 index 00000000000..07eb2a2ed45 --- /dev/null +++ b/backend/tests/unit/danswer/search/preprocessing/test_resolve_skip_llm_chunk_filter.py @@ -0,0 +1,70 @@ +"""Unit tests for _resolve_skip_llm_chunk_filter — decides whether to skip the +LLM relevance filter. + +Rule: the filter runs only when the global master switch +LLM_RELEVANCE_FILTER_ENABLED AND the per-assistant opt-in +(Persona.llm_relevance_filter) are both on. The global DISABLE_LLM_CHUNK_FILTER +kill-switch always wins. An explicit skip (from the chat flow) is honored unless +the kill-switch is set. Independent of reranking (LLM-only, no GPU). +""" +from types import SimpleNamespace + +import pytest + +from danswer.search.preprocessing import preprocessing as pp + + +def _persona(llm_relevance_filter: bool) -> SimpleNamespace: + return SimpleNamespace(llm_relevance_filter=llm_relevance_filter) + + +@pytest.fixture(autouse=True) +def _reset(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pp, "LLM_RELEVANCE_FILTER_ENABLED", False) + + +def _resolve(explicit, persona, disable=False): # type: ignore[no-untyped-def] + return pp._resolve_skip_llm_chunk_filter(explicit, persona, disable) + + +# --- kill-switch wins --- +def test_kill_switch_forces_skip(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pp, "LLM_RELEVANCE_FILTER_ENABLED", True) + assert _resolve(False, _persona(True), disable=True) is True + + +# --- explicit override (chat) honored when not killed --- +def test_explicit_false_runs_filter() -> None: + assert _resolve(False, _persona(False)) is False + + +def test_explicit_true_skips() -> None: + assert _resolve(True, _persona(True)) is True + + +# --- global x per-assistant matrix (explicit None) --- +def test_global_off_persona_on_skips() -> None: + assert _resolve(None, _persona(True)) is True + + +def test_global_on_persona_off_skips(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pp, "LLM_RELEVANCE_FILTER_ENABLED", True) + assert _resolve(None, _persona(False)) is True + + +def test_global_on_no_persona_skips(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pp, "LLM_RELEVANCE_FILTER_ENABLED", True) + assert _resolve(None, None) is True + + +def test_global_on_persona_on_runs(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pp, "LLM_RELEVANCE_FILTER_ENABLED", True) + assert _resolve(None, _persona(True)) is False + + +# --- independence from reranking: relevance filter on with rerank irrelevant --- +def test_independent_of_rerank(monkeypatch: pytest.MonkeyPatch) -> None: + # Relevance filter enabled while reranking is globally OFF (no GPU path). + monkeypatch.setattr(pp, "LLM_RELEVANCE_FILTER_ENABLED", True) + monkeypatch.setattr(pp, "RERANK_ENABLED", False) + assert _resolve(None, _persona(True)) is False # filter still runs diff --git a/backend/tests/unit/danswer/search/preprocessing/test_resolve_skip_rerank.py b/backend/tests/unit/danswer/search/preprocessing/test_resolve_skip_rerank.py new file mode 100644 index 00000000000..1a54be7677f --- /dev/null +++ b/backend/tests/unit/danswer/search/preprocessing/test_resolve_skip_rerank.py @@ -0,0 +1,75 @@ +"""Unit tests for _resolve_skip_rerank — the single resolver that decides +whether a query skips cross-encoder reranking. + +The rule: rerank runs only when the global master switch (RERANK_ENABLED, i.e. +a GPU-backed model server is deployed) AND the per-assistant opt-in +(Persona.rerank_enabled) are both on. An explicit skip_rerank is honored as-is. +A legacy ENABLE_RERANKING_REAL_TIME_FLOW=true forces rerank as a fallback. + +The function reads RERANK_ENABLED / ENABLE_RERANKING_REAL_TIME_FLOW as module +globals at call time, so we monkeypatch them on the module. +""" +from types import SimpleNamespace + +import pytest + +from danswer.search.preprocessing import preprocessing as pp + + +def _persona(rerank_enabled: bool) -> SimpleNamespace: + # Stands in for a Persona; _resolve_skip_rerank only reads .rerank_enabled. + return SimpleNamespace(rerank_enabled=rerank_enabled) + + +@pytest.fixture(autouse=True) +def _reset_flags(monkeypatch: pytest.MonkeyPatch) -> None: + # Default both global flags off so each test sets only what it needs. + monkeypatch.setattr(pp, "RERANK_ENABLED", False) + monkeypatch.setattr(pp, "ENABLE_RERANKING_REAL_TIME_FLOW", False) + + +# --- explicit skip_rerank is always honored, regardless of globals/persona --- + + +def test_explicit_true_is_honored(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pp, "RERANK_ENABLED", True) + assert pp._resolve_skip_rerank(True, _persona(True)) is True + + +def test_explicit_false_is_honored(monkeypatch: pytest.MonkeyPatch) -> None: + # Even with everything off, an explicit "don't skip" wins. + assert pp._resolve_skip_rerank(False, _persona(False)) is False + + +# --- the global x per-assistant matrix (explicit None) --- + + +def test_global_off_persona_on_skips(monkeypatch: pytest.MonkeyPatch) -> None: + # Local / GPU-free default: global off => never rerank, even if opted in. + assert pp._resolve_skip_rerank(None, _persona(True)) is True + + +def test_global_on_persona_off_skips(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pp, "RERANK_ENABLED", True) + assert pp._resolve_skip_rerank(None, _persona(False)) is True + + +def test_global_on_no_persona_skips(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(pp, "RERANK_ENABLED", True) + assert pp._resolve_skip_rerank(None, None) is True + + +def test_global_on_persona_on_reranks(monkeypatch: pytest.MonkeyPatch) -> None: + # The one combination that actually reranks. + monkeypatch.setattr(pp, "RERANK_ENABLED", True) + assert pp._resolve_skip_rerank(None, _persona(True)) is False + + +# --- legacy fallback flag --- + + +def test_legacy_realtime_flag_forces_rerank(monkeypatch: pytest.MonkeyPatch) -> None: + # Back-compat: the old env flag reranks even without the per-assistant opt-in. + monkeypatch.setattr(pp, "ENABLE_RERANKING_REAL_TIME_FLOW", True) + assert pp._resolve_skip_rerank(None, _persona(False)) is False + assert pp._resolve_skip_rerank(None, None) is False diff --git a/backend/tests/unit/danswer/search/test_protected_source_topup.py b/backend/tests/unit/danswer/search/test_protected_source_topup.py new file mode 100644 index 00000000000..a59e09163f9 --- /dev/null +++ b/backend/tests/unit/danswer/search/test_protected_source_topup.py @@ -0,0 +1,63 @@ +"""Unit tests for the source-reserved retrieval top-up. + +`protected_source_topup` is the pure core of the recall guarantee: given the main +candidate set and a source-scoped supplemental retrieval, it decides which (if any) +protected-source chunks to inject so the candidate set holds up to `reserved` of +them. Tested with stub chunks (only `.source_type` and `.unique_id` are read). +""" +from types import SimpleNamespace + +from danswer.configs.constants import DocumentSource +from danswer.search.pipeline import protected_source_topup + +OS = DocumentSource.OUTSYSTEMS +SLACK = DocumentSource.SLACK +WEB = DocumentSource.WEB +PROTECTED = {OS, WEB} + + +def chunk(uid: str, source: DocumentSource) -> SimpleNamespace: + return SimpleNamespace(unique_id=uid, source_type=source) + + +def uids(chunks: list) -> list[str]: + return [c.unique_id for c in chunks] + + +def test_injects_up_to_reserved_when_none_present() -> None: + existing = [chunk(f"s{i}", SLACK) for i in range(50)] + candidates = [chunk(f"o{i}", OS) for i in range(5)] + added = protected_source_topup(existing, candidates, reserved=3, protected_sources=PROTECTED) + assert uids(added) == ["o0", "o1", "o2"] + + +def test_tops_up_only_the_shortfall_when_some_present() -> None: + existing = [chunk("o_present", OS)] + [chunk(f"s{i}", SLACK) for i in range(10)] + candidates = [chunk(f"o{i}", OS) for i in range(5)] + added = protected_source_topup(existing, candidates, reserved=3, protected_sources=PROTECTED) + assert uids(added) == ["o0", "o1"] # 1 already present -> need 2 more + + +def test_no_injection_when_reservation_already_met() -> None: + existing = [chunk(f"o{i}", OS) for i in range(3)] + [chunk("s", SLACK)] + candidates = [chunk("o_extra", OS)] + assert protected_source_topup(existing, candidates, reserved=3, protected_sources=PROTECTED) == [] + + +def test_skips_non_protected_and_dedupes_against_existing() -> None: + existing = [chunk("dup", OS), chunk("s0", SLACK)] + candidates = [ + chunk("dup", OS), # already present -> skip + chunk("s1", SLACK), # not protected -> skip + chunk("o_new", OS), # inject + chunk("w_new", WEB), # protected (web) -> inject + ] + added = protected_source_topup(existing, candidates, reserved=3, protected_sources=PROTECTED) + assert uids(added) == ["o_new", "w_new"] # 1 present (dup) -> need 2 + + +def test_disabled_when_reserved_zero_or_no_protected_sources() -> None: + existing = [chunk("s", SLACK)] + candidates = [chunk("o", OS)] + assert protected_source_topup(existing, candidates, reserved=0, protected_sources=PROTECTED) == [] + assert protected_source_topup(existing, candidates, reserved=3, protected_sources=set()) == [] diff --git a/backend/tests/unit/danswer/secondary_llm_flows/__init__.py b/backend/tests/unit/danswer/secondary_llm_flows/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/backend/tests/unit/danswer/secondary_llm_flows/test_listwise_chunk_filter.py b/backend/tests/unit/danswer/secondary_llm_flows/test_listwise_chunk_filter.py new file mode 100644 index 00000000000..4f2deca80a3 --- /dev/null +++ b/backend/tests/unit/danswer/secondary_llm_flows/test_listwise_chunk_filter.py @@ -0,0 +1,45 @@ +"""Unit tests for the listwise (one-shot) relevance filter parsing. + +The filter asks the LLM, in a single call, to return a JSON array of the useful +section numbers. Parsing must: extract the array even amid prose, treat an empty +array as a valid "none useful", clamp out-of-range numbers, and FAIL OPEN (keep +all chunks) when no array can be parsed. +""" +from danswer.secondary_llm_flows.chunk_usefulness import _parse_useful_indices +from danswer.secondary_llm_flows.chunk_usefulness import llm_eval_chunks_listwise + + +def test_parses_array() -> None: + assert _parse_useful_indices("[1, 3, 4]", count=5) == {1, 3, 4} + + +def test_parses_array_amid_prose() -> None: + assert _parse_useful_indices("Sure! The useful ones are [2, 5].", count=5) == {2, 5} + + +def test_empty_array_means_none_useful() -> None: + # Explicit [] is a valid answer (not a parse failure) → empty set. + assert _parse_useful_indices("[]", count=5) == set() + + +def test_out_of_range_clamped() -> None: + assert _parse_useful_indices("[0, 2, 9]", count=3) == {2} + + +def test_no_array_is_parse_failure() -> None: + # None signals the caller to fail OPEN. + assert _parse_useful_indices("I cannot decide.", count=3) is None + + +def test_empty_input_returns_empty() -> None: + assert llm_eval_chunks_listwise("q", [], llm=None) == [] # type: ignore[arg-type] + + +def test_fail_open_on_llm_error() -> None: + class _BoomLLM: + def invoke(self, *_args: object, **_kwargs: object) -> object: + raise RuntimeError("model down") + + # Any exception → keep all chunks (all True). + out = llm_eval_chunks_listwise("q", ["a", "b", "c"], llm=_BoomLLM()) # type: ignore[arg-type] + assert out == [True, True, True] diff --git a/backend/tests/unit/danswer/server/manage/test_user_preferences.py b/backend/tests/unit/danswer/server/manage/test_user_preferences.py new file mode 100644 index 00000000000..277f9d9151a --- /dev/null +++ b/backend/tests/unit/danswer/server/manage/test_user_preferences.py @@ -0,0 +1,47 @@ +"""Unit tests for opt-out assistant visibility serialization. + +`UserInfo.from_model` is what the frontend reads to learn which assistants a +user has hidden. The opt-out model lives in `hidden_assistants`; these tests pin +its serialization (including the None -> [] normalization) without a DB. +""" +from types import SimpleNamespace + +from danswer.auth.schemas import UserRole +from danswer.server.manage.models import UserInfo +from danswer.server.manage.models import UserPreferences + + +def _fake_user( + chosen_assistants: list[int] | None, + hidden_assistants: list[int] | None, +) -> SimpleNamespace: + # from_model only reads these attributes off the ORM user. + return SimpleNamespace( + id="00000000-0000-0000-0000-000000000001", + email="user@example.com", + is_active=True, + is_superuser=False, + is_verified=True, + role=UserRole.BASIC, + chosen_assistants=chosen_assistants, + hidden_assistants=hidden_assistants, + ) + + +def test_from_model_exposes_hidden_assistants() -> None: + info = UserInfo.from_model(_fake_user(chosen_assistants=[2, 1], hidden_assistants=[3])) + assert info.preferences.hidden_assistants == [3] + assert info.preferences.chosen_assistants == [2, 1] + + +def test_from_model_normalizes_null_hidden_to_empty_list() -> None: + # Existing rows / no-preference users must serialize as "nothing hidden" + # (i.e. everything visible) rather than null. + info = UserInfo.from_model(_fake_user(chosen_assistants=None, hidden_assistants=None)) + assert info.preferences.hidden_assistants == [] + + +def test_user_preferences_hidden_defaults_to_none() -> None: + # The field is optional on the wire so older clients/payloads still parse. + prefs = UserPreferences(chosen_assistants=None) + assert prefs.hidden_assistants is None diff --git a/docs/aks-managed-backup.md b/docs/aks-managed-backup.md new file mode 100644 index 00000000000..6469c999608 --- /dev/null +++ b/docs/aks-managed-backup.md @@ -0,0 +1,172 @@ +# Azure-managed AKS Backup — weekly Vespa backup (runbook) + +Goal: weekly, managed backup of **only the Vespa state that can't be cheaply +rebuilt**, using **Azure Backup for AKS** (the `azure-aks-backup` cluster +extension + a Backup vault). Managed identity is used throughout, so there is +**no service-principal secret to expire** — which is exactly what broke the +previous self-managed Velero (`AADSTS7000215: invalid client secret`). + +## What gets backed up + +Postgres (the real source of truth) is **external + Azure-managed** +(`darwin-postgres` flexible server) and is backed up separately. In-cluster, +the only durable, not-cheaply-rebuildable state is Vespa. + +Selected by the label `backup=vespa` (already applied): + +| PVC | Size | Why | +|---|---|---| +| `vespa-var-vespa-content-{0,1,2}` | 3×100Gi | the search index (rebuild = full re-index, hours–days) | +| `vespa-var1-vespa-configserver-{0,1,2}` | 3×5Gi | deployed app package/schema — pairs with content for a turnkey restore | + +Everything else is intentionally excluded: model caches (`indexing-model-*`, +`inference-model-pvc`) re-download on boot; `vespa-logs*` are logs; +`vespa-workspace*` is deploy scratch; `dynamic-pvc`/`file-connector-pvc` are +unmounted/unused (Postgres+Blob file store); Redis is cache. + +**Consistency caveat:** CSI/disk snapshots are point-in-time *per volume*, not +coordinated across the 3 content nodes. Vespa's redundancy + startup recovery +usually makes a restore fine, but treat this as a **fast-DR accelerator** — the +authoritative recovery path is still re-indexing from Postgres. + +## Environment (real values) + +``` +SUBSCRIPTION 202e5d15-5356-4826-bc61-ebd449d12e34 (Internal-Production-EA) +TENANT d8353d2a-b153-4d17-8827-902c51f72357 +AKS name=darwin rg=darwin nodeRG=MC_darwin_darwin_westeurope identity=SystemAssigned +LOCATION westeurope +NAMESPACE darwin +PVC SELECTOR backup=vespa (already labeled on the 6 PVCs) +``` + +### Two hard constraints from this environment + +1. **The `darwin` RG has a `CanNotDelete` lock.** Backup retention *deletes* + old snapshots + the snapshot-RG contents, so the **snapshot resource group + and the backup storage account must live in a SEPARATE, unlocked RG** + (below: `darwin-backup`). Never point the snapshot RG at `darwin`. +2. **You (PIM Contributor) cannot do role assignments or Trusted Access + bindings.** Steps tagged **[OWNER]** require Owner / User Access + Administrator. Contributor-only steps are tagged **[CONTRIB]**. + +## Steps + +```bash +SUB=202e5d15-5356-4826-bc61-ebd449d12e34 +LOC=westeurope +BKP_RG=darwin-backup # unlocked RG for backup artifacts + snapshots +SA=darwinaksbkp$RANDOM # storage account (globally-unique, <=24 chars, lowercase) +VAULT=darwin-bkp-vault +AKS_ID=$(az aks show -g darwin -n darwin --query id -o tsv) + +# 1. [CONTRIB] providers +az provider register --namespace Microsoft.KubernetesConfiguration +az provider register --namespace Microsoft.DataProtection + +# 2. [CONTRIB] unlocked RG + storage for backup metadata (Velero store) + snapshots +az group create -n $BKP_RG -l $LOC +az storage account create -n $SA -g $BKP_RG -l $LOC --sku Standard_LRS --min-tls-version TLS1_2 +az storage container create -n aks-backup --account-name $SA --auth-mode login + +# 3. [CONTRIB] install the managed backup extension (runs Velero in dataprotection-microsoft ns) +az k8s-extension create -g darwin -c darwin --cluster-type managedClusters \ + --extension-type Microsoft.DataProtection.Kubernetes --name azure-aks-backup \ + --release-train stable --scope cluster \ + --config blobContainer=aks-backup storageAccount=$SA \ + storageAccountResourceGroup=$BKP_RG storageAccountSubscriptionId=$SUB + +# 4. [OWNER] grant the extension MSI write access to the backup storage account +EXT_MSI=$(az k8s-extension show -g darwin -c darwin --cluster-type managedClusters \ + -n azure-aks-backup --query aksAssignedIdentity.principalId -o tsv) +SA_ID=$(az storage account show -n $SA -g $BKP_RG --query id -o tsv) +az role assignment create --assignee $EXT_MSI --role "Storage Account Contributor" --scope $SA_ID + +# 5. [CONTRIB] Backup vault (system-assigned identity) +az dataprotection backup-vault create -g $BKP_RG --vault-name $VAULT -l $LOC \ + --type SystemAssigned \ + --storage-settings datastore-type="VaultStore" type="LocallyRedundant" +VAULT_ID=$(az dataprotection backup-vault show -g $BKP_RG --vault-name $VAULT --query id -o tsv) +VAULT_MSI=$(az dataprotection backup-vault show -g $BKP_RG --vault-name $VAULT --query identity.principalId -o tsv) + +# 6. [OWNER] Trusted Access rolebinding (cluster <-> vault) — creates a role assignment +az aks trustedaccess rolebinding create -g darwin --cluster-name darwin \ + -n darwin-backup-binding --source-resource-id $VAULT_ID \ + --roles Microsoft.DataProtection/backupVaults/backup-operator + +# 7. [OWNER] vault MSI roles: read the cluster, snapshot the disks, write blobs +az role assignment create --assignee $VAULT_MSI --role "Reader" --scope $AKS_ID +az role assignment create --assignee $VAULT_MSI --role "Disk Snapshot Contributor" \ + --scope $(az group show -n $BKP_RG --query id -o tsv) +az role assignment create --assignee $VAULT_MSI --role "Storage Account Contributor" --scope $SA_ID +# extension MSI also needs read on the source disks' node RG: +az role assignment create --assignee $EXT_MSI --role "Contributor" \ + --scope $(az group show -n MC_darwin_darwin_westeurope --query id -o tsv) + +# 8. [CONTRIB] WEEKLY policy (edit the default template's schedule -> weekly + retention) +az dataprotection backup-policy get-default-policy-template \ + --datasource-type AzureKubernetesService > policy.json +# edit policy.json: trigger schedule -> "R/2024-01-07T02:00:00+00:00/P1W" (weekly, Sun 02:00), +# default retention e.g. 4–8 weeks. Then: +az dataprotection backup-policy create -g $BKP_RG --vault-name $VAULT \ + -n weekly-vespa --policy policy.json + +# 9. [CONTRIB] backup instance — namespace=darwin, label=backup=vespa, snapshots ON, +# snapshot RG = the UNLOCKED $BKP_RG (NOT darwin) +POLICY_ID=$(az dataprotection backup-policy show -g $BKP_RG --vault-name $VAULT \ + -n weekly-vespa --query id -o tsv) +az dataprotection backup-instance initialize-backupconfig \ + --datasource-type AzureKubernetesService \ + --label-selectors backup=vespa --included-namespaces darwin \ + --snapshot-volumes true --include-cluster-scope-resources false > backupconfig.json +az dataprotection backup-instance initialize \ + --datasource-id $AKS_ID --datasource-location $LOC \ + --datasource-type AzureKubernetesService --policy-id $POLICY_ID \ + --backup-configuration ./backupconfig.json \ + --friendly-name darwin-vespa-weekly \ + --snapshot-resource-group-name $BKP_RG > backupinstance.json +az dataprotection backup-instance create -g $BKP_RG --vault-name $VAULT \ + --backup-instance backupinstance.json + +# 10. [CONTRIB] trigger an on-demand backup to validate end-to-end +az dataprotection backup-instance adhoc-backup -g $BKP_RG --vault-name $VAULT \ + --backup-instance-name \ + --rule-name +``` + +## Validation + +```bash +# extension healthy + Velero pods up +az k8s-extension show -g darwin -c darwin --cluster-type managedClusters -n azure-aks-backup \ + --query provisioningState -o tsv +kubectl get pods -n dataprotection-microsoft +# backup instance status +az dataprotection backup-instance list -g darwin-backup --vault-name darwin-bkp-vault \ + --query "[].{name:name, status:properties.protectionStatus.status}" -o table +``` + +## Owner hand-off (the [OWNER] steps, minimal set) + +Hand these to whoever holds Owner / User Access Administrator on sub +`202e5d15-…`. All scoped to the backup artifacts, none touch the locked +`darwin` RG's existing resources: + +- `Storage Account Contributor` for the **extension MSI** on the backup SA (step 4) +- `aks trustedaccess rolebinding` cluster↔vault (step 6) +- vault MSI: `Reader` on AKS, `Disk Snapshot Contributor` on `darwin-backup`, + `Storage Account Contributor` on the backup SA (step 7) +- extension MSI: `Contributor` on `MC_darwin_darwin_westeurope` (to snapshot the + source disks) (step 7) + +## Notes + +- The 6 source PVCs are already labeled `backup=vespa` + (`kubectl get pvc -n darwin -l backup=vespa`). Adjust scope by relabeling. +- Old self-managed Velero (namespace `velero`, CRDs, schedule `daily-backups`, + 373 backup CRs) and its 130 stale Azure disk snapshots in + `MC_darwin_darwin_westeurope` were removed on 2026-06-11. The old backup + *blobs* in storage account `darwinaksbackup` are NOT deleted by that — drop + that storage account separately if it's no longer needed. + + diff --git a/docs/how-darwin-answers-questions.md b/docs/how-darwin-answers-questions.md new file mode 100644 index 00000000000..a54c125effa --- /dev/null +++ b/docs/how-darwin-answers-questions.md @@ -0,0 +1,310 @@ +# How Darwin Answers a Question + +A visual tour of how Darwin turns a question (in **Slack** or the **web chat**) +into a **grounded, cited answer** — and the knobs that change that behavior. +Written for a mixed audience: enough pictures for a stakeholder conversation, +enough specifics that engineers trust it. + +> **TL;DR** — Darwin is **not** a "stuff some chunks into a prompt" toy RAG. +> Every question runs through hybrid retrieval → recency-aware ranking → an +> optional neural reranker → an LLM relevance filter → answer-grounding +> guardrails — all configurable **per assistant**. That depth is where answer +> quality and trust come from. +> +> Engineering deep-dive (file:line, exact ranking math, rollout plan): +> [`search-quality-reranking-and-recency.md`](./search-quality-reranking-and-recency.md). + +--- + +## 1. The moving parts + +```mermaid +flowchart LR + SL["💬 Slack"]:::surface + WEB["🖥️ Web chat"]:::surface + + subgraph CORE["🧩 Answering core"] + direction TB + API["⚙️ API server
orchestration"]:::core + PIPE["🔎 Search pipeline"]:::core + API --> PIPE + end + + MS["🧠 Inference model server
embeddings · intent · reranker"]:::model + VES[("📚 Vespa
vector + keyword index")]:::store + LLM["✨ LLM provider"]:::llm + PG[("🗄️ Postgres")]:::store + REDIS[("⚡ Redis
cache · rate-limit")]:::store + + SL --> API + WEB --> API + PIPE -->|"embed query"| MS + PIPE -->|"hybrid search"| VES + PIPE -->|"generate"| LLM + API -.-> PG + API -.-> REDIS + + subgraph ING["📥 Ingestion · runs continuously"] + direction LR + CONN["🔌 Connectors
Slack · Jira · Web · Salesforce …"]:::surface + DASK["🧵 Dask workers
chunk + embed"]:::core + CONN --> DASK + end + DASK -->|"write chunks + vectors"| VES + + classDef surface fill:#dbeafe,stroke:#2563eb,color:#1e3a8a,stroke-width:1px + classDef core fill:#ede9fe,stroke:#7c3aed,color:#4c1d95,stroke-width:1px + classDef model fill:#fef3c7,stroke:#d97706,color:#7c2d12,stroke-width:1px + classDef store fill:#f1f5f9,stroke:#64748b,color:#334155,stroke-width:1px + classDef llm fill:#fae8ff,stroke:#c026d3,color:#701a75,stroke-width:1px + style CORE fill:#faf5ff,stroke:#a78bfa,color:#4c1d95 + style ING fill:#f0fdf4,stroke:#86efac,color:#14532d +``` + +Two independent halves: +- **📥 Ingestion (always on):** connectors pull documents → Dask workers chunk & + embed them → everything lands in **Vespa**. (Smart enough to *skip* + re-indexing unchanged content.) +- **🧩 Answering (per question):** the path in §2. + +--- + +## 2. The journey of a question + +The same pipeline serves **both** Slack and web chat — they differ only in entry +point and presentation, not in how retrieval/ranking work. + +```mermaid +sequenceDiagram + autonumber + actor U as 👤 User + participant API as ⚙️ API + participant MS as 🧠 Models + participant V as 📚 Vespa + participant R as 🎯 Reranker + participant L as ✨ LLM + + U->>API: question + chosen assistant + Note over API: preprocess — filters,
recency, rerank decision + + rect rgb(220, 252, 231) + Note over API,V: ① RETRIEVE + API->>MS: embed the question + API->>V: hybrid search
(meaning + keywords, recency-weighted) + V-->>API: ≈ 50 candidate chunks + end + + rect rgb(254, 243, 199) + Note over API,R: ② RERANK · only if enabled for this assistant + API->>R: re-score top 15
(question + chunk text together) + R-->>API: reordered by true relevance + end + + rect rgb(250, 232, 255) + Note over API,L: ③ GENERATE + API->>API: relevance filter → keep best ≈ 10 + API->>L: prompt grounded in those chunks + L-->>API: streamed answer + citations + end + + rect rgb(254, 226, 226) + Note over API: 🛡️ guardrail — no citations ⇒ don't answer + end + API-->>U: ✅ grounded answer with sources +``` + +--- + +## 3. The retrieval funnel — where quality comes from + +A question doesn't get "the top chunks." It gets **progressively narrowed** by +increasingly precise (and expensive) stages — broad recall first, sharp +precision last: + +``` + hundreds of thousands of indexed chunks + ████████████████████████████████████████████████ KNOWLEDGE BASE + │ ① hybrid retrieval (meaning + keywords + recency) + ██████████████████████████ ≈ 50 candidates + │ ② cross-encoder rerank (question + chunk together) + ██████████████ top 15, re-scored + │ ③ LLM relevance filter + token budget + ████████ ≈ 10 best chunks + │ ④ grounded generation + citation check + ▼ + ✅ 1 trustworthy answer +``` + +```mermaid +flowchart TB + CORP["📚 Entire knowledge base
(100k+ chunks)"]:::broad + CORP --> S1["① Hybrid retrieval · Vespa
vector + keyword, recency-weighted"]:::retrieve + S1 --> C50(["≈ 50 candidates"]):::count + C50 --> S2["② Cross-encoder reranker
reads question + chunk together"]:::rerank + S2 --> C15(["top 15 re-scored"]):::count + C15 --> S3["③ LLM relevance filter
+ token budget"]:::filter + S3 --> C10(["≈ 10 best chunks"]):::count + C10 --> S4["④ Grounded LLM generation"]:::llm + S4 --> ANS(["✅ 1 cited answer"]):::answer + + classDef broad fill:#f1f5f9,stroke:#64748b,color:#334155 + classDef retrieve fill:#dcfce7,stroke:#16a34a,color:#14532d + classDef rerank fill:#fef3c7,stroke:#d97706,color:#7c2d12 + classDef filter fill:#ffedd5,stroke:#ea580c,color:#7c2d12 + classDef llm fill:#fae8ff,stroke:#c026d3,color:#701a75 + classDef count fill:#ffffff,stroke:#94a3b8,color:#0f172a,stroke-dasharray:3 3 + classDef answer fill:#bbf7d0,stroke:#15803d,color:#14532d,stroke-width:2px +``` + +Why each stage matters: + +| Stage | What it does | Why it's not trivial | +|---|---|---| +| **① Hybrid retrieval** | Finds candidates by **meaning** (vector) *and* **exact terms** (keyword/BM25), fused, then weighted by recency | Pure vector misses exact IDs/error codes; pure keyword misses paraphrases. Fusion catches both. | +| **② Reranking** | A neural **cross-encoder** reads the question and each chunk *together* and re-scores | Retrieval embeds the chunk *before* it sees your question; the reranker judges actual relevance and fixes "right doc, ranked too low." | +| **③ Relevance filter** | An LLM pass drops off-topic chunks before answering | Stops near-misses from diluting the prompt → fewer confident-but-wrong answers. | +| **④ Grounded generation** | Answer built only from selected chunks, with citations | If it can't cite, Darwin **stays silent** rather than hallucinate. | + +--- + +## 4. Hybrid search, in one picture + +Every candidate's score blends two signals, then is nudged by freshness and human +feedback: + +```mermaid +flowchart LR + SEM["🧭 Semantic similarity
meaning match (vectors)"]:::sem + KW["🔤 Keyword match
exact terms (BM25)"]:::kw + SEM -->|"× α"| MIX(("➕ blend")):::mix + KW -->|"× (1 − α)"| MIX + MIX --> BOOST["👍 feedback boost
(promote / bury docs)"]:::boost + BOOST --> REC["🕒 recency factor
(newer scores higher)"]:::rec + REC --> SCORE(["⭐ final relevance score"]):::score + + classDef sem fill:#dbeafe,stroke:#2563eb,color:#1e3a8a + classDef kw fill:#dcfce7,stroke:#16a34a,color:#14532d + classDef mix fill:#ede9fe,stroke:#7c3aed,color:#4c1d95 + classDef boost fill:#fef3c7,stroke:#d97706,color:#7c2d12 + classDef rec fill:#cffafe,stroke:#0891b2,color:#155e75 + classDef score fill:#bbf7d0,stroke:#15803d,color:#14532d,stroke-width:2px +``` + +> `score = ( α · semantic + (1 − α) · keyword ) × feedback_boost × recency` +> — **α** is the dial between meaning and exact terms (default leans semantic). + +This is the *search engine* under the chatbot — a real ranking system, not a +nearest-neighbor lookup. + +--- + +## 5. Recency — preferring fresh knowledge + +Two equally-relevant docs shouldn't tie when one is from last week and one is +three years old. Darwin multiplies each score by a **recency factor** that decays +with document age: + +> `recency_factor = max( 1 / (1 + decay × age_in_years), 0.75 )` + +``` +score multiplier by document age (1.00 = full score) + + new 6 mo 1 yr 2 yr+ +default ██████ █████▍ █████ ████▌ 1.00 → 0.89 → 0.80 → 0.75 (floor) +favor- ██████ █████ ████▌ ████▌ 1.00 → 0.80 → 0.75 → 0.75 +recent +``` + +- A **soft nudge**, not a cliff — a clearly-better old doc can still win. +- The **0.75 floor** caps the penalty at 25% (tunable). Decay strength is set + **per assistant** (`favor_recent` is more aggressive) or globally. +- The lever for "prefer recent" without discarding authoritative older content. + +--- + +## 6. The knobs — same engine, different behavior + +Darwin's behavior is **configured, not hardcoded** — globally and **per +assistant** (each assistant has its own knowledge scope, prompt, and ranking +behavior). + +| Knob | Where | Default | When on | +|---|---|---|---| +| **Neural reranking** | global switch **and** per-assistant toggle | raw hybrid order | top candidates reordered by a cross-encoder (sharper relevance) | +| **Recency preference** | per assistant | mild decay | stronger tilt toward recent docs | +| **LLM relevance filter** | per assistant | — | off-topic chunks dropped pre-answer | +| **Citations required** | per Slack channel | — | no citations ⇒ **no answer** (won't bluff) | +| **Source diversity** | automatic (global) | on | guarantees curated KB/web docs aren't crowded out of the prompt by a chatty source | +| **Guardrails** | built in | — | ACL filtering · rate limiting · retry/backoff | + +```mermaid +flowchart LR + QQ(["❓ Same question"]):::q + QQ --> A1["🅰️ Assistant A
rerank OFF · broad scope"]:::dim + QQ --> A2["🅱️ Assistant B
rerank ON · favor-recent · curated"]:::bright + A1 --> R1(["fast · recall-oriented"]):::dimout + A2 --> R2(["sharper · fresher · more precise"]):::brightout + + classDef q fill:#e0e7ff,stroke:#4f46e5,color:#312e81,stroke-width:2px + classDef dim fill:#f1f5f9,stroke:#94a3b8,color:#475569 + classDef bright fill:#fef3c7,stroke:#d97706,color:#7c2d12,stroke-width:2px + classDef dimout fill:#f8fafc,stroke:#cbd5e1,color:#64748b + classDef brightout fill:#dcfce7,stroke:#16a34a,color:#14532d,stroke-width:2px +``` + +This is what makes a **controlled rollout** possible: enable a capability on +*one* assistant, compare answers side-by-side, then make it the default — no +big-bang switch. + +--- + +## 7. Why this isn't a toy RAG + +A weekend RAG demo is: embed docs → nearest-neighbor → stuff prompt. Darwin adds +the parts that decide whether answers are **trustworthy at scale**: + +```mermaid +flowchart TB + subgraph TOY["🧪 Toy RAG"] + direction TB + T1["vector nearest-neighbor"]:::t --> T2["stuff into prompt"]:::t --> T3["hope it's right"]:::t + end + subgraph DARWIN["🦾 Darwin"] + direction TB + D1["hybrid retrieval · meaning + keywords"]:::d + D2["two-stage ranking · recall → precision"]:::d + D3["recency-aware scoring"]:::d + D4["LLM relevance filtering"]:::d + D5["grounded + cited · suppress if unsure"]:::d + D6["per-assistant config · ACL · rate-limit · retention"]:::d + D1 --> D2 --> D3 --> D4 --> D5 --> D6 + end + classDef t fill:#fee2e2,stroke:#dc2626,color:#7f1d1d + classDef d fill:#dcfce7,stroke:#16a34a,color:#14532d + style TOY fill:#fef2f2,stroke:#fca5a5,color:#7f1d1d + style DARWIN fill:#f0fdf4,stroke:#86efac,color:#14532d +``` + +- **Two-signal hybrid retrieval** (meaning *and* keywords), not just vectors. +- **Two-stage ranking** — cheap recall (bi-encoder) then expensive precision + (cross-encoder) — the standard of serious search systems. +- **Recency-aware ranking** so stale content doesn't masquerade as current. +- **LLM relevance filtering** before generation. +- **Grounded, cited answers with hallucination suppression** — would rather say + nothing than make something up. +- **Per-assistant configurability** — different teams, knowledge scopes, prompts, + and ranking behavior from one platform. +- **Enterprise plumbing** — permissions/ACL, rate limiting, retention, a broad + connector ecosystem, and a horizontally-scaled indexing pipeline. +- **Operational depth** — separate embedding/indexing/reranking model servers, + Redis caching, CPU-optimized reranking (TEI), incremental & measurable rollouts. + +Each layer is a deliberate quality or trust decision. That's the "meat": the gap +between a chatbot that *sounds* right and one you can put in front of the +business. + +--- + +*Companion deep-dive:* +[`search-quality-reranking-and-recency.md`](./search-quality-reranking-and-recency.md) +*(architecture, exact ranking math, file references, rollout plan).* diff --git a/docs/search-quality-reranking-and-recency.md b/docs/search-quality-reranking-and-recency.md new file mode 100644 index 00000000000..6c20f19097d --- /dev/null +++ b/docs/search-quality-reranking-and-recency.md @@ -0,0 +1,508 @@ +# Search Answer Quality: Reranking, Recency, and Retrieval Prioritization + +> Design + investigation notes for branch **`feature/improve-queries`**. +> Goal: improve the **reliability of answers** (chat + Slack), give a path to +> **prioritize recent documents**, and do it **incrementally / A-B-comparably** +> rather than flipping a global switch. All findings below were verified against +> the code on this branch; file:line anchors are approximate (they drift as the +> code changes) but point at the right place. + +--- + +## 1. What problem this branch solves + +The fork retrieves well but **ranks and reranks poorly by default**, which costs +answer quality: + +1. **Cross-encoder reranking is OFF by default** — the LLM receives chunks in raw + bi-encoder hybrid order, so a genuinely-relevant chunk ranked #12 by vector + similarity never reaches the prompt (only ~10 chunks fit). +2. **A fork-specific "prioritized source" hack biases retrieval** — it runs a + second, source-filtered Vespa query and merges it, but because Vespa's + `normalize_linear` scoring is *relative to each query's candidate set*, the + narrow second query's scores are inflated and `web`/`sfkbarticles` get lifted + to the top regardless of true relevance. Default-on for every query. (This was + removed — but the *recall* goal it served was later restored with a bounded, + scope-safe pass that doesn't depend on the inflation artifact; see §5.1.) +3. **Recency is only a gentle decay** (and the `auto` setting is effectively + dead — see §4), so there's no real "prefer recent" behavior. +4. There was **no way to roll any of this out gradually** or compare old vs new. + +The branch adds a **two-level rerank gate** (global infra flag + per-assistant +toggle), splits retrieval so reranking gets clean (unbiased) candidates, and +documents the recency levers — without changing behavior for un-opted assistants +or the GPU-free local setup. + +--- + +## 2. The query/answer path (verified) + +Both surfaces converge on the same pipeline: + +``` +Chat: /chat send-message ─┐ + ├─► SearchTool.run ─► SearchPipeline +Slack: handle_message ─────┘ │ + ├─ retrieval_preprocessing (builds SearchQuery) + ├─ VespaIndex.hybrid_retrieval → _query_vespa (retrieve) + ├─ search_postprocessing (rerank + LLM relevance filter) + └─ prune_documents (token budget → ~10 chunks → LLM) +``` + +Key files: +- `danswer/tools/search/search_tool.py` — builds `SearchRequest` (sets `persona`, + leaves `skip_rerank=None`), runs `SearchPipeline`. +- `danswer/search/pipeline.py` — orchestrates the stages. +- `danswer/search/preprocessing/preprocessing.py` — `retrieval_preprocessing` + builds the `SearchQuery`; resolves filters, recency multiplier, and + `skip_rerank` (see §3, §5). +- `danswer/search/retrieval/search_runner.py` — `doc_index_retrieval` calls + `hybrid_retrieval`. +- `danswer/document_index/vespa/index.py` — `hybrid_retrieval` → `_query_vespa` + (the Vespa query); `danswer_chunk.sd` is the rank profile. +- `danswer/search/postprocessing/postprocessing.py` — `semantic_reranking`, + `rerank_chunks`, `filter_chunks`. +- `danswer/danswerbot/slack/handlers/handle_message.py` — Slack entry; now passes + `skip_rerank=None` so it shares the chat resolver. + +### Slack-specific guardrails worth knowing +- No-citations ⇒ **answer suppressed** (primary hallucination guard). +- `@retry(tries=5)` on answer generation; up-to-5× full re-execution on missing + citations (latency/cost + orphaned chat sessions — a known cost). +- Slack **bypasses ACL** for channels with document sets. + +--- + +## 3. Reranking: what it actually does + +**Bi-encoder (default, today):** Vespa scores each chunk as +`alpha·vector_similarity + (1-alpha)·BM25`, then `× document_boost × recency_bias`. +`vector_similarity` is cosine between the query embedding and each chunk's +*pre-computed* embedding — fast, but it can't model fine-grained query↔chunk +interaction. + +**Cross-encoder (reranking):** feeds the query **and** each chunk's *text* +together through a transformer (`mxbai-rerank-xsmall-v1` by default) so attention +runs across both → a far more accurate relevance judgment. Standard +retrieve-broad-then-rerank. + +**The real flow (corrected mental model):** +1. Vespa returns `NUM_RETURNED_HITS = 50` (a single all-sources query) — **not 15**. +2. `rerank_chunks` reranks only the **top 15** (`NUM_RERANKED_RESULTS`, + `chunks_to_rerank[:num_rerank]`); the rest get `score=None`, appended behind. +3. `semantic_reranking` computes the cross-encoder score, then + **`boosted = cross_encoder_score × document_boost × recency_bias`** + (`postprocessing.py:~74`) and re-sorts. So recency/boost are **re-applied** on + top of the cross-encoder score — it's not pure cross-encoder. +4. Token-budget prune → ~10 chunks to the LLM (LLM-relevant ones hoisted first). + +So **Vespa's score still selects *which* 15 are candidates**; the cross-encoder +reorders within that set. This is why a biased candidate set (see §5) matters. + +**Flags (both default `false` → rerank off everywhere):** +- Slack: `skip_rerank = not ENABLE_RERANKING_ASYNC_FLOW` +- Chat: `skip_rerank = not ENABLE_RERANKING_REAL_TIME_FLOW` + +This branch supersedes both with `RERANK_ENABLED` + per-assistant (see §6); the +old flags remain as a fallback. + +--- + +## 4. Recency / freshness + +The decay machinery **already exists and is identical to upstream Onyx** — there +is nothing to "catch up" on; the lever is *tuning* it. + +Vespa rank profile (`danswer_chunk.sd`): +``` +document_age = max(if(isNan(doc_updated_at),7890000, now()-doc_updated_at)/31536000, 0) # years +recency_bias = max(1 / (1 + query(decay_factor) * document_age), 0.75) # floored at 0.75 +# global-phase: (alpha·norm(vector) + (1-alpha)·norm(keyword)) * document_boost * recency_bias +``` +- `decay_factor = DOC_TIME_DECAY(0.5) × recency_bias_multiplier`. +- **Floor 0.75** ⇒ an old doc loses *at most 25%* of its score. It only *decays* + old docs; it never *boosts* fresh ones. + +`recency_bias_multiplier` per persona (`preprocessing.py:~176`): `no_decay`→0, +`base_decay`→0.5, `favor_recent`→1.0, `auto`→LLM-predicted. + +**Gotcha — `auto` is effectively dead.** The LLM time-filter auto-detection +(`enable_auto_detect_filters`) is **never threaded through**: `handle_message` +sets it on `RetrievalDetails`, but `SearchTool` doesn't copy it into +`SearchRequest` and `pipeline.py` doesn't pass it to `retrieval_preprocessing` +(whose param defaults `False`). So personas set to `recency_bias: "auto"` (the +seeded default) fall back to *base* decay and never favor recent. → **use +`favor_recent` for a deterministic recency preference.** + +**Levers to actually prefer recent (by effort):** +| Lever | Effect | Cost | +|---|---|---| +| persona `recency_bias: favor_recent` | doubles decay rate | config only | +| lower the `0.75` floor in `danswer_chunk.sd` | old docs decay further | Vespa schema redeploy | +| raise `DOC_TIME_DECAY` env | sharper 0–2yr decay | env only (floor-capped) | +| add a real `freshness()` boost term | lifts new docs | schema change + **reindex** | + +**Reranking weakens recency further** (a 0.75–1.0 multiplier barely moves a wide +cross-encoder score spread). So if recency matters, tune decay **separately** +from the rerank rollout, and measure independently. + +--- + +## 5. Source prioritization & authoritative citations + +**The requirement:** a chatty source (e.g. a busy Slack channel) shouldn't crowd +authoritative **curated** content (`web`/docs, `sfkbarticles`, `highspot`, +`outsystems`) out of the answer — neither out of the prompt nor out of the +**citations** the user sees. + +This is a **layered pipeline**, all global + config-gated (no per-assistant knob), +all keyed off **`PROTECTED_SOURCES`** (prod: `web,sfkbarticles,highspot,outsystems`). +Each layer was added because the prior one was necessary-but-insufficient — they +move a doc from *retrieved* → *in the prompt* → *cited*. Everything below applies to +**both** the chat and Slack flows (they share `SearchPipeline` and the `Answer` +object) and **every** assistant. + +> Historical note: the fork once ran a **two-query union** in `_query_vespa` (an +> all-sources query plus a source-filtered one). It was removed because the rank +> profile's `normalize_linear(...)` is min-max **relative to each query's candidate +> set**, so the narrow second query over-promoted its docs regardless of true +> relevance. The *recall goal* it served is now met by §5.1 — without that bug. + +### 5.1 Recall — `SOURCE_RESERVED_RETRIEVAL_SLOTS` (retrieval) +Final-selection promotion (§5.2) can only reorder docs retrieval already returned. +When a chatty source saturates the top-`NUM_RETURNED_HITS=50`, a relevant curated +doc may not be in the candidate set at all. `SearchPipeline._supplement_protected_sources` +(pure core `protected_source_topup`) runs **one extra source-scoped retrieval** when +fewer than N protected-source chunks are present, and merges the top results in. It +reuses the **same filters** as the main query (ACL + persona document-set fence) and +only **adds** a `source_type` restriction — so it never widens scope; it guarantees +**presence** (ordering is §5.2, so it doesn't rely on the old normalize artifact). +prod `=6`, code default `0` (off). *Note: 6 because a relevant protected doc can rank +#4 among protected sources and miss a smaller cut.* + +### 5.2 Prompt position — `SOURCE_DIVERSITY_RESERVED_SLOTS` (final selection) +`ensure_source_diversity` in `doc_pruning.py` (in `_apply_pruning`, after the +relevance reorder, before the token cut) promotes up to N of the highest-ranked +`PROTECTED_SOURCES` docs to the **front** of the prompt, preserving the rest of the +order. prod `=3` (was 2). Disable with `=0`. + +### 5.3 Prompt balance — `MAX_PROMPT_DOCS_PER_SOURCE` (final selection) +Even with curated docs at the front, a prompt of `3 curated + 49 Slack` lets the LLM +ground every claim in the dominant source. `cap_docs_per_source` (in `_apply_pruning`, +after `ensure_source_diversity`) keeps the top-N docs **per source** and drops the +rest before the token cut. prod `=8`, code default `0`. Only binds when a source +dominates (single-source assistants unaffected). + +### 5.4 Citation preference — authoritative-sources nudge (prompt) +A soft, global instruction (`build_authoritative_sources_reminder` in +`prompt_utils.py`, appended to the shared `CITATION_REMINDER` via +`build_task_prompt_reminders`, derived from `PROTECTED_SOURCES`) asks the model to +prefer citing authoritative sources over chat discussions when they support the +point. Soft — it nudges, it doesn't guarantee. + +### 5.5 Citation guarantee — verify-then-retain (`AUTHORITATIVE_CITATION_RETENTION_ENABLED`) +**The hard lesson:** presence/position/balance (5.1–5.3) reliably get curated docs +*into the prompt and into the answer's content*, but **citation attribution is a +separate, harder problem**. With curated docs at prompt positions [1][2][3], the LLM +still cited the near-duplicate Slack threads — and *no* prompt lever (soft nudge, +mandatory "you MUST cite", grouped output) reliably flipped it (the grouped variant +even mislabeled). Citations are the LLM's output; a prompt is a request it can ignore. + +So we add a **deterministic post-generation step** in `Answer._process_stream` +(`authoritative_retention.py`): for any **uncited** authoritative doc in context, one +batched LLM call checks whether it's relevant, and relevant ones are appended as an +**"Authoritative sources" footer** (markdown links; renders in chat + Slack). It is: +- **additive** (the LLM's own inline citations are untouched); +- **gated** to *uncited* authoritative docs — citing one KB doesn't suppress surfacing + another relevant docs/web page; +- **verified on the matched chunk** (`LlmDoc.content`, the retrieved passage, passed + whole) against **both the question and the answer** — relevance to the *question* + (not just the answer) is what excludes topically-adjacent docs (e.g. an Azure-SignalR + or "Automation Cloud cannot be accessed" KB on an "is there AI?" question), while a + same-subject doc with a scary "error" title (e.g. a "Migration failed … on upgrade" + KB whose body is about pre-upgrade table cleanup) is correctly kept; +- **conditional + bounded** — at most ONE extra call, only when an uncited + authoritative doc is present; retries once on a transient gateway timeout; fail-closed. + +> Why a footer and not merged into the numbered "Sources" cards: `citation_num` is +> the doc's context position and the LLM already owns the low numbers, so injecting a +> retained doc collides (de-duped away by `translate_citations`, first-wins) and can't +> be placed "at the top" without renumbering the LLM's inline `[[n]]`. The footer +> sidesteps that. The footer/`final_context` links *are* rewritten (§5.6); the LLM's +> inline citation **cards** come from the reference-doc snapshot and are left as-is. + +### 5.6 Docs versioning — `rewrite_docs_links` (version-aware) +The docs.uipath.com connector indexes ~6 versions of every page (2022.4 … 2025.10, +plus slug variants), with near-identical content — so which version gets retrieved is +~arbitrary, and the query-time version dedup (`dedupe_doc_versions`) only collapses +versions that were *retrieved*. `rewrite_docs_links` (in `doc_pruning.py`, called from +`search_tool` after prune) resolves each versioned docs link to the right version of +the same page (URL with the version segment stripped, slug kept): +- if the question names exactly one version (`parse_question_doc_version`: "23.10" → + `2023.10`; multiple = ambiguous → None), resolve to **that** version even if older — + "is X supported in 23.10?" must point at the 23.10 doc; +- otherwise resolve to the **newest indexed** version. +One PK-indexed prefix-scan per page; no reindex. + +--- + +## 6. The incremental design (what was built) + +**Two-level gate — rerank runs iff `RERANK_ENABLED` (global) AND +`persona.rerank_enabled` (per-assistant).** + +- **Global** `RERANK_ENABLED` (env, default false): the master switch. When on, + the reranker is available (served by **TEI on CPU** — see §7 — or a GPU) and + the app *may* rerank. Off (local / default) ⇒ reranking never runs. +- **Per-assistant** `Persona.rerank_enabled` (bool, default false): which + assistants actually rerank. Lets you enable it on one assistant, compare + answers against an un-toggled copy in chat **or** Slack, and flip the default + once convinced. +- **Single resolver** `_resolve_skip_rerank(explicit, persona)` in + `preprocessing.py` is the one place both chat and Slack decide reranking + (`rerank = (RERANK_ENABLED and persona.rerank_enabled) or + ENABLE_RERANKING_REAL_TIME_FLOW`). Slack now passes `skip_rerank=None` so it + shares this logic. An explicit `skip_rerank` is honored as-is. + +Both chat and Slack respect it because `SearchTool` builds `SearchRequest` with +`skip_rerank=None` + `persona=`, and preprocessing reads +`search_request.persona`. + +--- + +## 6b. The two assistant knobs + valid combinations + +There are **two per-assistant search-quality knobs**, each gated the same way +(global master switch × per-assistant flag, with a per-conversation chat toggle +that ignores the assistant). Source diversity (§5) is **not** a knob — it's +automatic and globally configured. + +| Knob | Global flag | Per-assistant | Chat toggle (default) | Needs GPU? | +|---|---|---|---|---| +| **Reranking** (cross-encoder) | `RERANK_ENABLED` | `Persona.rerank_enabled` | off | No — TEI on CPU (§7) | +| **LLM relevance filter** (one-shot, **main LLM**) | `LLM_RELEVANCE_FILTER_ENABLED` | `Persona.llm_relevance_filter` | off | **No** (LLM-only) | + +- **LLM relevance filter** is a **single listwise call on the main LLM** + (`llm_eval_chunks_listwise`, fails open on parse/error), not 15 fast-LLM + calls. It needs **no GPU**, so it's a cheaper quality tier on its own. +- **Source prioritization & authoritative citations** (§5) is **not** a per-assistant + knob either — it's the global, always-on layered pipeline (`PROTECTED_SOURCES` + + `SOURCE_RESERVED_RETRIEVAL_SLOTS`, `SOURCE_DIVERSITY_RESERVED_SLOTS`, + `MAX_PROMPT_DOCS_PER_SOURCE`, the authoritative nudge, + `AUTHORITATIVE_CITATION_RETENTION_ENABLED`, and the version-aware docs rewrite). No + per-assistant or chat decision. +- **Resolution precedence** for the two knobs: chat per-conversation toggle (if + the request set it) → assistant flag → default. Slack/one-shot always use the + assistant flag. + +**Reranking × relevance filter — all four combinations work** (source diversity +applies underneath all of them): + +| `rerank_enabled` | `llm_relevance_filter` | Behavior | Infra | +|---|---|---|---| +| off | off | raw hybrid order (today's default) | none | +| **on** | off | cross-encoder reordering | **TEI on GPU** | +| off | **on** | LLM relevance filter only (1 LLM call) | **none / no GPU** | +| **on** | **on** | rerank **then** relevance-filter | **TEI on GPU** | + +The "relevance-filter-only, no GPU" row is the cheap middle tier (just one extra +LLM call); the "rerank-on" rows need the **GPU** TEI reranker (CPU was measured at +24–98 s/query — see §7). + +--- + +## 7. Reranker serving: TEI on GPU (NVIDIA T4) + +**Decision: serve the reranker on GPU via the upstream Hugging Face TEI image.** + +We first tried **CPU** (the cluster had no GPU). It was functionally correct but +**not interactive-viable**: measured live on prod (`bge-reranker-v2-m3`, 568M, +fp32, 4 vCPU) reranking 15 chunks took **24 s** (passages ≤512 tok) to **98 s** +(longer passages) — ~1.6 s/passage — and the long blocking inference starved the +`/health` endpoint, so the liveness probe killed the pod (503s under load). +`--auto-truncate` caps the tail but not the ~24 s floor; replicas add concurrency, +not single-query speed; fp16 doesn't help on CPU (x86 has no native fp16 matmul — +ORT upcasts to fp32); int8 is only ~2–4×. So reranking moved to **GPU**, where the +same model reranks 15 chunks in **~20–80 ms**. + +Why the **upstream image, no custom build**: TEI's **GPU** backend is Candle + +**safetensors**, which `bge-reranker-v2-m3` ships — so the model loads directly. +The custom-image/ONNX-export dance was *only* needed by TEI's **CPU** runtime +(ONNX-Runtime-based; the model has no ONNX weights). On GPU that constraint is +gone, so we point straight at `ghcr.io/huggingface/text-embeddings-inference:turing-1.5` +(`turing` == T4, compute 7.5; use `86-1.5` for A10, `1.5` for A100/H100). + +How it's wired: +- A **`tei-rerank`** Deployment (upstream GPU image, `--dtype float16`) on the + tainted GPU node pool — it tolerates `gpu=true:NoSchedule` and requests + `nvidia.com/gpu: 1` (only GPU nodes advertise it, which also pins scheduling). + Model weights download once into a PVC-backed HF cache (`/data`); restarts reuse + it (no re-download), so no custom image is needed to avoid re-downloads. +- The app's `CrossEncoderEnsembleModel` calls TEI's `/rerank` when + **`RERANK_SERVER_URL`** is set (scattering TEI's score-sorted reply back to + passage order); otherwise it uses the legacy model-server path. When TEI is in + use, our own model server does **not** load the cross-encoder. +- **No reindex** to adopt or switch rerankers — cross-encoders score chunk *text* + at query time; only changing the *embedding* model forces a reindex. (Avoid + late-interaction/ColBERT-style models, which would need indexing changes.) + +Node pool: **`Standard_NC4as_T4_v3`** (1× T4 16 GB, 4 vCPU, ~$480/mo) tainted +`gpu=true:NoSchedule`. The reranker needs only ~1.5–2 GB VRAM, so the T4 is ample +and GPU compute is never the bottleneck. TEI is stateless → scale with replicas / +an HPA for throughput. + +k8s: **`k8s/optional/tei-rerank/`** component (PVC + Deployment + Service, GPU +request + taint toleration, `/health` probes). Included by the **prod** overlay +(local dev loads the cross-encoder in-process — no TEI container). The overlay's +`env.properties` sets `RERANK_ENABLED=true`, +`RERANK_SERVER_URL=http://tei-rerank-service:80`, and +`LLM_RELEVANCE_FILTER_ENABLED=true`. (The earlier CPU `tei-rerank` image and its +ONNX-export Dockerfile were removed in favor of this.) + +> If a GPU is ever desired for lowest latency, the same model runs on a **CUDA** +> node (e.g. `NV6ads_A10_v5` — fractional NVIDIA A10; *not* `NV8as_v4`, whose GPU +> is AMD and unusable by TEI/PyTorch). Not needed for current scale. + +--- + +## 8. Implementation map (files changed on this branch) + +Backend: +- `db/models.py` — `Persona.rerank_enabled` (server_default false). +- `alembic/versions/f6a7b8c9d0e1_persona_rerank_enabled.py` — migration + (down_revision `e5f6a7b8c9d0`). +- `db/persona.py` — thread `rerank_enabled` through `upsert_persona` / + `create_update_persona`. +- `server/features/persona/models.py` — `CreatePersonaRequest` + + `PersonaSnapshot` + `from_model`. +- `shared_configs/configs.py` — `RERANK_ENABLED`, env-selectable + `CROSS_ENCODER_MODEL_ENSEMBLE` via `RERANK_MODEL_NAME`. +- `search/preprocessing/preprocessing.py` — `_resolve_skip_rerank` (single + resolver) + `Persona` import. +- `danswerbot/slack/handlers/handle_message.py` — `skip_rerank=None` (+ dropped + the now-unused `ENABLE_RERANKING_ASYNC_FLOW` import). +- `model_server/main.py` — warm cross-encoder when `RERANK_ENABLED`. +- `document_index/vespa/index.py` — `_query_vespa` simplified to a **single + all-sources query** (removed the two-query union). + +LLM relevance filter (independent gate, one-shot, main LLM): +- `configs/chat_configs.py` — `LLM_RELEVANCE_FILTER_ENABLED`. +- `preprocessing.py` — `_resolve_skip_llm_chunk_filter` resolver. +- `prompts/llm_chunk_filter.py` + `secondary_llm_flows/chunk_usefulness.py` — + `LISTWISE_CHUNK_FILTER_PROMPT` + `llm_eval_chunks_listwise` (+ `_parse_useful_indices`). +- `search/pipeline.py` — relevance filter now uses the **main** llm (not fast). +- `search/postprocessing/postprocessing.py` — `filter_chunks` → listwise call. + +Source prioritization & authoritative citations (automatic, global — §5): +- `configs/chat_configs.py` — `PROTECTED_SOURCES`, `SOURCE_DIVERSITY_RESERVED_SLOTS`, + `SOURCE_RESERVED_RETRIEVAL_SLOTS`, `MAX_PROMPT_DOCS_PER_SOURCE`, + `AUTHORITATIVE_CITATION_RETENTION_ENABLED`. +- `search/pipeline.py` — `_supplement_protected_sources` + pure + `protected_source_topup` (recall, §5.1). +- `llm/answering/doc_pruning.py` — `ensure_source_diversity` (§5.2), + `cap_docs_per_source` (§5.3); `rewrite_docs_links` + `parse_question_doc_version` + + `_versioned_url_parts` (version-aware docs links, §5.6); `dedupe_doc_versions` / + `_docs_version_sort_key` (version dedup). +- `prompts/prompt_utils.py` — `build_authoritative_sources_reminder` (nudge, §5.4), + appended in `build_task_prompt_reminders`. +- `llm/answering/authoritative_retention.py` — verify-then-retain footer + (`select_authoritative_candidates`, `verify_supporting_docs`, + `retained_authoritative_footer`), hooked in `llm/answering/answer.py` + `_process_stream` (§5.5). +- `tools/search/search_tool.py` — calls `rewrite_docs_links(..., + parse_question_doc_version(query))` after `prune_documents`. + +Chat per-conversation toggles + TEI serving: +- `server/query_and_chat/models.py` — `use_reranking` / `use_relevance_filter` + on `CreateChatMessageRequest`. +- `tools/search/search_tool.py` + `chat/process_message.py` — thread the + per-conversation skips into the `SearchRequest`. +- `shared_configs/configs.py` — `RERANK_SERVER_URL`; `search_nlp_models.py` + `CrossEncoderEnsembleModel` TEI `/rerank` path; `model_server/main.py` skips + loading the cross-encoder when TEI serves it. + +Web: +- `admin/assistants/{interfaces,lib,AssistantEditor}.tsx` — "Rerank results" + checkbox (relevance filter reuses the existing "Apply LLM Relevance Filter"). +- `chat/{lib.tsx,ChatPage.tsx,input/ChatInputBar.tsx}` — two per-conversation + chat toggles (Rerank, Relevance). + +Infra: +- `k8s/optional/tei-rerank/` — GPU TEI reranker component (upstream TEI image on a + T4 node pool); included by the **prod** overlay (sets `RERANK_ENABLED` / + `RERANK_SERVER_URL` / `LLM_RELEVANCE_FILTER_ENABLED`). **Local** loads the + reranker in-process (no TEI, no GPU), via `RERANK_ENABLED` with + `RERANK_SERVER_URL` unset. + +Tests: +- `tests/unit/.../test_resolve_skip_rerank.py`, `test_resolve_skip_llm_chunk_filter.py` + — the two gating matrices. +- `tests/unit/.../test_listwise_chunk_filter.py` — listwise parser. +- `tests/unit/.../test_source_diversity.py` — diversity promotion / caps / disable. +- `tests/unit/.../test_source_reserved_topup.py` — recall top-up / dedupe / scope. +- `tests/unit/.../test_cap_docs_per_source.py` — per-source cap. +- `tests/unit/.../test_authoritative_sources.py` — the nudge text from PROTECTED_SOURCES. +- `tests/unit/.../test_authoritative_retention.py` — candidate gate, chunk-based + verify, fail-closed/retry, footer. +- `tests/unit/.../test_docs_version_rewrite.py` — version parse + version-aware / + latest rewrite. +- `tests/integration/` — TEI rerank transport (mocked), **real CPU cross-encoder + reordering** (MiniLM), and `filter_chunks` with a stub LLM. + +--- + +## 9. How to enable in prod (when ready) + +1. `alembic upgrade head` (adds `persona.rerank_enabled`) → bounce `dapi` + `dbe`. +2. Add a GPU node pool tainted `gpu=true:NoSchedule` (prod uses + `Standard_NC4as_T4_v3`). The prod overlay already includes + `../../optional/tei-rerank` and sets `RERANK_ENABLED` / `RERANK_SERVER_URL` / + `LLM_RELEVANCE_FILTER_ENABLED` in `env.properties` — `kubectl apply -k + k8s/overlays/prod`. TEI pulls the upstream GPU image and downloads the model + once into its PVC-backed cache. +3. Per assistant (admin editor): toggle **Rerank results** and/or **Apply LLM + Relevance Filter**. Or use the **chat-page toggles** to A/B per conversation. + Compare answers, then flip the defaults once satisfied. Source prioritization & + authoritative citations (§5) are automatic and already on in prod — + `PROTECTED_SOURCES=web,sfkbarticles,highspot,outsystems`, + `SOURCE_RESERVED_RETRIEVAL_SLOTS=6`, `SOURCE_DIVERSITY_RESERVED_SLOTS=3`, + `MAX_PROMPT_DOCS_PER_SOURCE=8`, `AUTHORITATIVE_CITATION_RETENTION_ENABLED=true`. + +Reranking needs the GPU node; the **relevance filter alone needs no GPU**. Local +exercises reranking in-process (no GPU) for dev. + +--- + +## 10. Open / sequenced follow-ups + +- **Recency tuning is a separate experiment** from reranking — don't bundle. + Start with `favor_recent` on the test assistant (config); lower the `0.75` + floor only if needed (schema redeploy). Measure independently. +- **Graceful rerank fallback:** if the TEI reranker errors, search currently + degrades rather than falling back to bi-encoder order — worth adding before + enabling rerank by default (see §3 / reliability). +- **`enable_auto_detect_filters` is dead** globally (§4) — fixing it would restore + LLM time/source filter extraction *and* the `auto` recency path; tracked + separately. +- **No min relevance-score cutoff** (`SEARCH_DISTANCE_CUTOFF=0` unused) — weak + chunks still fill the context window; candidate for a follow-up. +- **The 5× re-execution on missing citations** (Slack) — latency/cost + orphaned + chat sessions; candidate for a cap. +- Consider a **stronger/larger reranker** or hosted (Cohere) if `bge-reranker-v2-m3` + isn't enough — reindex-free either way. +- **Externalize the tuned prompts to a configmap** (verify prompt §5.5, nudge §5.4) + so prompt iteration doesn't need an image rebuild — file-mounted configmap, read + with the in-code default as fallback (hot-reload via mounted-file sync). Scoped but + not yet built; only worth it for the actively-tuned prompts, and watch for + config-vs-git drift (repo file stays canonical). +- **Authoritative citation: only the footer / `final_context` docs links are + version-rewritten (§5.6); the LLM's inline citation cards (reference-doc snapshot) + are not** — deliberate, but means inline cards can show a different version than + the footer. Revisit only if that inconsistency matters. +- **Indexed-content freshness:** relevance (and the answer) are judged against the + *indexed* copy of a doc; an edited KB/docs page can diverge from what we indexed, + so a citation may point to a live doc whose current content differs. Connector + re-sync cadence, not a verify-logic issue. +- **DB-backed / admin-editable prompts** — the "real" version of prompt + externalization if tuning becomes frequent or multi-owner; bigger build (table + + endpoints + UI), not warranted yet. diff --git a/docs/web-deploy.md b/docs/web-deploy.md new file mode 100644 index 00000000000..850ff9c1f04 --- /dev/null +++ b/docs/web-deploy.md @@ -0,0 +1,113 @@ +# Deploying the web image (`danswer-web-server`) + +**TL;DR — do NOT use `build-deploy.sh deploy web` on an Apple-Silicon Mac.** The +web `next build` segfaults under Docker's amd64 emulation (see below). Build the +image natively on Azure with `az acr build`, copy it into the prod registry, then +bump the tag and apply. + +The **backend** image is unaffected — `k8s/scripts/build-deploy.sh deploy backend` +works fine locally (no V8/SWC compile). + +--- + +## Why the local web build fails + +`docker build --platform linux/amd64 … next build` crashes with +`Next.js build worker exited … signal SIGSEGV` ~11s into the compile, on Apple +Silicon. As of 2026‑06‑11 this happens **even with Docker's Apple Virtualization +framework + Rosetta enabled** (it used to work with Rosetta; a Docker +Desktop / macOS / containerd-snapshotter change regressed it). None of these fix +it: build-cache prune, Docker restart, disabling the containerd image store, +`next.config` worker tweaks. `sharp` is a runtime dep, so you also can't dodge it +by mixing a glibc builder with the alpine runtime (libc mismatch). + +So: **build off the Mac.** + +--- + +## Registries & subscriptions + +| Registry | Subscription | Purpose | +|---|---|---| +| `darwinacr.azurecr.io` | `Internal-Production-EA` (`202e5d15-…`), RG `darwin` | **Build here** (native amd64, same sub as the AKS) | +| `sfbrdevhelmweacr.azurecr.io` | a different sub (`d10f042e-…`) | **Prod pull registry** — what `k8s/overlays/prod` references | + +Image tags follow `vha-N` (bump from the current `newTag` in +`k8s/overlays/prod/kustomization.yaml`). + +--- + +## Build + deploy steps + +Set `TAG` to the next `vha-N` (current web tag + 1): + +```bash +TAG=vha-79 # <- bump from kustomization.yaml + +# 1. Build natively on Azure (no emulation, no SIGSEGV). +# Requires Contributor RBAC on darwinacr — see "Permissions" below. +az acr build --registry darwinacr \ + --image danswer/danswer-web-server:$TAG \ + --file web/Dockerfile web + +# 2. Copy the image into the prod registry (pure blob copy — safe on the Mac, +# no execution). darwinacr login via AAD; sfbrdevhelmweacr via admin creds. +az acr login --name darwinacr +docker login sfbrdevhelmweacr.azurecr.io -u "$ACR_USERNAME" -p "$ACR_PASSWORD" # creds from ~/.zshrc +docker pull --platform linux/amd64 darwinacr.azurecr.io/danswer/danswer-web-server:$TAG +docker tag darwinacr.azurecr.io/danswer/danswer-web-server:$TAG \ + sfbrdevhelmweacr.azurecr.io/danswer/danswer-web-server:$TAG +docker push sfbrdevhelmweacr.azurecr.io/danswer/danswer-web-server:$TAG + +# 3. Point the prod overlay at the new tag and apply (context must be `darwin`). +# Edit k8s/overlays/prod/kustomization.yaml: danswer-web-server newTag -> $TAG +kubectl apply -k k8s/overlays/prod + +# 4. Verify. +bash k8s/scripts/build-deploy.sh verify # live tags == manifest + pod health +kubectl get pods -n darwin -l app=web-server # expect 2/2 Running on $TAG +``` + +Live sanity check (the app is OIDC-gated, so curl `/auth/login` — served by our +Next app — and confirm the build markers are present): + +```bash +curl -s https://darwin.westeurope.cloudapp.azure.com/auth/login | grep -o "darwin-theme" +``` + +Finally, commit the `kustomization.yaml` tag bump (the script does not auto-commit it). + +--- + +## Permissions + +- **`az acr build` on darwinacr** needs ARM `…/registries/listBuildSourceUploadUrl/action` + — i.e. **Contributor** on the registry. The `~/.zshrc` `ACR_USERNAME`/`ACR_PASSWORD` + are registry **push admin** only; they do **not** satisfy this. +- Today this is obtained via a **PIM time-bound activation** of Contributor + (it expires — re-activate per build session). +- **Pushing to `sfbrdevhelmweacr`** uses the admin creds in `~/.zshrc` (no ARM/PIM). + +### Optional: unattended / CI builds via a service principal + +So web deploys don't depend on a human's PIM activation, an **admin** (with Entra +app-registration rights + `User Access Administrator`/`Owner`) creates an SP: + +```bash +az ad sp create-for-rbac --name darwin-web-ci \ + --role Contributor \ + --scopes /subscriptions/202e5d15-5356-4826-bc61-ebd449d12e34/resourceGroups/darwin/providers/Microsoft.ContainerRegistry/registries/darwinacr +# -> store appId / password / tenant as CI secrets +``` + +Then in CI (or locally): + +```bash +az login --service-principal -u -p --tenant +az acr build --registry darwinacr --image danswer/danswer-web-server:$TAG -f web/Dockerfile web +# ... then the copy + apply steps above +``` + +> Note: the repo's existing `.github/workflows/docker-build-push-web-container-on-tag.yml` +> pushes to **Docker Hub** (upstream Onyx), not this ACR — it is not wired to this +> deploy path. diff --git a/k8s/optional/tei-rerank/kustomization.yaml b/k8s/optional/tei-rerank/kustomization.yaml new file mode 100644 index 00000000000..3e9c9109bc6 --- /dev/null +++ b/k8s/optional/tei-rerank/kustomization.yaml @@ -0,0 +1,26 @@ +# Kustomize Component — CPU-hosted cross-encoder reranker via Hugging Face TEI. +# +# Adds a `tei-rerank` Deployment + Service that serves BAAI/bge-reranker-v2-m3 +# on CPU at full precision (no GPU, no accuracy loss vs GPU — just a bit more +# latency, ~100-250ms for ~20 chunks). +# +# Opting an overlay into this component is the "global enable" half of +# reranking. You MUST ALSO set, in the including overlay's env.properties (so +# they reach the api-server / background pods via env-configmap): +# +# RERANK_ENABLED=true +# RERANK_SERVER_URL=http://tei-rerank-service:80 +# +# With those set, the app reranks for assistants whose `rerank_enabled` flag is +# on (and for chat conversations whose Rerank toggle is on). Our own model +# server then does NOT load the cross-encoder — TEI owns it. +# +# Opt in from an overlay: +# components: +# - ../../optional/tei-rerank +# # + RERANK_ENABLED=true and RERANK_SERVER_URL=... in env.properties +apiVersion: kustomize.config.k8s.io/v1alpha1 +kind: Component + +resources: + - tei-rerank.yaml diff --git a/k8s/optional/tei-rerank/tei-rerank.yaml b/k8s/optional/tei-rerank/tei-rerank.yaml new file mode 100644 index 00000000000..cc68144ea71 --- /dev/null +++ b/k8s/optional/tei-rerank/tei-rerank.yaml @@ -0,0 +1,147 @@ +# HuggingFace Text-Embeddings-Inference (TEI) serving the cross-encoder reranker +# on GPU (NVIDIA T4). The app reaches it via RERANK_SERVER_URL → /rerank. +# +# Why the UPSTREAM image (no custom build): TEI's GPU backend is Candle + +# safetensors, and BAAI/bge-reranker-v2-m3 ships safetensors — so the model +# loads directly, no ONNX export needed. (That dance was only required by TEI's +# *CPU* runtime, which is ONNX-Runtime-based and the model has no ONNX weights.) +# CPU serving was also far too slow (~1.6s/passage, 24–98s for 15 chunks), so +# reranking lives on GPU (~20–80ms). +# +# Why NO istio sidecar: TEI is a leaf service that needs no mesh features, mTLS +# is PERMISSIVE (the api-server still reaches it), and — critically — an istio +# sidecar isn't running during the init phase, so the model-prefetch init +# container below couldn't reach the network with the sidecar injected. +# +# Why prefetch the model in an init container (vs letting TEI download it): TEI's +# Rust hf-hub client fails on Hugging Face's redirect with "relative URL without +# a base". The Python huggingface_hub client resolves it fine, so we prefetch the +# weights into the PVC and run TEI offline against the local path. +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: tei-rerank-model-cache +spec: + accessModes: ["ReadWriteOnce"] + # No storageClassName → cluster default SC (AKS managed-csi, + # volumeBindingMode=WaitForFirstConsumer, so the disk lands in the GPU node's + # zone — no cross-zone attach conflict). + resources: + requests: + storage: 10Gi +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: tei-rerank-deployment +spec: + replicas: 1 + strategy: + # RWO model-cache PVC can't be mounted by an old + new pod simultaneously. + type: Recreate + selector: + matchLabels: + app: tei-rerank + template: + metadata: + labels: + app: tei-rerank + annotations: + # Leaf service — keep it out of the mesh (see header). + sidecar.istio.io/inject: "false" + spec: + # Tolerate the GPU pool's taint (gpu=true:NoSchedule). + tolerations: + - key: gpu + operator: Equal + value: "true" + effect: NoSchedule + initContainers: + # Prefetch the model into the PVC with the (robust) Python HF client. + # Idempotent: skips the download when the weights are already cached. + - name: fetch-model + image: python:3.11-slim + command: ["sh", "-c"] + args: + - | + set -e + if [ ! -f /data/model/model.safetensors ]; then + pip install --no-cache-dir huggingface_hub + python -c "from huggingface_hub import snapshot_download; snapshot_download('BAAI/bge-reranker-v2-m3', local_dir='/data/model')" + else + echo "model already cached at /data/model" + fi + volumeMounts: + - name: model-cache + mountPath: /data + containers: + - name: tei-rerank + # Upstream TEI GPU image, pinned. `turing-1.5` == NVIDIA T4 (compute + # cap 7.5). For an A10 node switch to `86-1.5`; A100/H100 → `1.5`. + image: ghcr.io/huggingface/text-embeddings-inference:turing-1.5 + args: + # Local path — prefetched by the init container; no runtime download. + - "--model-id" + - "/data/model" + - "--dtype" + - "float16" # GPU: fp16, ~no quality loss vs fp32 for this model + - "--port" + - "80" + - "--max-concurrent-requests" + - "64" + - "--max-batch-tokens" + - "16384" + - "--max-client-batch-size" + - "32" + env: + - name: HF_HUB_OFFLINE + value: "1" # belt-and-suspenders: never touch the hub at runtime + ports: + - containerPort: 80 + protocol: TCP + resources: + requests: + cpu: "2" + memory: 6Gi + limits: + # Requesting a GPU forces this pod onto the GPU node pool. + nvidia.com/gpu: 1 + memory: 12Gi + volumeMounts: + - name: model-cache + mountPath: /data + readinessProbe: + httpGet: + path: /health + port: 80 + periodSeconds: 10 + livenessProbe: + httpGet: + path: /health + port: 80 + periodSeconds: 30 + # First boot prefetches the model (init) then loads it; generous. + startupProbe: + httpGet: + path: /health + port: 80 + periodSeconds: 10 + failureThreshold: 60 + volumes: + - name: model-cache + persistentVolumeClaim: + claimName: tei-rerank-model-cache +--- +apiVersion: v1 +kind: Service +metadata: + name: tei-rerank-service +spec: + ports: + - name: tei-rerank-port + port: 80 + protocol: TCP + targetPort: 80 + selector: + app: tei-rerank + type: ClusterIP diff --git a/k8s/overlays/local/env.properties b/k8s/overlays/local/env.properties index b75f1d918ed..ee84eb01f3c 100644 --- a/k8s/overlays/local/env.properties +++ b/k8s/overlays/local/env.properties @@ -37,6 +37,13 @@ ASYM_QUERY_PREFIX= ASYM_PASSAGE_PREFIX= ENABLE_RERANKING_REAL_TIME_FLOW= ENABLE_RERANKING_ASYNC_FLOW= +# Cross-encoder reranking. Local loads the reranker IN-PROCESS in the model +# server (no TEI container; RERANK_SERVER_URL intentionally unset). Assistants/ +# chats still opt in per their own flag. (Prod serves it via the tei-rerank +# component instead.) +RERANK_ENABLED=true +# LLM relevance filter (LLM-based). Independent of reranking. +LLM_RELEVANCE_FILTER_ENABLED=true # --- LLM --- GEN_AI_MODEL_PROVIDER=custom diff --git a/k8s/overlays/prod-velero/README.md b/k8s/overlays/prod-velero/README.md new file mode 100644 index 00000000000..a1c4ee98c4b --- /dev/null +++ b/k8s/overlays/prod-velero/README.md @@ -0,0 +1,105 @@ +# Velero — weekly Vespa backup (self-managed, prod) + +Standalone Velero install for the `darwin` AKS cluster. **Deliberately separate +from the app overlay** (like `k8s/overlays/prod-vespa`): `kubectl apply -k +k8s/overlays/prod` never touches it. This tree owns Velero, its weekly backup +schedule, the failure alerts, and the Slack notifier — nothing else. + +## What it backs up, and why + +Only the Vespa state that can't be cheaply rebuilt — the 6 PVCs labeled +`backup=vespa` in namespace `darwin`: + +- `vespa-var-vespa-content-{0,1,2}` (3×100Gi) — the search index +- `vespa-var1-vespa-configserver-{0,1,2}` (3×5Gi) — deployed app package/schema + +Everything else is intentionally excluded (model caches re-download; logs and +workspace are scratch; `dynamic-pvc`/`file-connector-pvc` are unused; Redis is +cache). Postgres — the real source of truth — is external + Azure-managed and +backed up separately. + +**Consistency caveat:** disk snapshots are point-in-time *per volume*, not +coordinated across the 3 content nodes. Treat this as a fast-DR accelerator; +the authoritative recovery path is re-indexing from Postgres. + +## Schedule & retention + +- Weekly, Sundays 02:00 UTC (`schedule.yaml`). +- `ttl: 504h` (21 days) → the **last 3 weekly backups** are retained; the + 21-day-old one is garbage-collected (snapshots GC'd too). Worst-case data + loss on a crash is up to one week; max rollback is 21 days. + +## Azure wiring (reused, no new role assignments) + +- Auth: SP `155fae3c-6bce-46be-8c15-25adf84b4c4e`, which already holds + `Contributor` on `darwin-backups`, `MC_darwin_darwin_westeurope`, and + `darwin` — so **no Owner/UAA ticket is needed**. Only a fresh client secret. +- Backup metadata store (`BackupStorageLocation`): SA `darwinaksbackup` / + container `darwinaksbackup` in RG `darwin-backups` (the old store, purged). +- Disk snapshots (`VolumeSnapshotLocation`): node RG + `MC_darwin_darwin_westeurope` — NOT the lock-protected `darwin` RG, so TTL GC + can delete them. + +## Alerting (two layers) + +1. **PrometheusRule + ServiceMonitor** (`release: robusta` label → picked up by + the existing robusta Prometheus/Alertmanager). Real-time failure + staleness. + `VeleroNoRecentSuccessfulBackup` is the safety net that watches the *outcome* + (time since last success) — it would have caught the previous silent + ~year-long failure regardless of cause. +2. **Notifier CronJob** (`notifier.yaml`) — Sundays 03:00 UTC, posts the latest + run's result (✅/⚠️/❌, both success and failure) to **#darwin-devs** via the + Slack bot token. + +## One-time setup (in order) + +```bash +# context must be darwin +kubectl config current-context # -> darwin + +# 1. CRDs (version plumbing, not config — bootstrap once). Either: +velero install --crds-only # if the velero CLI is handy, OR +kubectl apply -f https://raw.githubusercontent.com/vmware-tanzu/velero/v1.13.2/config/crd/v1/crds/crds.yaml + +# 2. Azure SP secret — create a FRESH client secret on app 155fae3c +# (as an OWNER OF THE APP REGISTRATION; NOT subscription Owner/UAA), then: +cp credentials-velero.example credentials-velero +# edit credentials-velero -> paste AZURE_CLIENT_SECRET + +# 3. Slack notifier — reuse the existing bot token, and INVITE the bot to +# #darwin-devs (the bot-token path posts by channel, not the email address): +cp slack-notify.env.example slack-notify.env +kubectl get secret danswer-secrets -n darwin \ + -o jsonpath='{.data.DANSWER_BOT_SLACK_BOT_TOKEN}' | base64 -d # -> SLACK_BOT_TOKEN + +# 4. Apply +kubectl apply -k k8s/overlays/prod-velero + +# 5. Validate +kubectl -n velero get pods # velero Running +kubectl -n velero get backupstoragelocation # PHASE Available (NOT Unavailable) +velero backup create --from-schedule vespa-weekly # on-demand test run (or wait for Sunday) +kubectl -n velero get backups.velero.io # PHASE Completed +``` + +`credentials-velero` and `slack-notify.env` are gitignored — never commit them. + +## Restore (outline) + +```bash +velero restore create --from-backup \ + --include-namespaces darwin --selector backup=vespa +``` +Restoring recreates the PVCs from snapshots. Coordinate with the Vespa +StatefulSets (scale down content/configserver, restore, scale up) — see +k8s/scripts and the Vespa overlay. + +## History + +The previous self-managed Velero (same SP, same SA) silently failed for ~a year +(volume snapshots stopped 2025-06-10; BSL auth died on an expired SP secret). +Removed 2026-06-11 along with its 130 stale snapshots and 63 backup blobs. This +install is the revival — same approach, now with **failure + staleness alerts** +and a **success/failure notifier** so it can't fail silently again. Set a +long-lived SP secret (and a renewal reminder); nothing rotates it automatically. + diff --git a/k8s/overlays/prod-velero/backupstoragelocation.yaml b/k8s/overlays/prod-velero/backupstoragelocation.yaml new file mode 100644 index 00000000000..70a20ad99ed --- /dev/null +++ b/k8s/overlays/prod-velero/backupstoragelocation.yaml @@ -0,0 +1,18 @@ +# Where Velero stores backup metadata (object storage). Reuses the existing +# storage account + container the old Velero used (now purged) — the SP +# 155fae3c already holds Contributor on RG darwin-backups. +apiVersion: velero.io/v1 +kind: BackupStorageLocation +metadata: + name: default + namespace: velero + labels: + app.kubernetes.io/name: velero +spec: + provider: velero.io/azure + objectStorage: + bucket: darwinaksbackup + config: + resourceGroup: darwin-backups + storageAccount: darwinaksbackup + subscriptionId: 202e5d15-5356-4826-bc61-ebd449d12e34 diff --git a/k8s/overlays/prod-velero/credentials-velero.example b/k8s/overlays/prod-velero/credentials-velero.example new file mode 100644 index 00000000000..06d907e1e78 --- /dev/null +++ b/k8s/overlays/prod-velero/credentials-velero.example @@ -0,0 +1,17 @@ +# Velero Azure credentials — TEMPLATE. Copy to `credentials-velero` (gitignored) +# and fill in AZURE_CLIENT_SECRET with a FRESH client secret for app +# 155fae3c-6bce-46be-8c15-25adf84b4c4e (create it as an OWNER OF THE APP +# REGISTRATION — this is a directory permission, NOT subscription Owner/UAA). +# +# The previous Velero broke because this secret EXPIRED. When you create the new +# one, set a long expiry (or a calendar reminder), because nothing here rotates +# it automatically — the alerts will catch the failure, but renewing is manual. +# +# AZURE_RESOURCE_GROUP must be the RG where Velero CREATES disk snapshots +# (the node RG), matching the VolumeSnapshotLocation. +AZURE_SUBSCRIPTION_ID=202e5d15-5356-4826-bc61-ebd449d12e34 +AZURE_TENANT_ID=d8353d2a-b153-4d17-8827-902c51f72357 +AZURE_CLIENT_ID=155fae3c-6bce-46be-8c15-25adf84b4c4e +AZURE_CLIENT_SECRET= +AZURE_RESOURCE_GROUP=MC_darwin_darwin_westeurope +AZURE_CLOUD_NAME=AzurePublicCloud diff --git a/k8s/overlays/prod-velero/deployment.yaml b/k8s/overlays/prod-velero/deployment.yaml new file mode 100644 index 00000000000..a2c4cc4e65f --- /dev/null +++ b/k8s/overlays/prod-velero/deployment.yaml @@ -0,0 +1,78 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: velero + namespace: velero + labels: + app.kubernetes.io/name: velero + component: velero +spec: + replicas: 1 + selector: + matchLabels: + app.kubernetes.io/name: velero + template: + metadata: + labels: + app.kubernetes.io/name: velero + component: velero + annotations: + prometheus.io/scrape: "true" + prometheus.io/port: "8085" + prometheus.io/path: /metrics + spec: + serviceAccountName: velero + # Azure plugin is loaded as an init container into the shared /plugins dir. + initContainers: + - name: velero-plugin-for-microsoft-azure + image: velero/velero-plugin-for-microsoft-azure + imagePullPolicy: IfNotPresent + volumeMounts: + - name: plugins + mountPath: /target + containers: + - name: velero + image: velero/velero + imagePullPolicy: IfNotPresent + command: + - /velero + args: + - server + ports: + - name: http-monitoring + containerPort: 8085 + env: + - name: VELERO_SCRATCH_DIR + value: /scratch + - name: VELERO_NAMESPACE + valueFrom: + fieldRef: + fieldPath: metadata.namespace + - name: LD_LIBRARY_PATH + value: /plugins + # INI credentials file consumed by the azure plugin (key `cloud` + # of the cloud-credentials secret). + - name: AZURE_CREDENTIALS_FILE + value: /credentials/cloud + volumeMounts: + - name: plugins + mountPath: /plugins + - name: scratch + mountPath: /scratch + - name: cloud-credentials + mountPath: /credentials + resources: + requests: + cpu: 250m + memory: 128Mi + limits: + cpu: "1" + memory: 512Mi + volumes: + - name: plugins + emptyDir: {} + - name: scratch + emptyDir: {} + - name: cloud-credentials + secret: + secretName: cloud-credentials diff --git a/k8s/overlays/prod-velero/kustomization.yaml b/k8s/overlays/prod-velero/kustomization.yaml new file mode 100644 index 00000000000..0b7711f7bab --- /dev/null +++ b/k8s/overlays/prod-velero/kustomization.yaml @@ -0,0 +1,52 @@ +# Velero (self-managed) apply target for PROD — deliberately SEPARATE from the +# app overlay, exactly like k8s/overlays/prod-vespa. +# +# kubectl apply -k k8s/overlays/prod-velero # context: darwin +# +# `kubectl apply -k k8s/overlays/prod` does NOT touch Velero. This tree owns the +# Velero install, its backup schedule, and the failure alerts — nothing else. +# +# Backs up ONLY the PVCs labeled `backup=vespa` in the `darwin` namespace +# (the Vespa index + configserver state). See README.md for the why, the +# one-time CRD bootstrap, and the required SP-secret step. +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + +namespace: velero + +resources: + - namespace.yaml + - rbac.yaml + - deployment.yaml + - service.yaml + - backupstoragelocation.yaml + - volumesnapshotlocation.yaml + - schedule.yaml + - servicemonitor.yaml + - prometheusrule.yaml + - notifier.yaml + +# Image pins (bump deliberately; keep velero <-> azure-plugin compatible: +# velero 1.13.x <-> plugin-for-microsoft-azure 1.9.x). +images: + - name: velero/velero + newTag: v1.13.2 + - name: velero/velero-plugin-for-microsoft-azure + newTag: v1.9.2 + +# cloud-credentials: the Azure SP credentials Velero authenticates with. Built +# from the gitignored `credentials-velero` file (copy credentials-velero.example +# and fill in a FRESH client secret for app 155fae3c — see README). Same name + +# key (`cloud`) the deployment mounts. Hash suffix disabled so the static +# secretName resolves. +secretGenerator: + - name: cloud-credentials + files: + - cloud=credentials-velero + # slack-notify: bot token + channel for the post-backup notifier CronJob. + # Built from the gitignored slack-notify.env (copy slack-notify.env.example). + - name: slack-notify + envs: + - slack-notify.env +generatorOptions: + disableNameSuffixHash: true diff --git a/k8s/overlays/prod-velero/namespace.yaml b/k8s/overlays/prod-velero/namespace.yaml new file mode 100644 index 00000000000..448c7c9ab77 --- /dev/null +++ b/k8s/overlays/prod-velero/namespace.yaml @@ -0,0 +1,6 @@ +apiVersion: v1 +kind: Namespace +metadata: + name: velero + labels: + app.kubernetes.io/name: velero diff --git a/k8s/overlays/prod-velero/notifier.yaml b/k8s/overlays/prod-velero/notifier.yaml new file mode 100644 index 00000000000..a284d96af7f --- /dev/null +++ b/k8s/overlays/prod-velero/notifier.yaml @@ -0,0 +1,119 @@ +# Post-backup notifier: every week, AFTER the backup window, read the latest +# vespa-weekly backup's status and post the outcome (success AND failure) to +# #darwin-devs via the Slack bot token. This is the "did the weekly backup +# happen and did it work?" heartbeat to the team channel. +# +# (The PrometheusRule alerts are the real-time failure safety net via robusta; +# this CronJob is the explicit per-run success/failure report to darwin-devs.) +apiVersion: v1 +kind: ServiceAccount +metadata: + name: velero-backup-notifier + namespace: velero + labels: + app.kubernetes.io/name: velero +--- +# Minimal read access to Backup CRs (NOT cluster-admin like velero itself). +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: velero-backup-notifier + namespace: velero + labels: + app.kubernetes.io/name: velero +rules: + - apiGroups: ["velero.io"] + resources: ["backups"] + verbs: ["get", "list"] +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: velero-backup-notifier + namespace: velero + labels: + app.kubernetes.io/name: velero +subjects: + - kind: ServiceAccount + name: velero-backup-notifier + namespace: velero +roleRef: + kind: Role + name: velero-backup-notifier + apiGroup: rbac.authorization.k8s.io +--- +apiVersion: batch/v1 +kind: CronJob +metadata: + name: velero-backup-notifier + namespace: velero + labels: + app.kubernetes.io/name: velero +spec: + # 1h after the Sun 02:00 backup, so the run has finished. + schedule: "0 3 * * 0" + concurrencyPolicy: Forbid + successfulJobsHistoryLimit: 3 + failedJobsHistoryLimit: 3 + jobTemplate: + spec: + backoffLimit: 2 + template: + metadata: + labels: + app.kubernetes.io/name: velero + spec: + serviceAccountName: velero-backup-notifier + restartPolicy: Never + containers: + - name: notifier + # bundles kubectl + curl + jq + bash + image: alpine/k8s:1.29.4 + command: ["/bin/bash", "-c"] + args: + - | + set -euo pipefail + items=$(kubectl get backups.velero.io -n velero \ + -l velero.io/schedule-name=vespa-weekly -o json) + latest=$(echo "$items" | jq -r '.items | sort_by(.metadata.creationTimestamp) | last') + if [ "$latest" = "null" ] || [ -z "$latest" ]; then + name="(none)"; phase="NoBackupFound"; errs="-"; comp="-" + else + name=$(echo "$latest" | jq -r '.metadata.name') + phase=$(echo "$latest" | jq -r '.status.phase // "Unknown"') + errs=$(echo "$latest" | jq -r '.status.errors // 0') + comp=$(echo "$latest" | jq -r '.status.completionTimestamp // "-"') + fi + case "$phase" in + Completed) emoji="✅" ;; + PartiallyFailed) emoji="⚠️" ;; + *) emoji="❌" ;; + esac + text=$(printf '%s Vespa weekly backup *%s*\nbackup: `%s`\nerrors: %s\ncompleted: %s' \ + "$emoji" "$phase" "$name" "$errs" "$comp") + payload=$(jq -nc --arg c "$SLACK_CHANNEL" --arg t "$text" '{channel:$c,text:$t}') + resp=$(curl -sf -X POST https://slack.com/api/chat.postMessage \ + -H "Authorization: Bearer $SLACK_BOT_TOKEN" \ + -H "Content-type: application/json; charset=utf-8" \ + --data "$payload") + echo "$resp" + echo "$resp" | jq -e '.ok == true' >/dev/null \ + || { echo "ERROR: slack post failed"; exit 1; } + env: + - name: SLACK_BOT_TOKEN + valueFrom: + secretKeyRef: + name: slack-notify + key: SLACK_BOT_TOKEN + - name: SLACK_CHANNEL + valueFrom: + secretKeyRef: + name: slack-notify + key: SLACK_CHANNEL + resources: + requests: + cpu: 50m + memory: 64Mi + limits: + cpu: 250m + memory: 256Mi diff --git a/k8s/overlays/prod-velero/prometheusrule.yaml b/k8s/overlays/prod-velero/prometheusrule.yaml new file mode 100644 index 00000000000..ed76d2d24aa --- /dev/null +++ b/k8s/overlays/prod-velero/prometheusrule.yaml @@ -0,0 +1,66 @@ +# Backup-failure alerts. `release: robusta` is REQUIRED to match the +# Prometheus ruleSelector (matchLabels release=robusta); robusta forwards +# firing alerts to Slack. +# +# The alert that matters most is VeleroNoRecentSuccessfulBackup — it fires on +# the EXACT failure mode that went unnoticed for ~a year last time (auth +# broke -> backups silently failing). It watches the OUTCOME (time since last +# success), so it catches every cause: bad credentials, BSL down, velero +# crashed, schedule disabled. +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: velero-backup-alerts + namespace: velero + labels: + app.kubernetes.io/name: velero + release: robusta +spec: + groups: + - name: velero-backups + rules: + # SAFETY NET: no successful weekly backup in >8 days (weekly + 1 day slack). + - alert: VeleroNoRecentSuccessfulBackup + expr: (time() - velero_backup_last_successful_timestamp{schedule="vespa-weekly"}) > 8 * 24 * 3600 + for: 1h + labels: + severity: critical + annotations: + summary: "Velero: no successful vespa-weekly backup in over 8 days" + description: "The last successful 'vespa-weekly' backup is older than 8 days. The weekly schedule is failing or not running. Check `velero backup get` and the velero pod logs." + # velero down / not scraped / never ran -> metric absent. + - alert: VeleroBackupMetricsAbsent + expr: absent(velero_backup_last_successful_timestamp{schedule="vespa-weekly"}) + for: 6h + labels: + severity: critical + annotations: + summary: "Velero: vespa-weekly backup metric absent" + description: "No metric for the 'vespa-weekly' schedule for 6h — velero may be down, unscraped, or no backup has ever completed." + # A backup attempt failed. + - alert: VeleroBackupFailure + expr: increase(velero_backup_failure_total{schedule="vespa-weekly"}[1h]) > 0 + for: 0m + labels: + severity: critical + annotations: + summary: "Velero: vespa-weekly backup FAILED" + description: "A 'vespa-weekly' backup failed in the last hour. `velero backup describe --details`." + # A backup partially failed (some items errored). + - alert: VeleroBackupPartialFailure + expr: increase(velero_backup_partial_failure_total{schedule="vespa-weekly"}[1h]) > 0 + for: 0m + labels: + severity: warning + annotations: + summary: "Velero: vespa-weekly backup partially failed" + description: "A 'vespa-weekly' backup partially failed in the last hour (some resources/snapshots errored)." + # A volume (disk) snapshot failed — the index data didn't get captured. + - alert: VeleroVolumeSnapshotFailure + expr: increase(velero_volume_snapshot_failure_total[1h]) > 0 + for: 0m + labels: + severity: critical + annotations: + summary: "Velero: volume snapshot failed" + description: "A PV snapshot failed in the last hour — the Vespa disk data may not be in the latest backup." diff --git a/k8s/overlays/prod-velero/rbac.yaml b/k8s/overlays/prod-velero/rbac.yaml new file mode 100644 index 00000000000..be8e9e775a9 --- /dev/null +++ b/k8s/overlays/prod-velero/rbac.yaml @@ -0,0 +1,26 @@ +apiVersion: v1 +kind: ServiceAccount +metadata: + name: velero + namespace: velero + labels: + app.kubernetes.io/name: velero +--- +# Velero needs broad access to snapshot/restore arbitrary resources. This +# mirrors the upstream `velero install` default (cluster-admin). Scope it down +# later if desired, but the backup/restore controllers touch many resource +# types across the cluster. +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: velero + labels: + app.kubernetes.io/name: velero +subjects: + - kind: ServiceAccount + name: velero + namespace: velero +roleRef: + kind: ClusterRole + name: cluster-admin + apiGroup: rbac.authorization.k8s.io diff --git a/k8s/overlays/prod-velero/schedule.yaml b/k8s/overlays/prod-velero/schedule.yaml new file mode 100644 index 00000000000..68cfe4d978e --- /dev/null +++ b/k8s/overlays/prod-velero/schedule.yaml @@ -0,0 +1,31 @@ +# WEEKLY backup of ONLY the Vespa state worth keeping. +# +# Scope: namespace `darwin`, resources labeled `backup=vespa` (the 6 PVCs: +# vespa-var-vespa-content-{0,1,2} + vespa-var1-vespa-configserver-{0,1,2}), +# with their PVs snapshotted as Azure disk snapshots. +# +# Retention: ttl 504h = 21 days. Weekly cadence => the last 3 backups are +# retained (ages ~0/7/14d; the 21-day-old one is garbage-collected). This +# matches "retain last 3, max 21 days back, OK to lose up to a week on crash". +apiVersion: velero.io/v1 +kind: Schedule +metadata: + name: vespa-weekly + namespace: velero + labels: + app.kubernetes.io/name: velero +spec: + # Sundays 02:00 UTC. + schedule: "0 2 * * 0" + useOwnerReferencesInBackup: false + template: + includedNamespaces: + - darwin + labelSelector: + matchLabels: + backup: vespa + snapshotVolumes: true + storageLocation: default + volumeSnapshotLocations: + - default + ttl: 504h0m0s diff --git a/k8s/overlays/prod-velero/service.yaml b/k8s/overlays/prod-velero/service.yaml new file mode 100644 index 00000000000..a9d79e893b5 --- /dev/null +++ b/k8s/overlays/prod-velero/service.yaml @@ -0,0 +1,16 @@ +# Exposes Velero's :8085 metrics endpoint so the robusta Prometheus (via the +# ServiceMonitor) can scrape it. +apiVersion: v1 +kind: Service +metadata: + name: velero + namespace: velero + labels: + app.kubernetes.io/name: velero +spec: + selector: + app.kubernetes.io/name: velero + ports: + - name: http-monitoring + port: 8085 + targetPort: http-monitoring diff --git a/k8s/overlays/prod-velero/servicemonitor.yaml b/k8s/overlays/prod-velero/servicemonitor.yaml new file mode 100644 index 00000000000..986284bcded --- /dev/null +++ b/k8s/overlays/prod-velero/servicemonitor.yaml @@ -0,0 +1,19 @@ +# Makes the robusta Prometheus scrape Velero's metrics. The `release: robusta` +# label is REQUIRED — it matches that Prometheus's serviceMonitorSelector +# (matchLabels release=robusta); without it the target is silently ignored. +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: velero + namespace: velero + labels: + app.kubernetes.io/name: velero + release: robusta +spec: + selector: + matchLabels: + app.kubernetes.io/name: velero + endpoints: + - port: http-monitoring + path: /metrics + interval: 30s diff --git a/k8s/overlays/prod-velero/slack-notify.env.example b/k8s/overlays/prod-velero/slack-notify.env.example new file mode 100644 index 00000000000..25e257bf5e5 --- /dev/null +++ b/k8s/overlays/prod-velero/slack-notify.env.example @@ -0,0 +1,12 @@ +# Slack notifier credentials — TEMPLATE. Copy to `slack-notify.env` (gitignored) +# and fill in. +# +# SLACK_BOT_TOKEN: reuse the existing bot token already in the cluster: +# kubectl get secret danswer-secrets -n darwin \ +# -o jsonpath='{.data.DANSWER_BOT_SLACK_BOT_TOKEN}' | base64 -d +# The bot needs the `chat:write` scope and MUST be invited to #darwin-devs +# (the bot-token path posts by channel name/ID, not the channel email). +# +# SLACK_CHANNEL: the channel name or ID the bot posts to. +SLACK_BOT_TOKEN=xoxb-REPLACE_ME +SLACK_CHANNEL=darwin-devs diff --git a/k8s/overlays/prod-velero/volumesnapshotlocation.yaml b/k8s/overlays/prod-velero/volumesnapshotlocation.yaml new file mode 100644 index 00000000000..e58357377e6 --- /dev/null +++ b/k8s/overlays/prod-velero/volumesnapshotlocation.yaml @@ -0,0 +1,16 @@ +# Where Velero creates the Azure Managed Disk snapshots for the backed-up PVs. +# Snapshots land in the AKS node RG (MC_darwin_darwin_westeurope) — NOT the +# lock-protected `darwin` RG — so TTL-driven snapshot GC can delete them. The +# SP 155fae3c holds Contributor there. +apiVersion: velero.io/v1 +kind: VolumeSnapshotLocation +metadata: + name: default + namespace: velero + labels: + app.kubernetes.io/name: velero +spec: + provider: velero.io/azure + config: + resourceGroup: MC_darwin_darwin_westeurope + subscriptionId: 202e5d15-5356-4826-bc61-ebd449d12e34 diff --git a/k8s/overlays/prod/env.properties b/k8s/overlays/prod/env.properties index 331889a5472..f48d1bdeb28 100644 --- a/k8s/overlays/prod/env.properties +++ b/k8s/overlays/prod/env.properties @@ -75,6 +75,43 @@ ASYM_QUERY_PREFIX= ASYM_PASSAGE_PREFIX= ENABLE_RERANKING_REAL_TIME_FLOW= ENABLE_RERANKING_ASYNC_FLOW= +# Cross-encoder reranking + LLM relevance filter — DISABLED cluster-wide. +# These are the single source of truth: when false the backend never runs the +# feature (regardless of per-assistant flags) AND the chat/assistant UIs hide +# the toggles (the values are surfaced via /settings). Flip to true (and add a +# GPU node + the tei-rerank component for reranking) to re-enable — code is +# intact, this is purely config-driven. The GPU node pool can be removed while +# RERANK_ENABLED=false. +RERANK_ENABLED=false +# RERANK_SERVER_URL=http://tei-rerank-service:80 # only used when RERANK_ENABLED=true +LLM_RELEVANCE_FILTER_ENABLED=false +# Source diversity at final doc selection: reserve up to N front slots for the +# highest-ranked docs from PROTECTED_SOURCES so curated content isn't crowded +# out of the prompt by a chatty source (e.g. Slack). These match the code +# defaults — set explicitly here for visibility/tuning (set SLOTS=0 to disable). +PROTECTED_SOURCES=web,sfkbarticles,highspot,outsystems +SOURCE_DIVERSITY_RESERVED_SLOTS=3 +# Recall guarantee: run an extra source-scoped retrieval so up to N PROTECTED_SOURCES +# docs always reach the candidate set, even when a chatty source (Slack) saturates the +# top hits. Complements SOURCE_DIVERSITY_RESERVED_SLOTS (which only reserves final-prompt +# slots among already-retrieved docs). Applies to all assistants + chat & Slack flows. +SOURCE_RESERVED_RETRIEVAL_SLOTS=6 +# Cap docs-per-source in the final LLM prompt so a chatty source (e.g. a busy Slack +# channel) can't contribute dozens of docs and monopolize grounding/citations, +# drowning out curated sources. Keeps top-N per source (after diversity promotion). +MAX_PROMPT_DOCS_PER_SOURCE=8 +# Verify-then-retain authoritative citations: after generation, if a promoted +# authoritative (PROTECTED_SOURCES) doc was NOT cited by the LLM, one batched LLM +# call checks whether it supports the answer; supporting docs are appended as an +# "Authoritative sources" footer. Additive + deduped + fail-closed. Conditional — +# only costs an extra call when an uncited authoritative doc is in context. +AUTHORITATIVE_CITATION_RETENTION_ENABLED=true +# Versioned-docs dedup at final doc selection: for URLs containing this substring +# (documentation that publishes the same page per product version), keep only the +# newest version's chunk(s) and drop older-version duplicates, so the LLM context +# isn't flooded with copies of one page. Other sources are untouched. Empty to +# disable. +DOCS_VERSION_DEDUP_URL_SUBSTR=docs.uipath.com # --- LLM --- GEN_AI_MODEL_PROVIDER=custom @@ -111,6 +148,13 @@ DISABLE_GENERATIVE_AI= # --- Indexing --- NUM_INDEXING_WORKERS=2 +# Per-source indexing concurrency. Global default is 1 attempt per source +# (protects rate-limited / shared-credential sources like Slack, Jira, +# Confluence). Override lifts the cap for `web` only — every web cc-pair has +# its own dummy credential and mostly distinct domains, so it parallelizes +# safely. 0 = uncapped (bounded by NUM_INDEXING_WORKERS + the per-cc-pair +# lock). Enforced in the scheduler — see configs/indexing_concurrency.py. +INDEXING_PER_SOURCE_CAP_OVERRIDES=web=0 ENABLED_CONNECTOR_TYPES= DISABLE_INDEX_UPDATE_ON_SWAP= DASK_JOB_CLIENT_ENABLED=true diff --git a/k8s/overlays/prod/kustomization.yaml b/k8s/overlays/prod/kustomization.yaml index 7e492172b1c..c6f327e5f98 100644 --- a/k8s/overlays/prod/kustomization.yaml +++ b/k8s/overlays/prod/kustomization.yaml @@ -16,6 +16,11 @@ resources: components: - ../../optional/background-scaling + # GPU cross-encoder reranker. DISABLED (RERANK_ENABLED=false in env.properties), + # so the tei-rerank component + GPU node pool are removed. To re-enable: set + # RERANK_ENABLED=true + RERANK_SERVER_URL, re-add the component below, and add a + # GPU node pool. + # - ../../optional/tei-rerank namespace: darwin @@ -25,10 +30,10 @@ namespace: darwin images: - name: danswer-backend newName: sfbrdevhelmweacr.azurecr.io/danswer/danswer-backend - newTag: vha-147 + newTag: vha-204 - name: danswer-web-server newName: sfbrdevhelmweacr.azurecr.io/danswer/danswer-web-server - newTag: vha-77 + newTag: vha-103 - name: danswer-model-server newName: danswer/danswer-model-server newTag: v0.3.94 diff --git a/k8s/scripts/build-deploy.sh b/k8s/scripts/build-deploy.sh index fdb0a515bf7..ce8c9fe9ce2 100755 --- a/k8s/scripts/build-deploy.sh +++ b/k8s/scripts/build-deploy.sh @@ -31,6 +31,13 @@ # docker push $REGISTRY/danswer-backend:vha-N # docker push $REGISTRY/danswer-web-server:vha-M # +# Apple Silicon: the web image is NEVER built locally. Its `next build` step +# SIGSEGVs under linux/amd64 emulation on arm64 Macs, so on Apple Silicon this +# script builds web on darwinacr (native amd64 ACR build agents) and imports the +# result into the prod registry — automatically, no flags needed. Backend still +# builds locally (pure Python, no native build step). Force the old local web +# build with FORCE_LOCAL_WEB_BUILD=1 (only useful on an amd64 host). +# # Safety: # - `deploy` refuses unless the kubectl context is the prod cluster # ($PROD_CONTEXT) — the prod overlay targets it. Override with FORCE=1. @@ -74,6 +81,18 @@ img_build_extra() { case "$1" in web) echo "--load";; *) echo "";; esac; } # which live deployment to read the running tag from, for `verify` img_verify_deploy(){ case "$1" in backend) echo api-server-deployment;; web) echo web-server-deployment;; esac; } +# ---- Apple Silicon / web build routing ------------------------------------ +# The web image's `next build` step SIGSEGVs when built for linux/amd64 under +# emulation on Apple Silicon (the only way to produce an amd64 image locally on +# an arm64 Mac). So on Apple Silicon we NEVER build web locally — we build it on +# darwinacr's native-amd64 ACR build agents and import the result into the prod +# registry. Backend is pure Python (no native build step) and builds fine under +# emulation, so it stays local. Escape hatch: FORCE_LOCAL_WEB_BUILD=1. +CLOUD_BUILD_REGISTRY="darwinacr" # native amd64 build agents +NODE_BASE_MIRROR="darwinacr.azurecr.io/library/node:20-alpine" # avoids docker.io pulls in ACR build +is_apple_silicon() { [ "$(uname -s)" = "Darwin" ] && [ "$(uname -m)" = "arm64" ]; } +web_uses_cloud_build() { is_apple_silicon && [ "${FORCE_LOCAL_WEB_BUILD:-0}" != "1" ]; } + # ---- logging -------------------------------------------------------------- log() { printf '\033[1;34m==>\033[0m %s\n' "$*"; } ok() { printf '\033[1;32m ok\033[0m %s\n' "$*"; } @@ -235,11 +254,50 @@ for c in "${COMPONENTS[@]}"; do printf ' %-8s %s -> %s\n' "$c" "$cur" "$(next_tag "$cur")" done +# Build the web image on the cloud (darwinacr, native amd64). Used on Apple +# Silicon, where a local amd64 `next build` SIGSEGVs under emulation. CWD is +# $REPO_ROOT (set in the build section), so ./web + web/Dockerfile resolve. +cloud_build_web_image() { + local tag="$1" + run az acr build --registry "$CLOUD_BUILD_REGISTRY" \ + --image "danswer/danswer-web-server:$tag" \ + --build-arg NODE_BASE="$NODE_BASE_MIRROR" \ + --file web/Dockerfile ./web \ + || die "cloud web build failed on $CLOUD_BUILD_REGISTRY (az acr build)" +} + +# Copy a cloud-built web image from darwinacr into the prod registry. darwinacr +# and the prod registry are in DIFFERENT subscriptions, so we transfer by +# pull -> retag -> push rather than `az acr import` (cross-sub import auth is +# unreliable here). This is a pure blob copy on the Mac — no `next`/V8 executes, +# so no SIGSEGV; the digest is preserved. Prod push uses the docker login from +# registry_login (ACR_USERNAME/ACR_PASSWORD); the darwinacr pull needs an +# `az acr login` (PIM Contributor). +import_web_to_prod() { + local tag="$1" + local src="$CLOUD_BUILD_REGISTRY.azurecr.io/danswer/danswer-web-server:$tag" + local dst="$REGISTRY/danswer-web-server:$tag" + run az acr login --name "$CLOUD_BUILD_REGISTRY" \ + || die "az acr login to $CLOUD_BUILD_REGISTRY failed — activate PIM Contributor and retry" + run docker pull --platform linux/amd64 "$src" || die "pull $src failed" + run docker tag "$src" "$dst" + run docker push "$dst" || die "push $dst failed" +} + # ---- build ---------------------------------------------------------------- ensure_disk_space log "BUILD (linux/amd64)" cd "$REPO_ROOT" for c in "${COMPONENTS[@]}"; do + # On Apple Silicon the web image is built on the cloud (native amd64) and + # never locally — see web_uses_cloud_build / the routing comment above. + if [ "$c" = "web" ] && web_uses_cloud_build; then + log "build web on $CLOUD_BUILD_REGISTRY (cloud, native amd64) — local 'next build' SIGSEGVs under Apple Silicon emulation" + cloud_build_web_image "$(component_next_tag web)" + WEB_CLOUD_BUILT=1 + ok "built web on $CLOUD_BUILD_REGISTRY" + continue + fi local_tag="$(img_local "$c"):latest" log "build $c -> $local_tag" # shellcheck disable=SC2046,SC2086 @@ -253,6 +311,15 @@ done log "PUSH -> $REGISTRY" registry_login # docker login using $ACR_USERNAME/$ACR_PASSWORD (see helper) for c in "${COMPONENTS[@]}"; do + # Cloud-built web (Apple Silicon) is already in darwinacr — import it into the + # prod registry instead of docker-pushing a local image that doesn't exist. + if [ "$c" = "web" ] && [ "${WEB_CLOUD_BUILT:-0}" = "1" ]; then + web_tag="$(component_next_tag web)" + log "import web $web_tag: $CLOUD_BUILD_REGISTRY -> $REGISTRY_HOST" + import_web_to_prod "$web_tag" + ok "imported $REGISTRY/danswer-web-server:$web_tag" + continue + fi local_tag="$(img_local "$c"):latest" remote_tag="$REGISTRY/$(img_logical "$c"):$(component_next_tag "$c")" run docker tag "$local_tag" "$remote_tag" @@ -281,6 +348,24 @@ for c in "${COMPONENTS[@]}"; do APPLIED+=("$c=$nxt") ok "kustomization newTag $(img_logical "$c") -> $nxt" done + +# Safety gate: never `kubectl apply` a manifest that points at an image which +# isn't actually in the registry. An interrupted/killed run (e.g. a backgrounded +# deploy whose build/push was SIGKILLed) can leave the tag bumped without the +# image pushed; applying that rolls the deployment onto an unpullable tag -> +# ImagePullBackOff, and for the dask scheduler it cascades the whole indexing +# pipeline down. Verify each tag we're about to apply first (docker is already +# logged in to $REGISTRY from the push stage). +log "verify target images exist in $REGISTRY before applying" +for entry in "${APPLIED[@]}"; do + c="${entry%%=*}"; tag="${entry##*=}" + ref="$REGISTRY/$(img_logical "$c"):$tag" + if ! docker manifest inspect "$ref" >/dev/null 2>&1; then + die "image $ref is NOT in the registry — refusing to apply (would cause ImagePullBackOff). Re-run the 'push' stage (the manifest is bumped but the cluster is untouched)." + fi + ok "registry has $ref" +done + log "kubectl apply -k $OVERLAY_DIR (ns=$NAMESPACE, context=$ctx)" run kubectl apply -k "$OVERLAY_DIR" ok "applied. Rollout status:" diff --git a/web/Dockerfile b/web/Dockerfile index 585b1fb756a..9087d058b1d 100644 --- a/web/Dockerfile +++ b/web/Dockerfile @@ -1,4 +1,9 @@ -FROM node:20-alpine AS base +# Base node image. Overridable so CI can pull from a registry that isn't +# subject to Docker Hub's unauthenticated pull-rate limit (e.g. our ACR mirror, +# `az acr build --build-arg NODE_BASE=darwinacr.azurecr.io/library/node:20-alpine`). +# Defaults to Docker Hub so local builds are unchanged. +ARG NODE_BASE=node:20-alpine +FROM ${NODE_BASE} AS base LABEL com.danswer.maintainer="founders@danswer.ai" LABEL com.danswer.description="This image is the web/frontend container of Danswer which \ diff --git a/web/package-lock.json b/web/package-lock.json index b61c964e258..b8abe518fd2 100644 --- a/web/package-lock.json +++ b/web/package-lock.json @@ -52,7 +52,8 @@ "@tailwindcss/typography": "^0.5.10", "eslint": "^8.48.0", "eslint-config-next": "^14.2.35", - "prettier": "2.8.8" + "prettier": "2.8.8", + "vitest": "^1.6.1" } }, "node_modules/@alloc/quick-lru": { @@ -70,6 +71,7 @@ "version": "2.3.0", "resolved": "https://registry.npmjs.org/@ampproject/remapping/-/remapping-2.3.0.tgz", "integrity": "sha512-30iZtAPgz+LTIYoeivqYo853f02jBYSd5uGnGpkFV0M3xOt9aN73erkgYAmZU43x4VfqcnLxW9Kpg3R5LC4YYw==", + "peer": true, "dependencies": { "@jridgewell/gen-mapping": "^0.3.5", "@jridgewell/trace-mapping": "^0.3.24" @@ -96,6 +98,7 @@ "version": "7.24.4", "resolved": "https://registry.npmjs.org/@babel/compat-data/-/compat-data-7.24.4.tgz", "integrity": "sha512-vg8Gih2MLK+kOkHJp4gBEIkyaIi00jgWot2D9QOmmfLC8jINSOzmCLta6Bvz/JSBCqnegV0L80jhxkol5GWNfQ==", + "peer": true, "engines": { "node": ">=6.9.0" } @@ -134,6 +137,7 @@ "version": "6.3.1", "resolved": "https://registry.npmjs.org/semver/-/semver-6.3.1.tgz", "integrity": "sha512-BR7VvDCVHO+q2xBEWskxS6DJE1qRnb7DxzUrogb71CWoSficBxYsiAGd+Kl0mmq/MprG9yArRkyrQxTO6XjMzA==", + "peer": true, "bin": { "semver": "bin/semver.js" } @@ -167,6 +171,7 @@ "version": "7.23.6", "resolved": "https://registry.npmjs.org/@babel/helper-compilation-targets/-/helper-compilation-targets-7.23.6.tgz", "integrity": "sha512-9JB548GZoQVmzrFgp8o7KxdgkTGm6xs9DW0o/Pim72UDjzr5ObUQ6ZzYPqA+g9OTS2bBQoctLJrky0RDCAWRgQ==", + "peer": true, "dependencies": { "@babel/compat-data": "^7.23.5", "@babel/helper-validator-option": "^7.23.5", @@ -182,6 +187,7 @@ "version": "5.1.1", "resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-5.1.1.tgz", "integrity": "sha512-KpNARQA3Iwv+jTA0utUVVbrh+Jlrr1Fv0e56GGzAFOXN7dk/FviaDW8LHmK52DlcH4WP2n6gI8vN1aesBFgo9w==", + "peer": true, "dependencies": { "yallist": "^3.0.2" } @@ -190,6 +196,7 @@ "version": "6.3.1", "resolved": "https://registry.npmjs.org/semver/-/semver-6.3.1.tgz", "integrity": "sha512-BR7VvDCVHO+q2xBEWskxS6DJE1qRnb7DxzUrogb71CWoSficBxYsiAGd+Kl0mmq/MprG9yArRkyrQxTO6XjMzA==", + "peer": true, "bin": { "semver": "bin/semver.js" } @@ -240,6 +247,7 @@ "version": "7.24.5", "resolved": "https://registry.npmjs.org/@babel/helper-module-transforms/-/helper-module-transforms-7.24.5.tgz", "integrity": "sha512-9GxeY8c2d2mdQUP1Dye0ks3VDyIMS98kt/llQ2nUId8IsWqTF0l1LkSX0/uP7l7MCDrzXS009Hyhe2gzTiGW8A==", + "peer": true, "dependencies": { "@babel/helper-environment-visitor": "^7.22.20", "@babel/helper-module-imports": "^7.24.3", @@ -266,6 +274,7 @@ "version": "7.24.5", "resolved": "https://registry.npmjs.org/@babel/helper-simple-access/-/helper-simple-access-7.24.5.tgz", "integrity": "sha512-uH3Hmf5q5n7n8mz7arjUlDOCbttY/DW4DYhE6FUsjKJ/oYC1kQQUvwEQWxRwUpX9qQKRXeqLwWxrqilMrf32sQ==", + "peer": true, "dependencies": { "@babel/types": "^7.24.5" }, @@ -306,6 +315,7 @@ "version": "7.23.5", "resolved": "https://registry.npmjs.org/@babel/helper-validator-option/-/helper-validator-option-7.23.5.tgz", "integrity": "sha512-85ttAOMLsr53VgXkTbkx8oA6YTfT4q7/HzXSLEYmjcSTJPMPQtvq1BD79Byep5xMUYbGRzEpDsjUf3dyp54IKw==", + "peer": true, "engines": { "node": ">=6.9.0" } @@ -315,6 +325,7 @@ "resolved": "https://registry.npmjs.org/@babel/helpers/-/helpers-7.29.2.tgz", "integrity": "sha512-HoGuUs4sCZNezVEKdVcwqmZN8GoHirLUcLaYVNBK2J0DadGtdcqgr3BCbvH8+XUo4NGjNl3VOtSjEKNzqfFgKw==", "license": "MIT", + "peer": true, "dependencies": { "@babel/template": "^7.28.6", "@babel/types": "^7.29.0" @@ -431,7 +442,6 @@ "version": "6.1.0", "resolved": "https://registry.npmjs.org/@dnd-kit/core/-/core-6.1.0.tgz", "integrity": "sha512-J3cQBClB4TVxwGo3KEjssGEXNJqGVWx17aRTZ1ob0FliR5IjYgTxl5YJbKTzA6IzrtelotH19v6y7uoIRUZPSg==", - "peer": true, "dependencies": { "@dnd-kit/accessibility": "^3.1.0", "@dnd-kit/utilities": "^3.2.2", @@ -497,10 +507,401 @@ "resolved": "https://registry.npmjs.org/@emotion/stylis/-/stylis-0.8.5.tgz", "integrity": "sha512-h6KtPihKFn3T9fuIrwvXXUOwlx3rfUvfZIcP5a6rh8Y7zjE3O06hT5Ss4S/YI1AYhuZ1kjaE/5EaOOI2NqSylQ==" }, - "node_modules/@emotion/unitless": { - "version": "0.7.5", - "resolved": "https://registry.npmjs.org/@emotion/unitless/-/unitless-0.7.5.tgz", - "integrity": "sha512-OWORNpfjMsSSUBVrRBVGECkhWcULOAJz9ZW8uK9qgxD+87M7jHRcvh/A96XXNhXTLmKcoYSQtBEX7lHMO7YRwg==" + "node_modules/@emotion/unitless": { + "version": "0.7.5", + "resolved": "https://registry.npmjs.org/@emotion/unitless/-/unitless-0.7.5.tgz", + "integrity": "sha512-OWORNpfjMsSSUBVrRBVGECkhWcULOAJz9ZW8uK9qgxD+87M7jHRcvh/A96XXNhXTLmKcoYSQtBEX7lHMO7YRwg==" + }, + "node_modules/@esbuild/aix-ppc64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.21.5.tgz", + "integrity": "sha512-1SDgH6ZSPTlggy1yI6+Dbkiz8xzpHJEVAlF/AM1tHPLsf5STom9rwtjE4hKAF20FfXXNTFqEYXyJNWh1GiZedQ==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "aix" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/android-arm": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.21.5.tgz", + "integrity": "sha512-vCPvzSjpPHEi1siZdlvAlsPxXl7WbOVUBBAowWug4rJHb68Ox8KualB+1ocNvT5fjv6wpkX6o/iEpbDrf68zcg==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/android-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.21.5.tgz", + "integrity": "sha512-c0uX9VAUBQ7dTDCjq+wdyGLowMdtR/GoC2U5IYk/7D1H1JYC0qseD7+11iMP2mRLN9RcCMRcjC4YMclCzGwS/A==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/android-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.21.5.tgz", + "integrity": "sha512-D7aPRUUNHRBwHxzxRvp856rjUHRFW1SdQATKXH2hqA0kAZb1hKmi02OpYRacl0TxIGz/ZmXWlbZgjwWYaCakTA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/darwin-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.21.5.tgz", + "integrity": "sha512-DwqXqZyuk5AiWWf3UfLiRDJ5EDd49zg6O9wclZ7kUMv2WRFr4HKjXp/5t8JZ11QbQfUS6/cRCKGwYhtNAY88kQ==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/darwin-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.21.5.tgz", + "integrity": "sha512-se/JjF8NlmKVG4kNIuyWMV/22ZaerB+qaSi5MdrXtd6R08kvs2qCN4C09miupktDitvh8jRFflwGFBQcxZRjbw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/freebsd-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.21.5.tgz", + "integrity": "sha512-5JcRxxRDUJLX8JXp/wcBCy3pENnCgBR9bN6JsY4OmhfUtIHe3ZW0mawA7+RDAcMLrMIZaf03NlQiX9DGyB8h4g==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/freebsd-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.21.5.tgz", + "integrity": "sha512-J95kNBj1zkbMXtHVH29bBriQygMXqoVQOQYA+ISs0/2l3T9/kj42ow2mpqerRBxDJnmkUDCaQT/dfNXWX/ZZCQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-arm": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.21.5.tgz", + "integrity": "sha512-bPb5AHZtbeNGjCKVZ9UGqGwo8EUu4cLq68E95A53KlxAPRmUyYv2D6F0uUI65XisGOL1hBP5mTronbgo+0bFcA==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.21.5.tgz", + "integrity": "sha512-ibKvmyYzKsBeX8d8I7MH/TMfWDXBF3db4qM6sy+7re0YXya+K1cem3on9XgdT2EQGMu4hQyZhan7TeQ8XkGp4Q==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-ia32": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.21.5.tgz", + "integrity": "sha512-YvjXDqLRqPDl2dvRODYmmhz4rPeVKYvppfGYKSNGdyZkA01046pLWyRKKI3ax8fbJoK5QbxblURkwK/MWY18Tg==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-loong64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.21.5.tgz", + "integrity": "sha512-uHf1BmMG8qEvzdrzAqg2SIG/02+4/DHB6a9Kbya0XDvwDEKCoC8ZRWI5JJvNdUjtciBGFQ5PuBlpEOXQj+JQSg==", + "cpu": [ + "loong64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-mips64el": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.21.5.tgz", + "integrity": "sha512-IajOmO+KJK23bj52dFSNCMsz1QP1DqM6cwLUv3W1QwyxkyIWecfafnI555fvSGqEKwjMXVLokcV5ygHW5b3Jbg==", + "cpu": [ + "mips64el" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-ppc64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.21.5.tgz", + "integrity": "sha512-1hHV/Z4OEfMwpLO8rp7CvlhBDnjsC3CttJXIhBi+5Aj5r+MBvy4egg7wCbe//hSsT+RvDAG7s81tAvpL2XAE4w==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-riscv64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.21.5.tgz", + "integrity": "sha512-2HdXDMd9GMgTGrPWnJzP2ALSokE/0O5HhTUvWIbD3YdjME8JwvSCnNGBnTThKGEB91OZhzrJ4qIIxk/SBmyDDA==", + "cpu": [ + "riscv64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-s390x": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.21.5.tgz", + "integrity": "sha512-zus5sxzqBJD3eXxwvjN1yQkRepANgxE9lgOW2qLnmr8ikMTphkjgXu1HR01K4FJg8h1kEEDAqDcZQtbrRnB41A==", + "cpu": [ + "s390x" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.21.5.tgz", + "integrity": "sha512-1rYdTpyv03iycF1+BhzrzQJCdOuAOtaqHTWJZCWvijKD2N5Xu0TtVC8/+1faWqcP9iBCWOmjmhoH94dH82BxPQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/netbsd-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.21.5.tgz", + "integrity": "sha512-Woi2MXzXjMULccIwMnLciyZH4nCIMpWQAs049KEeMvOcNADVxo0UBIQPfSmxB3CWKedngg7sWZdLvLczpe0tLg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "netbsd" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/openbsd-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.21.5.tgz", + "integrity": "sha512-HLNNw99xsvx12lFBUwoT8EVCsSvRNDVxNpjZ7bPn947b8gJPzeHWyNVhFsaerc0n3TsbOINvRP2byTZ5LKezow==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/sunos-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.21.5.tgz", + "integrity": "sha512-6+gjmFpfy0BHU5Tpptkuh8+uw3mnrvgs+dSPQXQOv3ekbordwnzTVEb4qnIvQcYXq6gzkyTnoZ9dZG+D4garKg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "sunos" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/win32-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.21.5.tgz", + "integrity": "sha512-Z0gOTd75VvXqyq7nsl93zwahcTROgqvuAcYDUr+vOv8uHhNSKROyU961kgtCD1e95IqPKSQKH7tBTslnS3tA8A==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/win32-ia32": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.21.5.tgz", + "integrity": "sha512-SWXFF1CL2RVNMaVs+BBClwtfZSvDgtL//G/smwAc5oVK/UPu2Gu9tIaRgFmYFFKrmg3SyAjSrElf0TiJ1v8fYA==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/win32-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.21.5.tgz", + "integrity": "sha512-tQd/1efJuzPC6rCFwEvLtci/xNFcTZknmXs98FYDfGE4wP9ClFV98nyKrzJKVPMhdDnjzLhdUyMX4PsQAPjwIw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=12" + } }, "node_modules/@eslint-community/eslint-utils": { "version": "4.9.1", @@ -727,6 +1128,19 @@ "url": "https://github.com/chalk/strip-ansi?sponsor=1" } }, + "node_modules/@jest/schemas": { + "version": "29.6.3", + "resolved": "https://registry.npmjs.org/@jest/schemas/-/schemas-29.6.3.tgz", + "integrity": "sha512-mo5j5X+jIZmJQveBKeS/clAueipV7KgiX1vMgCxam1RNYiqE1w62n0/tJJnHtjW8ZHcQco5gY85jA3mi0L+nSA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@sinclair/typebox": "^0.27.8" + }, + "engines": { + "node": "^14.15.0 || ^16.10.0 || >=18.0.0" + } + }, "node_modules/@jridgewell/gen-mapping": { "version": "0.3.5", "resolved": "https://registry.npmjs.org/@jridgewell/gen-mapping/-/gen-mapping-0.3.5.tgz", @@ -757,9 +1171,10 @@ } }, "node_modules/@jridgewell/sourcemap-codec": { - "version": "1.4.15", - "resolved": "https://registry.npmjs.org/@jridgewell/sourcemap-codec/-/sourcemap-codec-1.4.15.tgz", - "integrity": "sha512-eF2rxCRulEKXHTRiDrDy6erMYWqNw4LPdQ8UQA4huuxaQsVeRPFl2oM8oDGxMFhJUWZf9McpLtJasDDZb/Bpeg==" + "version": "1.5.5", + "resolved": "https://registry.npmjs.org/@jridgewell/sourcemap-codec/-/sourcemap-codec-1.5.5.tgz", + "integrity": "sha512-cYQ9310grqxueWbl+WuIUIaiUaDcj7WOq5fVhEljNVgRfOUhY9fy2zTvfoqWsnebh8Sl70VScFbICvJnLKB0Og==", + "license": "MIT" }, "node_modules/@jridgewell/trace-mapping": { "version": "0.3.25", @@ -1561,13 +1976,402 @@ } } }, - "node_modules/@radix-ui/rect": { - "version": "1.0.1", - "resolved": "https://registry.npmjs.org/@radix-ui/rect/-/rect-1.0.1.tgz", - "integrity": "sha512-fyrgCaedtvMg9NK3en0pnOYJdtfwxUcNolezkNPUsoX57X8oQk+NkqcvzHXD2uKNij6GXmWU9NDru2IWjrO4BQ==", - "dependencies": { - "@babel/runtime": "^7.13.10" - } + "node_modules/@radix-ui/rect": { + "version": "1.0.1", + "resolved": "https://registry.npmjs.org/@radix-ui/rect/-/rect-1.0.1.tgz", + "integrity": "sha512-fyrgCaedtvMg9NK3en0pnOYJdtfwxUcNolezkNPUsoX57X8oQk+NkqcvzHXD2uKNij6GXmWU9NDru2IWjrO4BQ==", + "dependencies": { + "@babel/runtime": "^7.13.10" + } + }, + "node_modules/@rollup/rollup-android-arm-eabi": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.62.0.tgz", + "integrity": "sha512-IPIQ55ythEHkfEd9jMEi32OQ7SxURsGA43JI22lj01OLZNt2NUbJX8YUHxkVWyQ6daHPNn0truF5nSj3DQp6YQ==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ] + }, + "node_modules/@rollup/rollup-android-arm64": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.62.0.tgz", + "integrity": "sha512-M6s9cr10MibETyo8JsOkq+Lo1+lU6hcvb1MApnUql5qte/5hMEgzlN8/ReIKNfRV8rrqX50W1BX9zoUhC192RA==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ] + }, + "node_modules/@rollup/rollup-darwin-arm64": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.62.0.tgz", + "integrity": "sha512-BqCoMoIbn0keKys+dEAdBa70EtOwV1bEsQCUgU9FdiZmmMge/Zk7LlkYGqbrdHR+Frnt0E1FOanly+rlwvvQzw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ] + }, + "node_modules/@rollup/rollup-darwin-x64": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.62.0.tgz", + "integrity": "sha512-SIMzST3VFNXDAbeIWDWiFCNM5qncUBDWaEV7NfE7oZbDt2mgfW4MvbKdbYiGOLoM32gbTv608UMd0XktEYSD7w==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ] + }, + "node_modules/@rollup/rollup-freebsd-arm64": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.62.0.tgz", + "integrity": "sha512-ezjfSQMP7ArdUsbBwbQIfwAlhE84I2iVnzQNCFSveqV42q+BmKlzVpf7mxv5EchLcoWU4y6/heFzVg1F+hodUQ==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ] + }, + "node_modules/@rollup/rollup-freebsd-x64": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.62.0.tgz", + "integrity": "sha512-9+qTWGW9AZRhnUgwtTwzNwcPlL87ngkeN0LA+q1bADvmY9aNvWaF2TFW8BZgnQPYxpDI7+rMVLivcd4V737TAQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ] + }, + "node_modules/@rollup/rollup-linux-arm-gnueabihf": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.62.0.tgz", + "integrity": "sha512-T1dMEQhXA/jkJ/jyMIw9IovK8bSUq7A8kLIlvZTb/6YIVsp2zLavr4F3oyllHWo7eIVJRyE5n3tUjQJEbE1IuQ==", + "cpu": [ + "arm" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-arm-musleabihf": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.62.0.tgz", + "integrity": "sha512-2as0LgT7qQpyceQq6VUJYnumUMUrgGQCWIiDIN9DE0/tglsk6o66uCB4f3djRawAltvfCNLyZZrsqbPA6inCsA==", + "cpu": [ + "arm" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-arm64-gnu": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.62.0.tgz", + "integrity": "sha512-bVURMg+6eNN9C/yc0aVjooZcwTTtYF4YW3xta5pP0//r3o1V8gXEHXWCndj47w/HhwsFroZrFhR+6uQP5T0n0g==", + "cpu": [ + "arm64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-arm64-musl": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.62.0.tgz", + "integrity": "sha512-Ful8pM/2yYI83PViWdFdpZhdI8HJ5qsXANe5atypbHDf+KIBBDsZsbyy8hbXnULVvW9NsTh5DHwbcBftyLTfiw==", + "cpu": [ + "arm64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-loong64-gnu": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loong64-gnu/-/rollup-linux-loong64-gnu-4.62.0.tgz", + "integrity": "sha512-9Gp/DgrkzfUBmNPVTyPTvay+4xEP7M/clXpj3efXBcm6uTIVIgDg4rqUpqKXvLEuFRVuEpSAOkhgNeecvaZ4Cg==", + "cpu": [ + "loong64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-loong64-musl": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loong64-musl/-/rollup-linux-loong64-musl-4.62.0.tgz", + "integrity": "sha512-m9tsJz54LUXkSYM8+8PG81B9IKK5r+2T0clMq4QrS16xFosufU7firBDAZEsDheDs7wTlP7h3++S7lMsU955HA==", + "cpu": [ + "loong64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-ppc64-gnu": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-ppc64-gnu/-/rollup-linux-ppc64-gnu-4.62.0.tgz", + "integrity": "sha512-3UvJ5PNVU16aJf6M3tFI24pWzAl2/ynfbyRN3ICyQajK1lSkrnVYNnLz3v04J32qKa0FczJc22zeToc0lr2A3w==", + "cpu": [ + "ppc64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-ppc64-musl": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-ppc64-musl/-/rollup-linux-ppc64-musl-4.62.0.tgz", + "integrity": "sha512-vRWUAbYLGHBZS6Q8Msb2sfnf1fvJf+47t8l/TwOerM2qArzy+IeNMTHrYLHXh95h8MoatPHI5hhSZNs+mGXKPg==", + "cpu": [ + "ppc64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-riscv64-gnu": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.62.0.tgz", + "integrity": "sha512-c00T5SYENHAt86cfW47URaP3Us5vLC/4QO7GYud1G5VNRffCwwCuBspwqYrriuJB+5m0WFzClCn9wed0FBjKvg==", + "cpu": [ + "riscv64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-riscv64-musl": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-musl/-/rollup-linux-riscv64-musl-4.62.0.tgz", + "integrity": "sha512-krrCDilhXOwFkSkO3Wm9I/f9H0L92XHHwy2fwxjukxIbh0dem8gZqOW5Y8BsHrpJv5qwlRBV+Wl4ZFyRWhUpwg==", + "cpu": [ + "riscv64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-s390x-gnu": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.62.0.tgz", + "integrity": "sha512-7pfYFSTc4/rUC/FtAI0Qp6QthDBCIi6/AuP1xYqFk5vanI6KnL5dWKP60OM/05LOsbwTmIcvr6eXC4CJuJ75IA==", + "cpu": [ + "s390x" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-x64-gnu": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.62.0.tgz", + "integrity": "sha512-7SDIalKeIpG0Ifogbbdn58HmSotYMlf23K3dCJEmiVd9Fg36Vmni82iPQec27N3wY4Bvbxftkxz6vSx9OcouTg==", + "cpu": [ + "x64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-x64-musl": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.62.0.tgz", + "integrity": "sha512-eRZevouTH2i1HeAVLqJuLnt256krQkGY0TN6WsTmsIhuzbh457HuWDMakKwmi0Cjadux983CoSr8Lim2QhUIFw==", + "cpu": [ + "x64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-openbsd-x64": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-openbsd-x64/-/rollup-openbsd-x64-4.62.0.tgz", + "integrity": "sha512-3oVS7FLGa4U1qcvao9ylGxrjXZyUQqR8UwxEcnUEyPX53O/C/mKDZegNXTdHCP+h3e6ta/f1EN38Yif1mmZHYg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ] + }, + "node_modules/@rollup/rollup-openharmony-arm64": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-openharmony-arm64/-/rollup-openharmony-arm64-4.62.0.tgz", + "integrity": "sha512-yTB9TgfWj5wHe5QgktAgXTLLot1gvEjl1NiPPAUiCs4oPrIWFl5V4nC3GrkNdj9LaAU4s94nVrGbGOCqUpyWsg==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openharmony" + ] + }, + "node_modules/@rollup/rollup-win32-arm64-msvc": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.62.0.tgz", + "integrity": "sha512-5LOhoaesY3doG1c+ac/2JtgREpKoJr5bUHH8tKY0V8di7+uSV6BwLs2PlR0/yzefGOkR+wE7ZolZphHCsyG5Rw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@rollup/rollup-win32-ia32-msvc": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.62.0.tgz", + "integrity": "sha512-yYkWHhmbhRTWTnWos5HC4GcPQfjlzzCNbM9e/+GXrLuaBXYA3qSDR9f0Vgufd5S8yX81U8jPKp7ZnAjZFMtRnw==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@rollup/rollup-win32-x64-gnu": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-gnu/-/rollup-win32-x64-gnu-4.62.0.tgz", + "integrity": "sha512-SoTb6lPg25xZlA2ibwQ++ahCCnH+FP0qmEuafMJ4gznZKOlXioKEAeJLgCrqjM98ACziXM9V1amFjICVL4IFoA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@rollup/rollup-win32-x64-msvc": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.62.0.tgz", + "integrity": "sha512-5L+T1fMX4RIEBoZzT0+sQ0PhTS36NULFmMXtl1TZo44TMAROIMHbZufSOjVWt/Y622BtxgxtaNOokbTDvfsrZA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] }, "node_modules/@rtsao/scc": { "version": "1.1.0", @@ -1583,6 +2387,13 @@ "dev": true, "license": "MIT" }, + "node_modules/@sinclair/typebox": { + "version": "0.27.10", + "resolved": "https://registry.npmjs.org/@sinclair/typebox/-/typebox-0.27.10.tgz", + "integrity": "sha512-MTBk/3jGLNB2tVxv6uLlFh1iu64iYOQ2PbdOSK3NW8JZsmlaOh2q6sdtKowBhfw8QFLmYNzTW4/oK4uATIi6ZA==", + "dev": true, + "license": "MIT" + }, "node_modules/@swc/counter": { "version": "0.1.3", "resolved": "https://registry.npmjs.org/@swc/counter/-/counter-0.1.3.tgz", @@ -1721,9 +2532,10 @@ } }, "node_modules/@types/estree": { - "version": "1.0.5", - "resolved": "https://registry.npmjs.org/@types/estree/-/estree-1.0.5.tgz", - "integrity": "sha512-/kYRxGDLWzHOB7q+wtSUQlFrtcdUccpfy+X+9iMBpHK8QLLhx2wIPYuS5DYtR9Wa/YlZAbIovy7qVdB1Aq6Lyw==" + "version": "1.0.9", + "resolved": "https://registry.npmjs.org/@types/estree/-/estree-1.0.9.tgz", + "integrity": "sha512-GhdPgy1el4/ImP05X05Uw4cw2/M93BCUmnEvWZNStlCzEKME4Fkk+YpoA5OiHNQmoS7Cafb8Xa3Pya8m1Qrzeg==", + "license": "MIT" }, "node_modules/@types/estree-jsx": { "version": "1.0.5", @@ -1798,7 +2610,6 @@ "version": "18.0.32", "resolved": "https://registry.npmjs.org/@types/react/-/react-18.0.32.tgz", "integrity": "sha512-gYGXdtPQ9Cj0w2Fwqg5/ak6BcK3Z15YgjSqtyDizWUfx7mQ8drs0NBUzRRsAdoFVTO8kJ8L2TL8Skm7OFPnLUw==", - "peer": true, "dependencies": { "@types/prop-types": "*", "@types/scheduler": "*", @@ -1809,7 +2620,6 @@ "version": "18.0.11", "resolved": "https://registry.npmjs.org/@types/react-dom/-/react-dom-18.0.11.tgz", "integrity": "sha512-O38bPbI2CWtgw/OoQoY+BRelw7uysmXbWvw3nLWO21H1HSh+GOlqPuXshJfjmpNlKiiSDG9cc1JZAaMmVdcTlw==", - "peer": true, "dependencies": { "@types/react": "*" } @@ -1874,7 +2684,6 @@ "integrity": "sha512-TI1XGwKbDpo9tRW8UDIXCOeLk55qe9ZFGs8MTKU6/M08HWTw52DD/IYhfQtOEhEdPhLMT26Ka/x7p70nd3dzDg==", "dev": true, "license": "MIT", - "peer": true, "dependencies": { "@typescript-eslint/scope-manager": "8.59.0", "@typescript-eslint/types": "8.59.0", @@ -2117,12 +2926,115 @@ "resolved": "https://registry.npmjs.org/@ungap/structured-clone/-/structured-clone-1.2.0.tgz", "integrity": "sha512-zuVdFrMJiuCDQUMCzQaD6KL28MjnqqN8XnAqiEq9PNm/hCPTSGfrXCOfwj1ow4LFb/tNymJPwsNbVePc1xFqrQ==" }, + "node_modules/@vitest/expect": { + "version": "1.6.1", + "resolved": "https://registry.npmjs.org/@vitest/expect/-/expect-1.6.1.tgz", + "integrity": "sha512-jXL+9+ZNIJKruofqXuuTClf44eSpcHlgj3CiuNihUF3Ioujtmc0zIa3UJOW5RjDK1YLBJZnWBlPuqhYycLioog==", + "dev": true, + "license": "MIT", + "dependencies": { + "@vitest/spy": "1.6.1", + "@vitest/utils": "1.6.1", + "chai": "^4.3.10" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/@vitest/runner": { + "version": "1.6.1", + "resolved": "https://registry.npmjs.org/@vitest/runner/-/runner-1.6.1.tgz", + "integrity": "sha512-3nSnYXkVkf3mXFfE7vVyPmi3Sazhb/2cfZGGs0JRzFsPFvAMBEcrweV1V1GsrstdXeKCTXlJbvnQwGWgEIHmOA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@vitest/utils": "1.6.1", + "p-limit": "^5.0.0", + "pathe": "^1.1.1" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/@vitest/runner/node_modules/p-limit": { + "version": "5.0.0", + "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-5.0.0.tgz", + "integrity": "sha512-/Eaoq+QyLSiXQ4lyYV23f14mZRQcXnxfHrN0vCai+ak9G0pp9iEQukIIZq5NccEvwRB8PUnZT0KsOoDCINS1qQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "yocto-queue": "^1.0.0" + }, + "engines": { + "node": ">=18" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/@vitest/runner/node_modules/yocto-queue": { + "version": "1.2.2", + "resolved": "https://registry.npmjs.org/yocto-queue/-/yocto-queue-1.2.2.tgz", + "integrity": "sha512-4LCcse/U2MHZ63HAJVE+v71o7yOdIe4cZ70Wpf8D/IyjDKYQLV5GD46B+hSTjJsvV5PztjvHoU580EftxjDZFQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12.20" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/@vitest/snapshot": { + "version": "1.6.1", + "resolved": "https://registry.npmjs.org/@vitest/snapshot/-/snapshot-1.6.1.tgz", + "integrity": "sha512-WvidQuWAzU2p95u8GAKlRMqMyN1yOJkGHnx3M1PL9Raf7AQ1kwLKg04ADlCa3+OXUZE7BceOhVZiuWAbzCKcUQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "magic-string": "^0.30.5", + "pathe": "^1.1.1", + "pretty-format": "^29.7.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/@vitest/spy": { + "version": "1.6.1", + "resolved": "https://registry.npmjs.org/@vitest/spy/-/spy-1.6.1.tgz", + "integrity": "sha512-MGcMmpGkZebsMZhbQKkAf9CX5zGvjkBTqf8Zx3ApYWXr3wG+QvEu2eXWfnIIWYSJExIp4V9FCKDEeygzkYrXMw==", + "dev": true, + "license": "MIT", + "dependencies": { + "tinyspy": "^2.2.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/@vitest/utils": { + "version": "1.6.1", + "resolved": "https://registry.npmjs.org/@vitest/utils/-/utils-1.6.1.tgz", + "integrity": "sha512-jOrrUvXM4Av9ZWiG1EajNto0u96kWAhJ1LmPmJhXXQx/32MecEKd10pOLYgS2BQx1TgkGhloPU1ArDW2vvaY6g==", + "dev": true, + "license": "MIT", + "dependencies": { + "diff-sequences": "^29.6.3", + "estree-walker": "^3.0.3", + "loupe": "^2.3.7", + "pretty-format": "^29.7.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, "node_modules/acorn": { - "version": "8.11.3", - "resolved": "https://registry.npmjs.org/acorn/-/acorn-8.11.3.tgz", - "integrity": "sha512-Y9rRfJG5jcKOE0CLisYbojUjIrIEE7AGMzA/Sm4BslANhbS+cDMpgBdcPT91oJ7OuJ9hYJBx59RjbhxVnrF8Xg==", + "version": "8.17.0", + "resolved": "https://registry.npmjs.org/acorn/-/acorn-8.17.0.tgz", + "integrity": "sha512-xRQbDb9BnwDafYNn6Vwl839DYVjqXYb1XVGtWAZ1kcDc6iwAL4hg3B1dZlRiuENFeO2H53gFG3in621AdERVAg==", "dev": true, - "peer": true, + "license": "MIT", "bin": { "acorn": "bin/acorn" }, @@ -2139,6 +3051,19 @@ "acorn": "^6.0.0 || ^7.0.0 || ^8.0.0" } }, + "node_modules/acorn-walk": { + "version": "8.3.5", + "resolved": "https://registry.npmjs.org/acorn-walk/-/acorn-walk-8.3.5.tgz", + "integrity": "sha512-HEHNfbars9v4pgpW6SO1KSPkfoS0xVOM/9UzkJltjlsHZmJasxg8aXkuZa7SMf8vKGIBhpUsPluQSqhJFCqebw==", + "dev": true, + "license": "MIT", + "dependencies": { + "acorn": "^8.11.0" + }, + "engines": { + "node": ">=0.4.0" + } + }, "node_modules/ajv": { "version": "6.14.0", "resolved": "https://registry.npmjs.org/ajv/-/ajv-6.14.0.tgz", @@ -2387,6 +3312,16 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/assertion-error": { + "version": "1.1.0", + "resolved": "https://registry.npmjs.org/assertion-error/-/assertion-error-1.1.0.tgz", + "integrity": "sha512-jgsaNduz+ndvGyFt3uSuWqvy4lCnIJiovtouQN5JZHOKCS2QuhEdbcQHFhVksz2N2U9hXJo8odG7ETyWlEeuDw==", + "dev": true, + "license": "MIT", + "engines": { + "node": "*" + } + }, "node_modules/ast-types-flow": { "version": "0.0.8", "resolved": "https://registry.npmjs.org/ast-types-flow/-/ast-types-flow-0.0.8.tgz", @@ -2695,7 +3630,6 @@ "url": "https://github.com/sponsors/ai" } ], - "peer": true, "dependencies": { "caniuse-lite": "^1.0.30001587", "electron-to-chromium": "^1.4.668", @@ -2743,6 +3677,16 @@ "node": ">=10.16.0" } }, + "node_modules/cac": { + "version": "6.7.14", + "resolved": "https://registry.npmjs.org/cac/-/cac-6.7.14.tgz", + "integrity": "sha512-b6Ilus+c3RrdDk+JhLKUAQfzzgLEPy6wcXqS7f/xe1EETvsDP6GORG7SFuOs6cID5YkqchW/LXZbX5bc8j7ZcQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + } + }, "node_modules/call-bind": { "version": "1.0.9", "resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.9.tgz", @@ -2846,6 +3790,25 @@ "url": "https://github.com/sponsors/wooorm" } }, + "node_modules/chai": { + "version": "4.5.0", + "resolved": "https://registry.npmjs.org/chai/-/chai-4.5.0.tgz", + "integrity": "sha512-RITGBfijLkBddZvnn8jdqoTypxvqbOLYQkGGxXzeFjVHvudaPw0HNFD9x928/eUwYWd2dPCugVqspGALTZZQKw==", + "dev": true, + "license": "MIT", + "dependencies": { + "assertion-error": "^1.1.0", + "check-error": "^1.0.3", + "deep-eql": "^4.1.3", + "get-func-name": "^2.0.2", + "loupe": "^2.3.6", + "pathval": "^1.1.1", + "type-detect": "^4.1.0" + }, + "engines": { + "node": ">=4" + } + }, "node_modules/chalk": { "version": "4.1.2", "resolved": "https://registry.npmjs.org/chalk/-/chalk-4.1.2.tgz", @@ -2898,6 +3861,19 @@ "url": "https://github.com/sponsors/wooorm" } }, + "node_modules/check-error": { + "version": "1.0.3", + "resolved": "https://registry.npmjs.org/check-error/-/check-error-1.0.3.tgz", + "integrity": "sha512-iKEoDYaRmd1mxM90a2OEfWhjsjPpYPuQ+lMYsoxB126+t8fw7ySEO48nmDg5COTjxDI65/Y2OWpeEHk3ZOe8zg==", + "dev": true, + "license": "MIT", + "dependencies": { + "get-func-name": "^2.0.2" + }, + "engines": { + "node": "*" + } + }, "node_modules/chokidar": { "version": "3.6.0", "resolved": "https://registry.npmjs.org/chokidar/-/chokidar-3.6.0.tgz", @@ -3010,10 +3986,18 @@ "integrity": "sha512-/Srv4dswyQNBfohGpz9o6Yb3Gz3SrUDqBH5rTuhGR7ahtlbYKnVxw2bCFMRljaA7EXHaXZ8wsHdodFvbkhKmqg==", "dev": true }, + "node_modules/confbox": { + "version": "0.1.8", + "resolved": "https://registry.npmjs.org/confbox/-/confbox-0.1.8.tgz", + "integrity": "sha512-RMtmw0iFkeR4YV+fUOSucriAQNb9g8zFR52MWCtl+cCZOFRNL6zeB395vPzFhEjjn4fMxXudmELnl/KF/WrK6w==", + "dev": true, + "license": "MIT" + }, "node_modules/convert-source-map": { "version": "2.0.0", "resolved": "https://registry.npmjs.org/convert-source-map/-/convert-source-map-2.0.0.tgz", - "integrity": "sha512-Kvp459HrV2FEJ1CAsi1Ku+MY3kasH19TFykTz2xWmMeq6bk2NU3XXvfJ+Q61m0xktWwt+1HSYf3JZsTms3aRJg==" + "integrity": "sha512-Kvp459HrV2FEJ1CAsi1Ku+MY3kasH19TFykTz2xWmMeq6bk2NU3XXvfJ+Q61m0xktWwt+1HSYf3JZsTms3aRJg==", + "peer": true }, "node_modules/cross-spawn": { "version": "7.0.6", @@ -3238,7 +4222,6 @@ "version": "3.6.0", "resolved": "https://registry.npmjs.org/date-fns/-/date-fns-3.6.0.tgz", "integrity": "sha512-fRHTG8g/Gif+kSh50gaGEdToemgfj74aRX3swtiouboip5JDLAyDE9F11nHMIcvOaXeOC6D7SpNhi7uFyB7Uww==", - "peer": true, "funding": { "type": "github", "url": "https://github.com/sponsors/kossnocorp" @@ -3292,6 +4275,19 @@ "url": "https://github.com/sponsors/sindresorhus" } }, + "node_modules/deep-eql": { + "version": "4.1.4", + "resolved": "https://registry.npmjs.org/deep-eql/-/deep-eql-4.1.4.tgz", + "integrity": "sha512-SUwdGfqdKOwxCPeVYjwSyRpJ7Z+fhpwIAtmCUdZIWZ/YP5R9WAsyuSgpLVDi9bjWoN2LXHNss/dk3urXtdQxGg==", + "dev": true, + "license": "MIT", + "dependencies": { + "type-detect": "^4.0.0" + }, + "engines": { + "node": ">=6" + } + }, "node_modules/deep-extend": { "version": "0.6.0", "resolved": "https://registry.npmjs.org/deep-extend/-/deep-extend-0.6.0.tgz", @@ -3387,6 +4383,16 @@ "resolved": "https://registry.npmjs.org/didyoumean/-/didyoumean-1.2.2.tgz", "integrity": "sha512-gxtyfqMg7GKyhQmb056K7M3xszy/myH8w+B4RT+QXBQsvAOdc3XymqDDPHx1BgPgsdAA5SIifona89YtRATDzw==" }, + "node_modules/diff-sequences": { + "version": "29.6.3", + "resolved": "https://registry.npmjs.org/diff-sequences/-/diff-sequences-29.6.3.tgz", + "integrity": "sha512-EjePK1srD3P08o2j4f0ExnylqRs5B9tJjcp9t1krH2qRi8CCdsYfwe9JgSLurFBWwq4uOlipzfk5fHNvwFKr8Q==", + "dev": true, + "license": "MIT", + "engines": { + "node": "^14.15.0 || ^16.10.0 || >=18.0.0" + } + }, "node_modules/dlv": { "version": "1.1.3", "resolved": "https://registry.npmjs.org/dlv/-/dlv-1.1.3.tgz", @@ -3652,6 +4658,45 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/esbuild": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.21.5.tgz", + "integrity": "sha512-mg3OPMV4hXywwpoDxu3Qda5xCKQi+vCTZq8S9J/EpkhB2HzKXq4SNFZE3+NK93JYxc8VMSep+lOUSC/RVKaBqw==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "bin": { + "esbuild": "bin/esbuild" + }, + "engines": { + "node": ">=12" + }, + "optionalDependencies": { + "@esbuild/aix-ppc64": "0.21.5", + "@esbuild/android-arm": "0.21.5", + "@esbuild/android-arm64": "0.21.5", + "@esbuild/android-x64": "0.21.5", + "@esbuild/darwin-arm64": "0.21.5", + "@esbuild/darwin-x64": "0.21.5", + "@esbuild/freebsd-arm64": "0.21.5", + "@esbuild/freebsd-x64": "0.21.5", + "@esbuild/linux-arm": "0.21.5", + "@esbuild/linux-arm64": "0.21.5", + "@esbuild/linux-ia32": "0.21.5", + "@esbuild/linux-loong64": "0.21.5", + "@esbuild/linux-mips64el": "0.21.5", + "@esbuild/linux-ppc64": "0.21.5", + "@esbuild/linux-riscv64": "0.21.5", + "@esbuild/linux-s390x": "0.21.5", + "@esbuild/linux-x64": "0.21.5", + "@esbuild/netbsd-x64": "0.21.5", + "@esbuild/openbsd-x64": "0.21.5", + "@esbuild/sunos-x64": "0.21.5", + "@esbuild/win32-arm64": "0.21.5", + "@esbuild/win32-ia32": "0.21.5", + "@esbuild/win32-x64": "0.21.5" + } + }, "node_modules/escalade": { "version": "3.1.2", "resolved": "https://registry.npmjs.org/escalade/-/escalade-3.1.2.tgz", @@ -3677,7 +4722,6 @@ "resolved": "https://registry.npmjs.org/eslint/-/eslint-8.57.0.tgz", "integrity": "sha512-dZ6+mexnaTIbSBZWgou51U6OmzIhYM2VcNdtiTtI7qPNZm35Akpr0f6vtw3w1Kmn5PYo+tZVfh13WrhpS6oLqQ==", "dev": true, - "peer": true, "dependencies": { "@eslint-community/eslint-utils": "^4.2.0", "@eslint-community/regexpp": "^4.6.1", @@ -3834,7 +4878,6 @@ "integrity": "sha512-whOE1HFo/qJDyX4SnXzP4N6zOWn79WhnCUY/iDR0mPfQZO8wcYE4JClzI2oZrhBnnMUCBCHZhO6VQyoBU95mZA==", "dev": true, "license": "MIT", - "peer": true, "dependencies": { "@rtsao/scc": "^1.1.0", "array-includes": "^3.1.9", @@ -4103,6 +5146,16 @@ "url": "https://opencollective.com/unified" } }, + "node_modules/estree-walker": { + "version": "3.0.3", + "resolved": "https://registry.npmjs.org/estree-walker/-/estree-walker-3.0.3.tgz", + "integrity": "sha512-7RUKfXgSMMkzt6ZuXmqapOurLGPPfgj6l9uRZ7lRGolvk0y2yocc35LdcxKC5PQZdn2DMqioAQ2NoWcrTKmm6g==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/estree": "^1.0.0" + } + }, "node_modules/esutils": { "version": "2.0.3", "resolved": "https://registry.npmjs.org/esutils/-/esutils-2.0.3.tgz", @@ -4126,6 +5179,30 @@ "bare-events": "^2.7.0" } }, + "node_modules/execa": { + "version": "8.0.1", + "resolved": "https://registry.npmjs.org/execa/-/execa-8.0.1.tgz", + "integrity": "sha512-VyhnebXciFV2DESc+p6B+y0LjSm0krU4OgJN44qFAhBY0TJ+1V61tYD2+wHusZ6F9n5K+vl8k0sTy7PEfV4qpg==", + "dev": true, + "license": "MIT", + "dependencies": { + "cross-spawn": "^7.0.3", + "get-stream": "^8.0.1", + "human-signals": "^5.0.0", + "is-stream": "^3.0.0", + "merge-stream": "^2.0.0", + "npm-run-path": "^5.1.0", + "onetime": "^6.0.0", + "signal-exit": "^4.1.0", + "strip-final-newline": "^3.0.0" + }, + "engines": { + "node": ">=16.17" + }, + "funding": { + "url": "https://github.com/sindresorhus/execa?sponsor=1" + } + }, "node_modules/expand-template": { "version": "2.0.3", "resolved": "https://registry.npmjs.org/expand-template/-/expand-template-2.0.3.tgz", @@ -4420,10 +5497,21 @@ "version": "1.0.0-beta.2", "resolved": "https://registry.npmjs.org/gensync/-/gensync-1.0.0-beta.2.tgz", "integrity": "sha512-3hN7NaskYvMDLQY55gnW3NQ+mesEAepTqlg+VEbj7zzqEMBVNhzcGYYeqFo/TlYz6eQiFcp1HcsCZO+nGgS8zg==", + "peer": true, "engines": { "node": ">=6.9.0" } }, + "node_modules/get-func-name": { + "version": "2.0.2", + "resolved": "https://registry.npmjs.org/get-func-name/-/get-func-name-2.0.2.tgz", + "integrity": "sha512-8vXOvuE167CtIc3OyItco7N/dpRtBbYOsPsXCz7X/PMnlGjYjSGuZJgM1Y7mmew7BKf9BqvLX2tnOVy1BBUsxQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": "*" + } + }, "node_modules/get-intrinsic": { "version": "1.3.0", "resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.3.0.tgz", @@ -4471,6 +5559,19 @@ "node": ">= 0.4" } }, + "node_modules/get-stream": { + "version": "8.0.1", + "resolved": "https://registry.npmjs.org/get-stream/-/get-stream-8.0.1.tgz", + "integrity": "sha512-VaUJspBffn/LMCJVoMvSAdmscJyS1auj5Zulnn5UoYcY531UWmdwhRWkcGKnGU93m5HSXP9LP2usOryrBtQowA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=16" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/get-symbol-description": { "version": "1.1.0", "resolved": "https://registry.npmjs.org/get-symbol-description/-/get-symbol-description-1.1.0.tgz", @@ -4891,6 +5992,16 @@ "url": "https://opencollective.com/unified" } }, + "node_modules/human-signals": { + "version": "5.0.0", + "resolved": "https://registry.npmjs.org/human-signals/-/human-signals-5.0.0.tgz", + "integrity": "sha512-AXcZb6vzzrFAUE61HnN4mpLqd/cSIwNQjtNWR0euPm6y0iqx3G4gOXaIDdtdDwZmhwe82LA6+zinmW4UBWVePQ==", + "dev": true, + "license": "Apache-2.0", + "engines": { + "node": ">=16.17.0" + } + }, "node_modules/ieee754": { "version": "1.2.1", "resolved": "https://registry.npmjs.org/ieee754/-/ieee754-1.2.1.tgz", @@ -5373,6 +6484,19 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/is-stream": { + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/is-stream/-/is-stream-3.0.0.tgz", + "integrity": "sha512-LnQR4bZ9IADDRSkvpqMGvt/tEJWclzklNgSw48V5EAaAeDd6qGvN8ei6k5p0tvxSR171VmGyHuTiAOfxAbr8kA==", + "dev": true, + "license": "MIT", + "engines": { + "node": "^12.20.0 || ^14.13.1 || >=16.0.0" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/is-string": { "version": "1.1.1", "resolved": "https://registry.npmjs.org/is-string/-/is-string-1.1.1.tgz", @@ -5582,6 +6706,7 @@ "version": "2.2.3", "resolved": "https://registry.npmjs.org/json5/-/json5-2.2.3.tgz", "integrity": "sha512-XmOWe7eyHYH14cLdVPoyg+GOH3rYX++KpzrylJwSW98t3Nk+U8XOl8FWKOgwtzdb8lXGf6zYwDUzeHMWfxasyg==", + "peer": true, "bin": { "json5": "lib/cli.js" }, @@ -5660,6 +6785,23 @@ "resolved": "https://registry.npmjs.org/lines-and-columns/-/lines-and-columns-1.2.4.tgz", "integrity": "sha512-7ylylesZQ/PV29jhEDl3Ufjo6ZX7gCqJr5F7PKrqc93v7fzSymt1BpwEU8nAUXs8qzzvqhbjhK5QZg6Mt/HkBg==" }, + "node_modules/local-pkg": { + "version": "0.5.1", + "resolved": "https://registry.npmjs.org/local-pkg/-/local-pkg-0.5.1.tgz", + "integrity": "sha512-9rrA30MRRP3gBD3HTGnC6cDFpaE1kVDWxWgqWJUN0RvDNAo+Nz/9GxB+nHOH0ifbVFy0hSA1V6vFDvnx54lTEQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "mlly": "^1.7.3", + "pkg-types": "^1.2.1" + }, + "engines": { + "node": ">=14" + }, + "funding": { + "url": "https://github.com/sponsors/antfu" + } + }, "node_modules/locate-path": { "version": "6.0.0", "resolved": "https://registry.npmjs.org/locate-path/-/locate-path-6.0.0.tgz", @@ -5725,6 +6867,16 @@ "loose-envify": "cli.js" } }, + "node_modules/loupe": { + "version": "2.3.7", + "resolved": "https://registry.npmjs.org/loupe/-/loupe-2.3.7.tgz", + "integrity": "sha512-zSMINGVYkdpYSOBmLi0D1Uo7JU9nVdQKrHxC8eYlV+9YKK9WePqAlL7lSlorG/U2Fw1w0hTBmaa/jrQ3UbPHtA==", + "dev": true, + "license": "MIT", + "dependencies": { + "get-func-name": "^2.0.1" + } + }, "node_modules/lru-cache": { "version": "10.2.2", "resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-10.2.2.tgz", @@ -5733,6 +6885,16 @@ "node": "14 || >=16.14" } }, + "node_modules/magic-string": { + "version": "0.30.21", + "resolved": "https://registry.npmjs.org/magic-string/-/magic-string-0.30.21.tgz", + "integrity": "sha512-vd2F4YUyEXKGcLHoq+TEyCjxueSeHnFxyyjNp80yg0XV4vUhnDer/lvvlqM/arB5bXQN5K2/3oinyCRyx8T2CQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@jridgewell/sourcemap-codec": "^1.5.5" + } + }, "node_modules/markdown-table": { "version": "3.0.3", "resolved": "https://registry.npmjs.org/markdown-table/-/markdown-table-3.0.3.tgz", @@ -6019,6 +7181,13 @@ "url": "https://opencollective.com/unified" } }, + "node_modules/merge-stream": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/merge-stream/-/merge-stream-2.0.0.tgz", + "integrity": "sha512-abv/qOcuPfk3URPfDzmZU1LKmuw8kT+0nIHvKrKgFrwifol/doWcdA4ZqsWQ8ENrFKkd67Mfpo/LovbIUsbt3w==", + "dev": true, + "license": "MIT" + }, "node_modules/merge2": { "version": "1.4.1", "resolved": "https://registry.npmjs.org/merge2/-/merge2-1.4.1.tgz", @@ -6575,6 +7744,19 @@ "node": ">=8.6" } }, + "node_modules/mimic-fn": { + "version": "4.0.0", + "resolved": "https://registry.npmjs.org/mimic-fn/-/mimic-fn-4.0.0.tgz", + "integrity": "sha512-vqiC06CuhBTUdZH+RYl8sFrL096vA45Ok5ISO6sE/Mr1jRbGH4Csnhi8f3wKVl7x8mO4Au7Ir9D3Oyv1VYMFJw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/mimic-response": { "version": "3.1.0", "resolved": "https://registry.npmjs.org/mimic-response/-/mimic-response-3.1.0.tgz", @@ -6621,6 +7803,26 @@ "resolved": "https://registry.npmjs.org/mkdirp-classic/-/mkdirp-classic-0.5.3.tgz", "integrity": "sha512-gKLcREMhtuZRwRAfqP3RFW+TK4JqApVBtOIftVgjuABpAtpxhPGaDcfvbhNvD0B8iD1oUr/txX35NjcaY6Ns/A==" }, + "node_modules/mlly": { + "version": "1.8.2", + "resolved": "https://registry.npmjs.org/mlly/-/mlly-1.8.2.tgz", + "integrity": "sha512-d+ObxMQFmbt10sretNDytwt85VrbkhhUA/JBGm1MPaWJ65Cl4wOgLaB1NYvJSZ0Ef03MMEU/0xpPMXUIQ29UfA==", + "dev": true, + "license": "MIT", + "dependencies": { + "acorn": "^8.16.0", + "pathe": "^2.0.3", + "pkg-types": "^1.3.1", + "ufo": "^1.6.3" + } + }, + "node_modules/mlly/node_modules/pathe": { + "version": "2.0.3", + "resolved": "https://registry.npmjs.org/pathe/-/pathe-2.0.3.tgz", + "integrity": "sha512-WUjGcAqP1gQacoQe+OBJsFA7Ld4DyXuUIjZ5cc75cLHvJ7dtNsTugphxIADwspS+AraAUePCKrSVtPLFj/F88w==", + "dev": true, + "license": "MIT" + }, "node_modules/ms": { "version": "2.1.3", "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz", @@ -6638,9 +7840,9 @@ } }, "node_modules/nanoid": { - "version": "3.3.11", - "resolved": "https://registry.npmjs.org/nanoid/-/nanoid-3.3.11.tgz", - "integrity": "sha512-N8SpfPUnUp1bK+PMYW8qSWdl9U+wwNWI4QKxOYDy9JAro3WMX7p2OeVRF9v+347pnakNevPmiHhNmZ2HbFA76w==", + "version": "3.3.13", + "resolved": "https://registry.npmjs.org/nanoid/-/nanoid-3.3.13.tgz", + "integrity": "sha512-sPdqC6ByMVVGvF1ynvvMo0/o+oD1VX7DaHhijt1bFgjvBkHBib4t49GoNDhf2NDta4oeUNlaGbSt5K7qjZ955Q==", "funding": [ { "type": "github", @@ -6969,6 +8171,35 @@ "node": "^18.17.0 || >=20.5.0" } }, + "node_modules/npm-run-path": { + "version": "5.3.0", + "resolved": "https://registry.npmjs.org/npm-run-path/-/npm-run-path-5.3.0.tgz", + "integrity": "sha512-ppwTtiJZq0O/ai0z7yfudtBpWIoxM8yE6nHi1X47eFR2EWORqfbu6CnPlNsjeN683eT0qG6H/Pyf9fCcvjnnnQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "path-key": "^4.0.0" + }, + "engines": { + "node": "^12.20.0 || ^14.13.1 || >=16.0.0" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/npm-run-path/node_modules/path-key": { + "version": "4.0.0", + "resolved": "https://registry.npmjs.org/path-key/-/path-key-4.0.0.tgz", + "integrity": "sha512-haREypq7xkM7ErfgIyA0z+Bj4AGKlMSdlQE2jvJo6huWD1EdkKYV+G/T4nq0YEF2vgTT8kqMFKo1uHn950r4SQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/npm/node_modules/@isaacs/cliui": { "version": "8.0.2", "inBundle": true, @@ -8541,7 +9772,6 @@ "version": "4.0.3", "inBundle": true, "license": "MIT", - "peer": true, "engines": { "node": ">=12" }, @@ -9269,6 +10499,22 @@ "wrappy": "1" } }, + "node_modules/onetime": { + "version": "6.0.0", + "resolved": "https://registry.npmjs.org/onetime/-/onetime-6.0.0.tgz", + "integrity": "sha512-1FlR+gjXK7X+AsAHso35MnyN5KqGwJRi/31ft6x0M194ht7S+rWAvd7PHss9xSKMzE0asv1pyIHaJYq+BbacAQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "mimic-fn": "^4.0.0" + }, + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/optionator": { "version": "0.9.4", "resolved": "https://registry.npmjs.org/optionator/-/optionator-0.9.4.tgz", @@ -9438,6 +10684,23 @@ "url": "https://github.com/sponsors/isaacs" } }, + "node_modules/pathe": { + "version": "1.1.2", + "resolved": "https://registry.npmjs.org/pathe/-/pathe-1.1.2.tgz", + "integrity": "sha512-whLdWMYL2TwI08hn8/ZqAbrVemu0LNaNNJZX73O6qaIdCTfXutsLhMkjdENX0qhsQ9uIimo4/aQOmXkoon2nDQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/pathval": { + "version": "1.1.1", + "resolved": "https://registry.npmjs.org/pathval/-/pathval-1.1.1.tgz", + "integrity": "sha512-Dp6zGqpTdETdR63lehJYPeIOqpiNBNtc7BpWSLrOje7UaIsE5aY92r/AunQA7rsXvet3lrJ3JnZX29UPTKXyKQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": "*" + } + }, "node_modules/picocolors": { "version": "1.1.1", "resolved": "https://registry.npmjs.org/picocolors/-/picocolors-1.1.1.tgz", @@ -9472,6 +10735,25 @@ "node": ">= 6" } }, + "node_modules/pkg-types": { + "version": "1.3.1", + "resolved": "https://registry.npmjs.org/pkg-types/-/pkg-types-1.3.1.tgz", + "integrity": "sha512-/Jm5M4RvtBFVkKWRu2BLUTNP8/M2a+UwuAX+ae4770q1qVGtfjG+WTCupoZixokjmHiry8uI+dlY8KXYV5HVVQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "confbox": "^0.1.8", + "mlly": "^1.7.4", + "pathe": "^2.0.1" + } + }, + "node_modules/pkg-types/node_modules/pathe": { + "version": "2.0.3", + "resolved": "https://registry.npmjs.org/pathe/-/pathe-2.0.3.tgz", + "integrity": "sha512-WUjGcAqP1gQacoQe+OBJsFA7Ld4DyXuUIjZ5cc75cLHvJ7dtNsTugphxIADwspS+AraAUePCKrSVtPLFj/F88w==", + "dev": true, + "license": "MIT" + }, "node_modules/possible-typed-array-names": { "version": "1.1.0", "resolved": "https://registry.npmjs.org/possible-typed-array-names/-/possible-typed-array-names-1.1.0.tgz", @@ -9483,9 +10765,9 @@ } }, "node_modules/postcss": { - "version": "8.4.38", - "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.4.38.tgz", - "integrity": "sha512-Wglpdk03BSfXkHoQa3b/oulrotAkwrlLDRSOb9D0bN86FdRyE9lppSp33aHNPgBa0JKCoB+drFLZkQoRRYae5A==", + "version": "8.5.15", + "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.15.tgz", + "integrity": "sha512-FfR8sjd4em2T6fb3I2MwAJU7HWVMr9zba+enmQeeWFfCbm+UOC/0X4DS8XtpUTMwWMGbjKYP7xjfNekzyGmB3A==", "funding": [ { "type": "opencollective", @@ -9500,11 +10782,11 @@ "url": "https://github.com/sponsors/ai" } ], - "peer": true, + "license": "MIT", "dependencies": { - "nanoid": "^3.3.7", - "picocolors": "^1.0.0", - "source-map-js": "^1.2.0" + "nanoid": "^3.3.12", + "picocolors": "^1.1.1", + "source-map-js": "^1.2.1" }, "engines": { "node": "^10 || ^12 || >=14" @@ -9713,6 +10995,41 @@ "url": "https://github.com/prettier/prettier?sponsor=1" } }, + "node_modules/pretty-format": { + "version": "29.7.0", + "resolved": "https://registry.npmjs.org/pretty-format/-/pretty-format-29.7.0.tgz", + "integrity": "sha512-Pdlw/oPxN+aXdmM9R00JVC9WVFoCLTKJvDVLgmJ+qAffBMxsV85l/Lu7sNx4zSzPyoL2euImuEwHhOXdEgNFZQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@jest/schemas": "^29.6.3", + "ansi-styles": "^5.0.0", + "react-is": "^18.0.0" + }, + "engines": { + "node": "^14.15.0 || ^16.10.0 || >=18.0.0" + } + }, + "node_modules/pretty-format/node_modules/ansi-styles": { + "version": "5.2.0", + "resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-5.2.0.tgz", + "integrity": "sha512-Cxwpt2SfTzTtXcfOlzGEee8O+c+MmUgGrNiBcXnuWxuFJHe6a5Hz7qwhwe5OgaSYI0IJvkLqWX1ASG+cJOkEiA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/chalk/ansi-styles?sponsor=1" + } + }, + "node_modules/pretty-format/node_modules/react-is": { + "version": "18.3.1", + "resolved": "https://registry.npmjs.org/react-is/-/react-is-18.3.1.tgz", + "integrity": "sha512-/LLMVyas0ljjAtoYiPqYiL8VWXzUUdThrmU5+n20DZv+a+ClRoevUzw5JxU+Ieh5/c87ytoTBV9G1FiKfNJdmg==", + "dev": true, + "license": "MIT" + }, "node_modules/prismjs": { "version": "1.30.0", "resolved": "https://registry.npmjs.org/prismjs/-/prismjs-1.30.0.tgz", @@ -9809,7 +11126,6 @@ "version": "18.3.1", "resolved": "https://registry.npmjs.org/react/-/react-18.3.1.tgz", "integrity": "sha512-wS+hAgJShR0KhEvPJArfuPVN1+Hz1t0Y6n5jLrGQbkb4urgPE/0Rve+1kMB1v/oWgHgm4WIcV+i7F2pTVj+2iQ==", - "peer": true, "dependencies": { "loose-envify": "^1.1.0" }, @@ -9834,7 +11150,6 @@ "version": "18.3.1", "resolved": "https://registry.npmjs.org/react-dom/-/react-dom-18.3.1.tgz", "integrity": "sha512-5m4nQKp+rZRb09LNH59GM4BxTh9251/ylbKIbpe7TpGxfJ+9kv6BLkLBXIjjspbgbnIBNqlI23tRnTWT0snUIw==", - "peer": true, "dependencies": { "loose-envify": "^1.1.0", "scheduler": "^0.23.2" @@ -10327,6 +11642,51 @@ "url": "https://github.com/sponsors/isaacs" } }, + "node_modules/rollup": { + "version": "4.62.0", + "resolved": "https://registry.npmjs.org/rollup/-/rollup-4.62.0.tgz", + "integrity": "sha512-nc72Wgq62I7rtDV4izT5/aaS0zxy3kttkinf9586ApknY3jZO9NYsmtc24fUckA0X7Q2v+ML4a15pdUlV5V/jA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/estree": "1.0.9" + }, + "bin": { + "rollup": "dist/bin/rollup" + }, + "engines": { + "node": ">=18.0.0", + "npm": ">=8.0.0" + }, + "optionalDependencies": { + "@rollup/rollup-android-arm-eabi": "4.62.0", + "@rollup/rollup-android-arm64": "4.62.0", + "@rollup/rollup-darwin-arm64": "4.62.0", + "@rollup/rollup-darwin-x64": "4.62.0", + "@rollup/rollup-freebsd-arm64": "4.62.0", + "@rollup/rollup-freebsd-x64": "4.62.0", + "@rollup/rollup-linux-arm-gnueabihf": "4.62.0", + "@rollup/rollup-linux-arm-musleabihf": "4.62.0", + "@rollup/rollup-linux-arm64-gnu": "4.62.0", + "@rollup/rollup-linux-arm64-musl": "4.62.0", + "@rollup/rollup-linux-loong64-gnu": "4.62.0", + "@rollup/rollup-linux-loong64-musl": "4.62.0", + "@rollup/rollup-linux-ppc64-gnu": "4.62.0", + "@rollup/rollup-linux-ppc64-musl": "4.62.0", + "@rollup/rollup-linux-riscv64-gnu": "4.62.0", + "@rollup/rollup-linux-riscv64-musl": "4.62.0", + "@rollup/rollup-linux-s390x-gnu": "4.62.0", + "@rollup/rollup-linux-x64-gnu": "4.62.0", + "@rollup/rollup-linux-x64-musl": "4.62.0", + "@rollup/rollup-openbsd-x64": "4.62.0", + "@rollup/rollup-openharmony-arm64": "4.62.0", + "@rollup/rollup-win32-arm64-msvc": "4.62.0", + "@rollup/rollup-win32-ia32-msvc": "4.62.0", + "@rollup/rollup-win32-x64-gnu": "4.62.0", + "@rollup/rollup-win32-x64-msvc": "4.62.0", + "fsevents": "~2.3.2" + } + }, "node_modules/run-parallel": { "version": "1.2.0", "resolved": "https://registry.npmjs.org/run-parallel/-/run-parallel-1.2.0.tgz", @@ -10613,6 +11973,13 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/siginfo": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/siginfo/-/siginfo-2.0.0.tgz", + "integrity": "sha512-ybx0WO1/8bSBLEWXZvEd7gMW3Sn3JFlW3TvX1nREbDLRNQNaeNN8WK0meBwPdAaOI7TtRRRJn/Es1zhrrCHu7g==", + "dev": true, + "license": "ISC" + }, "node_modules/signal-exit": { "version": "4.1.0", "resolved": "https://registry.npmjs.org/signal-exit/-/signal-exit-4.1.0.tgz", @@ -10676,9 +12043,10 @@ } }, "node_modules/source-map-js": { - "version": "1.2.0", - "resolved": "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.0.tgz", - "integrity": "sha512-itJW8lvSA0TXEphiRoawsCksnlf8SyvmFzIhltqAHluXd88pkCd+cXJVHTDwdCr0IzwptSm035IHQktUu1QUMg==", + "version": "1.2.1", + "resolved": "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.1.tgz", + "integrity": "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==", + "license": "BSD-3-Clause", "engines": { "node": ">=0.10.0" } @@ -10692,6 +12060,20 @@ "url": "https://github.com/sponsors/wooorm" } }, + "node_modules/stackback": { + "version": "0.0.2", + "resolved": "https://registry.npmjs.org/stackback/-/stackback-0.0.2.tgz", + "integrity": "sha512-1XMJE5fQo1jGH6Y/7ebnwPOBEkIEnT4QF32d5R1+VXdXveM0IBMJt8zfaxX1P3QhVwrYe+576+jkANtSS2mBbw==", + "dev": true, + "license": "MIT" + }, + "node_modules/std-env": { + "version": "3.10.0", + "resolved": "https://registry.npmjs.org/std-env/-/std-env-3.10.0.tgz", + "integrity": "sha512-5GS12FdOZNliM5mAOxFRg7Ir0pWz8MdpYm6AY6VPkGpbA7ZzmbzNcBJQ0GPvvyWgcY7QAhCgf9Uy89I03faLkg==", + "dev": true, + "license": "MIT" + }, "node_modules/stop-iteration-iterator": { "version": "1.1.0", "resolved": "https://registry.npmjs.org/stop-iteration-iterator/-/stop-iteration-iterator-1.1.0.tgz", @@ -10956,6 +12338,19 @@ "node": ">=4" } }, + "node_modules/strip-final-newline": { + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/strip-final-newline/-/strip-final-newline-3.0.0.tgz", + "integrity": "sha512-dOESqjYr96iWYylGObzd39EuNTa5VJxyvVAEm5Jnh7KGo75V43Hk1odPQkNDyXNmUR6k+gEiDVXnjB8HJ3crXw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/strip-json-comments": { "version": "3.1.1", "resolved": "https://registry.npmjs.org/strip-json-comments/-/strip-json-comments-3.1.1.tgz", @@ -10968,6 +12363,26 @@ "url": "https://github.com/sponsors/sindresorhus" } }, + "node_modules/strip-literal": { + "version": "2.1.1", + "resolved": "https://registry.npmjs.org/strip-literal/-/strip-literal-2.1.1.tgz", + "integrity": "sha512-631UJ6O00eNGfMiWG78ck80dfBab8X6IVFB51jZK5Icd7XAs60Z5y7QdSd/wGIklnWvRbUNloVzhOKKmutxQ6Q==", + "dev": true, + "license": "MIT", + "dependencies": { + "js-tokens": "^9.0.1" + }, + "funding": { + "url": "https://github.com/sponsors/antfu" + } + }, + "node_modules/strip-literal/node_modules/js-tokens": { + "version": "9.0.1", + "resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-9.0.1.tgz", + "integrity": "sha512-mxa9E9ITFOt0ban3j6L5MpjwegGz6lBQmM1IJkWeBZGcMxto50+eWdjC/52xDbS2vy0k7vIMK0Fe2wfL9OQSpQ==", + "dev": true, + "license": "MIT" + }, "node_modules/style-to-object": { "version": "1.0.6", "resolved": "https://registry.npmjs.org/style-to-object/-/style-to-object-1.0.6.tgz", @@ -10980,7 +12395,6 @@ "version": "5.3.11", "resolved": "https://registry.npmjs.org/styled-components/-/styled-components-5.3.11.tgz", "integrity": "sha512-uuzIIfnVkagcVHv9nE0VPlHPSCmXIUGKfJ42LNjxCCTDTL5sgnJ8Z7GZBq0EnLYGln77tPpEpExt2+qa+cZqSw==", - "peer": true, "dependencies": { "@babel/helper-module-imports": "^7.0.0", "@babel/traverse": "^7.4.5", @@ -11127,7 +12541,6 @@ "version": "3.4.3", "resolved": "https://registry.npmjs.org/tailwindcss/-/tailwindcss-3.4.3.tgz", "integrity": "sha512-U7sxQk/n397Bmx4JHbJx/iSOOv5G+II3f1kpLpY2QeUv5DcPdcTsYLlusZfq1NthHS1c1cZoyFmmkex1rzke0A==", - "peer": true, "dependencies": { "@alloc/quick-lru": "^5.2.0", "arg": "^5.0.2", @@ -11264,6 +12677,13 @@ "resolved": "https://registry.npmjs.org/tiny-warning/-/tiny-warning-1.0.3.tgz", "integrity": "sha512-lBN9zLN/oAf68o3zNXYrdCt1kP8WsiGW8Oo2ka41b2IM5JL/S1CTyX1rW0mb/zSuJun0ZUrDxx4sqvYS2FWzPA==" }, + "node_modules/tinybench": { + "version": "2.9.0", + "resolved": "https://registry.npmjs.org/tinybench/-/tinybench-2.9.0.tgz", + "integrity": "sha512-0+DUvqWMValLmha6lr4kD8iAMK1HzV0/aKnCtWb9v9641TnP/MFb7Pc2bxoxQjTXAErryXVgUOfv2YqNllqGeg==", + "dev": true, + "license": "MIT" + }, "node_modules/tinyglobby": { "version": "0.2.16", "resolved": "https://registry.npmjs.org/tinyglobby/-/tinyglobby-0.2.16.tgz", @@ -11305,7 +12725,6 @@ "integrity": "sha512-QP88BAKvMam/3NxH6vj2o21R6MjxZUAd6nlwAS/pnGvN9IVLocLHxGYIzFhg6fUQ+5th6P4dv4eW9jX3DSIj7A==", "dev": true, "license": "MIT", - "peer": true, "engines": { "node": ">=12" }, @@ -11313,6 +12732,26 @@ "url": "https://github.com/sponsors/jonschlinkert" } }, + "node_modules/tinypool": { + "version": "0.8.4", + "resolved": "https://registry.npmjs.org/tinypool/-/tinypool-0.8.4.tgz", + "integrity": "sha512-i11VH5gS6IFeLY3gMBQ00/MmLncVP7JLXOw1vlgkytLmJK7QnEr7NXf0LBdxfmNPAeyetukOk0bOYrJrFGjYJQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=14.0.0" + } + }, + "node_modules/tinyspy": { + "version": "2.2.1", + "resolved": "https://registry.npmjs.org/tinyspy/-/tinyspy-2.2.1.tgz", + "integrity": "sha512-KYad6Vy5VDWV4GH3fjpseMQ/XU2BhIYP7Vzd0LG44qRWm/Yt2WCOTicFdvmgo6gWaqooMQCawTtILVQJupKu7A==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=14.0.0" + } + }, "node_modules/to-regex-range": { "version": "5.0.1", "resolved": "https://registry.npmjs.org/to-regex-range/-/to-regex-range-5.0.1.tgz", @@ -11419,6 +12858,16 @@ "node": ">= 0.8.0" } }, + "node_modules/type-detect": { + "version": "4.1.0", + "resolved": "https://registry.npmjs.org/type-detect/-/type-detect-4.1.0.tgz", + "integrity": "sha512-Acylog8/luQ8L7il+geoSxhEkazvkslg7PSNKOX59mbB9cOveP5aq9h74Y7YU8yDpJwetzQQrfIwtf4Wp4LKcw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=4" + } + }, "node_modules/type-fest": { "version": "0.20.2", "resolved": "https://registry.npmjs.org/type-fest/-/type-fest-0.20.2.tgz", @@ -11513,7 +12962,6 @@ "version": "5.0.3", "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.0.3.tgz", "integrity": "sha512-xv8mOEDnigb/tN9PSMTwSEqAnUvkoXMQlicOb0IUVDBSQCgBSaAAROUZYy2IcUy5qU6XajK5jjjO7TMWqBTKZA==", - "peer": true, "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" @@ -11522,6 +12970,13 @@ "node": ">=12.20" } }, + "node_modules/ufo": { + "version": "1.6.4", + "resolved": "https://registry.npmjs.org/ufo/-/ufo-1.6.4.tgz", + "integrity": "sha512-JFNbkD1Svwe0KvGi8GOeLcP4kAWQ609twvCdcHxq1oSL8svv39ZuSvajcD8B+5D0eL4+s1Is2D/O6KN3qcTeRA==", + "dev": true, + "license": "MIT" + }, "node_modules/unbox-primitive": { "version": "1.1.0", "resolved": "https://registry.npmjs.org/unbox-primitive/-/unbox-primitive-1.1.0.tgz", @@ -11811,6 +13266,155 @@ "d3-timer": "^3.0.1" } }, + "node_modules/vite": { + "version": "5.4.21", + "resolved": "https://registry.npmjs.org/vite/-/vite-5.4.21.tgz", + "integrity": "sha512-o5a9xKjbtuhY6Bi5S3+HvbRERmouabWbyUcpXXUA1u+GNUKoROi9byOJ8M0nHbHYHkYICiMlqxkg1KkYmm25Sw==", + "dev": true, + "license": "MIT", + "dependencies": { + "esbuild": "^0.21.3", + "postcss": "^8.4.43", + "rollup": "^4.20.0" + }, + "bin": { + "vite": "bin/vite.js" + }, + "engines": { + "node": "^18.0.0 || >=20.0.0" + }, + "funding": { + "url": "https://github.com/vitejs/vite?sponsor=1" + }, + "optionalDependencies": { + "fsevents": "~2.3.3" + }, + "peerDependencies": { + "@types/node": "^18.0.0 || >=20.0.0", + "less": "*", + "lightningcss": "^1.21.0", + "sass": "*", + "sass-embedded": "*", + "stylus": "*", + "sugarss": "*", + "terser": "^5.4.0" + }, + "peerDependenciesMeta": { + "@types/node": { + "optional": true + }, + "less": { + "optional": true + }, + "lightningcss": { + "optional": true + }, + "sass": { + "optional": true + }, + "sass-embedded": { + "optional": true + }, + "stylus": { + "optional": true + }, + "sugarss": { + "optional": true + }, + "terser": { + "optional": true + } + } + }, + "node_modules/vite-node": { + "version": "1.6.1", + "resolved": "https://registry.npmjs.org/vite-node/-/vite-node-1.6.1.tgz", + "integrity": "sha512-YAXkfvGtuTzwWbDSACdJSg4A4DZiAqckWe90Zapc/sEX3XvHcw1NdurM/6od8J207tSDqNbSsgdCacBgvJKFuA==", + "dev": true, + "license": "MIT", + "dependencies": { + "cac": "^6.7.14", + "debug": "^4.3.4", + "pathe": "^1.1.1", + "picocolors": "^1.0.0", + "vite": "^5.0.0" + }, + "bin": { + "vite-node": "vite-node.mjs" + }, + "engines": { + "node": "^18.0.0 || >=20.0.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/vitest": { + "version": "1.6.1", + "resolved": "https://registry.npmjs.org/vitest/-/vitest-1.6.1.tgz", + "integrity": "sha512-Ljb1cnSJSivGN0LqXd/zmDbWEM0RNNg2t1QW/XUhYl/qPqyu7CsqeWtqQXHVaJsecLPuDoak2oJcZN2QoRIOag==", + "dev": true, + "license": "MIT", + "dependencies": { + "@vitest/expect": "1.6.1", + "@vitest/runner": "1.6.1", + "@vitest/snapshot": "1.6.1", + "@vitest/spy": "1.6.1", + "@vitest/utils": "1.6.1", + "acorn-walk": "^8.3.2", + "chai": "^4.3.10", + "debug": "^4.3.4", + "execa": "^8.0.1", + "local-pkg": "^0.5.0", + "magic-string": "^0.30.5", + "pathe": "^1.1.1", + "picocolors": "^1.0.0", + "std-env": "^3.5.0", + "strip-literal": "^2.0.0", + "tinybench": "^2.5.1", + "tinypool": "^0.8.3", + "vite": "^5.0.0", + "vite-node": "1.6.1", + "why-is-node-running": "^2.2.2" + }, + "bin": { + "vitest": "vitest.mjs" + }, + "engines": { + "node": "^18.0.0 || >=20.0.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + }, + "peerDependencies": { + "@edge-runtime/vm": "*", + "@types/node": "^18.0.0 || >=20.0.0", + "@vitest/browser": "1.6.1", + "@vitest/ui": "1.6.1", + "happy-dom": "*", + "jsdom": "*" + }, + "peerDependenciesMeta": { + "@edge-runtime/vm": { + "optional": true + }, + "@types/node": { + "optional": true + }, + "@vitest/browser": { + "optional": true + }, + "@vitest/ui": { + "optional": true + }, + "happy-dom": { + "optional": true + }, + "jsdom": { + "optional": true + } + } + }, "node_modules/web-namespaces": { "version": "2.0.1", "resolved": "https://registry.npmjs.org/web-namespaces/-/web-namespaces-2.0.1.tgz", @@ -11923,6 +13527,23 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/why-is-node-running": { + "version": "2.3.0", + "resolved": "https://registry.npmjs.org/why-is-node-running/-/why-is-node-running-2.3.0.tgz", + "integrity": "sha512-hUrmaWBdVDcxvYqnyh09zunKzROWjbZTiNy8dBEjkS7ehEDQibXJ7XvlmtbwuTclUiIyN+CyXQD4Vmko8fNm8w==", + "dev": true, + "license": "MIT", + "dependencies": { + "siginfo": "^2.0.0", + "stackback": "0.0.2" + }, + "bin": { + "why-is-node-running": "cli.js" + }, + "engines": { + "node": ">=8" + } + }, "node_modules/word-wrap": { "version": "1.2.5", "resolved": "https://registry.npmjs.org/word-wrap/-/word-wrap-1.2.5.tgz", @@ -12034,7 +13655,8 @@ "node_modules/yallist": { "version": "3.1.1", "resolved": "https://registry.npmjs.org/yallist/-/yallist-3.1.1.tgz", - "integrity": "sha512-a4UGQaWPH59mOXUYnAG2ewncQS4i4F43Tv3JoAM+s2VDAmS9NsK8GpDMLrCHPksFT7h3K6TOoUNn2pb7RoXx4g==" + "integrity": "sha512-a4UGQaWPH59mOXUYnAG2ewncQS4i4F43Tv3JoAM+s2VDAmS9NsK8GpDMLrCHPksFT7h3K6TOoUNn2pb7RoXx4g==", + "peer": true }, "node_modules/yaml": { "version": "2.8.3", diff --git a/web/package.json b/web/package.json index a9bb458b8b7..f47495d3d87 100644 --- a/web/package.json +++ b/web/package.json @@ -6,7 +6,8 @@ "dev": "next dev", "build": "next build", "start": "next start", - "lint": "next lint" + "lint": "next lint", + "test": "vitest run" }, "dependencies": { "@dnd-kit/core": "^6.1.0", @@ -53,6 +54,7 @@ "@tailwindcss/typography": "^0.5.10", "eslint": "^8.48.0", "eslint-config-next": "^14.2.35", - "prettier": "2.8.8" + "prettier": "2.8.8", + "vitest": "^1.6.1" } } diff --git a/web/src/app/admin/assistants/AssistantEditor.tsx b/web/src/app/admin/assistants/AssistantEditor.tsx index c58cdcdadf9..3e1e2020336 100644 --- a/web/src/app/admin/assistants/AssistantEditor.tsx +++ b/web/src/app/admin/assistants/AssistantEditor.tsx @@ -17,7 +17,7 @@ import { useRouter } from "next/navigation"; import { usePopup } from "@/components/admin/connectors/Popup"; import { Persona, StarterMessage } from "./interfaces"; import Link from "next/link"; -import { useEffect, useState } from "react"; +import { useContext, useEffect, useState } from "react"; import { BooleanFormField, SelectorFormField, @@ -82,6 +82,9 @@ export function AssistantEditor({ const { popup, setPopup } = usePopup(); const isPaidEnterpriseFeaturesEnabled = usePaidEnterpriseFeaturesEnabled(); + // Cluster-level enablement — hide the rerank / relevance assistant settings + // entirely when the feature is disabled cluster-wide. + const globalSettings = useContext(SettingsContext)?.settings; // EE only const { data: userGroups, isLoading: userGroupsIsLoading } = useUserGroups(); @@ -168,6 +171,7 @@ export function AssistantEditor({ const initialValues = { name: existingPersona?.name ?? "", + display_name: existingPersona?.display_name ?? "", description: existingPersona?.description ?? "", system_prompt: existingPrompt?.system_prompt ?? "", task_prompt: existingPrompt?.task_prompt ?? "", @@ -178,6 +182,7 @@ export function AssistantEditor({ num_chunks: existingPersona?.num_chunks ?? null, include_citations: existingPersona?.prompts[0]?.include_citations ?? true, llm_relevance_filter: existingPersona?.llm_relevance_filter ?? false, + rerank_enabled: existingPersona?.rerank_enabled ?? false, llm_model_provider_override: existingPersona?.llm_model_provider_override ?? null, llm_model_version_override: @@ -213,6 +218,7 @@ export function AssistantEditor({ num_chunks: Yup.number().nullable(), include_citations: Yup.boolean().required(), llm_relevance_filter: Yup.boolean().required(), + rerank_enabled: Yup.boolean().required(), llm_model_version_override: Yup.string().nullable(), llm_model_provider_override: Yup.string().nullable(), starter_messages: Yup.array().of( @@ -401,6 +407,12 @@ export function AssistantEditor({ subtext="Users will be able to select this Assistant based on this name." /> + + 0 ? ( ( -
-
- {documentSets.map((documentSet) => { - const ind = - values.document_set_ids.indexOf( - documentSet.id - ); - let isSelected = ind !== -1; - return ( - { - if (isSelected) { - arrayHelpers.remove(ind); - } else { - arrayHelpers.push( - documentSet.id - ); - } - }} - /> - ); - })} + render={(arrayHelpers: ArrayHelpers) => { + const selectedDocumentSets = + documentSets.filter((ds) => + values.document_set_ids.includes(ds.id) + ); + const availableDocumentSets = + documentSets.filter( + (ds) => + !values.document_set_ids.includes( + ds.id + ) + ); + return ( +
+
+ Selected ( + {selectedDocumentSets.length}) +
+ {selectedDocumentSets.length > 0 ? ( +
+ {selectedDocumentSets.map( + (documentSet) => ( + { + const ind = + values.document_set_ids.indexOf( + documentSet.id + ); + if (ind !== -1) { + arrayHelpers.remove(ind); + } + }} + /> + ) + )} +
+ ) : ( + + None selected — pick from the + available sets below. + + )} + +
+ Available ( + {availableDocumentSets.length}) +
+ {availableDocumentSets.length > 0 ? ( +
+ {availableDocumentSets.map( + (documentSet) => ( + + arrayHelpers.push( + documentSet.id + ) + } + /> + ) + )} +
+ ) : ( + + All document sets are selected. + + )}
-
- )} + ); + }} /> ) : ( @@ -572,13 +630,25 @@ export function AssistantEditor({ - + {globalSettings?.llm_relevance_filter_enabled && ( + + )} + + {globalSettings?.rerank_enabled && ( + + )}
); -} \ No newline at end of file +} diff --git a/web/src/app/admin/assistants/PersonaTable.tsx b/web/src/app/admin/assistants/PersonaTable.tsx index 73fcbeb2b6a..e18c1d9c86e 100644 --- a/web/src/app/admin/assistants/PersonaTable.tsx +++ b/web/src/app/admin/assistants/PersonaTable.tsx @@ -11,6 +11,7 @@ import { DraggableTable } from "@/components/table/DraggableTable"; import { deletePersona, personaComparator } from "./lib"; import { FiEdit2 } from "react-icons/fi"; import { TrashIcon } from "@/components/icons/icons"; +import { Tooltip } from "@/components/tooltip/Tooltip"; function PersonaTypeDisplay({ persona }: { persona: Persona }) { if (persona.default_persona) { @@ -98,9 +99,21 @@ export function PersonasTable({ personas }: { personas: Persona[] }) { } /> )} -

- {persona.name} -

+ +

+ {persona.name} +

+
,

+ + + + + + +); + +const credentialValidation = Yup.object().shape({ + outsystems_cookie: Yup.string().required("Please paste the session cookie"), + outsystems_csrf: Yup.string().required("Please paste the x-csrftoken value"), + outsystems_api_version: Yup.string().required("Please paste the apiVersion"), + outsystems_file_api_version: Yup.string().optional(), + outsystems_base_url: Yup.string().optional(), +}); + +const MainSection = () => { + const { mutate } = useSWRConfig(); + const [isEditingCredential, setIsEditingCredential] = useState(false); + + const { + data: connectorIndexingStatuses, + isLoading: isConnectorIndexingStatusesLoading, + error: isConnectorIndexingStatusesError, + } = useSWR[]>( + "/api/manage/admin/connector/indexing-status", + fetcher + ); + + const { + data: credentialsData, + isLoading: isCredentialsLoading, + error: isCredentialsError, + refreshCredentials, + } = usePublicCredentials(); + + if ( + (!connectorIndexingStatuses && isConnectorIndexingStatusesLoading) || + (!credentialsData && isCredentialsLoading) + ) { + return ; + } + + if (isConnectorIndexingStatusesError || !connectorIndexingStatuses) { + return

Failed to load connectors
; + } + + if (isCredentialsError || !credentialsData) { + return
Failed to load credentials
; + } + + const outsystemsConnectorIndexingStatuses: ConnectorIndexingStatus< + OutSystemsConfig, + OutSystemsCredentialJson + >[] = connectorIndexingStatuses.filter( + (status) => status.connector.source === "outsystems" + ); + + const outsystemsCredential: Credential | undefined = + credentialsData.find((credential) => credential.credential_json?.outsystems_csrf); + + return ( + <> + + The OutSystems connector indexes pages from an OutSystems app + (currently the UiPath Intranet, inside.uipath.com). It is an interim, one-time connector: it + authenticates with a short-lived browser session, so create it with no + refresh schedule and run a single index while the session is fresh. It + will be replaced by a service-account version later. + + + + Step 1: Provide an OutSystems session + + + From a browser logged in to inside.uipath.com, open DevTools → Network, + click any page, find a POST to DataActionGetPage, and copy + its cookie and x-csrftoken request headers and + the request body's versionInfo.apiVersion. + + + {outsystemsCredential ? ( + <> +
+ Existing session credential + + +
+ {isEditingCredential && ( + + + Re-paste a fresh session (cookie / csrf / apiVersion). Needed + whenever the previous session has expired. + + + existingCredentialId={outsystemsCredential.id} + formBody={credentialFormBody} + validationSchema={credentialValidation} + initialValues={{ + outsystems_cookie: "", + outsystems_csrf: "", + outsystems_api_version: "", + outsystems_file_api_version: "", + outsystems_base_url: + outsystemsCredential.credential_json.outsystems_base_url || "", + }} + onSubmit={(isSuccess) => { + if (isSuccess) { + setIsEditingCredential(false); + refreshCredentials(); + } + }} + extraActions={ + + } + /> + + )} + + ) : ( + + + formBody={credentialFormBody} + validationSchema={credentialValidation} + initialValues={{ + outsystems_cookie: "", + outsystems_csrf: "", + outsystems_api_version: "", + outsystems_file_api_version: "", + outsystems_base_url: "", + }} + onSubmit={(isSuccess) => { + if (isSuccess) { + refreshCredentials(); + } + }} + /> + + )} + + + Step 2: Index OutSystems pages + + + {outsystemsConnectorIndexingStatuses.length > 0 && ( + <> + + The connector below was created. Trigger an index run from its row + (it has no refresh schedule, so it only runs when you ask it to). + +
+ + connectorIndexingStatuses={outsystemsConnectorIndexingStatuses} + liveCredential={outsystemsCredential} + getCredential={(credential) => + credential.credential_json.outsystems_csrf ? "session set" : "" + } + specialColumns={[ + { + header: "Page range", + key: "page_range", + getValue: (ccPairStatus) => { + const cfg = + ccPairStatus.connector.connector_specific_config; + return `${cfg.page_id_start ?? 1} – ${cfg.page_id_end ?? 600}`; + }, + }, + ]} + onUpdate={() => + mutate("/api/manage/admin/connector/indexing-status") + } + onCredentialLink={async (connectorId) => { + if (outsystemsCredential) { + await linkCredential(connectorId, outsystemsCredential.id); + mutate("/api/manage/admin/connector/indexing-status"); + } + }} + /> +
+ + {/* Edit the page range to RESUME after a session expiry, or to run in + cookie-sized chunks. page_id_start is the cursor: set it to the + PageId from the failure message/logs and use the cc-pair page's + "Run Update" (NOT "Run Complete Re-Indexing") so already-indexed + pages aren't re-embedded. Lowering page_id_start to the resume + point means earlier pages aren't re-fetched at all. */} + {outsystemsConnectorIndexingStatuses.map((status) => ( + +

Edit page range (resume / chunk)

+ + To resume after a session expiry, set First PageId to the + PageId in the failure message, save, then use{" "} + "Run Update" on the connector's page. + + + nameBuilder={() => "OutSystemsConnector"} + existingConnector={status.connector} + formBody={ + <> + + + + } + validationSchema={Yup.object().shape({ + page_id_start: Yup.number() + .min(1) + .required("Please enter a start PageId"), + page_id_end: Yup.number() + .min(1) + .required("Please enter an end PageId"), + })} + onSubmit={(isSuccess) => { + if (isSuccess) { + mutate("/api/manage/admin/connector/indexing-status"); + } + }} + /> +
+ ))} + + + )} + + {outsystemsCredential ? ( + +

Create the OutSystems connector

+ + nameBuilder={() => "OutSystemsConnector"} + source="outsystems" + inputType="load_state" + formBody={ + <> + + + + } + validationSchema={Yup.object().shape({ + page_id_start: Yup.number() + .min(1) + .required("Please enter a start PageId"), + page_id_end: Yup.number() + .min(1) + .required("Please enter an end PageId"), + })} + initialValues={{ page_id_start: 1, page_id_end: 600 }} + credentialId={outsystemsCredential.id} + // null => one-time, refresh_freq=None (never auto-re-indexed; the + // session credential would be expired by then anyway). + refreshFreq={null} + /> +
+ ) : ( + + Please provide a session credential in Step 1 first. + + )} + + ); +}; + +export default function Page() { + return ( +
+
+ +
+ + } title="OutSystems" /> + + +
+ ); +} diff --git a/web/src/app/admin/connectors/web/page.tsx b/web/src/app/admin/connectors/web/page.tsx index 70a16ba184a..a0d72932ea5 100644 --- a/web/src/app/admin/connectors/web/page.tsx +++ b/web/src/app/admin/connectors/web/page.tsx @@ -12,6 +12,7 @@ import { import { errorHandlingFetcher } from "@/lib/fetcher"; import { ErrorCallout } from "@/components/ErrorCallout"; import { + BooleanFormField, SelectorFormField, TextFormField, } from "@/components/admin/connectors/Field"; @@ -103,6 +104,18 @@ export default function Web() { ]} /> + + } validationSchema={Yup.object().shape({ @@ -112,10 +125,14 @@ export default function Web() { web_connector_type: Yup.string() .oneOf(["recursive", "single", "sitemap"]) .optional(), + uipath_latest_versions: Yup.boolean().optional(), + max_versions: Yup.number().min(1).optional(), })} initialValues={{ base_url: "", web_connector_type: undefined, + uipath_latest_versions: false, + max_versions: 2, }} refreshFreq={60 * 60 * 24} // 1 day pruneFreq={0} // Don't prune diff --git a/web/src/app/admin/documents/sets/DocumentSetCreationForm.tsx b/web/src/app/admin/documents/sets/DocumentSetCreationForm.tsx index 5ccfb4c488d..a2c83a3619e 100644 --- a/web/src/app/admin/documents/sets/DocumentSetCreationForm.tsx +++ b/web/src/app/admin/documents/sets/DocumentSetCreationForm.tsx @@ -1,5 +1,6 @@ "use client"; +import { useState } from "react"; import { ArrayHelpers, FieldArray, Form, Formik } from "formik"; import * as Yup from "yup"; import { PopupSpec } from "@/components/admin/connectors/Popup"; @@ -12,10 +13,16 @@ import { } from "@/lib/types"; import { BooleanFormField, + Label, + SubLabel, TextFormField, } from "@/components/admin/connectors/Field"; import { ConnectorTitle } from "@/components/admin/connectors/ConnectorTitle"; -import { SearchMultiSelectDropdown } from "@/components/Dropdown"; +import { + SearchMultiSelectDropdown, + DefaultDropdown, +} from "@/components/Dropdown"; +import { getSourceMetadata } from "@/lib/sources"; import { Button, Divider, Text } from "@tremor/react"; import { FiPlus, FiUsers, FiX } from "react-icons/fi"; import { usePaidEnterpriseFeaturesEnabled } from "@/components/settings/usePaidEnterpriseFeaturesEnabled"; @@ -62,6 +69,26 @@ export const DocumentSetCreationForm = ({ const isUpdate = existingDocumentSet !== undefined; + // Optional connector-type filter for the picker below. Defaults to "all" so + // every connector shows; narrows the picker to a single source when chosen. + const [sourceFilter, setSourceFilter] = useState("all"); + const sourceCounts = ccPairs.reduce>((acc, ccPair) => { + const source = ccPair.connector.source as string; + acc[source] = (acc[source] ?? 0) + 1; + return acc; + }, {}); + const sourceOptions = [ + { name: `All connectors (${ccPairs.length})`, value: "all" }, + ...Array.from(new Set(ccPairs.map((ccPair) => ccPair.connector.source))) + .map((source) => ({ + name: `${getSourceMetadata(source).displayName} (${ + sourceCounts[source as string] + })`, + value: source as string, + })) + .sort((a, b) => a.name.localeCompare(b.name)), + ]; + return (
- {({ isSubmitting, values }) => ( + {({ isSubmitting, values, setFieldValue }) => (
-

- Pick your connectors: -

-

+ + All documents indexed by the selected connectors will be a part of - this document set. Search by connector name and click to add; - click a selected connector to remove it. -

+ this document set. Filter by connector type, then search by name + and click to add; click a selected connector to remove it. + + { @@ -161,19 +187,35 @@ export const DocumentSetCreationForm = ({ ); const availableOptions = ccPairs .filter( - (ccPair) => !values.cc_pair_ids.includes(ccPair.cc_pair_id) + (ccPair) => + !values.cc_pair_ids.includes(ccPair.cc_pair_id) && + (sourceFilter === "all" || + ccPair.connector.source === sourceFilter) ) - .map((ccPair) => ({ - name: ccPair.name?.toString() || "", - value: ccPair.cc_pair_id?.toString() ?? "", - metadata: { - ccPairId: ccPair.cc_pair_id, - connector: ccPair.connector, - configSummary: summarizeConnectorConfig( - ccPair.connector.connector_specific_config - ), - }, - })); + .map((ccPair) => { + const configSummary = summarizeConnectorConfig( + ccPair.connector.connector_specific_config + ); + return { + name: ccPair.name?.toString() || "", + value: ccPair.cc_pair_id?.toString() ?? "", + // Make the connector's source + config (which for web + // connectors holds the URL) searchable, not just the + // display name — so typing a URL fragment finds it. + searchableText: [ + ccPair.name, + ccPair.connector?.source, + configSummary, + ] + .filter(Boolean) + .join(" "), + metadata: { + ccPairId: ccPair.cc_pair_id, + connector: ccPair.connector, + configSummary, + }, + }; + }); return (
{selectedCCPairs.length > 0 && ( @@ -207,6 +249,15 @@ export const DocumentSetCreationForm = ({ })}
)} +
+ + setSourceFilter((value as string) ?? "all") + } + /> +
{ @@ -249,6 +300,37 @@ export const DocumentSetCreationForm = ({ ); }} /> +
+ + {values.cc_pair_ids.length > 0 && ( + + )} +
); }} diff --git a/web/src/app/admin/documents/sets/page.tsx b/web/src/app/admin/documents/sets/page.tsx index 25c19de7a26..8e8d1a995c8 100644 --- a/web/src/app/admin/documents/sets/page.tsx +++ b/web/src/app/admin/documents/sets/page.tsx @@ -124,7 +124,12 @@ const DocumentSetTable = ({ .map((documentSet) => { return ( - + {/* No overflow-hidden here: the EditRow renders an absolutely + positioned "Cannot update while syncing" tooltip that must + escape the cell. Column width is held by table-fixed, and + the name itself is clipped by the inner `truncate` span, so + dropping overflow-hidden doesn't break the layout. */} +
diff --git a/web/src/app/admin/indexing/status/CCPairIndexingStatusTable.tsx b/web/src/app/admin/indexing/status/CCPairIndexingStatusTable.tsx index 11f9d576196..267ff7fe2a3 100644 --- a/web/src/app/admin/indexing/status/CCPairIndexingStatusTable.tsx +++ b/web/src/app/admin/indexing/status/CCPairIndexingStatusTable.tsx @@ -110,13 +110,17 @@ function ClickableTableRow({ export function CCPairIndexingStatusTable({ ccPairsIndexingStatuses, onRefresh, + initialStatusFilter = "all", }: { ccPairsIndexingStatuses: ConnectorIndexingStatus[]; onRefresh?: () => void; + // Seeds the status dropdown so a deep-link (e.g. ?status=active) lands + // straight in the filtered view without the user touching the filters. + initialStatusFilter?: string; }) { const [page, setPage] = useState(1); const [sourceFilter, setSourceFilter] = useState("all"); - const [statusFilter, setStatusFilter] = useState("all"); + const [statusFilter, setStatusFilter] = useState(initialStatusFilter); const [nameSearch, setNameSearch] = useState(""); // Sort state for the "Last Indexed" column. `none` falls back to the // page's default ordering (source-name ascending, set in page.tsx). @@ -153,7 +157,13 @@ export function CCPairIndexingStatusTable({ if (sourceFilter !== "all") { rows = rows.filter((s) => s.connector.source === sourceFilter); } - if (statusFilter !== "all") { + if (statusFilter === "active") { + // Composite lens: currently running + queued waiting for a worker + // slot — the scheduler's live activity in one view. + rows = rows.filter((s) => + ["in_progress", "not_started"].includes(effectiveStatus(s)) + ); + } else if (statusFilter !== "all") { rows = rows.filter((s) => effectiveStatus(s) === statusFilter); } const q = nameSearch.trim().toLowerCase(); @@ -181,6 +191,16 @@ export function CCPairIndexingStatusTable({ lastIndexedSort, ]); + // Keep the status filter in sync with the ?status= deep-link. The sidebar + // tabs (Existing Connectors / Indexing Activity / Failed Indexing) are the + // same route with different ?status= values; navigating between them is + // client-side and does NOT remount this component, so without this the URL + // changes but statusFilter (seeded once on mount) never updates — the click + // appears to do nothing. + useEffect(() => { + setStatusFilter(initialStatusFilter); + }, [initialStatusFilter]); + // Reset page + selection when any filter changes so we never act on hidden rows. useEffect(() => { setPage(1); @@ -306,6 +326,7 @@ export function CCPairIndexingStatusTable({ onChange={(e) => setStatusFilter(e.target.value)} > + diff --git a/web/src/app/admin/indexing/status/page.tsx b/web/src/app/admin/indexing/status/page.tsx index f5b8170dbcd..498eb95cc79 100644 --- a/web/src/app/admin/indexing/status/page.tsx +++ b/web/src/app/admin/indexing/status/page.tsx @@ -1,6 +1,7 @@ "use client"; -import { useState } from "react"; +import { Suspense, useState } from "react"; +import { useSearchParams } from "next/navigation"; import useSWR from "swr"; import { LoadingAnimation } from "@/components/Loading"; @@ -28,6 +29,10 @@ function buildIndexingStatusUrl(show: ShowFilter): string { } function Main() { + // Deep-link shortcut: ?status=active (or any status value) seeds the + // table's status filter so links/bookmarks land straight in that view. + const searchParams = useSearchParams(); + const initialStatusFilter = searchParams.get("status") ?? "all"; // Default to "enabled" so the initial load is small. Switching the // dropdown changes the SWR key (different URL) so SWR re-fetches // and caches each variant separately. @@ -137,6 +142,7 @@ function Main() { refetchIndexAttempt()} + initialStatusFilter={initialStatusFilter} /> )} @@ -157,7 +163,11 @@ export default function Status() { } /> -
+ {/* useSearchParams() in Main requires a Suspense boundary, or the + production build fails ("should be wrapped in a suspense boundary"). */} + }> +
+ ); } diff --git a/web/src/app/admin/settings/interfaces.ts b/web/src/app/admin/settings/interfaces.ts index c1f4da5b2cc..17dd2da2a94 100644 --- a/web/src/app/admin/settings/interfaces.ts +++ b/web/src/app/admin/settings/interfaces.ts @@ -5,6 +5,11 @@ export interface Settings { maximum_chat_retention_days: number | null; // Byte cap for chat file uploads (mirrors backend CHAT_FILE_MAX_SIZE_MB). chat_file_max_size_mb?: number; + // Cluster-level enablement (mirrors backend RERANK_ENABLED / + // LLM_RELEVANCE_FILTER_ENABLED). When false the chat + assistant UIs hide the + // corresponding rerank / relevance toggles. + rerank_enabled?: boolean; + llm_relevance_filter_enabled?: boolean; } export interface EnterpriseSettings { diff --git a/web/src/app/admin/tools/ToolEditor.tsx b/web/src/app/admin/tools/ToolEditor.tsx index 89046d21f1f..4b6b34e19a9 100644 --- a/web/src/app/admin/tools/ToolEditor.tsx +++ b/web/src/app/admin/tools/ToolEditor.tsx @@ -143,7 +143,7 @@ function ToolForm({

Available methods

- +
diff --git a/web/src/app/assistants/gallery/AssistantsGallery.tsx b/web/src/app/assistants/gallery/AssistantsGallery.tsx index cfae8b122b0..f2139307603 100644 --- a/web/src/app/assistants/gallery/AssistantsGallery.tsx +++ b/web/src/app/assistants/gallery/AssistantsGallery.tsx @@ -51,9 +51,9 @@ import { AssistantIcon } from "@/components/assistants/AssistantIcon"; import { Bubble } from "@/components/Bubble"; import { usePopup } from "@/components/admin/connectors/Popup"; import { - addAssistantToList, - reorderAssistantList, - removeAssistantFromList, + hideAssistant, + unhideAssistant, + setHiddenAssistants, } from "@/lib/assistants/updateAssistantPreferences"; import { checkUserOwnsAssistant } from "@/lib/assistants/checkOwnership"; import { AssistantsPageTitle } from "../AssistantsPageTitle"; @@ -225,7 +225,7 @@ function GalleryCard({ assistant, user, isAdded, onAdd, onRemove }: CardProps) { )} {/* Footer row: author (or built-in subtle text) + Add/Remove */} -
+
{isBuiltIn ? ( Bundled assistant @@ -381,16 +381,13 @@ export function AssistantsGallery({ }; const { popup, setPopup } = usePopup(); - // Mirrors the Manage page: no preference = every accessible assistant - // is "in the picker" by default. - const initialChosen: number[] = - user?.preferences?.chosen_assistants ?? assistants.map((a) => a.id); - const [chosenAssistants, setChosenAssistants] = - useState(initialChosen); - const chosenSet = useMemo( - () => new Set(chosenAssistants), - [chosenAssistants] + // Opt-out model: an assistant is "added" (in the picker) unless the user has + // explicitly hidden it. So new assistants are added for everyone by default. + const [hiddenIds, setHiddenIds] = useState( + user?.preferences?.hidden_assistants ?? [] ); + const hiddenSet = useMemo(() => new Set(hiddenIds), [hiddenIds]); + const isAdded = (id: number) => !hiddenSet.has(id); // ---- filter / sort state ------------------------------------------------- @@ -416,8 +413,8 @@ export function AssistantsGallery({ .toLowerCase(); if (!hay.includes(q)) return false; } - if (availability === "added" && !chosenSet.has(a.id)) return false; - if (availability === "available" && chosenSet.has(a.id)) return false; + if (availability === "added" && hiddenSet.has(a.id)) return false; + if (availability === "available" && !hiddenSet.has(a.id)) return false; return true; }); @@ -430,7 +427,7 @@ export function AssistantsGallery({ } // "featured" = preserve API order (admins curate via display_priority). return out; - }, [assistants, search, availability, sortMode, chosenSet]); + }, [assistants, search, availability, sortMode, hiddenSet]); // ---- derived: sections --------------------------------------------------- @@ -492,26 +489,26 @@ export function AssistantsGallery({ if (!hay.includes(q)) continue; } all++; - if (chosenSet.has(a.id)) added++; + if (!hiddenSet.has(a.id)) added++; else available++; } return { all, added, available }; - }, [assistants, search, chosenSet]); + }, [assistants, search, hiddenSet]); // ---- optimistic add/remove (mirrors Manage page persistOrder) ----------- - const persistChosen = async ( + const persistHidden = async ( next: number[], { successMsg, - undoToOrder, - }: { successMsg?: string; undoToOrder?: number[] } = {} + undoToHidden, + }: { successMsg?: string; undoToHidden?: number[] } = {} ): Promise => { - const prev = chosenAssistants; - setChosenAssistants(next); - const ok = await reorderAssistantList(next); + const prev = hiddenIds; + setHiddenIds(next); + const ok = await setHiddenAssistants(next); if (!ok) { - setChosenAssistants(prev); + setHiddenIds(prev); setPopup({ message: "Couldn't update your assistant list — please try again.", type: "error", @@ -523,10 +520,10 @@ export function AssistantsGallery({ message: successMsg, type: "success", undo: - undoToOrder !== undefined + undoToHidden !== undefined ? { onClick: async () => { - await persistChosen(undoToOrder); + await persistHidden(undoToHidden); }, } : undefined, @@ -536,18 +533,16 @@ export function AssistantsGallery({ return true; }; + // Add = unhide (remove from hidden_assistants). const handleAdd = async (a: Persona) => { if (!user) return; - if (chosenSet.has(a.id)) return; // already added — no-op - const prev = chosenAssistants; - const next = [...prev, a.id]; - // Use addAssistantToList specifically (idempotent) rather than the - // generic reorder helper — both PATCH the same endpoint, but this - // signals intent at the call-site. - setChosenAssistants(next); - const ok = await addAssistantToList(a.id, prev); + if (isAdded(a.id)) return; // already visible — no-op + const prev = hiddenIds; + const next = prev.filter((id) => id !== a.id); + setHiddenIds(next); + const ok = await unhideAssistant(a.id, prev); if (!ok) { - setChosenAssistants(prev); + setHiddenIds(prev); setPopup({ message: `Couldn't add "${a.name}". Try again?`, type: "error", @@ -559,16 +554,18 @@ export function AssistantsGallery({ type: "success", undo: { onClick: async () => { - await persistChosen(prev); + await persistHidden(prev); }, }, }); router.refresh(); }; + // Remove = hide (add to hidden_assistants). Keep at least one visible. const handleRemove = async (a: Persona) => { if (!user) return; - if (chosenAssistants.length === 1 && chosenAssistants[0] === a.id) { + const visibleCount = assistants.filter((x) => !hiddenSet.has(x.id)).length; + if (visibleCount <= 1 && isAdded(a.id)) { setPopup({ message: "You need at least one visible assistant — can't remove the last one.", @@ -576,12 +573,12 @@ export function AssistantsGallery({ }); return; } - const prev = chosenAssistants; - const next = prev.filter((id) => id !== a.id); - setChosenAssistants(next); - const ok = await removeAssistantFromList(a.id, prev); + const prev = hiddenIds; + const next = [...prev, a.id]; + setHiddenIds(next); + const ok = await hideAssistant(a.id, prev); if (!ok) { - setChosenAssistants(prev); + setHiddenIds(prev); setPopup({ message: `Couldn't remove "${a.name}". Try again?`, type: "error", @@ -593,7 +590,7 @@ export function AssistantsGallery({ type: "success", undo: { onClick: async () => { - await persistChosen(prev); + await persistHidden(prev); }, }, }); @@ -777,7 +774,7 @@ export function AssistantsGallery({ key={assistant.id} assistant={assistant} user={user} - isAdded={chosenSet.has(assistant.id)} + isAdded={isAdded(assistant.id)} onAdd={handleAdd} onRemove={handleRemove} /> diff --git a/web/src/app/assistants/gallery/page.tsx b/web/src/app/assistants/gallery/page.tsx index 86b9a6a8b57..3fe3c8e3508 100644 --- a/web/src/app/assistants/gallery/page.tsx +++ b/web/src/app/assistants/gallery/page.tsx @@ -24,6 +24,7 @@ export default async function GalleryPage({ const { user, chatSessions, + hasMoreChatSessions, availableSources, documentSets, assistants, @@ -44,6 +45,7 @@ export default async function GalleryPage({ value={{ user, chatSessions, + hasMoreChatSessions, availableSources, availableDocumentSets: documentSets, availablePersonas: assistants, @@ -56,6 +58,7 @@ export default async function GalleryPage({
a.id); - - const [chosenOrder, setChosenOrder] = useState(initialChosen); + // Opt-out model: `chosenOrder` controls ORDER only (and default = position + // 0); `hiddenIds` controls VISIBILITY. Anything not hidden is visible — so a + // newly created assistant shows up here (and in chat) automatically. + const [chosenOrder, setChosenOrder] = useState( + user?.preferences?.chosen_assistants ?? [] + ); + const [hiddenIds, setHiddenIds] = useState( + user?.preferences?.hidden_assistants ?? [] + ); const [search, setSearch] = useState(""); const [selected, setSelected] = useState>(new Set()); const [sharingAssistantId, setSharingAssistantId] = useState( @@ -565,20 +568,29 @@ export function AssistantsList({ user, assistants }: AssistantsListProps) { () => new Map(assistants.map((a) => [a.id, a])), [assistants] ); - const chosenSet = useMemo(() => new Set(chosenOrder), [chosenOrder]); + const hiddenSet = useMemo(() => new Set(hiddenIds), [hiddenIds]); + // Visible = everything NOT hidden, ordered by chosenOrder first, then the + // rest in their incoming (backend) order. const visibleAssistants: Persona[] = useMemo(() => { - const out: Persona[] = []; - for (const id of chosenOrder) { - const a = assistantsById.get(id); - if (a) out.push(a); - } - return out; - }, [chosenOrder, assistantsById]); + const notHidden = assistants.filter((a) => !hiddenSet.has(a.id)); + const orderMap = new Map(chosenOrder.map((id, i) => [id, i])); + return notHidden + .map((a, i) => ({ a, i })) + .sort((x, y) => { + const ox = orderMap.get(x.a.id); + const oy = orderMap.get(y.a.id); + if (ox !== undefined && oy !== undefined) return ox - oy; + if (ox !== undefined) return -1; + if (oy !== undefined) return 1; + return x.i - y.i; + }) + .map(({ a }) => a); + }, [assistants, hiddenSet, chosenOrder]); const hiddenAssistants: Persona[] = useMemo( - () => assistants.filter((a) => !chosenSet.has(a.id)), - [assistants, chosenSet] + () => assistants.filter((a) => hiddenSet.has(a.id)), + [assistants, hiddenSet] ); const matchesSearch = (a: Persona) => { @@ -592,15 +604,14 @@ export function AssistantsList({ user, assistants }: AssistantsListProps) { const filteredVisible = visibleAssistants.filter(matchesSearch); const filteredHidden = hiddenAssistants.filter(matchesSearch); - // The default is just position 0 of chosen_assistants. If the user has - // no preference at all, there's no notion of "default yet" — leave it - // unset so no row shows the accent until the user picks. - const defaultId = - user?.preferences?.chosen_assistants && chosenOrder.length > 0 - ? chosenOrder[0] - : null; + // Default = position 0 of the explicit order. If the user has never set an + // order, there's no default yet — leave it unset so no row shows the accent. + const defaultId = chosenOrder.length > 0 ? chosenOrder[0] : null; // ---- persistence with optimistic + undo -------------------------------- + // Two independent arrays now: `chosen_assistants` (order) and + // `hidden_assistants` (visibility). persistOrder writes the former, + // persistHidden the latter; each is optimistic with rollback + undo. const persistOrder = async ( nextOrder: number[], @@ -640,15 +651,54 @@ export function AssistantsList({ user, assistants }: AssistantsListProps) { return true; }; + const persistHidden = async ( + nextHidden: number[], + { + successMsg, + undoToHidden, + }: { successMsg?: string; undoToHidden?: number[] } = {} + ): Promise => { + const prev = hiddenIds; + setHiddenIds(nextHidden); + const ok = await setHiddenAssistants(nextHidden); + if (!ok) { + setHiddenIds(prev); + setPopup({ + message: "Couldn't update your assistant list — please try again.", + type: "error", + }); + return false; + } + if (successMsg) { + setPopup({ + message: successMsg, + type: "success", + undo: + undoToHidden !== undefined + ? { + onClick: async () => { + await persistHidden(undoToHidden); + }, + } + : undefined, + }); + } + router.refresh(); + return true; + }; + // ---- handlers ---------------------------------------------------------- const handleDragEnd = (event: DragEndEvent) => { const { active, over } = event; if (!over || active.id === over.id) return; - const oldIndex = chosenOrder.indexOf(Number(active.id)); - const newIndex = chosenOrder.indexOf(Number(over.id)); + // Reorder operates over the *visible* list (which may include assistants + // not yet in chosenOrder); we persist the resulting full visible order. + const visibleIds = visibleAssistants.map((a) => a.id); + const oldIndex = visibleIds.indexOf(Number(active.id)); + const newIndex = visibleIds.indexOf(Number(over.id)); if (oldIndex < 0 || newIndex < 0) return; - const next = arrayMove(chosenOrder, oldIndex, newIndex); + const next = arrayMove(visibleIds, oldIndex, newIndex); void persistOrder(next, { successMsg: "Order updated.", undoToOrder: chosenOrder, @@ -656,38 +706,28 @@ export function AssistantsList({ user, assistants }: AssistantsListProps) { }; const handleSetDefault = async (id: number) => { - if (chosenOrder[0] === id) return; - const prev = chosenOrder; - const ok = await persistOrder( - [id, ...chosenOrder.filter((x) => x !== id)], - { - successMsg: `Default assistant updated.`, - undoToOrder: prev, - } - ); - if (!ok) { - // persistOrder already showed the error toast. - } else { - // setDefaultAssistant also handles the case where id wasn't in - // chosen_assistants; persistOrder above already prepended it. - void setDefaultAssistant(id, prev); // best-effort idempotent confirmation - } + const visibleIds = visibleAssistants.map((a) => a.id); + if (visibleIds[0] === id) return; + await persistOrder([id, ...visibleIds.filter((x) => x !== id)], { + successMsg: `Default assistant updated.`, + undoToOrder: chosenOrder, + }); }; const handleToggleVisibility = async (id: number, makeVisible: boolean) => { - const prev = chosenOrder; + const assistant = assistantsById.get(id); if (makeVisible) { - // Add to end so reorder isn't surprising. - const next = [...chosenOrder, id]; - const assistant = assistantsById.get(id); - await persistOrder(next, { - successMsg: assistant - ? `"${assistant.name}" added to your picker.` - : "Added to your picker.", - undoToOrder: prev, - }); + await persistHidden( + hiddenIds.filter((x) => x !== id), + { + successMsg: assistant + ? `"${assistant.name}" shown in your picker.` + : "Shown in your picker.", + undoToHidden: hiddenIds, + } + ); } else { - if (chosenOrder.length === 1 && chosenOrder[0] === id) { + if (visibleAssistants.length <= 1) { setPopup({ message: "You need at least one visible assistant — can't hide the last one.", @@ -695,13 +735,11 @@ export function AssistantsList({ user, assistants }: AssistantsListProps) { }); return; } - const next = chosenOrder.filter((x) => x !== id); - const assistant = assistantsById.get(id); - await persistOrder(next, { + await persistHidden([...hiddenIds, id], { successMsg: assistant ? `"${assistant.name}" hidden from your picker.` : "Hidden from your picker.", - undoToOrder: prev, + undoToHidden: hiddenIds, }); } }; @@ -718,67 +756,41 @@ export function AssistantsList({ user, assistants }: AssistantsListProps) { const clearSelection = () => setSelected(new Set()); const handleBulkShow = async () => { - const ids = Array.from(selected); - const prev = chosenOrder; - const ok = await bulkAddToList(ids, chosenOrder); - if (!ok) { - setPopup({ message: "Couldn't show selected.", type: "error" }); - return; - } - // Mirror the optimistic update locally — the helper PATCHed the - // server; we just need to align local state. - const existing = new Set(chosenOrder); - const toAppend = ids.filter((id) => !existing.has(id)); - setChosenOrder([...chosenOrder, ...toAppend]); - setPopup({ - message: `${ids.length} assistant${ids.length === 1 ? "" : "s"} shown.`, - type: "success", - undo: { - onClick: async () => { - await persistOrder(prev); - }, - }, + const ids = new Set(selected); + const next = hiddenIds.filter((id) => !ids.has(id)); + const ok = await persistHidden(next, { + successMsg: `${ids.size} assistant${ids.size === 1 ? "" : "s"} shown.`, + undoToHidden: hiddenIds, }); - clearSelection(); - router.refresh(); + if (ok) clearSelection(); }; const handleBulkHide = async () => { const ids = Array.from(selected); + const idSet = new Set(ids); // Don't let the user hide every visible row at once. - const remaining = chosenOrder.filter((id) => !ids.includes(id)); - if (remaining.length === 0 && chosenOrder.length > 0) { + const remainingVisible = assistants.filter( + (a) => !hiddenSet.has(a.id) && !idSet.has(a.id) + ); + if (remainingVisible.length === 0) { setPopup({ message: "Can't hide every visible assistant — keep at least one.", type: "error", }); return; } - const prev = chosenOrder; - const ok = await bulkRemoveFromList(ids, chosenOrder); - if (!ok) { - setPopup({ message: "Couldn't hide selected.", type: "error" }); - return; - } - setChosenOrder(remaining); - setPopup({ - message: `${ids.length} assistant${ids.length === 1 ? "" : "s"} hidden.`, - type: "success", - undo: { - onClick: async () => { - await persistOrder(prev); - }, - }, + const next = Array.from(new Set([...hiddenIds, ...ids])); + const ok = await persistHidden(next, { + successMsg: `${ids.length} assistant${ids.length === 1 ? "" : "s"} hidden.`, + undoToHidden: hiddenIds, }); - clearSelection(); - router.refresh(); + if (ok) clearSelection(); }; - // "Remove" is the same backend op as Hide today — both just remove the - // ids from chosen_assistants. The label distinction is a UX hint: Hide - // is reversible by toggling the switch back on (or Undo); Remove - // implies "I don't want to see this any more." Functionally identical - // until we have a true "remove access" path. + // "Remove" is the same op as Hide today — both add the ids to + // hidden_assistants. The label distinction is a UX hint: Hide is reversible + // by toggling back on (or Undo); Remove implies "I don't want to see this." + // Functionally identical until we have a true "remove access" path. const handleBulkRemove = handleBulkHide; // ---- DnD plumbing ------------------------------------------------------- @@ -885,7 +897,7 @@ export function AssistantsList({ user, assistants }: AssistantsListProps) { modifiers={[restrictToVerticalAxis]} > a.id)} + items={filteredVisible.map((a) => String(a.id))} strategy={verticalListSortingStrategy} > {filteredVisible.map((assistant) => ( diff --git a/web/src/app/assistants/mine/page.tsx b/web/src/app/assistants/mine/page.tsx index c1cac2273ef..1aff6588774 100644 --- a/web/src/app/assistants/mine/page.tsx +++ b/web/src/app/assistants/mine/page.tsx @@ -25,6 +25,7 @@ export default async function GalleryPage({ const { user, chatSessions, + hasMoreChatSessions, availableSources, documentSets, assistants, @@ -45,6 +46,7 @@ export default async function GalleryPage({ value={{ user, chatSessions, + hasMoreChatSessions, availableSources, availableDocumentSets: documentSets, availablePersonas: assistants, @@ -57,6 +59,7 @@ export default async function GalleryPage({
( -
{title}
-
{description}
+// Minimal chat landing: a centered greeting that sits directly above the input +// on the empty state (the logo and the Assistant/Filters/Model, Knowledge Sets, +// and Connected Sources tiles were intentionally removed for a cleaner, wider, +// reference-style landing). +export function ChatIntro() { + return ( +
+

+ Search your company's internal knowledge base +

); } - -function StepCard({ - icon: Icon, - title, - subtitle, - onClick, -}: { - icon: IconType; - title: string; - subtitle: string; - onClick: () => void; -}) { - return ( - - ); -} - -function OnboardingSteps({ - setConfigModalActiveTab, -}: { - setConfigModalActiveTab: (tab: string | null) => void; -}) { - return ( -
- setConfigModalActiveTab("assistants")} - /> - setConfigModalActiveTab("filters")} - /> - setConfigModalActiveTab("llms")} - /> -
- ); -} - -export function ChatIntro({ - availableSources, - selectedPersona, - setConfigModalActiveTab, -}: { - availableSources: ValidSources[]; - selectedPersona: Persona; - setConfigModalActiveTab?: (tab: string | null) => void; -}) { - const availableSourceMetadata = getSourceMetadataForSources(availableSources); - - const [displaySources, setDisplaySources] = useState(false); - - return ( - <> -
-
-
-
- - -
- {selectedPersona?.name || "How can I help you today?"} -
- {selectedPersona && ( -
{selectedPersona.description}
- )} -
-
- - {setConfigModalActiveTab && ( - - )} - - {selectedPersona && selectedPersona.num_chunks !== 0 && ( - <> - -
- {selectedPersona.document_sets.length > 0 && ( -
-

- Knowledge Sets:{" "} -

-
- {selectedPersona.document_sets.map((documentSet) => ( -
- -
- -
- {documentSet.name} - - } - popupContent={ -
- -
- {documentSet.description} -
-
- } - direction="top" - /> -
- ))} -
-
- )} - - {availableSources.length > 0 && ( -
-

- Connected Sources:{" "} -

-
- {availableSourceMetadata.map((sourceMetadata) => ( - -
- {sourceMetadata.icon({})} -
-
- {sourceMetadata.displayName} -
-
- ))} -
-
- )} -
- - )} -
-
- - ); -} diff --git a/web/src/app/chat/ChatPage.tsx b/web/src/app/chat/ChatPage.tsx index dd0d34cb341..da432589d2d 100644 --- a/web/src/app/chat/ChatPage.tsx +++ b/web/src/app/chat/ChatPage.tsx @@ -65,6 +65,7 @@ import { useChatContext } from "@/components/context/ChatContext"; import { UserDropdown } from "@/components/UserDropdown"; import { v4 as uuidv4 } from "uuid"; import { orderAssistantsForUser } from "@/lib/assistants/orderAssistants"; +import { assistantDisplayName } from "@/lib/assistants/displayName"; import { ChatPopup } from "./ChatPopup"; import { ChatBanner } from "./ChatBanner"; import { TbLayoutSidebarRightExpand } from "react-icons/tb"; @@ -89,6 +90,7 @@ export function ChatPage({ let { user, chatSessions, + hasMoreChatSessions, availableSources, availableDocumentSets, availablePersonas, @@ -140,10 +142,14 @@ export function ChatPage({ chatSessionIdRef.current = existingChatSessionId; textAreaRef.current?.focus(); - // only clear things if we're going from one chat session to another - const isChatSessionSwitch = - chatSessionIdRef.current !== null && - existingChatSessionId !== priorChatSessionId; + // Clear per-session state whenever the active session actually changes — + // including when starting a brand-new chat (existingChatSessionId === null). + // The previous guard (`chatSessionIdRef.current !== null`) skipped the + // New Chat case, so selected documents / filters leaked into the next + // session; the stale search_doc_ids then belong to the old session and the + // backend rejects them ("Invalid reference doc, not from this chat + // session"), failing the next message. + const isChatSessionSwitch = existingChatSessionId !== priorChatSessionId; if (isChatSessionSwitch) { // de-select documents clearSelectedDocuments(); @@ -363,6 +369,12 @@ export function ChatPage({ ); const [isStreaming, setIsStreaming] = useState(false); + // Per-conversation search-quality toggles (default OFF, independent of the + // assistant's own settings). Each is also gated server-side by its global + // master switch. Reset per page load — "default off" is intentional. + const [useReranking, setUseReranking] = useState(false); + const [useRelevanceFilter, setUseRelevanceFilter] = useState(false); + // uploaded files const [currentMessageFiles, setCurrentMessageFiles] = useState< FileDescriptor[] @@ -798,6 +810,8 @@ export function ChatPage({ systemPromptOverride: searchParams.get(SEARCH_PARAM_NAMES.SYSTEM_PROMPT) || undefined, useExistingUserMessage: isSeededChat, + useReranking: useReranking, + useRelevanceFilter: useRelevanceFilter, }); const updateFn = (messages: Message[]) => { const replacementsMap = finalMessage @@ -995,7 +1009,7 @@ export function ChatPage({ // a blank session with no explanation of why. setPopup({ message: - `Started a new chat with "${persona.name}", as each chat is bound to a single assistant.` + + `Started a new chat with "${assistantDisplayName(persona)}", as each chat is bound to a single assistant.` + (hadFiles ? " Please re-upload any files you'd attached." : ""), type: "success", }); @@ -1087,7 +1101,10 @@ export function ChatPage({ router.push("/search"); } - const [showDocSidebar, setShowDocSidebar] = useState(true); // State to track if sidebar is open + // Collapsed by default — the retrieved-documents sidebar stays hidden until + // the user opens it (there's a toggle button), so the landing isn't cluttered + // by an empty sidebar before any results exist. + const [showDocSidebar, setShowDocSidebar] = useState(false); const toggleSidebar = () => { if (sidebarElementRef.current) { @@ -1121,6 +1138,29 @@ export function ChatPage({ const sidebarElementRef = useRef(null); const innerSidebarElementRef = useRef(null); + // Auto-open the retrieved-documents panel when a fresh answer comes back with + // documents (it starts collapsed on the landing — see showDocSidebar above). + // Tracked per-message via the ref so the user can still manually hide it + // without it springing back open on the next render. + const autoOpenedDocsForRef = useRef(null); + useEffect(() => { + if (!retrievalEnabled) { + return; + } + const lastWithDocs = [...messageHistory] + .reverse() + .find( + (m) => m.type === "assistant" && !!m.documents && m.documents.length > 0 + ); + if ( + lastWithDocs && + autoOpenedDocsForRef.current !== lastWithDocs.messageId + ) { + autoOpenedDocsForRef.current = lastWithDocs.messageId; + setShowDocSidebar(true); + } + }, [messageHistory, retrievalEnabled]); + const currentPersona = selectedAssistant || livePersona; const updateSelectedAssistant = (newAssistant: Persona | null) => { @@ -1141,9 +1181,13 @@ export function ChatPage({ Only used in the EE version of the app. */} -
+
( <>
{/* */} + {/* Atmospheric backdrop behind the empty-state landing only + (removed once a conversation starts so it never sits behind + messages). Lives on this full-height container — not the + inner scroll area — so it also covers the bottom input + padding region, otherwise that strip renders as a black + band. -z-10 stays contained by `isolate`. */} + {messageHistory.length === 0 && + !isFetchingChatMessages && + !isStreaming && ( +
+ )} +
- {/* ChatBanner is a custom banner that displays a admin-specified message at + {/* ChatBanner is a custom banner that displays a admin-specified message at the top of the chat page. Only used in the EE version of the app. */} @@ -1253,16 +1309,6 @@ export function ChatPage({
)} - {messageHistory.length === 0 && - !isFetchingChatMessages && - !isStreaming && ( - - )} -
{message.message} @@ -1496,7 +1542,7 @@ export function ChatPage({ selectedAssistant } messageId={null} - personaName={livePersona.name} + personaName={assistantDisplayName(livePersona)} content={
+ {/* Empty-state landing: greeting sits directly above + the input, and the whole group is vertically centered + (the wrapper above switches from bottom-pinned to + center). Once a conversation starts it snaps to the + bottom as usual. */} + {messageHistory.length === 0 && + !isFetchingChatMessages && + !isStreaming && } {aboveHorizon && (
diff --git a/web/src/app/chat/ChatPopup.tsx b/web/src/app/chat/ChatPopup.tsx index 252b38a4d8c..9725c2c0bff 100644 --- a/web/src/app/chat/ChatPopup.tsx +++ b/web/src/app/chat/ChatPopup.tsx @@ -35,7 +35,7 @@ export function ChatPopup() { <> (
diff --git a/web/src/app/chat/documentSidebar/ChatDocumentDisplay.tsx b/web/src/app/chat/documentSidebar/ChatDocumentDisplay.tsx index 3243c15896f..9fb196aefce 100644 --- a/web/src/app/chat/documentSidebar/ChatDocumentDisplay.tsx +++ b/web/src/app/chat/documentSidebar/ChatDocumentDisplay.tsx @@ -6,10 +6,7 @@ import { DocumentUpdatedAtBadge } from "@/components/search/DocumentUpdatedAtBad import { DanswerDocument } from "@/lib/search/interfaces"; import { FiInfo, FiRadio } from "react-icons/fi"; import { DocumentSelector } from "./DocumentSelector"; -import { - DocumentMetadataBlock, - buildDocumentSummaryDisplay, -} from "@/components/search/DocumentDisplay"; +import { DocumentMetadataBlock } from "@/components/search/DocumentDisplay"; interface DocumentDisplayProps { document: DanswerDocument; @@ -57,7 +54,7 @@ export function ChatDocumentDisplay({ {isAIPick && (
} + mainContent={} popupContent={
@@ -103,7 +100,8 @@ export function ChatDocumentDisplay({

- {buildDocumentSummaryDisplay(document.match_highlights, document.blurb)} + {/* Keyword highlighting intentionally omitted — show the plain blurb. */} + {document.blurb}

{/* diff --git a/web/src/app/chat/documentSidebar/DocumentSidebar.tsx b/web/src/app/chat/documentSidebar/DocumentSidebar.tsx index aa962a1f9ea..54bd6646248 100644 --- a/web/src/app/chat/documentSidebar/DocumentSidebar.tsx +++ b/web/src/app/chat/documentSidebar/DocumentSidebar.tsx @@ -88,7 +88,7 @@ export const DocumentSidebar = forwardRef(
void; + useRelevanceFilter: boolean; + setUseRelevanceFilter: (value: boolean) => void; selectedAssistant: Persona; alternativeAssistant: Persona | null; files: FileDescriptor[]; setFiles: (files: FileDescriptor[]) => void; handleFileUpload: (files: File[]) => void; setConfigModalActiveTab: (tab: string) => void; + configModalActiveTab: string | null; textAreaRef: React.RefObject; }) { // handle re-sizing of the text area @@ -104,6 +117,9 @@ export function ChatInputBar({ }; const { llmProviders } = useChatContext(); + // Cluster-level enablement — hide the per-conversation rerank/relevance + // toggles entirely when the feature is disabled cluster-wide. + const settings = useContext(SettingsContext)?.settings; const [_, llmName] = getFinalLLM(llmProviders, selectedAssistant, null); const suggestionsRef = useRef(null); @@ -202,7 +218,7 @@ export function ChatInputBar({ return (
-
+
-
+
{filteredPersonas.map((currentPersona, index) => ( + )} + + {!retrievalDisabled && settings?.llm_relevance_filter_enabled && ( + + )} +
-
+
diff --git a/web/src/app/chat/interfaces.ts b/web/src/app/chat/interfaces.ts index ae3a7dd3e37..83ce8ad7c27 100644 --- a/web/src/app/chat/interfaces.ts +++ b/web/src/app/chat/interfaces.ts @@ -48,6 +48,10 @@ export interface ToolCallFinalResult { tool_result: Record; } +// Number of chat sessions loaded per "page" in the sidebar history. The first +// page is rendered server-side; older pages are lazy-loaded on scroll. +export const CHAT_SESSION_PAGE_SIZE = 30; + export interface ChatSession { id: number; name: string; diff --git a/web/src/app/chat/lib.tsx b/web/src/app/chat/lib.tsx index 606b206f37b..03fab8898a0 100644 --- a/web/src/app/chat/lib.tsx +++ b/web/src/app/chat/lib.tsx @@ -96,6 +96,8 @@ export async function* sendMessage({ systemPromptOverride, useExistingUserMessage, alternateAssistantId, + useReranking, + useRelevanceFilter, }: { message: string; fileDescriptors: FileDescriptor[]; @@ -116,6 +118,11 @@ export async function* sendMessage({ // and will ignore the specified `message` useExistingUserMessage?: boolean; alternateAssistantId?: number; + // Per-conversation search-quality toggles (default off). Each is also gated + // server-side by its global master switch (RERANK_ENABLED / + // LLM_RELEVANCE_FILTER_ENABLED). + useReranking?: boolean; + useRelevanceFilter?: boolean; }) { const documentsAreSelected = selectedDocumentIds && selectedDocumentIds.length > 0; @@ -161,6 +168,8 @@ export async function* sendMessage({ } : null, use_existing_user_message: useExistingUserMessage, + use_reranking: useReranking ?? false, + use_relevance_filter: useRelevanceFilter ?? false, }), }); if (!sendMessageResponse.ok) { @@ -316,6 +325,27 @@ export function getCitedDocumentsFromMessage(message: Message) { return documentsWithCitationKey; } +// Cutoff timestamps (ISO) matching the date buckets in +// groupSessionsByDateRange, for lazy-loading each bucket from the backend: +// today (Today): time_created >= oneDayAgo +// prev7 (Previous 7 Days): sevenDaysAgo <= time_created < oneDayAgo +// prev30 (Previous 30 Days): thirtyDaysAgo <= time_created < sevenDaysAgo +// older (Over 30 days ago): time_created < thirtyDaysAgo +export function getChatHistoryBoundaries(): { + oneDayAgo: string; + sevenDaysAgo: string; + thirtyDaysAgo: string; +} { + const today = new Date(); + today.setHours(0, 0, 0, 0); + const day = 1000 * 3600 * 24; + return { + oneDayAgo: new Date(today.getTime() - 1 * day).toISOString(), + sevenDaysAgo: new Date(today.getTime() - 7 * day).toISOString(), + thirtyDaysAgo: new Date(today.getTime() - 30 * day).toISOString(), + }; +} + export function groupSessionsByDateRange(chatSessions: ChatSession[]) { const today = new Date(); today.setHours(0, 0, 0, 0); // Set to start of today for accurate comparison diff --git a/web/src/app/chat/message/Messages.tsx b/web/src/app/chat/message/Messages.tsx index 84cbe2e2368..e0f090851f5 100644 --- a/web/src/app/chat/message/Messages.tsx +++ b/web/src/app/chat/message/Messages.tsx @@ -39,6 +39,7 @@ import Prism from "prismjs"; import "prismjs/themes/prism-tomorrow.css"; import "./custom-code-styles.css"; import { Persona } from "@/app/admin/assistants/interfaces"; +import { assistantDisplayName } from "@/lib/assistants/displayName"; import { Button } from "@tremor/react"; import { AssistantIcon } from "@/components/assistants/AssistantIcon"; @@ -169,7 +170,9 @@ export const AIMessage = ({ return (
-
+ {/* px-4 matches the input bar's inset so the answer text lines up + exactly with the search box (previously ml-8 shifted it ~16px right). */} +
{alternativeAssistant - ? alternativeAssistant.name + ? assistantDisplayName(alternativeAssistant) : personaName || "Darwin"}
@@ -187,7 +190,7 @@ export const AIMessage = ({ handleShowRetrieved !== undefined && isCurrentlyShowingRetrieved !== undefined && !retrievalDisabled && ( -
+
-
+
{(!toolCall || toolCall.tool_name === SEARCH_TOOL_NAME) && ( <> {query !== undefined && @@ -265,7 +268,11 @@ export const AIMessage = ({ {typeof content === "string" ? ( (forced text-default + // below) was readable and everything else stayed prose-grey. + className="prose dark:prose-invert max-w-full" components={{ a: (props) => { const { node, ...rest } = props; @@ -455,7 +462,9 @@ export const HumanMessage = ({ onMouseLeave={() => setIsHovered(false)} >
-
+ {/* px-4 matches the input bar inset (see AIMessage) so the user's + message lines up exactly with the search box. */} +
@@ -465,8 +474,8 @@ export const HumanMessage = ({
You
-
-
+
+
{isEditing ? ( @@ -563,7 +572,7 @@ export const HumanMessage = ({
) : typeof content === "string" ? ( -
+
{content}
) : ( diff --git a/web/src/app/chat/modal/ModalWrapper.tsx b/web/src/app/chat/modal/ModalWrapper.tsx index 9b0f56433be..065435a78af 100644 --- a/web/src/app/chat/modal/ModalWrapper.tsx +++ b/web/src/app/chat/modal/ModalWrapper.tsx @@ -13,7 +13,7 @@ export const ModalWrapper = ({
onClose && onClose()} className={ - "fixed inset-0 bg-black bg-opacity-30 backdrop-blur-sm " + + "fixed inset-0 bg-black bg-opacity-60 backdrop-blur-sm " + "flex items-center justify-center z-50 " + (bgClassName || "") } diff --git a/web/src/app/chat/modal/configuration/AssistantsTab.tsx b/web/src/app/chat/modal/configuration/AssistantsTab.tsx index 08589144abd..5367c6e122d 100644 --- a/web/src/app/chat/modal/configuration/AssistantsTab.tsx +++ b/web/src/app/chat/modal/configuration/AssistantsTab.tsx @@ -1,7 +1,8 @@ import { Persona } from "@/app/admin/assistants/interfaces"; +import { assistantDisplayName } from "@/lib/assistants/displayName"; import { Bubble } from "@/components/Bubble"; import { AssistantIcon } from "@/components/assistants/AssistantIcon"; -import React from "react"; +import React, { useState } from "react"; import { FiBookmark, FiImage, FiSearch } from "react-icons/fi"; interface AssistantsTabProps { @@ -15,11 +16,39 @@ export function AssistantsTab({ availableAssistants, onSelect, }: AssistantsTabProps) { + const [query, setQuery] = useState(""); + const q = query.trim().toLowerCase(); + const filteredAssistants = availableAssistants.filter( + (assistant) => + !q || + assistantDisplayName(assistant).toLowerCase().includes(q) || + assistant.name.toLowerCase().includes(q) || + (assistant.description?.toLowerCase().includes(q) ?? false) + ); + return ( <>

Choose Assistant

+ +
+ + setQuery(e.target.value)} + placeholder="Search assistants…" + className="w-full bg-transparent text-sm text-default placeholder:text-subtle focus:outline-none" + /> +
+
- {availableAssistants.map((assistant) => ( + {filteredAssistants.length === 0 && ( +
+ No assistants match “{query}”. +
+ )} + {filteredAssistants.map((assistant) => (
- {assistant.name} + {assistantDisplayName(assistant)}
{assistant.tools.length > 0 && ( diff --git a/web/src/app/chat/modal/configuration/FiltersTab.tsx b/web/src/app/chat/modal/configuration/FiltersTab.tsx index 86adbbfcc2c..2a9c4f0a158 100644 --- a/web/src/app/chat/modal/configuration/FiltersTab.tsx +++ b/web/src/app/chat/modal/configuration/FiltersTab.tsx @@ -110,7 +110,7 @@ export function FiltersTab({
setDocSetFilter(e.target.value)} diff --git a/web/src/app/chat/page.tsx b/web/src/app/chat/page.tsx index 325da0d0cf1..fd20ec155e3 100644 --- a/web/src/app/chat/page.tsx +++ b/web/src/app/chat/page.tsx @@ -24,6 +24,7 @@ export default async function Page({ const { user, chatSessions, + hasMoreChatSessions, ccPairs, availableSources, documentSets, @@ -52,6 +53,7 @@ export default async function Page({ value={{ user, chatSessions, + hasMoreChatSessions, availableSources, availableDocumentSets: documentSets, availablePersonas: assistants, diff --git a/web/src/app/chat/sessionSidebar/AssistantsTab.tsx b/web/src/app/chat/sessionSidebar/AssistantsTab.tsx index 62b5d6d58b4..1b168069469 100644 --- a/web/src/app/chat/sessionSidebar/AssistantsTab.tsx +++ b/web/src/app/chat/sessionSidebar/AssistantsTab.tsx @@ -62,7 +62,7 @@ export function AssistantsTab({ ); return ( -
+
Select an Assistant below to begin a new chat with them! diff --git a/web/src/app/chat/sessionSidebar/ChatSessionDisplay.tsx b/web/src/app/chat/sessionSidebar/ChatSessionDisplay.tsx index 20c76145fa7..2824af90034 100644 --- a/web/src/app/chat/sessionSidebar/ChatSessionDisplay.tsx +++ b/web/src/app/chat/sessionSidebar/ChatSessionDisplay.tsx @@ -164,7 +164,10 @@ export function ChatSessionDisplay({ {chatName || `Chat ${chatSession.id}`}

{chatSession.time_created && ( -

+

{timeAgo(chatSession.time_created)}

)} @@ -175,7 +178,7 @@ export function ChatSessionDisplay({
@@ -184,7 +187,7 @@ export function ChatSessionDisplay({ setChatName(chatSession.name); setIsRenamingChat(false); }} - className={`hover:bg-black/10 p-1 -m-1 rounded ml-2`} + className={`hover:bg-black/10 dark:hover:bg-white/10 p-1 -m-1 rounded ml-2`} >
@@ -206,7 +209,7 @@ export function ChatSessionDisplay({ setIsMoreOptionsDropdownOpen(open) } content={ -
+
} @@ -232,7 +235,7 @@ export function ChatSessionDisplay({
setIsDeletionModalVisible(true)} - className={`hover:bg-black/10 p-1 -m-1 rounded ml-2`} + className={`hover:bg-black/10 dark:hover:bg-white/10 p-1 -m-1 rounded ml-2`} >
@@ -243,7 +246,7 @@ export function ChatSessionDisplay({
)} {!isSelected && !delayedSkipGradient && ( -
+
)} diff --git a/web/src/app/chat/sessionSidebar/ChatSidebar.tsx b/web/src/app/chat/sessionSidebar/ChatSidebar.tsx index b6e094f6393..d3a7f8f6665 100644 --- a/web/src/app/chat/sessionSidebar/ChatSidebar.tsx +++ b/web/src/app/chat/sessionSidebar/ChatSidebar.tsx @@ -32,11 +32,13 @@ import { HeaderTitle } from "@/components/header/Header"; export const ChatSidebar = ({ existingChats, + hasMoreChats, currentChatSession, folders, openedFolders, }: { existingChats: ChatSession[]; + hasMoreChats: boolean; currentChatSession: ChatSession | null | undefined; folders: Folder[]; openedFolders: { [key: number]: boolean }; @@ -83,12 +85,10 @@ export const ChatSidebar = ({ w-64 flex flex-none - bg-background-weak + bg-background 3xl:w-72 - border-r - border-border - flex - flex-col + flex + flex-col h-screen transition-transform`} id="chat-sidebar" @@ -195,6 +195,7 @@ export const ChatSidebar = ({ s.id)); + const merged = [...prev]; + for (const s of next) { + if (!seen.has(s.id)) { + merged.push(s); + } + } + return merged; +} export function ChatTab({ existingChats, + hasMoreChats, currentChatId, folders, openedFolders, }: { existingChats: ChatSession[]; + hasMoreChats: boolean; currentChatId?: number; folders: Folder[]; openedFolders: { [key: number]: boolean }; }) { - const groupedChatSessions = groupSessionsByDateRange(existingChats); const { setPopup } = usePopup(); const router = useRouter(); const [isDragOver, setIsDragOver] = useState(false); + // Date cutoffs that mirror groupSessionsByDateRange. Computed once so the + // window doesn't shift mid-session. + const [boundaries] = useState(() => getChatHistoryBoundaries()); + const oneDayMs = new Date(boundaries.oneDayAgo).getTime(); + + // Keep only genuinely-today sessions from the server seed (drops the + // current-chat merge fetchChatData prepends, which may be an older session). + const todaySeed = useCallback( + (chats: ChatSession[]) => + chats.filter((c) => new Date(c.time_created).getTime() >= oneDayMs), + [oneDayMs] + ); + + const [buckets, setBuckets] = useState>(() => ({ + today: { + sessions: todaySeed(existingChats), + pagesLoaded: 1, + hasMore: hasMoreChats, + loading: false, + loaded: true, + }, + prev7: { ...EMPTY_BUCKET }, + prev30: { ...EMPTY_BUCKET }, + older: { ...EMPTY_BUCKET }, + })); + const [expanded, setExpanded] = useState>({ + today: true, + prev7: false, + prev30: false, + older: false, + }); + + // Mirror buckets to a ref so click handlers read the freshest offset/loaded + // without being re-created (and re-bound) on every render. + const bucketsRef = useRef(buckets); + useEffect(() => { + bucketsRef.current = buckets; + }, [buckets]); + + // Re-seed Today when the server prop changes (router.refresh on rename / new + // chat / chat switch). Older buckets are left alone so an expanded section + // isn't collapsed (and re-fetched) by an unrelated refresh. + useEffect(() => { + setBuckets((prev) => ({ + ...prev, + today: { + sessions: todaySeed(existingChats), + pagesLoaded: 1, + hasMore: hasMoreChats, + loading: false, + loaded: true, + }, + })); + }, [existingChats, hasMoreChats, todaySeed]); + + const queryFor = useCallback( + (key: BucketKey, offset: number): string => { + const params = new URLSearchParams({ + limit: String(CHAT_SESSION_PAGE_SIZE), + offset: String(offset), + }); + if (key === "today") { + params.set("start_time", boundaries.oneDayAgo); + } else if (key === "prev7") { + params.set("start_time", boundaries.sevenDaysAgo); + params.set("end_time", boundaries.oneDayAgo); + } else if (key === "prev30") { + params.set("start_time", boundaries.thirtyDaysAgo); + params.set("end_time", boundaries.sevenDaysAgo); + } else { + params.set("end_time", boundaries.thirtyDaysAgo); + } + return params.toString(); + }, + [boundaries] + ); + + const loadBucket = useCallback( + async (key: BucketKey) => { + const current = bucketsRef.current[key]; + if (current.loading) { + return; + } + const offset = current.pagesLoaded * CHAT_SESSION_PAGE_SIZE; + setBuckets((p) => ({ ...p, [key]: { ...p[key], loading: true } })); + try { + const res = await fetch( + `/api/chat/get-user-chat-sessions?${queryFor(key, offset)}` + ); + if (!res.ok) { + throw new Error(`status ${res.status}`); + } + const body = await res.json(); + setBuckets((p) => ({ + ...p, + [key]: { + sessions: dedupeAppend(p[key].sessions, body.sessions ?? []), + pagesLoaded: p[key].pagesLoaded + 1, + hasMore: body.has_more ?? false, + loading: false, + loaded: true, + }, + })); + } catch (error) { + setBuckets((p) => ({ + ...p, + [key]: { ...p[key], loading: false, hasMore: false, loaded: true }, + })); + setPopup({ message: "Failed to load chats", type: "error" }); + } + }, + [queryFor, setPopup] + ); + + const toggleBucket = (key: BucketKey) => { + const willExpand = !expanded[key]; + setExpanded((p) => ({ ...p, [key]: willExpand })); + if (willExpand && !bucketsRef.current[key].loaded) { + loadBucket(key); + } + }; + const handleDropToRemoveFromFolder = async ( event: React.DragEvent ) => { @@ -49,8 +216,21 @@ export function ChatTab({ } }; + const renderSessions = (sessions: ChatSession[]) => + sessions + .filter((chat) => chat.folder_id === null) + .map((chat) => ( +
+ +
+ )); + return ( -
+
{folders.length > 0 && (
@@ -75,33 +255,66 @@ export function ChatTab({ isDragOver ? "bg-hover" : "" } rounded-md`} > - {Object.entries(groupedChatSessions).map( - ([dateRange, chatSessions]) => { - if (chatSessions.length > 0) { - return ( -
+ {BUCKETS.map(({ key, title, collapsible }) => { + const bucket = buckets[key]; + const isOpen = expanded[key]; + const loose = bucket.sessions.filter((c) => c.folder_id === null); + return ( +
+ {collapsible ? ( + + ) : ( + loose.length > 0 && (
- {dateRange} + {title}
- {chatSessions - .filter((chat) => chat.folder_id === null) - .map((chat) => { - const isSelected = currentChatId === chat.id; - return ( -
- -
- ); - })} -
- ); - } - } - )} + ) + )} + + {isOpen && ( + <> + {renderSessions(bucket.sessions)} + + {bucket.loading && ( +
+ +
+ )} + + {!bucket.loading && bucket.hasMore && ( + + )} + + {collapsible && + bucket.loaded && + !bucket.loading && + loose.length === 0 && ( +
+ No chats in this range +
+ )} + + )} +
+ ); + })}
); diff --git a/web/src/app/globals.css b/web/src/app/globals.css index 45f0c1a6f9e..e6cf38d0b45 100644 --- a/web/src/app/globals.css +++ b/web/src/app/globals.css @@ -2,6 +2,147 @@ @tailwind components; @tailwind utilities; +/* Dark mode palette. Only the dark overrides live here — light mode uses the + * fallbacks baked into the token definitions (tailwind-themes/tailwind.config.js). + * Scoped to any `.dark` ancestor (CSS vars cascade), so dark mode applies only + * where the `.dark` class is set (currently: the chat page container). */ +.dark { + /* Render native form controls (checkboxes, date pickers, default input + backgrounds) and scrollbars in dark — fixes bright white checkboxes/inputs. */ + color-scheme: dark; + --background: #0f1117; + --background-weak: #171a21; + --background-emphasis: #1c1f27; + --background-strong: #232733; + --background-subtle: #2a2f3a; + --background-search: #171a21; + --background-custom-header: #171a21; + + --text-light: #3a3f4b; + --text-subtle: #9aa3b2; + --text-default: #c7ccd5; + --text-emphasis: #e4e7ec; + --text-strong: #f5f6f8; + + --border: #2a2f3a; + --border-light: #232733; + --border-medium: #353b48; + --border-strong: #4a5160; + + --hover-light: #1c1f27; + --hover: #232733; + --hover-emphasis: #2f3542; +} + +/* Atmosphere: a soft accent glow at the top of the chat canvas for depth + * (subtle in light, a touch more present in dark). */ +#chat-root { + background-image: radial-gradient( + 1100px 560px at 50% -12%, + rgba(102, 113, 208, 0.05), + transparent 68% + ); +} +.dark #chat-root { + background-image: radial-gradient( + 1100px 560px at 50% -10%, + rgba(120, 134, 255, 0.1), + transparent 64% + ); +} + +/* High-impact entrance motion (used for the empty state + message reveals). */ +@keyframes da-fade-up { + from { + opacity: 0; + transform: translateY(10px); + } + to { + opacity: 1; + transform: translateY(0); + } +} +.da-fade-up { + animation: da-fade-up 0.55s cubic-bezier(0.2, 0.7, 0.2, 1) both; +} +@media (prefers-reduced-motion: reduce) { + .da-fade-up { + animation: none; + } +} + +/* A tight glow hugging ONLY the chat box on the empty landing — a small, + * wide-but-short ellipse centered on the input (≈50% x / 51% y of the chat + * area), fading quickly to the plain dark page background so everything else + * (the heading, the corners, top and bottom) stays black. Pure CSS, no image + * asset. The ::before layer breathes slowly for a subtle sense of life. */ +.chat-landing-bg { + background: radial-gradient( + 36% 18% at 50% 51%, + rgba(99, 102, 241, 0.12), + transparent 76% + ); + animation: da-fade-up 0.7s ease both; +} +.dark .chat-landing-bg { + background: radial-gradient( + 36% 17% at 50% 51%, + rgba(110, 114, 250, 0.38) 0%, + rgba(76, 91, 232, 0.12) 52%, + transparent 78% + ); + animation: da-fade-up 0.7s ease both; +} +/* Slow breathing halo — gentle scale + opacity drift, centered on the box. */ +.dark .chat-landing-bg::before { + content: ""; + position: absolute; + inset: 0; + pointer-events: none; + transform-origin: 50% 51%; + background: radial-gradient( + 28% 13% at 50% 51%, + rgba(132, 121, 255, 0.22), + transparent 74% + ); + animation: da-glow-breathe 16s ease-in-out infinite alternate; +} +@keyframes da-glow-breathe { + from { + opacity: 0.5; + transform: scale(0.96); + } + to { + opacity: 0.9; + transform: scale(1.06); + } +} +@media (prefers-reduced-motion: reduce) { + .chat-landing-bg { + animation: none; + } + .dark .chat-landing-bg::before { + animation: none; + } +} + +/* Tailwind Typography (`prose`) hard-codes a near-black color for bold text + * and headings. The chat renders markdown on a DARK surface but only forces the + * light token onto

, so //headings (e.g. the bold lead-in labels + * of bullet points) came out dark-on-dark and nearly invisible. In dark mode, + * let them inherit the surrounding (light) text color instead. Scoped to + * `.dark` so light mode keeps the default dark bold. */ +.dark .prose :is(strong, b, h1, h2, h3, h4, h5, h6) { + color: inherit; +} + +/* Brand the native checkbox/radio accent so the checked state matches the app + * accent in both themes (pairs with color-scheme above for the box itself). */ +input[type="checkbox"], +input[type="radio"] { + accent-color: #6671d0; +} + @layer utilities { /* Hide scrollbar for Chrome, Safari and Opera */ .no-scrollbar::-webkit-scrollbar { @@ -46,6 +187,21 @@ height: 8px; /* Horizontal scrollbar height */ } +/* Dark mode: the custom scrollbar colors above are tuned for light mode and + render as a bright bar in dark mode — it looked like a thick white border + between the chat and the document panel (the chat scroll container's track + sits right on that boundary). Use a transparent track + a subtle dark thumb + so scrollbars stay usable without painting a light edge. */ +.dark ::-webkit-scrollbar-track { + background: transparent; +} +.dark ::-webkit-scrollbar-thumb { + background: #353b48; /* border-medium — subtle, visible on hover/scroll */ +} +.dark ::-webkit-scrollbar-thumb:hover { + background: #4a5160; /* border-strong */ +} + /* Used to create alternatie to React Markdown */ .preserve-lines { white-space: pre-wrap; /* Preserves whitespace and wraps text */ diff --git a/web/src/app/layout.tsx b/web/src/app/layout.tsx index 18b42b01d35..08a31a15716 100644 --- a/web/src/app/layout.tsx +++ b/web/src/app/layout.tsx @@ -1,15 +1,17 @@ import "./globals.css"; -import { Inter } from "next/font/google"; +import { IBM_Plex_Sans } from "next/font/google"; import { getCombinedSettings } from "@/components/settings/lib"; import { CUSTOM_ANALYTICS_ENABLED } from "@/lib/constants"; import { SettingsProvider } from "@/components/settings/SettingsProvider"; import { Metadata } from "next"; import { buildClientUrl } from "@/lib/utilsSS"; -const inter = Inter({ +// Body / UI: IBM Plex Sans — a refined, characterful humanist sans (not Inter). +const plexSans = IBM_Plex_Sans({ subsets: ["latin"], - variable: "--font-inter", + weight: ["400", "500", "600", "700"], + variable: "--font-sans", }); export async function generateMetadata(): Promise { @@ -39,22 +41,29 @@ export default async function RootLayout({ const combinedSettings = await getCombinedSettings({}); return ( - - {CUSTOM_ANALYTICS_ENABLED && combinedSettings.customAnalyticsScript && ( - + + + {/* Dark is the server-rendered default (className="dark" above), so it + survives React hydration. This pre-paint script only OPTS OUT for an + explicit "light" choice — removing the class before paint (no flash). + (Previously the script *added* dark to a class-less , which + React then clobbered on hydration, leaving prod stuck in light.) */} +

Name