Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
208 changes: 208 additions & 0 deletions medcat-trainer/docs/plugins.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
# Plugins & Extensions

MedCATtrainer can be extended with **enterprise / third-party plugins**. A plugin
is an ordinary Python package that is discovered at startup via the
`mct.plugins` [entry-point group](https://packaging.python.org/en/latest/specifications/entry-points/)
and installed as a Django app. Plugins can register backend hooks/signals and
contribute frontend menu items, routes, and UI slots.

This page describes **how plugins work** and, importantly, the **security/trust
model** you must understand before installing any plugin.

## Security & trust model

!!! danger "A plugin runs as fully-trusted, in-process application code"
There is **no sandbox**. An installed `mct.plugins` package executes at
import time and in `AppConfig.ready()` with the same privileges as
MedCATtrainer itself. A plugin can read and write the entire database, read
the filesystem and environment (including secrets), receive clinical
document/annotation data via signals, register permission hooks that grant
project-admin access, and expose new API endpoints.

**Treat plugins like kernel modules: only install packages you trust and
have reviewed.**

### What a plugin can do

| Capability | How |
| --- | --- |
| Run arbitrary code at startup | Module import + `AppConfig.ready()` |
| Read/write all application data | Django ORM access |
| Receive document/annotation/user data | Subscribing to `api.extensions` signals |
| Grant project-admin to users | `register_permission_hook('is_project_admin', ...)` |
| Add backend API endpoints | A plugin `urls.py` mounted at `/api/ee/<app_label>/` |
| Add frontend menu items / routes / UI | `register_menu_extension` / `register_route` / `PluginSlot` |

### What the scaffold guarantees (and does not)

The core app does not provide a security boundary to plugins, it does provide:

- **Grant-only permission hooks.** A hook returning `True` *grants* a
permission; `None`/`False` abstains. Hooks **cannot revoke** access the OSS
code already grants, so a plugin cannot lock legitimate users out.
- **URL validation.** `route`/`href`/`path` values registered for the frontend
bootstrap are validated to be relative paths or `http(s)` URLs. Dangerous
schemes (`javascript:`, `data:`, `vbscript:`, `file:` …) and
protocol-relative (`//host`) values are rejected, so a plugin cannot inject a
script-executing link into an authenticated user's browser.
- **Signal isolation.** Core signals are emitted with
`api.extensions.dispatch()` (backed by `Signal.send_robust`). An exception in
a plugin's receiver is logged and ignored — it cannot break document
submission, annotation persistence, or OIDC login.

None of the above protects against a plugin that is *deliberately* malicious —
it already runs in-process. These measures only reduce the blast radius of
careless or buggy plugins.

### Operator guidance

- Only install plugins from sources you trust; review the code and pin exact
versions/hashes (`pip install pkg==x.y.z --require-hashes`).
- Prefer building a dedicated image per deployment with a known, vetted set of
plugins rather than installing plugins into a shared/long-lived environment.
- Be aware that plugins receiving annotation/document signals or the
`user_oidc_resolved` `id_token` have access to potentially sensitive (PHI)
data; ensure plugin authors handle it accordingly.

## How discovery works

1. A plugin package declares an entry point:

```toml
# plugin's pyproject.toml
[project.entry-points."mct.plugins"]
my_plugin = "my_plugin.apps.MyPluginConfig"
```

2. The `AppConfig` opts in with `is_mct_plugin = True`:

```python
# my_plugin/apps.py
from django.apps import AppConfig

class MyPluginConfig(AppConfig):
name = "my_plugin"
is_mct_plugin = True

def ready(self):
# Register hooks/signals here.
from my_plugin import hooks # noqa: F401
```

3. At startup `core.plugin_discovery` imports the `AppConfig`, verifies it is an
`AppConfig` subclass with `is_mct_plugin = True`, and appends it to
`INSTALLED_APPS`. If the plugin ships a `urls.py`, it is mounted at
`/api/ee/<app_label>/`.

## Backend extension points

All backend extension points live in `api.extensions`. The module shape is
contract-tested, so these signatures are stable.

### Signals

Emitted by the core app; plugins connect receivers (in `AppConfig.ready()`):

| Signal | When | kwargs |
| --- | --- | --- |
| `pre_document_submit` / `post_document_submit` | around document submit | `project`, `document`, `user` |
| `annotation_created` / `annotation_updated` / `annotation_deleted` | annotation row change | `annotation`, `project`, `document`, (`user`) |
| `project_group_created` / `project_group_updated` | project group change | `project_group` |
| `user_oidc_resolved` | after OIDC user resolution | `user`, `id_token`, `created` |

Receivers should be cheap and must not assume they can block the core flow —
exceptions are logged and swallowed.

### Permission hooks

```python
from api.extensions import register_permission_hook

def grant_from_oidc_group(user, project):
# Return True to grant; None/False to abstain. Cannot deny.
if user_in_admin_group(user):
return True
return None

register_permission_hook("is_project_admin", grant_from_oidc_group)
```

### Frontend bootstrap registries

```python
from api.extensions import register_feature, register_menu_extension, register_route

register_feature("adjudication")
register_menu_extension({"id": "adj", "label": "Adjudication", "route": "/ee/adj"})
register_route({"path": "/ee/adj", "component": "Adjudication"})
```

`route`/`href`/`path` must be relative paths or `http(s)` URLs; other schemes
raise `ValueError` at registration time.

## Backend API endpoints — required auth pattern

Plugin URLs are mounted on the **same origin** as the core app under
`/api/ee/<app_label>/`. The scaffold does **not** add authentication for you —
**every plugin view must enforce its own authentication and authorisation.**

Use DRF's `IsAuthenticated` at minimum, and reuse `api.permissions.is_project_admin`
for project-scoped operations:

```python
# my_plugin/urls.py
from django.urls import path
from . import views

urlpatterns = [
path("adjudication/<int:project_id>/", views.adjudication_summary),
]
```

```python
# my_plugin/views.py
from rest_framework.decorators import api_view, permission_classes
from rest_framework import permissions
from rest_framework.response import Response

from api.models import ProjectAnnotateEntities
from api.permissions import is_project_admin


@api_view(["GET"])
@permission_classes([permissions.IsAuthenticated]) # REQUIRED: no anonymous access
def adjudication_summary(request, project_id):
try:
project = ProjectAnnotateEntities.objects.get(id=project_id)
except ProjectAnnotateEntities.DoesNotExist:
return Response({"error": "Project not found"}, status=404)

# REQUIRED for project-scoped data: enforce project-admin access.
if not is_project_admin(request.user, project):
return Response({"error": "Forbidden"}, status=403)

return Response({"project": project.id, "summary": "..."})
```

!!! warning "Do not expose unauthenticated endpoints"
Because plugin routes share the trainer's origin and session/token context,
an endpoint that omits `permission_classes([IsAuthenticated])` is reachable
by anyone who can reach the trainer. Always gate views explicitly, and add
project-level checks with `is_project_admin` for any project-scoped data.

## Frontend extension points

- **Menu items** registered via `register_menu_extension` appear in the top nav.
- **Routes** are added either at build time (a bundled Vue plugin calling
`registerPlugin({ routes: [...] })`) or described via `register_route` for the
bootstrap payload.
- **UI slots** let a build-time plugin inject components at named slots, e.g.
`home:after-projects`, `project-admin:tabs`, `train-annotations:sidebar`:

```ts
import { registerPlugin } from "@/plugins/registry";
registerPlugin({ slots: { "home:after-projects": MyWidget } });
```

Build-time frontend plugins are bundled into the SPA and are therefore also
fully trusted; the same "only ship code you trust" rule applies.
1 change: 1 addition & 0 deletions medcat-trainer/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ nav:
- Advanced usage: advanced_usage.md
- Maintenance: maintenance.md
- Provisioning: provisioning.md
- Plugins & extensions: plugins.md
- Client API: client.md

plugins:
Expand Down
103 changes: 103 additions & 0 deletions medcat-trainer/webapp/api/api/extensions.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,37 @@
The shape of this module is contract-tested by ``test_extensions.py`` and the
``contract-tests`` CI job; changes that break the documented shape are
breaking changes.

Security / trust model
----------------------
An installed ``mct.plugins`` package runs as **first-class Django application
code**: it executes at import time and in ``AppConfig.ready()`` with full access
to the database, filesystem, environment, and request/session data. There is no
sandbox. Treat plugins like kernel modules — only install packages you trust and
have vetted. See ``docs/plugins.md`` for the full trust model and secure
plugin-authoring guidance.

The registry helpers below apply light input validation (URL-scheme validation on
menu/route entries) and signal emission is isolated via :func:`dispatch` so that
a plugin receiver cannot break core request flows. These are
validation measures, **not** a security boundary against malicious code
running in-process.
"""
from __future__ import annotations


import copy

import logging
import re

from collections.abc import Callable, Iterable
from typing import Any, Optional

from django.dispatch import Signal

logger = logging.getLogger(__name__)

# ---------------------------------------------------------------------------
# Signals
# ---------------------------------------------------------------------------
Expand Down Expand Up @@ -67,6 +89,74 @@
user_oidc_resolved = Signal()


# ---------------------------------------------------------------------------
# Signal dispatch (plugin-isolating)
# ---------------------------------------------------------------------------

def dispatch(signal: Signal, **kwargs: Any) -> None:
"""Emit a plugin-facing signal without letting receivers break core flows.

Core OSS code only ever *emits* the signals in this module; all receivers
are third-party plugin code. We therefore use :meth:`Signal.send_robust`,
which isolates and returns any exception raised by a receiver instead of
propagating it into the request path. Receiver failures are logged and then
ignored so a plugin cannot block document submission,
annotation persistence, or OIDC login.
"""
for receiver, response in signal.send_robust(**kwargs):
if isinstance(response, Exception):
logger.error(
"MCT plugin signal receiver %r raised %s: %s",
getattr(receiver, "__qualname__", receiver),
type(response).__name__,
response,
exc_info=response,
)


# ---------------------------------------------------------------------------
# URL validation for bootstrap-exposed entries
# ---------------------------------------------------------------------------
#
# Menu ``href``/``route`` and route ``path`` values are served via
# ``GET /api/bootstrap/`` and rendered into the authenticated SPA. A
# plugin must not be able to inject ``javascript:``/``data:`` (etc.)
# URLs that would execute in the browser of any logged-in user, nor
# protocol-relative ("//host") references that navigate off-origin.

_ALLOWED_URL_SCHEMES = ("http", "https")
_URL_SCHEME_RE = re.compile(r"^([a-zA-Z][a-zA-Z0-9+.\-]*):")
# Browsers ignore ASCII control chars / whitespace inside a scheme
# (e.g. "java\tscript:..."), so strip them before inspecting the scheme.
_URL_STRIP_RE = re.compile(r"[\x00-\x20\x7f]")


def _validate_safe_url(value: Any, *, field: str) -> None:
"""Reject dangerous URL values for bootstrap-exposed link fields.

Allowed: same-document/relative references (no scheme, e.g. ``/ee/adj``,
``./x``, ``#frag``, ``?q=1``) and absolute ``http(s)`` URLs. Rejected:
any other scheme (``javascript:``, ``data:``, ``vbscript:``, ``file:`` …)
and protocol-relative ``//host`` references.
"""
if not isinstance(value, str):
raise TypeError(f"{field} must be a string")
cleaned = _URL_STRIP_RE.sub("", value)
if cleaned == "":
raise ValueError(f"{field} must not be empty")
if cleaned.startswith("//"):
raise ValueError(f"{field} must not be a protocol-relative URL")
match = _URL_SCHEME_RE.match(cleaned)
if match is None:
# No scheme => relative path / fragment / query. Allowed.
return
if match.group(1).lower() not in _ALLOWED_URL_SCHEMES:
raise ValueError(
f"{field} has a disallowed URL scheme '{match.group(1)}'; "
f"only {', '.join(_ALLOWED_URL_SCHEMES)} and relative paths are permitted"
)


# ---------------------------------------------------------------------------
# Permission hook registry
# ---------------------------------------------------------------------------
Expand Down Expand Up @@ -123,11 +213,19 @@ def register_menu_extension(item: dict[str, Any]) -> None:
``item`` MUST contain ``id`` (str) and ``label`` (str). It SHOULD
contain ``route`` (str) or ``href`` (str). Additional keys pass through
verbatim to the frontend.

Any ``route``/``href`` value is validated to be a relative path or an
``http(s)`` URL (see :func:`_validate_safe_url`) so a plugin cannot inject
a ``javascript:`` link that runs in an authenticated user's browser.
"""
if not isinstance(item, dict):
raise TypeError("menu extension item must be a dict")
if "id" not in item or "label" not in item:
raise ValueError("menu extension item requires 'id' and 'label'")
if "route" in item:
_validate_safe_url(item["route"], field="menu extension 'route'")
if "href" in item:
_validate_safe_url(item["href"], field="menu extension 'href'")
_menu_extensions.append(copy.deepcopy(item))


Expand All @@ -140,11 +238,15 @@ def register_route(route: dict[str, Any]) -> None:

``route`` MUST contain ``path`` (str) and ``component`` (str — module
specifier or registered component name resolved by the frontend).

``path`` is validated to be a relative SPA path (no off-origin or
non-``http(s)`` scheme); see :func:`_validate_safe_url`.
"""
if not isinstance(route, dict):
raise TypeError("route must be a dict")
if "path" not in route or "component" not in route:
raise ValueError("route requires 'path' and 'component'")
_validate_safe_url(route["path"], field="route 'path'")
_plugin_routes.append(copy.deepcopy(route))


Expand Down Expand Up @@ -188,6 +290,7 @@ def clear_registries() -> None:
"project_group_created",
"project_group_updated",
"user_oidc_resolved",
"dispatch",
"register_permission_hook",
"get_permission_hooks",
"clear_permission_hooks",
Expand Down
5 changes: 3 additions & 2 deletions medcat-trainer/webapp/api/api/oidc_utils.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from django.contrib.auth import get_user_model
import secrets

from .extensions import user_oidc_resolved
from .extensions import dispatch, user_oidc_resolved


def get_user_by_email(request, id_token):
Expand Down Expand Up @@ -45,7 +45,8 @@ def get_user_by_email(request, id_token):

user.save()

user_oidc_resolved.send(
dispatch(
user_oidc_resolved,
sender=User,
user=user,
id_token=id_token,
Expand Down
Loading
Loading