AllenNeuralDynamics · arjunsridhar12345 · Jun 25, 2026 · Jun 12, 2026 · Jun 12, 2026 · Jun 12, 2026
diff --git a/.github/workflows/tag_and_publish.yml b/.github/workflows/tag_and_publish.yml
@@ -24,7 +24,7 @@ jobs:
       with:
         python-version: '3.10'
     - name: Install dependencies
-      run: uv sync
+      run: uv sync --extra full
     - name: Get Python version and Update README.md
       run: |
         python_version=$(grep "requires-python" pyproject.toml | grep -o ">=[^\"]*")

diff --git a/.github/workflows/test_and_lint.yml b/.github/workflows/test_and_lint.yml
@@ -21,7 +21,7 @@ jobs:
       with:
         python-version: ${{ matrix.python-version }}
     - name: Install dependencies
-      run: uv sync
+      run: uv sync --extra full
     - name: Run linter checks
       run: uv run ruff check . && uv run ruff format --check . && uv run interrogate --verbose .
     - name: Run tests and coverage

diff --git a/README.md b/README.md
@@ -87,6 +87,18 @@ This project uses [uv](https://docs.astral.sh/uv/). From the root directory, run
 ```bash
 uv sync
 ```
+
+> [!IMPORTANT]
+> The base install includes **only** the raw data loader and does **not** pull in
+> `aind-data-schema` or `matplotlib`. The quality-control (`qc`) module requires
+> both, so importing `dynamic_foraging_processing.qc` will fail on a base install.
+> To use the QC module, install the `qc` (or equivalent `full`) extra:
+> ```bash
+> pip install -e ".[qc]"
+> ```
+> The `full` extra is currently an alias for `qc`.
+
+To develop the code, run
 to create the environment and install the package. To include the development
 dependencies (linting, tests, docs), run
 ```bash
@@ -101,6 +113,10 @@ uv run ruff check . && uv run ruff format --check .   # lint + format
 uv run interrogate -v .                               # docstring coverage
 uv run coverage run -m pytest && uv run coverage report
 ```
+To include the QC dependencies with uv, run
+```bash
+uv sync --extra qc
+```
 
 ## Release Status
 GitHub's tags and Release features can be used to indicate a Release status.

diff --git a/docs/qc_upgrade_plan.md b/docs/qc_upgrade_plan.md
@@ -70,6 +70,7 @@ any dataset on disk.
 | `left_lick_times` | `np.ndarray` of seconds | `B_LeftLickTime` |
 | `right_lick_times` | `np.ndarray` of seconds | `B_RightLickTime` |
 | `animal_response` | `np.ndarray` of `{0,1,2}` per trial | `B_AnimalResponseHistory` |
+| `side_bias` | `np.ndarray` per trial (right minus left, `nan` on no-response) | `B_Bias` |
 | `go_cue_times` | `np.ndarray` of seconds | `B_GoCueTimeSoundCard` |
 | `rewarded_history` | `pd.DataFrame` with `left`/`right` boolean columns | `B_RewardedHistory` |
 | `stage_positions` | `pd.DataFrame` with `x`/`y`/`z` columns per trial | `B_StagePositions` |
@@ -78,8 +79,8 @@ any dataset on disk.
 
 - `drop_frames_tag`, `frame_num`, `trigger_length` — dropped-frames check.
 - `Experimenter`, `dirty_files`, `repo_dirty_flag` — basic-configuration check.
-- `B_Bias`, `B_Bias_CI` — pre-computed side bias; recompute from
-  `animal_response` instead (rolling fraction of right vs. left choices).
+- `B_Bias_CI` — side-bias confidence interval; dropped (the bias trace plots
+  the per-trial `side_bias` column directly, with no CI band).
 
 ## 3. Metrics in the new capsule
 
@@ -88,10 +89,12 @@ Keep only what maps cleanly. All metrics get `stage=Stage.RAW` and
 
 ### Side bias (`tags={"behavior": "average side bias"}`)
 
-- Input: `animal_response: np.ndarray` (`0=left`, `1=right`, `2=ignore`).
-- Average bias = `mean(is_right) - mean(is_left)` over responded trials (or
-  the rolling form, matching the old `B_Bias`).
-- Metric: `"average side bias"`, pass when `abs(mean_bias) < 0.5`.
+- Input: `side_bias: np.ndarray` — the per-trial side bias read directly from
+  the trial table (right minus left; `nan` on no-response trials). It is *not*
+  recomputed from `animal_response`.
+- Average bias = `nanmean(side_bias)` over the session.
+- Metric: `"average side bias"`, pass when `abs(mean_bias) < 0.5`. An empty or
+  all-`nan` column yields `nan`, which fails.
 - `reference="side_bias.png"`.
 
 ### Lick intervals
@@ -118,8 +121,8 @@ All carry `reference="lick_intervals.png"`.
   (`left licks`, `right licks`, `left to right licks`, `right to left licks`,
   `all licks`); inputs are `left_lick_times` and `right_lick_times`.
 - `side_bias.png` — four-panel figure:
-  - Side bias trace (with confidence interval band) — rolling `B_Bias` /
-    `B_Bias_CI` recomputed from `animal_response`.
+  - Side bias trace — the per-trial `side_bias` column read from the trial
+    table (no confidence-interval band).
   - Lickspout position over trials — `stage_positions` (x / y1 / y2 / z,
     relative to session start, in mm).
   - Behavior event raster — `animal_response` (L/R choice, ignore),
@@ -210,3 +213,4 @@ test_suite
 | --- | --- | --- | --- |
 | 2026-06-03 | metrics | Confirmed kept QC metrics: side bias, lick intervals, and Harp/contract QA via `make_qc_runner`. Dropped checks tied to old `behavior.json` (dropped frames, basic configuration). | Meeting with Alex. |
 | 2026-06-03 | qa | Adopt contraqctor `qc.Runner` output (`make_qc_runner(dataset)`) as the source for Harp / camera / contract / DynamicForaging QA, converted into `QCMetric`s. | Meeting with Alex. |
+| 2026-06-22 | metrics, data inputs, plots | Side bias is read from the precomputed per-trial `side_bias` column (averaged via `nanmean`) instead of being recomputed from `animal_response`; dropped the `B_Bias_CI` confidence-interval band. | Reflect implemented `side_bias_result` / `plot_side_bias`. |
diff --git a/examples/qc_example.ipynb b/examples/qc_example.ipynb
diff --git a/pyproject.toml b/pyproject.toml
@@ -21,6 +21,15 @@ dependencies = [
     "ipykernel",
 ]
 
+[project.optional-dependencies]
+qc = [
+    "aind-data-schema>=2.4.1",
+    "matplotlib",
+]
+full = [
+    "dynamic-foraging-processing[qc]",
+]
+
 [dependency-groups]
 dev = [
     'ruff',

diff --git a/src/dynamic_foraging_processing/qc/__init__.py b/src/dynamic_foraging_processing/qc/__init__.py
@@ -0,0 +1,65 @@
+"""Quality control for dynamic foraging datasets.
+
+Builds an ``aind_data_schema`` ``QualityControl`` object from primitive behavior
+data (lick times, per-trial choices) plus the contraqctor-based contract QA.
+
+The module is organized by QC stage:
+
+- :mod:`~dynamic_foraging_processing.qc._core` -- the shared stage interface,
+  schema helpers, per-check result type, and ``QualityControl`` assembler.
+- :mod:`~dynamic_foraging_processing.qc.raw` -- the raw-data (contract QA) stage.
+- :mod:`~dynamic_foraging_processing.qc.processed` -- the processed-data
+  (behavior metrics) stage.
+"""
+
+from dynamic_foraging_processing.qc._core import (
+    DEFAULT_GROUPING,
+    STATUS_CONVERTER,
+    BaseQC,
+    QCResult,
+    bool_to_status,
+    build_quality_control,
+    make_metric,
+    now_seattle,
+    now_utc,
+    to_builtin,
+    to_metrics,
+)
+from dynamic_foraging_processing.qc.processed import (
+    ProcessedQC,
+    behavior_qc_results,
+    calculate_lick_intervals,
+    lick_interval_results,
+    plot_lick_intervals,
+    plot_side_bias,
+    side_bias_result,
+)
+from dynamic_foraging_processing.qc.raw import (
+    RawQC,
+    contract_qc_metrics,
+    results_to_metrics,
+)
+
+__all__ = [
+    "DEFAULT_GROUPING",
+    "STATUS_CONVERTER",
+    "BaseQC",
+    "ProcessedQC",
+    "QCResult",
+    "RawQC",
+    "behavior_qc_results",
+    "bool_to_status",
+    "build_quality_control",
+    "calculate_lick_intervals",
+    "contract_qc_metrics",
+    "lick_interval_results",
+    "make_metric",
+    "now_seattle",
+    "now_utc",
+    "plot_lick_intervals",
+    "plot_side_bias",
+    "results_to_metrics",
+    "side_bias_result",
+    "to_builtin",
+    "to_metrics",
+]
diff --git a/src/dynamic_foraging_processing/qc/_core/__init__.py b/src/dynamic_foraging_processing/qc/_core/__init__.py
@@ -0,0 +1,34 @@
+"""Shared QC infrastructure: the stage interface, schema helpers, the per-check
+result type, and the ``QualityControl`` assembler.
+
+These pieces are stage-agnostic; the raw and processed stages build on them.
+"""
+
+from dynamic_foraging_processing.qc._core.base import BaseQC
+from dynamic_foraging_processing.qc._core.builder import (
+    DEFAULT_GROUPING,
+    build_quality_control,
+)
+from dynamic_foraging_processing.qc._core.result import QCResult, to_metrics
+from dynamic_foraging_processing.qc._core.schema import (
+    STATUS_CONVERTER,
+    bool_to_status,
+    make_metric,
+    now_seattle,
+    now_utc,
+    to_builtin,
+)
+
+__all__ = [
+    "DEFAULT_GROUPING",
+    "STATUS_CONVERTER",
+    "BaseQC",
+    "QCResult",
+    "bool_to_status",
+    "build_quality_control",
+    "make_metric",
+    "now_seattle",
+    "now_utc",
+    "to_builtin",
+    "to_metrics",
+]
diff --git a/src/dynamic_foraging_processing/qc/_core/base.py b/src/dynamic_foraging_processing/qc/_core/base.py
@@ -0,0 +1,33 @@
+"""Shared interface for the dynamic foraging QC stages.
+
+``BaseQC`` is an optional base class so the QC stages (raw, processed) share a
+single ``run`` interface. Each concrete stage computes its checks and returns
+them as schema ``QCMetric`` objects, which a caller assembles into one
+``QualityControl`` (see :func:`build_quality_control`).
+"""
+
+import abc
+import typing as t
+
+from aind_data_schema.core.quality_control import QCMetric
+
+
+class BaseQC(abc.ABC):
+    """Common interface for a QC stage.
+
+    A QC stage takes some slice of a session's data and produces a flat list of
+    ``QCMetric`` objects. Subclasses define what slice ``run`` consumes (raw
+    acquisition data, processed tables, ...); the return type is shared so the
+    metrics can be collected uniformly.
+    """
+
+    @abc.abstractmethod
+    def run(self, *args: t.Any, **kwargs: t.Any) -> t.List[QCMetric]:
+        """Run this stage's checks and return them as metrics.
+
+        Returns
+        -------
+        list of QCMetric
+            One metric per check, ready to assemble into a ``QualityControl``.
+        """
+        raise NotImplementedError
diff --git a/src/dynamic_foraging_processing/qc/_core/builder.py b/src/dynamic_foraging_processing/qc/_core/builder.py
@@ -0,0 +1,50 @@
+"""Assemble the dynamic foraging ``QualityControl`` object.
+
+Collects a flat list of metrics (behavior + contract QA) into a single
+``QualityControl``, wiring up ``default_grouping`` so the QC portal lays out
+``behavior`` and ``test_suite`` as sibling top-level groups.
+"""
+
+import typing as t
+
+from aind_data_schema.core.quality_control import QCMetric, QualityControl
+
+#: Tag keys laid out as siblings at the top level of the QC portal.
+DEFAULT_GROUPING = ["behavior", "test_suite"]
+
+
+def build_quality_control(
+    metrics: t.List[QCMetric],
+    *,
+    default_grouping: t.Optional[t.List[str]] = None,
+    allow_tag_failures: t.Optional[t.List[str]] = None,
+    key_experimenters: t.Optional[t.List[str]] = None,
+    notes: t.Optional[str] = None,
+) -> QualityControl:
+    """Wrap a flat list of metrics into a ``QualityControl`` object.
+
+    Parameters
+    ----------
+    metrics : list of QCMetric
+        All metrics (behavior + contract QA).
+    default_grouping : list of str, optional
+        Tag keys the portal groups by. Defaults to ``["behavior", "test_suite"]``.
+    allow_tag_failures : list of str, optional
+        Tag values whose metric failures should not fail the overall QC.
+    key_experimenters : list of str, optional
+        Experimenters associated with the session.
+    notes : str, optional
+        Free-text notes.
+
+    Returns
+    -------
+    QualityControl
+        The assembled quality-control object.
+    """
+    return QualityControl(
+        metrics=metrics,
+        default_grouping=default_grouping if default_grouping is not None else DEFAULT_GROUPING,
+        allow_tag_failures=allow_tag_failures if allow_tag_failures is not None else [],
+        key_experimenters=key_experimenters,
+        notes=notes,
+    )
diff --git a/src/dynamic_foraging_processing/qc/_core/result.py b/src/dynamic_foraging_processing/qc/_core/result.py
@@ -0,0 +1,75 @@
+"""The per-check ``QCResult`` and its conversion to a schema ``QCMetric``.
+
+A ``QCResult`` is the raw outcome of one behavior QC check (a name, value, and
+pass/fail). It converts into an ``aind_data_schema`` ``QCMetric`` via
+``to_metric``; the assembly step collects those metrics into a
+``QualityControl``.
+"""
+
+import dataclasses
+import typing as t
+
+from aind_data_schema.core.quality_control import QCMetric
+
+from dynamic_foraging_processing.qc._core.schema import bool_to_status, make_metric
+
+
+@dataclasses.dataclass(frozen=True)
+class QCResult:
+    """The raw outcome of a single QC check.
+
+    Attributes
+    ----------
+    name : str
+        Metric name.
+    value : Any
+        The computed value.
+    passed : bool
+        Whether the check passed.
+    description : str, optional
+        Human-readable description.
+    reference : str, optional
+        Relative path to a supporting asset (e.g. a plot).
+    tags : dict of str to str
+        Grouping tags (e.g. ``{"behavior": name}``).
+    """
+
+    name: str
+    value: t.Any
+    passed: bool
+    description: t.Optional[str] = None
+    reference: t.Optional[str] = None
+    tags: t.Dict[str, str] = dataclasses.field(default_factory=dict)
+
+    def to_metric(self) -> QCMetric:
+        """Convert this result into a schema ``QCMetric``.
+
+        Returns
+        -------
+        QCMetric
+            A metric carrying this result's value, pass/fail status, and tags.
+        """
+        return make_metric(
+            name=self.name,
+            value=self.value,
+            status=bool_to_status(self.passed),
+            description=self.description,
+            reference=self.reference,
+            tags=self.tags,
+        )
+
+
+def to_metrics(results: t.Sequence[QCResult]) -> t.List[QCMetric]:
+    """Convert a sequence of ``QCResult`` into schema ``QCMetric`` objects.
+
+    Parameters
+    ----------
+    results : sequence of QCResult
+        The per-check results to convert.
+
+    Returns
+    -------
+    list of QCMetric
+        One metric per result.
+    """
+    return [result.to_metric() for result in results]